RustBrock/Futures, Tasks and Threads Together.md
darkicewolf50 665215bd19
Some checks failed
Test Gitea Actions / first (push) Successful in 14s
Test Gitea Actions / check-code (push) Failing after 13s
Test Gitea Actions / test (push) Has been skipped
Test Gitea Actions / documentation-check (push) Has been skipped
finished ch17.6 and ch18 intro
2025-03-31 16:32:24 -06:00

6.9 KiB

Putting It All Together: Futures, Tasks and Threads

As we saw here, threads provide one approach to concurrency.

Another approach was using async with futures and streams.

If you were wondering when to choose one over the other.

The answer is that it depends, and in many cases, the choice isn't threads or async but rather threads and async.

Many OS have supplied threading-based concurrency models for decades now, and many programming results support them as a result.

These models are not without their own tradeoffs.

On many OSes, they use a fair bit of memory, and they come with some overhead for starting up and shutting down.

Threads are also only an option when your OS and hardware support them.

Unlike a modern desktop and mobile computer, some embedded systems don't have an OS at all, so they also don't have threads.

The async model provides a different and complementary set of tradeoffs.

In the async model, concurrent operations don't require their own threads.

Instead, they can run on tasks (just like when we used trpl::spawn_task)m this kicks off work form a synchronous function in the streams section.

A task is similar to a thread, instead of being managed by the OS, it is managed by library-level code: the runtime

Previously, we saw that we could build a stream by using async channel and spawning an async task we could call from synchronous code.

We then can do this exact same thing with a thread.

Before we used trpl::spawn_task and trpl::sleep.

Here we replaced those with the thread::spawn and thread::sleep APIs from the std library in the get_intervals function.

fn get_intervals() -> impl Stream<Item = u32> {
    let (tx, rx) = trpl::channel();

    // This is *not* `trpl::spawn` but `std::thread::spawn`!
    thread::spawn(move || {
        let mut count = 0;
        loop {
            // Likewise, this is *not* `trpl::sleep` but `std::thread::sleep`!
            thread::sleep(Duration::from_millis(1));
            count += 1;

            if let Err(send_error) = tx.send(count) {
                eprintln!("Could not send interval {count}: {send_error}");
                break;
            };
        }
    });

    ReceiverStream::new(rx)
}

If you run this code it will produce an identical output as the one before.

And notice how little changes here from the perspective of calling code.

What is more even though one of our functions spawned an async task on the runtime and the other spawned an OS thread.

The resulting streams were unaffected by the differences.

Despite their similarities, these two behave very differently, although we might have a hard time measuring it in this very simple example.

Alternatively we could spawn millions of async tasks on any modern personal computer.

If we tried to do that with threads, we would literally run out of memory.

There is a reason that these APIs are so similar.

Threads act as a boundary for sets of synchronous operations; concurrency is possible between threads.

Tasks act as a boundary for sets of asynchronous operations.

Concurrency is possible both between and within tasks, because a task can switch between futures in its body.

Finally, futures are Rust's most granular unit of concurrency, and each future may represent a tree of other futures.

The runtime (specifically, its executor) manages tasks and tasks manage futures.

In this regard, tasks are similar to lightweight, runtime-managed threads with added capabilities that come from being managed by a runtime instead of by the operating system.

This doesn't mean that async tasks are always better than threads (or vice versa).

Concurrency with threads is in some ways a simpler programming model than concurrency with async.

This can be either a strength or a weakness.

Threads are somewhat "fire and forget".

They have no native equivalent to a future, so they simply run to completion without being interrupted except by the OS itself.

That is they have no built-in support for intratask concurrency the way futures do.

Threads in Rust also have no mechanisms for cancellation (we haven't covered explicitly in this ch but was implied by the fact that whenever we ended a future, tis state got cleaned up correctly).

The limitations also make threads harder to compose than futures.

This is much more difficult.

For example, to use threads to build helpers such as the timeout and throttle methods that we built earlier.

The fact that futures are richer data structures means they can be composed together more naturally as we have seen.

Tasks, give us additional control over futures, allowing us to choose where and how to group them.

It turns out that threads and tasks often work very well together, because tasks (in some runtimes) can be moved around between threads.

In fact, under the hood, the runtime we have been using including the spawn_blocking and spawn_task functions is multithreaded by default.

Many runtimes use an approach called work stealing to transparently move tasks around between threads.

Based on how the threads are currently being utilized, to improve the system's overall performance.

This approach actually requires threads and tasks and therefore futures.

When thinking about which method to use when, consider these rules of thumb:

  • If the work is very parallelizable, such as processing a bunch of data where each part cab be processed separately, threads are a better choice.
  • If the work is very concurrent, such as handling message from a bunch of different sources that many come in at different intervals or different rates, async is a better choice. If you need both concurrency and parallelism, you don't have to choose between threads and async.

You can use them together freely, letting each one play the part it is best at.

An example of this, below, shows a fairly common example of this kind of mix in real-world Rust code.

use std::{thread, time::Duration};

fn main() {
    let (tx, mut rx) = trpl::channel();

    thread::spawn(move || {
        for i in 1..11 {
            tx.send(i).unwrap();
            thread::sleep(Duration::from_secs(1));
        }
    });

    trpl::run(async {
        while let Some(message) = rx.recv().await {
            println!("{message}");
        }
    });
}

Here we being by creating an async channel, then spawn a thread that takes ownership of the sender side of the channel.

Within the thread, we send the numbers 1-10, sleeping for a second between each.

Finally, we run a future created with an async block passed to trpl::run just as we have throughout the chapter.

In this future, we await those messages, just as in the other message-passing examples we have seen.

Returning to the scenario we opened the chapter with, imagine running a set of video encoding tasks using a dedicated thread (because video encoding is compute-bound) but notifying the UI that those operations are dont with an async channel.

There are countless examples of these kinds of combinations in real-world use cases.