finished ch17.5

This commit is contained in:
darkicewolf50 2025-03-31 14:55:57 -06:00
parent 0e0e0ea857
commit c1bdb1043e
3 changed files with 221 additions and 4 deletions

View File

@ -91,6 +91,20 @@
"title": "Traits for Async" "title": "Traits for Async"
} }
}, },
{
"id": "c00c13dd25b12ad4",
"type": "leaf",
"state": {
"type": "markdown",
"state": {
"file": "Futures, Tasks and Threads Together.md",
"mode": "source",
"source": false
},
"icon": "lucide-file",
"title": "Futures, Tasks and Threads Together"
}
},
{ {
"id": "2a974ca5442d705f", "id": "2a974ca5442d705f",
"type": "leaf", "type": "leaf",
@ -144,7 +158,7 @@
} }
} }
], ],
"currentTab": 5 "currentTab": 6
} }
], ],
"direction": "vertical" "direction": "vertical"
@ -287,10 +301,11 @@
"command-palette:Open command palette": false "command-palette:Open command palette": false
} }
}, },
"active": "ee4116419493acd3", "active": "c00c13dd25b12ad4",
"lastOpenFiles": [ "lastOpenFiles": [
"Futures in Sequence.md",
"Traits for Async.md", "Traits for Async.md",
"Futures, Tasks and Threads Together.md",
"Futures in Sequence.md",
"Any Number of Futures.md", "Any Number of Futures.md",
"Futures and Async.md", "Futures and Async.md",
"Async, Await, Futures and Streams.md", "Async, Await, Futures and Streams.md",
@ -315,7 +330,6 @@
"minigrep/src/lib.rs", "minigrep/src/lib.rs",
"Test_Organization.md", "Test_Organization.md",
"Traits.md", "Traits.md",
"Modules and Use.md",
"does_not_compile.svg", "does_not_compile.svg",
"Untitled.canvas", "Untitled.canvas",
"Good and Bad Code/Commenting Pratices", "Good and Bad Code/Commenting Pratices",

View File

@ -0,0 +1 @@
# Putting It All Together: Futures, Tasks and Threads

View File

@ -201,3 +201,205 @@ This was in terms of `Unpin`, not `Pin`.
How does `Pin` relate to `Unpin` and why does `Future` need `self` to be in a `Pin` type to call `poll`? How does `Pin` relate to `Unpin` and why does `Future` need `self` to be in a `Pin` type to call `poll`?
Remember from before, a series of await points in a future get compiled into a state machine, and the compiler makes sure that state machine follows all of Rust's normal rules around safety, which includes borrowing and ownership.
In order to make this work, Rust looks at what data is needed between one await point and either the next await point or the end of the async block.
Each variant get the access it needs to the data that will be used in that section of the source code, whether by taking ownership of that data or by getting a mutable or immutable reference to it.
If we get anything wrong about the ownership or references in a given async block, the borrow checker will tell us.
When we want to move around the future that corresponds to that block, like moving it into a `Vec` to pass to `join_all`, where things get tricker.
When we move a future, either by pushing it into a data structure to use as an iterator with `join_all` or by returning from a function, this actually means moving the state machine Rust creates for us.
Unlike most other types in Rust, the future Rust creates for async blocks can end up with references to themselves in the fields of any given variant.
This is shown in this illustration
<img src="https://doc.rust-lang.org/book/img/trpl17-04.svg" />
By default, any object that has a reference to itself is unsafe to move, because references always point to the actual memory address of whatever they refer to.
If you move the data structure itself, those internal references will be left pointing to the old location.
However that memory location is now invalid.
One thing is that its value will not be updated when you make changes to the data structure.
Another thing, which is more important, is the computer is now free to reuse that memory for other purposes.
You could end up reading completely unrelated data later.
<img src="https://doc.rust-lang.org/book/img/trpl17-05.svg" />
Theoretically, the Rust compiler could try to update every reference to an object whenever it gets moved, but that could add a lot of performance overhead.
This is especially true if a whole web of references needs updating.
If we could instead ensure that the data structure *doesn't move in memory*, we then wouldn't have to update any references.
This is exactly what Rust's borrow checker requires: in safe code, it prevents you from moving any item with an active reference to it.
`Pin` builds on that give us the exact guarantee we need.
When we *pin* a value by wrapping a pointer to that value in `Pin`, it can no longer move.
Thus if you have `Pin<Box<SomeType>>`, you actually pin the `SomeType` value, *not* the `Box` pointer.
The image illustrates this process.
<img src="https://doc.rust-lang.org/book/img/trpl17-06.svg" />
In fact, the `Box` pointer can still move around freely.
We car about making sure the data ultimately being referenced stays in place.
If a pointer moves around, *but the data it points is in the same place*, there is no potential problem.
As an independent exercise, look at the dos for the types as well as the `std::pin` module and try to work out how you would do do this with a `Pin` wrapping a `Box`.
The key is that the self-referential type cannot move, because it is still pinned.
<img src="https://doc.rust-lang.org/book/img/trpl17-07.svg" />
However most types are perfectly safe to move around, even if they happen to be behind a `Pin` pointer.
We only need to think about pinning when the items have internal references.
Primitives values such as numbers and Booleans are safe since they obviously don't have any internal references, so they are obviously safe.
Neither do most types you normally work with in Rust.
You can move around a `Vec`, for example, without worrying.
given what we have seen, if you have a `Pin<Vec<String>>`, you would have to everything via the safe but restrictive APIs provided by `Pin`/
Even though a `Vec<String>` is always safe to move if there are no other references to it.
We need a way to tell the compiler that it is fine to move items around in cases like this, this is where `Unpin` comes into action.
`Unpin` is a marker trait, similar to the `Send` and `Sync` traits.
Thus has no functionality of its own.
Marker traits exist only to tell the compiler to use the type implementing a given trait in a particular context.
`Unpin` informs the compiler that a given type does *not* need to uphold any guarantees about whether the value in question can be safely moved.
Just like `Send` and `Sync`, the compiler implements `Unpin` automatically for all types where it can prove it is safe.
A special case, is where `Unpin` is *not* implemented for a type.
The notation for this is `impl !Unpin for *SomeType*`, where `*SomeType*` is the name of a type that *does* need to uphold those guarantees to be safe whenever a pointer to that type is used in a `Pin`.
The relationship between `Pin` and `Unpin` has two important things to remember:
- `Unpin` is the "normal case", `!Unpin` is the special case
- Whether a type implements `Unpin` or `!Unpin` *only* matters when you are using a pinned pointer to that type like `Pin<&mut *SomeType*>`
To make that concrete, think about a `String`: it has a length and the Unicode characters that make it up.
We can wrap a `String` in `Pin`.
However `String` automatically implements `Unpin` as do most other types in Rust.
<img src="https://doc.rust-lang.org/book/img/trpl17-08.svg" />
Pinning a `String`; the dotted line indicates that the `String` implements the `Unpin` trait, and thus is not pinned.
This results, in the ability to do things that would be illegal if `String` implemented `!Unpin`, such as replacing one string with another at the exact same location in has no interval references that make it unsafe to move around.
This wouldn't violate the `Pin` contract, because `String` has no internal references that make it unsafe to move around.
This is precisely why it implements `Unpin` rather than `!Unpin`.
<img src="https://doc.rust-lang.org/book/img/trpl17-09.svg" />
Now that we know enough to understand the errors reported for that `join_all` call from before.
There we originally tried to move the futures produced by the async blocks into a `Vec<Box<dyn Future<Output = ()>>>`.
As we have seen, those futures may have internal references, so they don't implement `Unpin`.
They need to be pinned and then we can pass the `Pin` type into the `Vec`, confident that the underlying data in the futures will *not* be moved.
`Pin` and `Unpin` are mostly important for building lower-level libraries, or when you are building a runtime itself, rather than for day-to-day Rust.
When you see these traits in error messages, now you will have a better idea of how to fix your code.
Note, the combination of `Pin` and `Unpin` makes it possible to safely implement a whole class of complex types in Rust that would otherwise prove challenging because they are self-referential.
Types that require `Pin` show up most commonly in async Rust today.
Every once in a while, you may see them in other contexts too.
The specifics of how `Pin` and `Unpin` work, and the rules they are required to uphold are covered extensively in the `API` documentation for `std::pin`, so you can check there for more info.
In fact there is a whole BOOK on async Rust programming, that you can find [here](https://rust-lang.github.io/async-book/)
## The `Stream` Trait
As we leaned earlier, streams are similar to asynchronous iterators.
Unlike `Iterator` and `Future`, `Stream` has no definition in the std library (as of writing this), but there *is* a very common definition form the `fuitures` crate used throughout the ecosystem.
Here is a review of the `Iterator` and `Future` traits before going into how a `Stream` trait might merge them together.
From `Iterator`, we have the idea of a sequence: its `next` method provides an `Option<Self::Item>`
From `Future`, we have the idea of readiness over time: the `poll` method provides a `Poll<Self::Output>`
This allows us to represent a sequence of items that become ready over time, we define a `Stream` trait that puts those features together.
```rust
use std::pin::Pin;
use std::task::{Context, Poll};
trait Stream {
type Item;
fn poll_next(
self: Pin<&mut Self>,
cx: &mut Context<'_>
) -> Poll<Option<Self::Item>>;
}
```
Here the `Stream` trait defines an associated type called `Item` for the type of the items produced by stream.
This is similar to `Iterator`, where there may be zero to many items, and unlike `Future`, where there is always a single `Output`, even if it is the unit type `()`.
`Stream` also defines a method to get those items.
We call it `poll_next`, to make it clear that it polls in the same way `Future::poll` does and produces a sequence of items in the same way `Iterator::next` does.
Its return type combines `Poll` with `Option`.
The outer type is `Poll`, because it has to be checked for readiness, just as a future does.
The inner type is `Option`, because it needs to signal whether there are more messages, just as an iterator does.
Somethin like this will likely end up as part of Rust's standard library.
In the meantime, it is part of the toolkit of most runtimes, so you can rely on it, and everything that will be covered should apply generally.
In the example we saw previously in the section on streaming, we didn't use `Poll_next` or `Stream`, but instead used `next` and `StreamExt`.
We *could* work with futures directly via their `poll` method.
Using `await` is much nicer, and the `StreamExt` trait supplies the `next` method so we can do just this:
```rust
trait StreamExt: Stream {
async fn next(&mut self) -> Option<Self::Item>
where
Self: Unpin;
// other methods...
}
```
Note: The definition that we used earlier in the ch looks slightly different that this.
This is because it supports versions of Rust that did not yet support using async functions in traits.
As a result it looks like this:
```rust
fn next(&mut self) -> Next<'_, Self> where Self: Unpin;
```
This `Next` type is a `struct` that implements `Future` and allows us to name the lifetime of the reference to `self` with `Next<'_, Self>`, so that `await` can work with this method.
The `StreamExt` trait also has some interesting method available to use with steams.
`StreamExt` is automatically implemented for every type that implements `Stream`.
These traits are defined separately to enable the community to iterate on convenience APIs without affecting the foundational trait.
In the version of `StreamExt` used in the `trpl` crate, the trait not only defines the `next` method but also supplies a default implementation of `next` that correctly handles the details of calling `Stream::poll_next`.
Meaning that even when you need to write your own streaming data type, you *only* have to implement `Stream` and then anyone who uses your data type can use `StreamExt` and its methods with it automatically.