14 KiB
Shared-State Concurrency
Message passing is not the only way of handling concurrency.
Another way would be multiple threads to access the same shared data.
Consider this part of the slogan form the Go language documentation:
"Do not communicate by sharing memory."
What would communicating by sharing memory look like?
Why would message-passing enthusiasts caution not to use memory sharing.
Channels in any programming language are similar to single ownership, because once you transfer a value down a channel, you should no longer use that value.
Shared memory concurrency is like multiple ownership: multiple threads can access the same memory location at the same time.
As we saw before Ch15, where smart pointers made multiple ownership possible.
This can add complexity because these different owners need managing.
Rust's type system and ownership rules majority assist in getting this management correct.
Lets look at one example mutexes, one of the more common concurrency primitives for shared memory.
Using Mutexes to Allow Access to Data from One Thread at a Time
Mutex is an abbreviation for mutual exclusion, as in, a mutex allows only one thread to access some data at any given time.
To access the data in a mutex, a thread must first signal that it wants access by asking to acquire the mutex's lock.
The lock is a data structure that is part of the mutex that keeps track of who is currently gas exclusive access to the data.
The mutex can be described as guarding the data it holds via the locking system.
Mutexes have an associated reputation for being difficult to use because to must remember two rules:
- You must attempt to acquire the lock before using the data
- When you are done with the data that the mutex guards, you must unlock the data so other threads can acquire the lock
A real-world metaphor for a mutex would be like imagining a panel discussion at a conference with only one microphone.
Before a panelist can speak, they have to ask or signal that they want to use the microphone.
When they get the microphone, they can talk for as long as they want to and then hand the microphone to the next panelist who requests to speak.
If a panelist forgets to hand off the microphone off when they they are finished with it, no one else is able to speak.
If management of the shared microphone goes wrong, the panel won't work as planned.
Management of mutexes can be incredibly tricky to get right.
This is why so many people are enthusiastic about channels.
However due to Rust's type system and ownership rules, you can't get locking and unlocking wrong.
The API of Mutex<T>
Here is an example of how to use a mutex.
We will start by using a mutex in a single threaded context.
use std::sync::Mutex;
fn main() {
let m = Mutex::new(5);
{
let mut num = m.lock().unwrap();
*num = 6;
}
println!("m = {m:?}");
}
Just like with many types, we create a Mutex<T>
using the associated function new
To access the data in the data inside the mutex, we use the lock
method to acquire the lock.
This call will block the current thread so it can't do any work until its our turn to have the lock.
The call to lock
would fail if another thread holding the lock panicked.
In that case no one would ever be able to get the lock.
Here we chose to unwrap
and have this thread panic if we are in that situation.
After the acquire the lock we can treat the return value, named num
in this case as a mutable reference to the data inside.
The type system ensures that we acquire a lock before using the value in m
.
The type of m
is Mutex<i32>
, not i32
, therefore we must call lock
to be able to use the i32
value.
The type system won't let us access the inner i32
otherwise.
Mutex<T>
is a smart pointer.
More accurately, the call to lock
returns a smart pointer called MutexGuard
, wrapped in a LockResult
that we handled with the call to unwrap
.
The MutexGaurd
smart pointer implements Deref
to point at our inner data.
The smart pointer also has a Drop
implementation that releases the lock automatically when a MutexGaurd
goes out of scope. This happens at the end of the inner scope.
This results in not risking forgetting to release the lock and blocking the mutex form being used by other threads, because the lock releases happens automatically.
After dropping the lock, we can print the mutex value and see that we are able to change the inner i32
to 6.
Sharing a Mutex<T>
Between Multiple Threads
Here we will try to share a value between multiple threads suing Mutex<T>
.
We will spin up 10 threads and each will increment a counter by 1.
The counter will go from 0 to 10.
In this example it will give a compiler error and we will use that to learn a bit more about using Mutex<T>
and how Rust helps us use it correctly.
use std::sync::Mutex;
use std::thread;
fn main() {
let counter = Mutex::new(0);
let mut handles = vec![];
for _ in 0..10 {
let handle = thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *counter.lock().unwrap());
}
Here we create the counter
variable which holds an i32
inside a Mutex<T>
, just like before.
Then we create 10 threads by iterating over a range of numbers.
We use thread::spawn
and give all the threads the same closure.
This will moves the counter into the thread, acquires a lock on the Mutex<T>
by calling the lock
method, then adding 1 to the value in the mutex.
Finally when a thread finishes running its closure, num
will go out of scope and release the lock so another thread can acquire it.
Here in the main thread we collect all the join handles.
Then just as we did before we call join
on each handle to make sure that all the threads finish.
Once all the threads have finished the main thread will acquire the lock and print the result of the program.
Here is the output and compiler error
$ cargo run
Compiling shared-state v0.1.0 (file:///projects/shared-state)
error[E0382]: borrow of moved value: `counter`
--> src/main.rs:21:29
|
5 | let counter = Mutex::new(0);
| ------- move occurs because `counter` has type `Mutex<i32>`, which does not implement the `Copy` trait
...
8 | for _ in 0..10 {
| -------------- inside of this loop
9 | let handle = thread::spawn(move || {
| ------- value moved into closure here, in previous iteration of loop
...
21 | println!("Result: {}", *counter.lock().unwrap());
| ^^^^^^^ value borrowed here after move
|
help: consider moving the expression out of the loop so it is only moved once
|
8 ~ let mut value = counter.lock();
9 ~ for _ in 0..10 {
10 | let handle = thread::spawn(move || {
11 ~ let mut num = value.unwrap();
|
For more information about this error, try `rustc --explain E0382`.
error: could not compile `shared-state` (bin "shared-state") due to 1 previous error
This error message states that the counter
value was moved in the pervious iteration of the loop.
Rust is telling us that we can't move the ownership of counter
into multiple threads.
We will fix this compilation error with a multiple-ownership method discussed previously. Ch15
Multiple Ownership with Multiple Threads
Previously we gave gave a value multiple owners by using the smart pointer Rc<T>
to create a reference counted value.
Lets see what happens when we do the same here to see what happens.
We will wrap the Mutex<T>
in Rc<T>
and clone the Rc<T>
before moving ownership to the thread.
use std::rc::Rc;
use std::sync::Mutex;
use std::thread;
fn main() {
let counter = Rc::new(Mutex::new(0));
let mut handles = vec![];
for _ in 0..10 {
let counter = Rc::clone(&counter);
let handle = thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *counter.lock().unwrap());
}
This once again will give us a compilation error, but this is a different compiler error.
$ cargo run
Compiling shared-state v0.1.0 (file:///projects/shared-state)
error[E0277]: `Rc<Mutex<i32>>` cannot be sent between threads safely
--> src/main.rs:11:36
|
11 | let handle = thread::spawn(move || {
| ------------- ^------
| | |
| ______________________|_____________within this `{closure@src/main.rs:11:36: 11:43}`
| | |
| | required by a bound introduced by this call
12 | | let mut num = counter.lock().unwrap();
13 | |
14 | | *num += 1;
15 | | });
| |_________^ `Rc<Mutex<i32>>` cannot be sent between threads safely
|
= help: within `{closure@src/main.rs:11:36: 11:43}`, the trait `Send` is not implemented for `Rc<Mutex<i32>>`, which is required by `{closure@src/main.rs:11:36: 11:43}: Send`
note: required because it's used within this closure
--> src/main.rs:11:36
|
11 | let handle = thread::spawn(move || {
| ^^^^^^^
note: required by a bound in `spawn`
--> file:///home/.rustup/toolchains/1.82/lib/rustlib/src/rust/library/std/src/thread/mod.rs:675:8
|
672 | pub fn spawn<F, T>(f: F) -> JoinHandle<T>
| ----- required by a bound in this function
...
675 | F: Send + 'static,
| ^^^^ required by this bound in `spawn`
For more information about this error, try `rustc --explain E0277`.
error: could not compile `shared-state` (bin "shared-state") due to 1 previous error
Here is the important part to focus on: ``Rc<Mutex> cannot be send between threads safely
.
The compiler also tells us the reason why: the trait Send
is not implemented Rc<Mutex<i32>>
We will discuss Send
in the next section.
For now: it's one of the traits that ensures the types we use with threads are meant for use in concurrent situations.
Rc<T>
is not safe to share across threads.
When Rc<T>
manages the reference count, it adds to the count for each call to clone
and subtracts from the count when each clone is dropped.
This doesn't use any concurrency primitives to make sure that changes to the count can't be interrupted by another thread.
This could lead to wrong counts, this could cause subtle bugs that could in turn lead to memory leaks or a value being dropped before we are don't with it.
We need something that is exactly like Rc<T>
but one that makes changes to the reference count in a thread-safe way.
Atomic Reference Counting with Arc<T>
Arc<T>
is a type like Rc<T>
that is safe to use in concurrent situations.
The a stands for atomic meaning that it is an atomically reference counted type.
Atomics are an additional kind of concurrency primitive that we won't cover in detail here.
See the std library documentation for std::sync::atomic
for more info.
At this point know that atomics work like primitive types but are safe to share across threads.
You may wonder why all primitives types aren't atomic and why std library types aren't implemented to use Arc<T>
by default.
This is due the performance penalty that comes with being thread safe. You only want to pay when you really need to.
If you are just performing operations on values within a single thread, your code can run faster if it doesn't have to enforce the guarantees atomics provide.
We will update our example to use Arc<T>
.
Arc<T>
and Rc<T>
have the same API.
To fix our program by changing the use
line to call to new
and the call to clone
.
Here is the updated code
use std::sync::{Arc, Mutex};
use std::thread;
fn main() {
let counter = Arc::new(Mutex::new(0));
let mut handles = vec![];
for _ in 0..10 {
let counter = Arc::clone(&counter);
let handle = thread::spawn(move || {
let mut num = counter.lock().unwrap();
*num += 1;
});
handles.push(handle);
}
for handle in handles {
handle.join().unwrap();
}
println!("Result: {}", *counter.lock().unwrap());
}
This will print the following
Result: 10
This counts from 0 to 10.
This doesn't seem very impressive but this taught us a lot about Mutex<T>
and thread safety.
This could also be used in a program's structure to do more complex operations than just incrementing a counter.
Using this strategy, you can divide a calculation into independent parts, split those parts across threads and then use a Mutex<T>
to have each thread update the final result with its part.
Note that if you are doing simple numerical operations, there are types simpler than Mutex<T>
types provided by the std::sync::atomic module
These types provide safe concurrent, atomic access to primitives types.
Here we decided to use Mutex<T>
with a primitive type for this example so we could show how Mutex<T>
works.
Similarities Between RefCell<T>
/Rc<T>
and Mutex<T>
/Arc<T>
Notice that counter
is immutable but we could get a mutable reference to the value inside it.
This means Mutex<T>
provides interior mutability, just like how the Cell
family does.
In the same way we use RefCell<T>
in Ch15 to allow us to mutate contents inside an Rc<T>
, we use Mutex<T>
to mutate contents inside an Arc<T>
.
Another thing to notice is that Rust can't protect you form all kinds of logic errors when use use Mutex<T>
.
Recall that using Rc<T>
came with the risk of creating reference cycles, where tow Rc<T>
values refer to each other, thus causing a memory leak.
Similarly, Mutex<T>
comes with the rusk of creating deadlocks.
These occur when an operation needs to lock two resources and two threads each acquired one of the locks, thus causing them to wait for each other forever.
You can research deadlock mitigation strategies for mutexes in any language and have a go at implementing them in Rust.
The std library API documentation for Mutex<T>
and MutexGuard
offer useful info.
Next we will talk about the Send
and Sync
traits and how we can use them with custom types.
Go Here for the next chapter