mirror of
https://github.com/darkicewolf50/RustBrock.git
synced 2025-06-15 04:54:17 -06:00
390 lines
16 KiB
Markdown
390 lines
16 KiB
Markdown
# Reference Cycles Can Leak Memory
|
|
It is not impossible but difficult to accidentally create memory that is never cleaned up (known as a *memory leak*). This is despite Rust's memory safety guarantees.
|
|
|
|
Preventing memory leaks is entirely is not one of Rust's guarantees.
|
|
|
|
This means that memory leaks are memory safe in Rust.
|
|
|
|
|
|
Rusts allows for memory leaks by using `Rc<T>` and `RefCell<T>`.
|
|
|
|
These two make it possible to create references where items refer to each other other in aa cycle.
|
|
|
|
This will create a memory leak because the reference count of e4ach item in the cycle will never get to 0 and the values will not be dropped.
|
|
|
|
## Creating a Reference Cycle
|
|
Lets look into how a reference cycle could happen and how to prevent it.
|
|
|
|
We will start with the definition of the `List` enum and a `tail` method.
|
|
```rust
|
|
use crate::List::{Cons, Nil};
|
|
use std::cell::RefCell;
|
|
use std::rc::Rc;
|
|
|
|
#[derive(Debug)]
|
|
enum List {
|
|
Cons(i32, RefCell<Rc<List>>),
|
|
Nil,
|
|
}
|
|
|
|
impl List {
|
|
fn tail(&self) -> Option<&RefCell<Rc<List>>> {
|
|
match self {
|
|
Cons(_, item) => Some(item),
|
|
Nil => None,
|
|
}
|
|
}
|
|
}
|
|
|
|
fn main() {}
|
|
```
|
|
Here the second element in the `Cons` variant is now `RefCell<Rc<List>>` means that instead of having the ability to modify the `i32` value as we did before.
|
|
|
|
We instead want to modify the `List` value a `Cons` variant is pointing to.
|
|
|
|
Here we also add a `tail` method to make it convenient for us to access the second item if we have a `Cons` variant.
|
|
|
|
In the next example we add to `main` function's body.
|
|
This code create a list in `a` and a list in `b` that points to the list in `a`
|
|
|
|
It then modifies the list in `a` to point to `b`, this creates a reference cycle.
|
|
|
|
We show what is happening via `println!` statements to show what the reference count is at various points in the process.
|
|
```rust
|
|
fn main() {
|
|
let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));
|
|
|
|
println!("a initial rc count = {}", Rc::strong_count(&a));
|
|
println!("a next item = {:?}", a.tail());
|
|
|
|
let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));
|
|
|
|
println!("a rc count after b creation = {}", Rc::strong_count(&a));
|
|
println!("b initial rc count = {}", Rc::strong_count(&b));
|
|
println!("b next item = {:?}", b.tail());
|
|
|
|
if let Some(link) = a.tail() {
|
|
*link.borrow_mut() = Rc::clone(&b);
|
|
}
|
|
|
|
println!("b rc count after changing a = {}", Rc::strong_count(&b));
|
|
println!("a rc count after changing a = {}", Rc::strong_count(&a));
|
|
|
|
// Uncomment the next line to see that we have a cycle;
|
|
// it will overflow the stack
|
|
// println!("a next item = {:?}", a.tail());
|
|
}
|
|
```
|
|
Here we create a `Rc<List>` instance holding a `List` value in the variable `a` with an initial list of `5, Nil`.
|
|
|
|
Next we create another `Rc<List>` instance holding another `List` value in the variable `b` that contains the value 10 and points to the list in `a`.
|
|
|
|
We then modify `a` so that it points a `b` instead of `Nil`.
|
|
|
|
This creates a reference cycle.
|
|
|
|
We do this by using the `tail` method to get a reference to the `RefCell<Rc<List>>` in `a` which we put in the variable `link`.
|
|
|
|
Then we use the `borrow_mut` method on the `RefCell<Rc<LIST>>` to change the value inside form a `Rc<List>` that holds a `Nil` value to the `Rc<List>` in `b`.
|
|
|
|
When we run this, while keeping the last `println!` commented out for the moment, we get this output
|
|
```
|
|
$ cargo run
|
|
Compiling cons-list v0.1.0 (file:///projects/cons-list)
|
|
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.53s
|
|
Running `target/debug/cons-list`
|
|
a initial rc count = 1
|
|
a next item = Some(RefCell { value: Nil })
|
|
a rc count after b creation = 2
|
|
b initial rc count = 1
|
|
b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) })
|
|
b rc count after changing a = 2
|
|
a rc count after changing a = 2
|
|
```
|
|
We can see the reference count of the `Rc<List>` instances in both `a` and `b` are 2 after we change the list in `a` to point to `b`.
|
|
|
|
At the end of `main` Rust will drop the variable `b`, this decreases the reference count of the `b` `Rc<List>` instance from 2 to 1.
|
|
|
|
The memory of that `Rc<List>` has on heap won't be dropped at this point because the reference count is still not 0, it is 1 currently.
|
|
|
|
Next Rust drops `a`, this also decreases the reference count from the `a` `Rc<List>` instance from 2 to 1.
|
|
|
|
This instance's memory cant be dropped either because of the reference form the other `Rc<List>` instance still referes to it.
|
|
|
|
This memory allocated to the list will remain uncollected forever.
|
|
|
|
Here is a visual aid to show this.
|
|
<img src="https://doc.rust-lang.org/book/img/trpl15-04.svg" />
|
|
If you now uncomment the last `println!` and run the program.
|
|
|
|
Rust will try to print this cycle with `a` pointing to `a` and so forth until it overflows the stack.
|
|
|
|
Compared to a real-world program, the consequences of creating a reference cycle in this example aren't very dire.
|
|
|
|
Right after we create the reference cycle, the program ends.
|
|
|
|
|
|
If this were a more complex program allocating lots of memory in a cycle and held onto it for a long time, the program will use more memory than it needs and might overwhelm the system, causing it to run out of available memory.
|
|
|
|
If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested combinations of types with interior mutability and reference counting, you must ensure that you don't create these cycles.
|
|
|
|
You are unable to rely on Rust to catch them.
|
|
|
|
Creating a reference cycle would be a logic bug in your program that you should use automated tests, code reviews and other software development practices to minimize.
|
|
|
|
Another solution for avoiding this is reorganizing your data structs so that some references express ownership and some references don't.
|
|
|
|
This results in you can have cycles made up of some ownership relationships and some non-ownership relationships, and only the ownership relationships affect whether or not a value can be dropped.
|
|
|
|
In the previous example we always want `Cons` variants to own their list, so reorganizing the data struct isn't possible.
|
|
|
|
Now we will look at an example using graphs made up of parent nodes and child nodes to see when non-ownership relationships are an appropriate way to prevent reference cycles.
|
|
|
|
## Preventing Reference Cycles: Turning a `Rc<T>` into a `Weak<T>`
|
|
|
|
So far we have demonstrated that calling `Rc::clone` increases the `strong_count` of a `Rc<T>` instance, and a `Rc<T>` instance is only cleaned up if its `strong_count` is 0.
|
|
|
|
You can also create a *weak reference* to the value within a `Rc<T>` instance by calling `Rc::downgrade` and passing a reference to the `Rc<T>`.
|
|
|
|
Strong references don't express an ownership relationship.
|
|
|
|
Weak references don't express an ownership relationship, and their count doesn't affect when a `Rc<T>` instance is cleaned up.
|
|
|
|
They will no cause a reference cycle because any cycle involving weak references will be broken once the strong reference count of values involved is 0.
|
|
|
|
When you call `Rc::downgrade`, you get a smart point of type `Weak<T>`.
|
|
|
|
Instead of increasing the `strong_count` in the `Rc<T>` by 1, using `Rc::downgrade` increases the weak count by `1`.
|
|
|
|
The `Rc<T>` type uses `weak_count` to keep track of how many `Weak<T>` references exist, this is similar to `strong_count`.
|
|
|
|
The differences is that `weak_count` doesn't need to be 0 for the `Rc<T>` instance to be cleaned up.
|
|
|
|
Due to the value that `Weak<T>` references could be referred to might have been dropped, to do anything with a value that a `Weak` is pointing to, you must ensure that the value will still exist.
|
|
|
|
You can do this by calling the `upgrade` method on a `Weak<T>` instance, this will return an `Option<Rc<T>>`.
|
|
|
|
You will get a result of `Some` if the `Rc<T>` value has not been dropped yet and a results of `None` if the `Rc<T>` value has been dropped.
|
|
|
|
Due to `upgrade` returning an `Option<Rc<T>>`, Rust will ensure that the `Some` case and the `None` cases are handled and there will no be an invalid pointer.
|
|
|
|
Next we will show an example where rather than using a list whose items know only about the next item, we will create a tree whose items know about their children items *and* their parent items.
|
|
|
|
### Creating a Tree Data Structure a Node with Child Nodes
|
|
We will start with building a tree with nodes that know about their child nodes.
|
|
|
|
We will create a struct called `Node` that holds its own `i32` value as well as references to its children `Node` values.
|
|
```rust
|
|
use std::cell::RefCell;
|
|
use std::rc::Rc;
|
|
|
|
#[derive(Debug)]
|
|
struct Node {
|
|
value: i32,
|
|
children: RefCell<Vec<Rc<Node>>>,
|
|
}
|
|
```
|
|
Here we want a `Node` to own its children, and we want to share that ownership with variables so we can access each `Node` in the tree directly.
|
|
|
|
We do this by defining the `Vec<T>` items to be values of type `Rc<Node>`.
|
|
|
|
We also want to be able to modify which nodes are children of another node.
|
|
|
|
Now we have a `RefCell<T>` in children around the `Vec<Rc<Node>>`.
|
|
|
|
Next we will use the struct definition and create one `Node` instance named `leaf` with the value 3 and no children and another instance named `branch` with the vale 5 and `leaf` as one of its children.
|
|
```rust
|
|
fn main() {
|
|
let leaf = Rc::new(Node {
|
|
value: 3,
|
|
children: RefCell::new(vec![]),
|
|
});
|
|
|
|
let branch = Rc::new(Node {
|
|
value: 5,
|
|
children: RefCell::new(vec![Rc::clone(&leaf)]),
|
|
});
|
|
}
|
|
```
|
|
Here we clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the `Node` in `leaf` now has two owners.
|
|
|
|
These are `leaf` and `branch`.
|
|
|
|
We can get from `branch` to `leaf` through `branch.children`.
|
|
|
|
There is no way to get from `leaf` to `branch`.
|
|
|
|
The reason that `leaf` has no reference to `branch` and doesn't know they are related.
|
|
|
|
The reason that `leaf` has no reference to `branch` is that it doesn't know they are related.
|
|
|
|
We want `leaf` to know that `branch` is its parent.
|
|
|
|
### Adding a Reference form a `child` to Its Parent
|
|
To make the child node aware of the relationship to its parent, we need to add a `parent` field to our `Node` struct definition.
|
|
|
|
The trouble is in deciding what the type of `parent` should be.
|
|
|
|
We know that it cannot contain a `Rc<T>`, because that would create a reference cycle with `leaf.parent` pointing to `branch` and `branch.children` pointing to `leaf`.
|
|
|
|
This would cause their `strong_count` values to never be 0.
|
|
|
|
|
|
Thinking about the relationships another way, a parent node should own its children.
|
|
|
|
If a parent node is dropped, then its child nodes should be dropped as well.
|
|
|
|
A child should not own its parent.
|
|
|
|
If a child node, the parent should still exist.
|
|
|
|
We should use weak references in this case.
|
|
|
|
So we instead of `Rc<T>` we will make the type of `parent` use `Weak<T>`.
|
|
|
|
Specifically a `RefCell<Weak<Node>>`. Now the `Node` definition will look like this.
|
|
```rust
|
|
use std::cell::RefCell;
|
|
use std::rc::{Rc, Weak};
|
|
|
|
#[derive(Debug)]
|
|
struct Node {
|
|
value: i32,
|
|
parent: RefCell<Weak<Node>>,
|
|
children: RefCell<Vec<Rc<Node>>>,
|
|
}
|
|
```
|
|
Now a node will be able to refer to its parent node but doesn't own its parent.
|
|
|
|
In the next example we will update `main` to use this new definition so the `leaf` node will have a way to refer to its parent `branch`.
|
|
```rust
|
|
fn main() {
|
|
let leaf = Rc::new(Node {
|
|
value: 3,
|
|
parent: RefCell::new(Weak::new()),
|
|
children: RefCell::new(vec![]),
|
|
});
|
|
|
|
println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
|
|
|
|
let branch = Rc::new(Node {
|
|
value: 5,
|
|
parent: RefCell::new(Weak::new()),
|
|
children: RefCell::new(vec![Rc::clone(&leaf)]),
|
|
});
|
|
|
|
*leaf.parent.borrow_mut() = Rc::downgrade(&branch);
|
|
|
|
println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
|
|
}
|
|
```
|
|
Creating the `leaf` node is similar to the example above with the exception of the `parent` field.
|
|
|
|
`leaf` starts out without a parent, so we create a new , empty `Weak<node>` reference instance.
|
|
|
|
At this point when we try to get a reference to the parent of `leaf` by using the `upgrade` method, we get a `None` value.
|
|
|
|
We will see this in the output from the first `println!` statement.
|
|
```
|
|
leaf parent = None
|
|
```
|
|
|
|
Here when we create the `branch` node, it will also have a new `Weak<Node>` reference in the `parent` field, this is because the `branch` doesn't have a parent node.
|
|
|
|
We will still have `leaf` as one of the children of `branch`.
|
|
|
|
Once we have the `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>` reference to its parent.
|
|
|
|
We will use the `borrow_mut` method on the `RefCell<Weak<Node>>` in the `parent` field of `leaf`.
|
|
|
|
Then we use the `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` form the `Rc<Node>` in `branch`.
|
|
|
|
When we then print the parent of `leaf` again, this time we will get a `Some` variant holding `branch`
|
|
|
|
Now `leaf` will know the relationship to the parent and will have access to its parent.
|
|
|
|
When we print `leaf`, we need to avoid the cycle that eventually end in a stack overflow, like what we had before.
|
|
|
|
The `Weak<Node>` references are printed as `(Weak)`
|
|
```
|
|
leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) },
|
|
children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) },
|
|
children: RefCell { value: [] } }] } })
|
|
```
|
|
Th lack of infinite output indicates that this code didn't create a reference cycle.
|
|
|
|
We can also tell this by looking at the values we get form calling `Rc::strong_count` and `Rc::weak_count`.
|
|
|
|
### Visualizing Changes to `strong_count` and `weak_count`
|
|
Now we will look at how the `strong_count` and `weak_count` values of the `Rc<Node>` instances change by creating a new inner scope and moving the creation of `branch` into that scope`.
|
|
|
|
Doing this we can then see what happens when `branch` and then dropped when it goes out of scope.
|
|
|
|
Here we can see the modifications to `main`
|
|
```rust
|
|
fn main() {
|
|
let leaf = Rc::new(Node {
|
|
value: 3,
|
|
parent: RefCell::new(Weak::new()),
|
|
children: RefCell::new(vec![]),
|
|
});
|
|
|
|
println!(
|
|
"leaf strong = {}, weak = {}",
|
|
Rc::strong_count(&leaf),
|
|
Rc::weak_count(&leaf),
|
|
);
|
|
|
|
{
|
|
let branch = Rc::new(Node {
|
|
value: 5,
|
|
parent: RefCell::new(Weak::new()),
|
|
children: RefCell::new(vec![Rc::clone(&leaf)]),
|
|
});
|
|
|
|
*leaf.parent.borrow_mut() = Rc::downgrade(&branch);
|
|
|
|
println!(
|
|
"branch strong = {}, weak = {}",
|
|
Rc::strong_count(&branch),
|
|
Rc::weak_count(&branch),
|
|
);
|
|
|
|
println!(
|
|
"leaf strong = {}, weak = {}",
|
|
Rc::strong_count(&leaf),
|
|
Rc::weak_count(&leaf),
|
|
);
|
|
}
|
|
|
|
println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
|
|
println!(
|
|
"leaf strong = {}, weak = {}",
|
|
Rc::strong_count(&leaf),
|
|
Rc::weak_count(&leaf),
|
|
);
|
|
}
|
|
```
|
|
Now after creating `leaf` is created its `Rc<Node>` has a strong count of 1 and a weak count of 0.
|
|
|
|
In the inner scope we create `branch` and associate it with `leaf`, at which point when we print the counts, the `Rc<Node>` in `branch` will have a strong count of 1 and a weak count of 1.
|
|
|
|
For `leaf.parent` pointing to `branch` with a `Weak<Node>`.
|
|
|
|
When we print the counts in `leaf` we will see it have a strong count of 2, because `branch` now has a clone of the `Rc<Node>` of `leaf` stored in `branch.children`, built will still have a weak count of 0.
|
|
|
|
When the inner scope ends, `branch` goes out of scope and the strong count of the `Rc<node>` decreases to 0. The `Node` will be dropped.
|
|
|
|
The weak count of 1 form `leaf.parent` has no bearing on whether or not `Node` is dropped, so now we will not het any memory leaks.
|
|
|
|
Now if we try to access the parent of `leaf` after the inner scope we will get `None` again.
|
|
|
|
At the end of the program the `Rc<Node>` in `leaf` has a strong count of 1 and a weak count of 0.
|
|
|
|
This is because the variable `leaf` is now the only reference to the `Rc<Node>`.
|
|
|
|
All of this logic manages the counts and value dropping is built into `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait.
|
|
|
|
By specifying that the relationship form a child to its parent should be a `Weak<T>` reference in the definition of `Node`.
|
|
|
|
You are able to have parent nodes point to child nodes and vice versa without creating a reference cycle and memory leaks. |