RustBrock/Leaky Reference Cycles.md

# Reference Cycles Can Leak Memory
It is not impossible but difficult to accidentally create memory that is never cleaned up (known as a *memory leak*). This is despite Rust's memory safety guarantees.

Preventing memory leaks is entirely is not one of Rust's guarantees.

This means that memory leaks are memory safe in Rust.


Rusts allows for memory leaks by using `Rc<T>` and `RefCell<T>`.

These two make it possible to create references where items refer to each other other in aa cycle.

This will create a memory leak because the reference count of e4ach item in the cycle will never get to 0 and the values will not be dropped.

## Creating a Reference Cycle
Lets look into how a reference cycle could happen and how to prevent it.

We will start with the definition of the `List` enum and a `tail` method.
```rust
use crate::List::{Cons, Nil};
use std::cell::RefCell;
use std::rc::Rc;

#[derive(Debug)]
enum List {
    Cons(i32, RefCell<Rc<List>>),
    Nil,
}

impl List {
    fn tail(&self) -> Option<&RefCell<Rc<List>>> {
        match self {
            Cons(_, item) => Some(item),
            Nil => None,
        }
    }
}

fn main() {}
```
Here the second element in the `Cons` variant is now `RefCell<Rc<List>>` means that instead of having the ability to modify the `i32` value as we did before.

We instead want to modify the `List` value a `Cons` variant is pointing to.

Here we also add a `tail` method to make it convenient for us to access the second item if we have a `Cons` variant.

In the next example we add to `main` function's body.
This code create a list in `a` and a list in `b` that points to the list in `a`

It then modifies the list in `a` to point to `b`, this creates a reference cycle.

We show what is happening via `println!` statements to show what the reference count is at various points in the process.
```rust
fn main() {
    let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil))));

    println!("a initial rc count = {}", Rc::strong_count(&a));
    println!("a next item = {:?}", a.tail());

    let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a))));

    println!("a rc count after b creation = {}", Rc::strong_count(&a));
    println!("b initial rc count = {}", Rc::strong_count(&b));
    println!("b next item = {:?}", b.tail());

    if let Some(link) = a.tail() {
        *link.borrow_mut() = Rc::clone(&b);
    }

    println!("b rc count after changing a = {}", Rc::strong_count(&b));
    println!("a rc count after changing a = {}", Rc::strong_count(&a));

    // Uncomment the next line to see that we have a cycle;
    // it will overflow the stack
    // println!("a next item = {:?}", a.tail());
}
```
Here we create a `Rc<List>`  instance holding a `List` value in the variable `a` with an initial list of `5, Nil`.

Next we create another `Rc<List>` instance holding another `List` value in the variable `b` that contains the value 10 and points to the list in `a`.

We then modify `a` so that it points a `b` instead of `Nil`.

This creates a reference cycle.

We do this by using the `tail` method to get a reference to the `RefCell<Rc<List>>` in `a` which we put in the variable `link`.

Then we use the `borrow_mut` method on the `RefCell<Rc<LIST>>` to change the value inside form a `Rc<List>` that holds a `Nil` value to the `Rc<List>` in `b`.

When we run this, while keeping the last `println!` commented out for the moment, we get this output
```
$ cargo run
   Compiling cons-list v0.1.0 (file:///projects/cons-list)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.53s
     Running `target/debug/cons-list`
a initial rc count = 1
a next item = Some(RefCell { value: Nil })
a rc count after b creation = 2
b initial rc count = 1
b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) })
b rc count after changing a = 2
a rc count after changing a = 2
```
We can see the reference count of the `Rc<List>` instances in both `a` and `b` are 2 after we change the list in `a` to point to `b`.

At the end of `main` Rust will drop the variable `b`, this decreases the reference count of the `b` `Rc<List>` instance from 2 to 1.

The memory of that `Rc<List>` has on heap won't be dropped at this point because the reference count is still not 0, it is 1 currently.

Next Rust drops `a`, this also decreases the reference count from the `a` `Rc<List>` instance from 2 to 1.

This instance's memory cant be dropped either because of the reference form the other `Rc<List>` instance still referes to it.

This memory allocated to the list will remain uncollected forever.

Here is a visual aid to show this.
<img src="https://doc.rust-lang.org/book/img/trpl15-04.svg" />
If you now uncomment the last `println!` and run the program.

Rust will try to print this cycle with `a` pointing to `a` and so forth until it overflows the stack.

Compared to a real-world program, the consequences of creating a reference cycle in this example aren't very dire.

Right after we create the reference cycle, the program ends.


If this were a more complex program allocating lots of memory in a cycle and held onto it for a long time, the program will use more memory than it needs and might overwhelm the system, causing it to run out of available memory.

If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested combinations of types with interior mutability and reference counting, you must ensure that you don't create these cycles.

You are unable to rely on Rust to catch them.

Creating a reference cycle would be a logic bug in your program that you should use automated tests, code reviews and other software development practices to minimize.

Another solution for avoiding this is reorganizing your  data structs so that some references express ownership and some references don't.

This results in you can have cycles made up of some ownership relationships and some non-ownership relationships, and only the ownership relationships affect whether or not a value can be dropped.

In the previous example we always want `Cons` variants to own their list, so reorganizing the data struct isn't possible.

Now we will look at an example using graphs made up of parent nodes and child nodes to see when non-ownership relationships are an appropriate way to prevent reference cycles.

## Preventing Reference Cycles: Turning a `Rc<T>` into a `Weak<T>`

So far we have demonstrated that calling `Rc::clone` increases the `strong_count` of a `Rc<T>` instance, and a `Rc<T>` instance is only cleaned up if its `strong_count` is 0.

You can also create a *weak reference* to the value within a `Rc<T>` instance by calling `Rc::downgrade` and passing a reference to the `Rc<T>`.

Strong references don't express an ownership relationship.

Weak references don't express an ownership relationship, and their count doesn't affect when a `Rc<T>` instance is cleaned up.

They will no cause a reference cycle because any cycle involving weak references will be broken once the strong reference count of values involved is 0.

When you call `Rc::downgrade`, you get a smart point of type `Weak<T>`.

Instead of increasing the `strong_count` in the `Rc<T>` by 1, using `Rc::downgrade` increases the weak count by `1`.

The `Rc<T>` type uses `weak_count` to keep track of how many `Weak<T>` references exist, this is similar to `strong_count`.

The differences is that `weak_count` doesn't need to be 0 for the `Rc<T>` instance to be cleaned up.

Due to the value that `Weak<T>` references could be referred to might have been dropped, to do anything with a value that a `Weak` is pointing to, you must ensure that the value will still exist.

You can do this by calling the `upgrade` method on a `Weak<T>` instance, this will return an `Option<Rc<T>>`.

You will get a result of `Some` if the `Rc<T>` value has not been dropped yet and a results of `None` if the `Rc<T>` value has been dropped.

Due to `upgrade` returning an `Option<Rc<T>>`, Rust will ensure that the `Some` case and the `None` cases are handled and there will no be an invalid pointer.

Next we will show an example where rather than using a list whose items know only about the next item, we will create a tree whose items know about their children items *and* their parent items.

### Creating a Tree Data Structure a Node with Child Nodes
We will start with building a tree with nodes that know about their child nodes.

We will create a struct called `Node` that holds its own `i32` value as well as references to its children `Node` values.
```rust
use std::cell::RefCell;
use std::rc::Rc;

#[derive(Debug)]
struct Node {
    value: i32,
    children: RefCell<Vec<Rc<Node>>>,
}
```
Here we want a `Node` to own its children, and we want to share that ownership with variables so we can access each `Node` in the tree directly.

We do this by defining the `Vec<T>` items to be values of type `Rc<Node>`.

We also want to be able to modify which nodes are children of another node.

Now we have a `RefCell<T>` in children around the `Vec<Rc<Node>>`.

Next we will use the struct definition and create one `Node` instance named `leaf` with the value 3 and no children and another instance named `branch` with the vale 5 and `leaf` as one of its children.
```rust
fn main() {
    let leaf = Rc::new(Node {
        value: 3,
        children: RefCell::new(vec![]),
    });

    let branch = Rc::new(Node {
        value: 5,
        children: RefCell::new(vec![Rc::clone(&leaf)]),
    });
}
```
Here we clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the `Node` in `leaf` now has two owners.

These are `leaf` and `branch`.

We can get from `branch` to `leaf` through `branch.children`.

There is no way to get from `leaf` to `branch`.

The reason that `leaf` has no reference to `branch` and doesn't know they are related.

The reason that `leaf` has no reference to `branch` is that it doesn't know they are related.

We want `leaf` to know that `branch` is its parent.

### Adding a Reference form a `child` to Its Parent
To make the child node aware of the relationship to its parent, we need to add a `parent` field to our `Node` struct definition.

The trouble is in deciding what the type of `parent` should be.

We know that it cannot contain a `Rc<T>`, because that would create a reference cycle with `leaf.parent` pointing to `branch` and `branch.children` pointing to `leaf`.

This would cause their `strong_count` values to never be 0.


Thinking about the relationships another way, a parent node should own its children.

If a parent node is dropped, then its child nodes should be dropped as well.

A child should not own its parent.

If a child node, the parent should still exist.

We should use weak references in this case.

So we instead of `Rc<T>` we will make the type of `parent` use `Weak<T>`.

Specifically a `RefCell<Weak<Node>>`. Now the `Node` definition will look like this.
```rust
use std::cell::RefCell;
use std::rc::{Rc, Weak};

#[derive(Debug)]
struct Node {
    value: i32,
    parent: RefCell<Weak<Node>>,
    children: RefCell<Vec<Rc<Node>>>,
}
```
Now a node will be able to refer to its parent node but doesn't own its parent.

In the next example we will update `main` to use this new definition so the `leaf` node will have a way to refer to its parent `branch`.
```rust
fn main() {
    let leaf = Rc::new(Node {
        value: 3,
        parent: RefCell::new(Weak::new()),
        children: RefCell::new(vec![]),
    });

    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());

    let branch = Rc::new(Node {
        value: 5,
        parent: RefCell::new(Weak::new()),
        children: RefCell::new(vec![Rc::clone(&leaf)]),
    });

    *leaf.parent.borrow_mut() = Rc::downgrade(&branch);

    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
}
```
Creating the `leaf` node is similar to the example above with the exception of the `parent` field.

`leaf` starts out without a parent, so we create a new , empty `Weak<node>` reference instance.

At this point when we try to get a reference to the parent of `leaf` by using the `upgrade` method, we get a `None` value.

We will see this in the output from the first `println!` statement.
```
leaf parent = None
```

Here when we create the `branch` node, it will also have a new `Weak<Node>` reference in the `parent` field, this is because the `branch` doesn't have a parent node.

We will still have `leaf` as one of the children of `branch`.

Once we have the `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>` reference to its parent.

We will use the `borrow_mut` method on the `RefCell<Weak<Node>>` in the `parent` field of `leaf`.

Then we use the `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` form the `Rc<Node>` in `branch`.

When we then print the parent of `leaf` again, this time we will get a `Some` variant holding `branch`

Now `leaf` will know the relationship to the parent and will have access to its parent.

When we print `leaf`, we need to avoid the cycle that eventually end in a stack overflow, like what we had before.

The `Weak<Node>` references are printed as `(Weak)`
```
leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) },
children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) },
children: RefCell { value: [] } }] } })
```
Th lack of infinite output indicates that this code didn't create a reference cycle.

We can also tell this by looking at the values we get form calling `Rc::strong_count` and `Rc::weak_count`.

### Visualizing Changes to `strong_count` and `weak_count`
Now we will look at how the `strong_count` and `weak_count` values of the `Rc<Node>` instances change by creating a new inner scope and moving the creation of `branch` into that scope`.

Doing this we can then see what happens when `branch` and then dropped when it goes out of scope.

Here we can see the modifications to `main`
```rust
fn main() {
    let leaf = Rc::new(Node {
        value: 3,
        parent: RefCell::new(Weak::new()),
        children: RefCell::new(vec![]),
    });

    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );

    {
        let branch = Rc::new(Node {
            value: 5,
            parent: RefCell::new(Weak::new()),
            children: RefCell::new(vec![Rc::clone(&leaf)]),
        });

        *leaf.parent.borrow_mut() = Rc::downgrade(&branch);

        println!(
            "branch strong = {}, weak = {}",
            Rc::strong_count(&branch),
            Rc::weak_count(&branch),
        );

        println!(
            "leaf strong = {}, weak = {}",
            Rc::strong_count(&leaf),
            Rc::weak_count(&leaf),
        );
    }

    println!("leaf parent = {:?}", leaf.parent.borrow().upgrade());
    println!(
        "leaf strong = {}, weak = {}",
        Rc::strong_count(&leaf),
        Rc::weak_count(&leaf),
    );
}
```
Now after creating `leaf` is created its `Rc<Node>` has a strong count of 1 and a weak count of 0.

In the inner scope we create `branch` and associate it with `leaf`, at which point when we print the counts, the `Rc<Node>` in `branch` will have a strong count of 1 and a weak count of 1.

For `leaf.parent` pointing to `branch` with a `Weak<Node>`.

When we print the counts in `leaf` we will see it have a strong count of 2, because `branch` now has a clone of the `Rc<Node>` of `leaf` stored in `branch.children`, built will still have a weak count of 0.

When the inner scope ends, `branch` goes out of scope and the strong count of the `Rc<node>` decreases to 0. The `Node` will be dropped.

The weak count of 1 form `leaf.parent` has no bearing on whether or not `Node` is dropped, so now we will not het any memory leaks.

Now if we try to access the parent of `leaf` after the inner scope we will get `None` again.

At the end of the program the `Rc<Node>` in `leaf` has a strong count of 1 and a weak count of 0.

This is because the variable `leaf` is now the only reference to the `Rc<Node>`.

All of this logic manages the counts and value dropping is built into `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait.

By specifying that the relationship form a child to its parent should be a `Weak<T>` reference in the definition of `Node`.

You are able to have parent nodes point to child nodes and vice versa without creating a reference cycle and memory leaks.