# Using `Box` to Point ot Data on the Heap This is the most straightforward smart inter is a *box*, whose tpye is written `Box`. Boxes allow you to store data on the heap rather than the stack. What remains on the stack is the pointer the heap data. Boxes don't have performance overhead, other than storing thier data on the heap instead of on the stack. They don't have many extra capabilities either. You often use them in these situation: - When you have a type whose size can't be known at compile time and you want to use a value of that type in a context that requires an exact size. - When you have a large amount of data and you want to transfer ownership but ensure the data won't be copied when you do so. - When you want to own a value and you only care that it's a type that implements a particular trait rather than being of a specific type. We will demonstrate the first point in the [Enabling Recursive Types with Boxes](#enabling-recursive-types-with-boxes) section. The second point, transferring ownership of large amount of data can take a long time because the data is copied around on the stack. To improve performance in this situation, we can store the large amount of data on the heap in a box. Then only the small amount of pointer data is copied around on the stack, while the data it references stays in one place on the heap. The third case iis known as a *trait object* and another chapter is devoted entirely to this. ## Using a `Box` to Store Data on the Heap Before discussing the heap storage use case of `Box`. We will first cover the syntax and how to interact with values stored within a `Box` Here us an example of how to use a box to store an `i32` value on the heap ```rust fn main() { let b = Box::new(5); println!("b = {b}"); } ``` Here we define the variable `b` to have the value of a `Box` that points to the value `5`. This is allocated on the heap. This program will output `b = 5`. In this case we can access the data in the box similar to how it would be stored on the stack. Just like any owned value, when a box goes out of scope, as `b` does at the end of `main`, it will be deallocated. The deallocation happens both for the box (stored on the stack) and the data it points to (stored on the heap). Putting a single value on the heap isnt very useful, so you won't use boxes by themselves in this way often. Having a value like a single `i32` on the stack, where they are stored by default is more appropriate in the majority of situations. Now lets look at a case where oxe allow us to define types that we wouldn't be allowed to if we didn't have boxes ## Enabling Recursive Types with Boxes A value of *recursive type* can have another value of the same tpye as part of itself. These types pose an issue because at compile time Rust needs to know how much space a type takes up. However, the nesting of values of recursive types could theoretically continue infinitelym so Rust cant know how much spcae the value needs. Due to boxes having a knwon size, we can enable recursive types by inserting a box in the recursive type definition. As an example of a recursive type, lets explore the *cons list*. This data type is commonly found in functional programming languages. The cons list type we will define is straightforward except for the recursion, this means the cencepts in the example we will work with will be useful any time you get into more complex situations involving recursive types. ### More Information About the Cons List This data structure comes from the Lisp programming language and its dialects and is made up of nested pairs, and is the Lisp version of a linked list. The name comes form the `cons` function (short for "construct function") in Lisp that constructs a new pair form its two arguments. By calling `cons` on a pair consisting of a value and another pai, we can construct cons lists made up of recursive pairs. Here is a pseudocode representation of a cons list containing the list 1, 2, 3 with each pair in parentheses ``` (1, (2, (3, Nil))) ``` Each item is a cons list contains tow elements 1. The value of the current item 2. The next item The last item in the list only contains only a value called `Nil` without a next item. A cons list is produced by recursively calling the `cons` function. The canonical name to denote the base case of the recursion is `Nil` This is ot the same as the "null" or "nil" concept, which is an invalid or absent value. The cons list isn't a commonly ued data struct in Rust. Most of the time when you have a list of items in Rust, `Vec` is a better choice to use. Other, more complex recursive data types are useful in various situations but by starting with a basic type like the cons list. We can now explore how boxes let us define a recursive data type without much distraction. In this example, it contains an enum definition ofr a cons list. This will not compile yet because the `List` type doesn't have a known size ```rust enum List { Cons(i32, List), Nil, } ``` Note: We are implementing a cons list that holds only `i32` values for the purpose of this example. We could have done this with generics to define a cons list type that could store values of any type. Using the `List` type to store the list `1, 2, 3` would look like this next example ```rust use crate::List::{Cons, Nil}; fn main() { let list = Cons(1, Cons(2, Cons(3, Nil))); } ``` The first `Cons` value holds `1` and another `List` value. This `List` value is another `Cons` value tha holds `2` and naother `List` value. This `List` value also contains one more `Cons` value that hoolds `3` and a `List` value, which is finally `Nil`, the non-recursive variant that signals the end of the list If we try to compile this we would get an error like this ``` $ cargo run Compiling cons-list v0.1.0 (file:///projects/cons-list) error[E0072]: recursive type `List` has infinite size --> src/main.rs:1:1 | 1 | enum List { | ^^^^^^^^^ 2 | Cons(i32, List), | ---- recursive without indirection | help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle | 2 | Cons(i32, Box), | ++++ + error[E0391]: cycle detected when computing when `List` needs drop --> src/main.rs:1:1 | 1 | enum List { | ^^^^^^^^^ | = note: ...which immediately requires computing when `List` needs drop again = note: cycle used when computing whether `List` needs drop = note: see https://rustc-dev-guide.rust-lang.org/overview.html#queries and https://rustc-dev-guide.rust-lang.org/query.html for more information Some errors have detailed explanations: E0072, E0391. For more information about an error, try `rustc --explain E0072`. error: could not compile `cons-list` (bin "cons-list") due to 2 previous errors ``` The errors states that this type "has infinite size". The reason is that we defined `List` with a variant that is recursive. It holds another value of itself directly. This results in Rust being unable to figure out how much space t needs to store a `List` value. Lets look in detail at why we get this error, first we will look at how Rust decides how much space it needs to store a value of a non-recursive type. ### Computing the Size of a Non-Recursive Type If we recall the `Message` enum that we defined previously ```rust enum Message { Quit, Move { x: i32, y: i32 }, Write(String), ChangeColor(i32, i32, i32), } ``` In order to determine how much space to allocate for a `Message` value, RUst goes through each of the variants to see which variantneeds the msot space. Rust see that: - `Message::Quit` doesnt need any space - `Message::Move` needs enough space to store two `i32` values - etc. Because only one variant will be used, the most space a `Message` value will need is the space it would take to store the largest of its variants. Contrast this with what happens when Rust attempts to determine how much space a recursive type like `List` enum needs. The compiler starts by looking at the `Cons` variant, which holds a value if type `32` and a value of a type `List`. Therefore `Cons` needs an amount of space equal to the size of an `i32` plus the size of a `List` To figure out how much memory the `List` type needs the compiler looks at the variants, starting with the `Cons` variant. The `Cons` variant holds a value of type `i32` and a value of type `List` and this process continues infinitely Here is a picture that demonstrates what it looks like ### Using `Box` to Get a Recursive Type with a Known Size Due to Rust not being able to figure out how much space to allocate for recursively defined tpyes, the compiler gives an error with this great suggestion ``` help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle | 2 | Cons(i32, Box), | ++++ + ``` In this suggestion "indirection" means that instead of storeing a value directly, we should change the data struct to store the value indirectly by storeing a pointer to the value instead. Due to `Box` being a pointer, Rust will always know how much space a `Box` needs. A pointer's size doesnt change based on the kind of data it is pointing to. This means that we can put a `Box` inside the `Cons` variant instead of directly anthoer `List` value directly. The `Box` will point ot the next `List` value that will be on the heap rather than inside the `Cons` variant. Conceptually we still have a list, created with lists holding other lists. Now with this implementaion it is now more like placing the items next to one another rather than inside one another. We can now change the defintiion of the `List` enum and the usage of the `List`. This will now compile. ```rust enum List { Cons(i32, Box), Nil, } use crate::List::{Cons, Nil}; fn main() { let list = Cons(1, Box::new(Cons(2, Box::new(Cons(3, Box::new(Nil)))))); } ``` The `Cons` varienat needs the size of an `i32` plus the sapce to store the box's pointer data. The `Nil` variant stores no values so it needs less space than the `Cons` variant. By using boxes we have borken the infinite, recursive chain. Now the compiler can figure out the size it needs to store a `List` value. Here is what the `Cons` variant looks like now Boxes provide only the indirection and heap allocation. They do not have any other special capabilities unlike those we will see with the other smart pointer types. They also dont have the performance overhead that these special capabilities incur so they can be usefil in cases like the cons list where the indirection is the only feature we need. Boxes uses cases will also be covered later as well in more detail. The `Box` tpye is a smart pointer because it implements the `Deref` trait which allows `Box values to be treated like refs. When a `Box` value goes out of scope, the heap data that the box is pointing to is cleaned up as well, due to the `Drop` trait implementation. These two traits will be even more important to the functionality provided by the other smart pointer tpyes. Now we will go into these two traits in more detail. - [`Deref` Trait Quick Link](./Deref%20Trait.md)