RustBrock/Iterators.md
2025-02-24 13:11:28 -07:00

10 KiB

Processing a Series of Items with Iterators

Iterator patterns allow you to perform some task on a sequence of items in turn.

An iterator is responsibl for the logic of iterating over each item and determining when the sequence has finished.

When you use iterators, you don't have to reimplement that logic yourself again.

Rust's iterators are lasy, meaning they have no effect until yo call methods that consume the iterator to use it up

In this example the code creates an iterator over the items in the vecotr v1 by calling the iter method defined on Vec<T>

This code does nothing useful.

    let v1 = vec![1, 2, 3];

    let v1_iter = v1.iter();

The iterator is stored in the v1_iter variable.

Once an iterator is created, we can use it in a variety of ways.

Before we iterator over an array using a for loop to execute some code on each of its items.

Under the hood this implicitly created and then consumed an iterator, but we glossed over how exactly that works until now.

In this example, we separate the creation of the iterator from the use of the iterator in the for loop.

When the for loop is called using th iterator in v1_iter, each element in the iterator is used in one iteration of the loop which prints out each value.

let v1 = vec![1, 2, 3];

let v1_iter = v1.iter();

for val in v1_iter {
    println!("Got: {val}");
}

In languages that don't have iterators provided by their std libraries, you would likely wirte this same functionality by starting at index 0.

Using that variable to index into th vector to get a value and incrememting the variable in a loop until it reached the total number of items in the vector.

Iterators handle all that logic for you, removing repetitive code that you could mess up.

Iterators gives mroe flexibility to use the same logic with many different kinds of sequences, not just data structs you can index into (vectors for example).

The Iterator Trait and the next Method

All iterators implement a trait named Iterator that is defined in the std library

The defintion of the Iterator trait looks like this:

pub trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;

    // methods with default implementations elided
}

Note this definition uses some new syntax, type Item and Self::Item, which are defining an associated type with this trait

Associated types will be discussed in ch20 (Advanced Features)

For now know that this code says implementing the Iterator trait requires you also defined an Item type.

This Item type is usd in the return in the return type of the next method.

The Item tpye ill be the type returned form the iterator.

The Iterator trait only requires implmentors to define one method; next method which returns one item of the iterator at a time wrapped in Some and when the iteration is over, returns None.

We can call the next method on iterators directly

This example demonstrates what values are returned from repeated calls to next on the iterator created from the vector.

#[test]
fn iterator_demonstration() {
    let v1 = vec![1, 2, 3];

    let mut v1_iter = v1.iter();

    assert_eq!(v1_iter.next(), Some(&1));
    assert_eq!(v1_iter.next(), Some(&2));
   assert_eq!(v1_iter.next(), Some(&3));
    assert_eq!(v1_iter.next(), None);
}

Notice that we need to make v1_iter mutable.

Calling the next method on an iterator changes internal state that the iterator uses to keep track of where it is in the sequence.

It could also be said that the code consumes, or uses up, the iterator.

Each call to next comsumes an item form the iterator.

We didn't need to make v1_iter mutable when we used a for loop because the loop took ownership of v1_iter and made it mutable under the hood.

Note as well the values that we get from the next are immutable reference to the values in the vector.

The iter mthod produces an iterator over immutable references.

If we want to create an iterator that takes ownership of v1 and returns owned values, we can call into_iter.

If you want to iterate over mutalbe references, you can call iter_mut

Methods that Consume the Iterator

The Iterator trait has a number of different methods with default implementations provided by the std library.

You can find out about these by looking the std library API documentation for the Iterator trait.

Some these methods call the next method in their definition, which is why you are required to implement the Iterator trait.

Methods that call next are called consuming adapters, becaise calling them uses up the iterator.

One example is the sum metod which takes ownership of the iterator and iterates through the items by repeatedly calling next thus consuming the iterator.

As it iterates through it adds each item to a running total, this comsumes the iterator.

When it is complete it return the total

Here is a test illustrating a use of the sum method

    #[test]
    fn iterator_sum() {
        let v1 = vec![1, 2, 3];

        let v1_iter = v1.iter();

        let total: i32 = v1_iter.sum();

        assert_eq!(total, 6);
    }

We aren't allowed to use v1_iter after the call to `sum because it takes ownership of the iterator we call it on.

Methods that Produce Other Iterators

Iterator adapters are methods defined on the Iterator trait that don't consume the iterator.

Instead, they produce different iterators by changing some aspect of the original iterator.

This example shows calling an iterator adapter method map, which takes a closure to call on each item as the items are iterated through.

The map method returns a new iterator that produces the modified items.

The closure here creates a new iterator in which each item from the vector will be incremeted by 1

    let v1: Vec<i32> = vec![1, 2, 3];

    v1.iter().map(|x| x + 1);

But this code produces a warning

$ cargo run
   Compiling iterators v0.1.0 (file:///projects/iterators)
warning: unused `Map` that must be used
 --> src/main.rs:4:5
  |
4 |     v1.iter().map(|x| x + 1);
  |     ^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: iterators are lazy and do nothing unless consumed
  = note: `#[warn(unused_must_use)]` on by default
help: use `let _ = ...` to ignore the resulting value
  |
4 |     let _ = v1.iter().map(|x| x + 1);
  |     +++++++

warning: `iterators` (bin "iterators") generated 1 warning
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/iterators`

This doesnt do anything. The closure we sepcified never gets called. The warning reminds us why: iterator adapters are lazy and we need to consume the iterator here.

To fix this warning and consume the iterator, we will use the collect method, which we used in Ch 12 with env::args.

This method consumes the iterator and collects the resulting values into a collection data type.

In this next example we collect the results of the iterating over the iterator that is returned form the call to map into a vector.

This vector will end up containing each item form the original vector incremeted by 1.

    let v1: Vec<i32> = vec![1, 2, 3];

    let v2: Vec<_> = v1.iter().map(|x| x + 1).collect();

    assert_eq!(v2, vec![2, 3, 4]);

Because map takes a closure, we can specify any operation we want to perform on each item.

This is a good eample of how closures let you do custome behavior while reusing the iteration behavior that the Iterator trait provides.

You can chain mutliple calls to iterator adapters to perform compex action is a readable way.

Due to all iterators being lasy, you have to call one of the consuming adapter methods to get results ffrom calls to iterator adapters.

Using Closures that Capture Their Environment

Many iterator adapters take closures as args and commonly the closures we will specify as ags to iterator adapters will be closures that capture their environment.

In this example we use the filter method that takes a closure.

The closure get an item form the iterator and returns a bool

If the closure returns true, the value will be included in the iteration produced by filter.

If the closure returns false, the value won't be included.

Here we use filer with a closure that captures the shoe_size variable from its environment to iterate over a collection of Shoe struct instances.

It will return only shoes that are the specified size.

#[derive(PartialEq, Debug)]
struct Shoe {
    size: u32,
    style: String,
}

fn shoes_in_size(shoes: Vec<Shoe>, shoe_size: u32) -> Vec<Shoe> {
    shoes.into_iter().filter(|s| s.size == shoe_size).collect()
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn filters_by_size() {
        let shoes = vec![
            Shoe {
                size: 10,
                style: String::from("sneaker"),
            },
            Shoe {
                size: 13,
                style: String::from("sandal"),
            },
            Shoe {
                size: 10,
                style: String::from("boot"),
            },
        ];

        let in_my_size = shoes_in_size(shoes, 10);

        assert_eq!(
            in_my_size,
            vec![
                Shoe {
                    size: 10,
                    style: String::from("sneaker")
                },
                Shoe {
                    size: 10,
                    style: String::from("boot")
                },
            ]
        );
    }
}

The shoes_in_size function takes ownership of a vector of shoes and a shoe size parameters.

It returns a vector containing only shoes of the specified size.

In the body of shoes_in_size we call into_iter to create an iterator that takes ownership of the vector.

We then call filter to adapt that iterator into a new iterator that only contains elements for which the closure returns true.

The closure captures the shoe_size parameter from the environment and compares the value with each shoe's size, keeping only shoes of the size specified.

Calling collect finally gathers the values returned by the adapted iterator into a vector that is returned by the function.

This test shows that when we call shoes_in_size, we get back only shoes that have the same size as te value specified.