RustBrock/Improving The IO Project.md

7.1 KiB

Improving the I/O Project

Using this new info about iterators we can improve the minigrep project by using iterators to make places in the code clearer and more concise

Lets see how iterators can improve our implementation of the Config::build function and the search function

Removing a clone Using an Iterator

In before, we added code that took a slice of String values and created an instance of the Config struct by indexing into the slice and cloning the values, allowing the Config struct to own those values.

Here we have reporduced the implementation fo te Config::build function as it was at the end of ch12

impl Config {
    pub fn build(args: &[String]) -> Result<Config, &'static str> {
        if args.len() < 3 {
            return Err("not enough arguments");
        }

        let query = args[1].clone();
        let file_path = args[2].clone();

        let ignore_case = env::var("IGNORE_CASE").is_ok();

        Ok(Config {
            query,
            file_path,
            ignore_case,
        })
    }
}

Before we said to not worry about the inefficient clone calls because we would remove them later.

Now we will fix that

We needed clone here because we have a slice with String elements in the parameter args, but the build function doesn't own args

To return ownership of a Config instance we has to clone the values from the query and file_path fields of Config so the Config instance can own its values.

Now with iterators we can chang the build function to take ownership of an iterator as its argument instad of borrowing a slice.

We will use the iterator functionality instead of the code that checks the length of the slice and indexes into specific locations.

This will clarify wat the Config::build function is doing because the iterator will access the values.

Once Config::build takes ownership of the iterator and stops using indexing operations that borrow.

We can then move the String values from the iterator into Config rather than calling clone and making a new allocation

Using the Returned Iterator Directly

The main.rs should look like this

fn main() {
    let args: Vec<String> = env::args().collect();

    let config = Config::build(&args).unwrap_or_else(|err| {
        eprintln!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    // --snip--
}

First we will change the start of main to use an iterator instead.

This won't compile until we update Config::build as well

fn main() {
    let config = Config::build(env::args()).unwrap_or_else(|err| {
        eprintln!("Problem parsing arguments: {err}");
        process::exit(1);
    });

    // --snip--
}

The env::args fnction reutrns an iterator.

Rather than collecting the iterator values into a vector then passing that and then passing a slice to Config::build

Instead we pass ownership of the iterator returned from env::args to Config::build directly.

Now we need to update the definition of Config::build.

Here is how we update the signature of Config::build.

Note this still wont compile because we need to update the function body.

impl Config {
    pub fn build(
        mut args: impl Iterator<Item = String>,
    ) -> Result<Config, &'static str> {
        // --snip--

The std library documentation for the env::args function sohws that the type of the iterator it returns is std::env::Args and that type implements the Iterator trait and returns String values.

Now we updated the Config::build signature so the paramter args has a generic type wioth the trait bounds impl Iterator<Item = String> instead of &[String]

This useage of the impl Trait syntax was discuess in the Traits and Paramters.

This means that args can be any type that implements the Iterator trait and returns String

Because we take ownership of args, then we well will be mutating args by iterating over it.

We add the mut keyword into the sepcification of the argsg paramter to ensure it is mutable

Using Iterator Trait Methods Instead of Indexing

Now we will fix the body of Config::build

Due to how args implements the Iterator trait, we know we can call the next method on it

Here is the update body

impl Config {
    pub fn build(
        mut args: impl Iterator<Item = String>,
    ) -> Result<Config, &'static str> {
        args.next();

        let query = match args.next() {
            Some(arg) => arg,
            None => return Err("Didn't get a query string"),
        };

        let file_path = match args.next() {
            Some(arg) => arg,
            None => return Err("Didn't get a file path"),
        };

        let ignore_case = env::var("IGNORE_CASE").is_ok();

        Ok(Config {
            query,
            file_path,
            ignore_case,
        })
    }
}

Remember that the first value in the return value of env::args is the name of the program

We want to ignore that, we first call next and do nothing with the return value to consume it from the iterator.

Next we call net to get the value we want to put in the query field of Config

If next returns a Some, we use a match to extract the value.

If it returns None, it means not enough arguments were given and we return early with an Err value.

We do the same thing for the file_path value.

Making Code Clearer with Iterator Adapters

We can also take advantage of iterators in the search function in the I/O project.

Here is the old version of the search function

pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
    let mut results = Vec::new();

    for line in contents.lines() {
        if line.contains(query) {
            results.push(line);
        }
    }

    results
}

To rewrite this code in a more concise way by using adapter methods.

This helps us avoid having a mutable intermediate results vector.

The functional programming style prefers to minimize the amount of mutable state to make code clearer.

Removing the mutable state might enable a future enhancement to make searching happen in parallel, because we wouldn't have to manage concurrent access to the results vector

Here is the new change

pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
    contents
        .lines()
        .filter(|line| line.contains(query))
        .collect()
}

The purpose of the search function is to return all lines in contents that contain the query.

Similar to the filter example form before, this code uses the filter adapter to keep only the lines that line.contains(query) returns true.

We collect the matching lines into another vector with collect.

This is much simpler.

You can also make the change to use iterator methods in search_case_insensitive function as well.

Choosing Between Loops or Iterators

The next question is which sytle you should choose in your own code and why

The original implementation of minigrep verses using iterators.

Most Rust programmers prefer to use the iterator style.

It is a bit tougher to get the hand of at first, once you get the feel for the various