RustBrock/Improving The IO Project.md
darkicewolf50 90ffa5d8a5
Some checks failed
Test Gitea Actions / first (push) Successful in 14s
Test Gitea Actions / check-code (push) Failing after 14s
Test Gitea Actions / test (push) Has been skipped
Test Gitea Actions / documentation-check (push) Has been skipped
finished ch15.4
2025-03-05 17:07:24 -07:00

212 lines
7.6 KiB
Markdown

# Improving the I/O Project
Using this new info about iterators we can improve the minigrep project by using iterators to make places in the code clearer and more concise
Lets see how iterators can improve our implementation of the `Config::build` function and the `search` function
## Removing a `clone` Using an Iterator
In before, we added code that took a slice of `String` values and created an instance of the `Config` struct by indexing into the slice and cloning the values, allowing the `Config` struct to own those values.
Here we have reporduced the implementation fo te `Config::build` function as it was at the end of ch12
```rust
impl Config {
pub fn build(args: &[String]) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("not enough arguments");
}
let query = args[1].clone();
let file_path = args[2].clone();
let ignore_case = env::var("IGNORE_CASE").is_ok();
Ok(Config {
query,
file_path,
ignore_case,
})
}
}
```
Before we said to not worry about the inefficient `clone` calls because we would remove them later.
Now we will fix that
We needed `clone` here because we have a slice with `String` elements in the parameter `args`, but the `build` function doesn't own `args`
To return ownership of a `Config` instance we has to clone the values from the `query` and `file_path` fields of `Config` so the `Config` instance can own its values.
Now with iterators we can chang the `build` function to take ownership of an iterator as its argument instad of borrowing a slice.
We will use the iterator functionality instead of the code that checks the length of the slice and indexes into specific locations.
This will clarify wat the `Config::build` function is doing because the iterator will access the values.
Once `Config::build` takes ownership of the iterator and stops using indexing operations that borrow.
We can then move the `String` values from the iterator into `Config` rather than calling `clone` and making a new allocation
### Using the Returned Iterator Directly
The main.rs should look like this
```rust
fn main() {
let args: Vec<String> = env::args().collect();
let config = Config::build(&args).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {err}");
process::exit(1);
});
// --snip--
}
```
First we will change the start of `main` to use an iterator instead.
This won't compile until we update `Config::build` as well
```rust
fn main() {
let config = Config::build(env::args()).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {err}");
process::exit(1);
});
// --snip--
}
```
The `env::args` fnction reutrns an iterator.
Rather than collecting the iterator values into a vector then passing that and then passing a slice to `Config::build`
Instead we pass ownership of the iterator returned from `env::args` to `Config::build` directly.
Now we need to update the definition of `Config::build`.
Here is how we update the signature of `Config::build`.
Note this still wont compile because we need to update the function body.
```rust
impl Config {
pub fn build(
mut args: impl Iterator<Item = String>,
) -> Result<Config, &'static str> {
// --snip--
```
The std library documentation for the `env::args` function sohws that the type of the iterator it returns is `std::env::Args` and that type implements the `Iterator` trait and returns `String` values.
Now we updated the `Config::build` signature so the paramter `args` has a generic type wioth the trait bounds `impl Iterator<Item = String>` instead of `&[String]`
This useage of the `impl Trait` syntax was discuess in the [Traits and Paramters](./Traits.md#traits-as-parameters).
This means that `args` can be any type that implements the `Iterator` trait and returns `String`
Because we take ownership of `args`, then we well will be mutating `args` by iterating over it.
We add the `mut` keyword into the sepcification of the `argsg` paramter to ensure it is mutable
### Using `Iterator` Trait Methods Instead of Indexing
Now we will fix the body of `Config::build`
Due to how `args` implements the `Iterator` trait, we know we can call the `next` method on it
Here is the update body
```rust
impl Config {
pub fn build(
mut args: impl Iterator<Item = String>,
) -> Result<Config, &'static str> {
args.next();
let query = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a query string"),
};
let file_path = match args.next() {
Some(arg) => arg,
None => return Err("Didn't get a file path"),
};
let ignore_case = env::var("IGNORE_CASE").is_ok();
Ok(Config {
query,
file_path,
ignore_case,
})
}
}
```
Remember that the first value in the return value of `env::args` is the name of the program
We want to ignore that, we first call `next` and do nothing with the return value to consume it from the iterator.
Next we call `net` to get the value we want to put in the `query field` of `Config`
If `next` returns a `Some`, we use a `match` to extract the value.
If it returns `None`, it means not enough arguments were given and we return early with an `Err` value.
We do the same thing for the `file_path` value.
## Making Code Clearer with Iterator Adapters
We can also take advantage of iterators in the `search` function in the I/O project.
Here is the old version of the `search` function
```rust
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut results = Vec::new();
for line in contents.lines() {
if line.contains(query) {
results.push(line);
}
}
results
}
```
To rewrite this code in a more concise way by using adapter methods.
This helps us avoid having a mutable intermediate `results` vector.
The functional programming style prefers to minimize the amount of mutable state to make code clearer.
Removing the mutable state might enable a future enhancement to make searching happen in parallel, because we wouldn't have to manage concurrent access to the `results` vector
Here is the new change
```rust
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents
.lines()
.filter(|line| line.contains(query))
.collect()
}
```
The purpose of the `search` function is to return all lines in `contents` that contain the `query`.
Similar to the filter example form before, this code uses the `filter` adapter to keep only the lines that `line.contains(query)` returns `true`.
We collect the matching lines into another vector with `collect`.
This is much simpler.
You can also make the change to use iterator methods in `search_case_insensitive` function as well.
## Choosing Between Loops or Iterators
The next question is which sytle you should choose in your own code and why
The original implementation of minigrep verses using iterators.
Most Rust programmers prefer to use the iterator style.
It is a bit tougher to get the hand of at first, once you get the feel for the various iterator adaptor and what they do.
Iterators can be easier to understand.
Instead of fiddling with various bits of looping and building new vectors, the code focuses on high-level objective of the loop.
This abraction takes away some of the commonplace code so it is easier to see the concepts that are unique to this code, such as the filtering condition each element in the iteraor must pass.
You may think that the low level low will be but lets talk about performance [here](./The%20Performance%20Closures%20and%20Iterators.md).