I/O Project: Building a Command Line Program
In this module I will be recreating a classic command line search tool grep
(globally search a regular expression and print)
Rust's speed, safety, single binary output and cross-platform support makes it an ideal language for creating command line tools
In the simplest use case, grep
searches a specified file for a specified string.
To do this grep
takes as its arguments a file path and a string. Then it reads the file, finds lines in that file that contain/match to the string argument nad prints those lines
This project will also show along the way how to use the terminal features that many other command line tools use
It will include reading the value of an environemnt variable to allow the user to configure the behavior of our tool
This project will also go into printing error messages to the standard error console stream (stderr
) instead of the standard output (stdout
)
We to that so the user can redirect successful output to a file while still seeing error messages onscreen ofr example
One Rust community member, Andrew Gallant, has already created a fully featured, very fast version of grep
called ripgrep
This version will be fairly simple.
Inital Goal: Accept Command Line Arguments
We can do this when running our program with cargo run
by two hyphens to indicate the follwing arguments are for our program rather than for cargo
- A string to search for
- A path to a file to search in
Here is an example running
$ cargo run -- searchstring example-filename.txt
The program generated y cargo new
cannot process argments we give it.
There are some existing libraries on crates.io can help with writing a program that accepts command line arguments.
But since its a learning opporutnity I (with the help of the rust programming language) will be implementing this capability
Reading the Arguments Values
We will need the std::env::args
function prvided in Rust's std library.
This function reutnrs an iterator of the command line arguments passed to the program
Iterators will be covered later in the chapter after
For now the two important details about iterators:
- iterators produce a series of values
- we can call the
collect
method on an iterator to turn it into a collection, such as a vector, that contains all the elements the iterator produces
we bring the std::env
module into scope using the use
statement so we can use its args
function
Note thatthe std::env::args
function is nestd in two levels in two levels of modules.
In cases where the desired function is nested in more than one module, we chose to bring the parent module into scope rather than the function
By doing this we can also use other functions from std::env
It also less ambiguous than adding use std::env::args
and then calling the function with just args
, because args
might easily be mistaken for a function that is defined in the current module.
The args
Function and Invalid Unicode
Note that std::env::args
will panic if any arguments contains invalid Unicode.
If your program needs to accept arguments containing invalid Unicode, use std::env::args_os
instead
This function produces an iterator that produces 0sString
values instead of String
values
We chose to use std::env:args
for simplicit because 0sString
values differ per platform and are more complex to work with than String
values.
On the first line of main
we call env::args
and then collect
is immediately used to turn the iterator into a vector containing all the values produced by the iterator.
We can use the collect
function to create many kinds of collection, so we eplicitly annotate the tpye of args
to specify that we want a vector of strings.
When using collect
and other functions like it we need to annotate because Rust isn't able to infer the kind of collection desired
See the output with and without any arguments after cargo run
$ cargo run
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.61s
Running `target/debug/minigrep`
[src/main.rs:5:5] args = [
"target/debug/minigrep",
]
$ cargo run -- needle haystack
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.57s
Running `target/debug/minigrep needle haystack`
[src/main.rs:5:5] args = [
"target/debug/minigrep",
"needle",
"haystack",
]
Notice that the first value in the vector is "target/debug/mingrep"
, this is the name of our binary.
This matches the behavior if the arguemtns list in C, letting programs they were invoked in their execution.
Its often convenient ot have access to the program name in case you want ot print it in messages or change the behavior of the program based on what command line alias was sed to invoke the program.
For this program we will ignore it and save only the tow arguments we need.
Saving the Argument Values in Variables
The program is currently able to access the values specified as command line args
Now we should save the two arguments in variables so that we can use them later and throuht the program
We should do this by &args[1]
The first arg that minigrep
takes is the string we are searching for, so we put a reference to the first arg in the var query
The second arg is the file path, so we put a reference to the second argument in the var file_path
.
We will temporarily print the values of these varaibles to prove that the code is working as intended
Here is what the output would look like at this point
$ cargo run -- test sample.txt
Compiling minigrep v0.1.0 (file:///projects/minigrep)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.0s
Running `target/debug/minigrep test sample.txt`
Searching for test
In file sample.txt