7.5 KiB
Hash Map
Hash maps are defined as HashMap<K, V>
, where it stores a mapping of keys of type K
to values of type V
using a hashing function.
This determines how it places these keys and values into memory
Other programming support this kind of data structure its often called a different name, such as hash, map, object, hash table, dictionary, or associative array, these are just some of the names
Hash maps are useful when you want to look up data not by using an index, as you can with vectors, but instead you use a key that can be any type
For example it could be different teams scores where the name of the team is the key and the score is the value
Check the std library documentation for more info
Creating a New Hash Map
One way to create an empty hash map is to use new
and to add elements with insert
Lets say we are keeping track of the score of two teams whose names are Blue and Yellow
Blue team has 10 points Yellow team has 50 points
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
Note we must bring HashMap into scope because it is not included by default
Hash Maps also has less support from the std library, therefore there is no macros to construct them and there are less methods associated with them
They are store on the heap The keys must be all be the same type as each other The values must be all be the same type as each other
Keys and Values do not need to be the same type as each other
Accessing Values in a Hash Map
We can get a value out of the hash map by using its key to the get
method
Here is an example
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
let team_name = String::from("Blue");
let score = scores.get(&team_name).copied().unwrap_or(0);
This would result in score being equal to 10
The get
method returns an Option<&V>
, if there is no value for that key in the hash map, then the Option will be None
This program uses copied
to get an Option<i32>
rather than a Option<&i32>
then unwrap_or
to set score to zero if there is no entry for the key
You can iterate over each key-value pair in the hash map in a similar way as vectors using a for
loop
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
for (key, value) in &scores {
println!("{key}: {value}");
}
This would output each pair in an arbitrary order
Yellow: 50
Blue: 10
Hash Maps and Ownership
If a type implements the Copy
trait values are copied into the hash map
For owned values like String
the values with be moved and the hash map will be the owner of those values
Example
use std::collections::HashMap;
let field_name = String::from("Favorite color");
let field_value = String::from("Blue");
let mut map = HashMap::new();
map.insert(field_name, field_value);
// field_name and field_value are invalid at this point, try using them and
// see what compiler error you get!
You aren't able to use field_name
nor field_value
because ownership of the associated value from those two variables has changed
This change happened after the insert
method was called
If references are inserting into the hash map, then the values wont be moved but the reference MUST be valid for as long as the hash map is also valid
Updating a Hash Map
While a Hash Map is growable each key MUST by unique and can only be associated with a single value at a time. The reverse is not true
When you want to change the data in a hash map, you have to decided how to handle the case when a key already has a value assigned
Overwriting a Value
If you insert a key and value into a hash map then insert the same key with a different value then the value associated with the key will be overwritten
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Blue"), 25);
println!("{scores:?}");
This only contains 1 key: value
This will output {"Blue": 25}
The original 10 is overwritten
Adding a Key and Value Only If a Key Isn't Present
This is common do do if there is no key associated with a desired value, then add the key and value otherwise do nothing.
Hash Maps have a special API for this entry
that takes the key you want to check as a parameter
The return value of the entry
method is an enum called Entry that represents a value that might or might not exist
This can be used if it doesn't have a value associated with the key then insert the value
Example of use
use std::collections::HashMap;
let mut scores = HashMap::new();
scores.insert(String::from("Blue"), 10);
scores.entry(String::from("Yellow")).or_insert(50);
scores.entry(String::from("Blue")).or_insert(50);
println!("{scores:?}");
The or_insert
method on Entry
is defined to return a mutable reference to the value for the corresponding Entry
key if that key exists and if not it inserts the parameter as the new value for this key and returns a mutable refer to the new value
This is more clear then writing our own definition that complies with the borrow checker
This code will output {"Yellow": 50, "Blue": 10}
This means that this or_insert
will not change it if it already exists
Updating a Value Based on the Old Value
Another common use case for hash maps is to look up a key's value then update it based on the old value
Example
use std::collections::HashMap;
let text = "hello world wonderful world";
let mut map = HashMap::new();
for word in text.split_whitespace() {
let count = map.entry(word).or_insert(0);
*count += 1;
}
println!("{map:?}");
This counts how many times each word appears in some text
If its the first time we see that word we initialize it to zero, then we add one using a reference to that key's value
This will output {"world": 2, "hello": 1, "wonderful": 1}
Output order is arbitrary when iterating over a hash map
split_whitespace
method returns an iterator over sub slices, separated by whitespace, of the value in text
.
or_insert
method returns a mutable reference (&mut V
) to the value for the specified key
In this case this reference is stored in count
variable, then it is dereferenced count
using *
to assign a value to it. The reference then goes out of scope which ensures that the reference is freed and allows for the new assignment of another mutable reference to count
. This makes all of the changes by the mutable reference safe because of the use of a for
loop.
Hashing Functions
By default HashMap
uses a hashing function called SipHash that can provided resistance to DoS (denial-of-service) attacks involving hash tables
This is not the fastest hashing algorithm available, but it the trade-off is for better security but with a drop in performance. This is worth it.
If your use case requires a different or faster hash you can switch by specifying a different hasher
A hasher is a type that implements the BuildHasher
trait
Many other types of hashers can be provided in library crates which can be found on crates.io which implement many common hashing algorithms.