mirror of
https://github.com/darkicewolf50/RustBrock.git
synced 2025-06-14 20:44:17 -06:00
finished ch8
This commit is contained in:
parent
37ec4daf52
commit
d5364089e9
26
.obsidian/workspace.json
vendored
26
.obsidian/workspace.json
vendored
@ -27,12 +27,26 @@
|
||||
"state": {
|
||||
"type": "markdown",
|
||||
"state": {
|
||||
"file": "String.md",
|
||||
"file": "Collection of Common Data Structs.md",
|
||||
"mode": "source",
|
||||
"source": false
|
||||
},
|
||||
"icon": "lucide-file",
|
||||
"title": "String"
|
||||
"title": "Collection of Common Data Structs"
|
||||
}
|
||||
},
|
||||
{
|
||||
"id": "3a5f95a8df68eb56",
|
||||
"type": "leaf",
|
||||
"state": {
|
||||
"type": "markdown",
|
||||
"state": {
|
||||
"file": "Error Handling.md",
|
||||
"mode": "source",
|
||||
"source": false
|
||||
},
|
||||
"icon": "lucide-file",
|
||||
"title": "Error Handling"
|
||||
}
|
||||
},
|
||||
{
|
||||
@ -50,7 +64,7 @@
|
||||
}
|
||||
}
|
||||
],
|
||||
"currentTab": 1
|
||||
"currentTab": 2
|
||||
}
|
||||
],
|
||||
"direction": "vertical"
|
||||
@ -192,10 +206,13 @@
|
||||
"command-palette:Open command palette": false
|
||||
}
|
||||
},
|
||||
"active": "b80f5219fa24358f",
|
||||
"active": "3a5f95a8df68eb56",
|
||||
"lastOpenFiles": [
|
||||
"Collection of Common Data Structs.md",
|
||||
"Error Handling.md",
|
||||
"Hash.md",
|
||||
"String.md",
|
||||
"README.md",
|
||||
"Vector.md",
|
||||
"data_types.md",
|
||||
"Modules and Use.md",
|
||||
@ -215,7 +232,6 @@
|
||||
"Good and Bad Code",
|
||||
"Data Types.md",
|
||||
"Variables.md",
|
||||
"README.md",
|
||||
"Constants.md"
|
||||
]
|
||||
}
|
@ -9,7 +9,7 @@ choosing the right one is a skill that is developed over time
|
||||
## Common Collections
|
||||
- [*Vector*](Vector.md) - allows for storing a variable number of values next to each other
|
||||
- [*String*](String.md) - a collection of characters
|
||||
- *Hash Map* - allows you to associate a value with a specific key
|
||||
- [*Hash Map*](Hash Map.md) - allows you to associate a value with a specific key
|
||||
- Its particular implementation is a more general version of a general data struct called a map
|
||||
- {
|
||||
- 1: data,
|
||||
@ -17,3 +17,9 @@ choosing the right one is a skill that is developed over time
|
||||
- 3: this is a map
|
||||
- }
|
||||
|
||||
|
||||
# Summary
|
||||
Here are some exercise that should be able to solve after reading through some of the common collections in the std library
|
||||
- Given a list of integers, use a vector and return the median (when sorted, the value in the middle position) and mode (the value that occurs most often; a hash map will be helpful here) of the list.
|
||||
- Convert strings to pig latin. The first consonant of each word is moved to the end of the word and ay is added, so first becomes irst-fay. Words that start with a vowel have hay added to the end instead (apple becomes apple-hay). Keep in mind the details about UTF-8 encoding!
|
||||
- Using a hash map and vectors, create a text interface to allow a user to add employee names to a department in a company; for example, “Add Sally to Engineering” or “Add Amir to Sales.” Then let the user retrieve a list of all people in a department or all people in the company by department, sorted alphabetically.
|
1
Error Handling.md
Normal file
1
Error Handling.md
Normal file
@ -0,0 +1 @@
|
||||
# Error Handing
|
208
Hash.md
Normal file
208
Hash.md
Normal file
@ -0,0 +1,208 @@
|
||||
# Hash Map
|
||||
Hash maps are defined as `HashMap<K, V>`, where it stores a mapping of keys of type `K` to values of type `V` using a hashing function.
|
||||
|
||||
This determines how it places these keys and values into memory
|
||||
|
||||
Other programming support this kind of data structure its often called a different name, such as *hash*, *map*, *object*, *hash table*, *dictionary*, or *associative array*, these are just some of the names
|
||||
|
||||
Hash maps are useful when you want to look up data not by using an index, as you can with vectors, but instead you use a key that can be any type
|
||||
|
||||
For example it could be different teams scores where the name of the team is the key and the score is the value
|
||||
|
||||
Check the std library documentation for more info
|
||||
|
||||
## Creating a New Hash Map
|
||||
One way to create an empty hash map is to use `new` and to add elements with `insert`
|
||||
|
||||
Lets say we are keeping track of the score of two teams whose names are *Blue* and *Yellow*
|
||||
|
||||
Blue team has 10 points
|
||||
Yellow team has 50 points
|
||||
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let mut scores = HashMap::new();
|
||||
|
||||
scores.insert(String::from("Blue"), 10);
|
||||
scores.insert(String::from("Yellow"), 50);
|
||||
```
|
||||
|
||||
Note we must bring HashMap into scope because it is not included by default
|
||||
|
||||
Hash Maps also has less support from the std library, therefore there is no macros to construct them and there are less methods associated with them
|
||||
|
||||
They are store on the heap
|
||||
The keys must be all be the same type as each other
|
||||
The values must be all be the same type as each other
|
||||
|
||||
Keys and Values do not need to be the same type as each other
|
||||
|
||||
## Accessing Values in a Hash Map
|
||||
|
||||
We can get a value out of the hash map by using its key to the `get` method
|
||||
|
||||
Here is an example
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let mut scores = HashMap::new();
|
||||
|
||||
scores.insert(String::from("Blue"), 10);
|
||||
scores.insert(String::from("Yellow"), 50);
|
||||
|
||||
let team_name = String::from("Blue");
|
||||
let score = scores.get(&team_name).copied().unwrap_or(0);
|
||||
```
|
||||
|
||||
This would result in score being equal to 10
|
||||
|
||||
The `get` method returns an `Option<&V>`, if there is no value for that key in the hash map, then the Option will be `None`
|
||||
|
||||
This program uses `copied` to get an `Option<i32>` rather than a `Option<&i32>` then `unwrap_or` to set score to zero if there is no entry for the key
|
||||
|
||||
You can iterate over each key-value pair in the hash map in a similar way as vectors using a `for` loop
|
||||
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let mut scores = HashMap::new();
|
||||
|
||||
scores.insert(String::from("Blue"), 10);
|
||||
scores.insert(String::from("Yellow"), 50);
|
||||
|
||||
for (key, value) in &scores {
|
||||
println!("{key}: {value}");
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
This would output each pair in an arbitrary order
|
||||
```
|
||||
Yellow: 50
|
||||
Blue: 10
|
||||
```
|
||||
|
||||
## Hash Maps and Ownership
|
||||
|
||||
If a type implements the `Copy` trait values are copied into the hash map
|
||||
For owned values like `String` the values with be moved and the hash map will be the owner of those values
|
||||
|
||||
Example
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let field_name = String::from("Favorite color");
|
||||
let field_value = String::from("Blue");
|
||||
|
||||
let mut map = HashMap::new();
|
||||
map.insert(field_name, field_value);
|
||||
// field_name and field_value are invalid at this point, try using them and
|
||||
// see what compiler error you get!
|
||||
```
|
||||
|
||||
You aren't able to use `field_name` nor `field_value` because ownership of the associated value from those two variables has changed
|
||||
|
||||
This change happened after the `insert` method was called
|
||||
|
||||
If references are inserting into the hash map, then the values wont be moved but the reference MUST be valid for as long as the hash map is also valid
|
||||
|
||||
## Updating a Hash Map
|
||||
|
||||
While a Hash Map is growable each key MUST by unique and can only be associated with a single value at a time.
|
||||
The reverse is not true
|
||||
|
||||
When you want to change the data in a hash map, you have to decided how to handle the case when a key already has a value assigned
|
||||
|
||||
### Overwriting a Value
|
||||
|
||||
If you insert a key and value into a hash map then insert the same key with a different value then the value associated with the key will be overwritten
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let mut scores = HashMap::new();
|
||||
|
||||
scores.insert(String::from("Blue"), 10);
|
||||
scores.insert(String::from("Blue"), 25);
|
||||
|
||||
println!("{scores:?}");
|
||||
```
|
||||
|
||||
This only contains 1 key: value
|
||||
This will output `{"Blue": 25}`
|
||||
The original 10 is overwritten
|
||||
|
||||
### Adding a Key and Value Only If a Key Isn't Present
|
||||
This is common do do if there is no key associated with a desired value, then add the key and value otherwise do nothing.
|
||||
|
||||
Hash Maps have a special API for this `entry` that takes the key you want to check as a parameter
|
||||
|
||||
The return value of the `entry` method is an enum called Entry that represents a value that might or might not exist
|
||||
|
||||
This can be used if it doesn't have a value associated with the key then insert the value
|
||||
|
||||
Example of use
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let mut scores = HashMap::new();
|
||||
scores.insert(String::from("Blue"), 10);
|
||||
|
||||
scores.entry(String::from("Yellow")).or_insert(50);
|
||||
scores.entry(String::from("Blue")).or_insert(50);
|
||||
|
||||
println!("{scores:?}");
|
||||
```
|
||||
|
||||
The `or_insert` method on `Entry` is defined to return a mutable reference to the value for the corresponding `Entry` key if that key exists and if not it inserts the parameter as the new value for this key and returns a mutable refer to the new value
|
||||
|
||||
This is more clear then writing our own definition that complies with the borrow checker
|
||||
|
||||
This code will output `{"Yellow": 50, "Blue": 10}`
|
||||
|
||||
This means that this `or_insert` will not change it if it already exists
|
||||
|
||||
### Updating a Value Based on the Old Value
|
||||
Another common use case for hash maps is to look up a key's value then update it based on the old value
|
||||
|
||||
Example
|
||||
```rust
|
||||
use std::collections::HashMap;
|
||||
|
||||
let text = "hello world wonderful world";
|
||||
|
||||
let mut map = HashMap::new();
|
||||
|
||||
for word in text.split_whitespace() {
|
||||
let count = map.entry(word).or_insert(0);
|
||||
*count += 1;
|
||||
}
|
||||
|
||||
println!("{map:?}");
|
||||
```
|
||||
|
||||
This counts how many times each word appears in some text
|
||||
|
||||
If its the first time we see that word we initialize it to zero, then we add one using a reference to that key's value
|
||||
|
||||
This will output `{"world": 2, "hello": 1, "wonderful": 1}`
|
||||
|
||||
Output order is arbitrary when iterating over a hash map
|
||||
|
||||
`split_whitespace` method returns an iterator over sub slices, separated by whitespace, of the value in `text`.
|
||||
|
||||
`or_insert` method returns a mutable reference (`&mut V`) to the value for the specified key
|
||||
|
||||
In this case this reference is stored in `count` variable, then it is dereferenced `count` using `*` to assign a value to it. The reference then goes out of scope which ensures that the reference is freed and allows for the new assignment of another mutable reference to `count`. This makes all of the changes by the mutable reference safe because of the use of a `for` loop.
|
||||
|
||||
## Hashing Functions
|
||||
|
||||
By default `HashMap` uses a hashing function called [*SipHash*](https://en.wikipedia.org/wiki/SipHash) that can provided resistance to DoS (denial-of-service) attacks involving hash tables
|
||||
|
||||
This is not the fastest hashing algorithm available, but it the trade-off is for better security but with a drop in performance. This is worth it.
|
||||
|
||||
If your use case requires a different or faster hash you can switch by specifying a different hasher
|
||||
|
||||
A *hasher* is a type that implements the `BuildHasher` trait
|
||||
|
||||
Many other types of hashers can be provided in library crates which can be found on [crates.io](https://crates.io) which implement many common hashing algorithms.
|
205
String.md
205
String.md
@ -13,8 +13,6 @@ There is only one string type in the core language which is the string slice ``s
|
||||
This is special when not referenced because it is a constant written into the binary.
|
||||
This is only the case for string literals
|
||||
|
||||
The String type is provided by Rust's standard library rather than coded into the core language is growable, mutable, owned and UTF-8 encoded string type.
|
||||
|
||||
When a string is referred to in rust they refer either to the ``&str`` (string slice) or the ``String`` that is included in the std library
|
||||
|
||||
string slices are also UTF-9 encoded
|
||||
@ -97,3 +95,206 @@ s.push('l');
|
||||
|
||||
### Concatenation with the ``+`` Operator or the ``format!`` Macro
|
||||
|
||||
If you want to combine two existing strings one way is to use the ``+`` operator
|
||||
|
||||
```rust
|
||||
let s1 = String::from("Hello, ");
|
||||
let s2 = String::from("world!");
|
||||
let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used
|
||||
```
|
||||
|
||||
s3 will contain the string ``Hello, world!``
|
||||
|
||||
s1 is no longer valid due to how the ``+`` operator is implemented in the add function
|
||||
|
||||
```rust
|
||||
fn add(self, s: &str) -> String {
|
||||
```
|
||||
|
||||
In this definition is the string changes ownership and requires a second argument that is a reference
|
||||
|
||||
``add`` is normally defined with generics and associated types, here they are defined with concrete types to illustrate what will happen when using it fort a string
|
||||
|
||||
even though a ``&String`` is not a ``&str`` the program still compiles because the compiler can coerce the ``&String`` argument into a ``&str``.
|
||||
|
||||
You cannot add two Strings together directly
|
||||
|
||||
Rust uses a *deref coercion* which turns &s2 into ``&s2[..]``, also due to this being a reference ownership does not transfer
|
||||
|
||||
The definition moves (copies) s1 and creates a copy of s2 and combines them together
|
||||
|
||||
It may appear to be creating a lot of copies but it isn't the implementation is more efficient than copying
|
||||
|
||||
|
||||
if we need to concatenate multiple strings, the behavior of the ``+`` operator gets unwieldy
|
||||
|
||||
fir combining strings more complex ways we can instead use the ``format!`` macro
|
||||
|
||||
```rust
|
||||
let s1 = String::from("tic");
|
||||
let s2 = String::from("tac");
|
||||
let s3 = String::from("toe");
|
||||
|
||||
let s = s1 + "-" + &s2 + "-" + &s3;
|
||||
|
||||
// updated method
|
||||
|
||||
let s1 = String::from("tic");
|
||||
let s2 = String::from("tac");
|
||||
let s3 = String::from("toe");
|
||||
|
||||
let s = format!("{s1}-{s2}-{s3}");
|
||||
|
||||
```
|
||||
|
||||
the ``format!`` macro works just like ``println!`` but instead of outputting to the screen it returns a ``String`` with the contents.
|
||||
|
||||
using ``format!`` is much easier to read and the implementation of ``format`` uses references so it doesn't take ownership of any of its parameters
|
||||
|
||||
## Indexing into Strings
|
||||
You cannot index into a string and change characters using normal index syntax
|
||||
|
||||
```rust
|
||||
let s1 = String::from("hello");
|
||||
let h = s1[0];
|
||||
```
|
||||
|
||||
The reason is how strings are stored in memory in Rust
|
||||
|
||||
### Internal Representation
|
||||
A string is a wrapper over a ``Vec<u8>``
|
||||
|
||||
lets take some UTF-8 example strings
|
||||
```rust
|
||||
let hello = String::from("Hola");
|
||||
```
|
||||
In this case ``len`` will be `4` which means the vector storing the string `"Hola"` is 4 bytes long, each of these letters takes one byte when encoded in UTF-8 but this is not always true
|
||||
|
||||
for example
|
||||
```rust
|
||||
let hello = String::from("Здравствуйте");
|
||||
```
|
||||
if you were asked how long the string is, you may say 12 but in fact the answer in Rust is 24, which is the same number of bytes to encode the string in UTF-8
|
||||
|
||||
The String type is provided by Rust's standard library rather than coded into the core language is growable, mutable, owned and UTF-8 encoded string type.
|
||||
|
||||
This is because every Unicode scalar value in that string takes 2 bytes of storage. Therefore indexing into the string's bytes will not always correlate to a valid Unicode scalar value
|
||||
|
||||
To break this down consider
|
||||
|
||||
```rust
|
||||
let hello = "Здравствуйте";
|
||||
let answer = &hello[0];
|
||||
```
|
||||
|
||||
The value inside `answer` is not `З`, the first letter
|
||||
|
||||
When encoded into UTF-8, the first byte of `З` is `208` and the second is `151` so it would seem that `answer` should contain `208`, but `208` by itself is not a valid character.
|
||||
No one generally needs just the first byte at a index of a string
|
||||
|
||||
So to avoid returning the first byte which is an unexpected value and could be considered a bug, so the right answer is to give a compilation error and not compile. This prevents misunderstandings early in the dev process.
|
||||
|
||||
### Bytes and Scalar Values and Grapheme Clusters
|
||||
|
||||
There are three ways to look at a UTF-8 string from Rust's perspective:
|
||||
- Bytes
|
||||
- Scalar Values
|
||||
- Grapheme Clusters (the closest thing to what we would call letters)
|
||||
|
||||
Consider the Hindi word “नमस्ते” written in the Devanagari script, the vector of u8 values that would store that string would look like this
|
||||
```
|
||||
[224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]
|
||||
```
|
||||
|
||||
This is 18 bytes and is how computers store the data
|
||||
|
||||
If we look at them as Unicode scalar values, which is what the Rust `char` type is, those bytes would look like this
|
||||
```
|
||||
['न', 'म', 'स', '्', 'त', 'े']
|
||||
```
|
||||
The problem with this is that the 4th and 6th values are not letters, they are diacritics that don't make sense on their own
|
||||
|
||||
If we look at them as Grapheme Clusters then we would get the four characters that make up the Hindi word
|
||||
|
||||
```
|
||||
["न", "म", "स्", "ते"]
|
||||
```
|
||||
|
||||
Rust provides these ways of interpreting the raw string so that each program can choose the interpretation it needs, no matter what human language the data is in.
|
||||
|
||||
The final reason Rust doesn't allow us to index into a `String` to get a character is that indexing operations are expected to always take constant time O(1), but that is not always possible because you have to iterate through it to determine how many valid characters there were.
|
||||
|
||||
## Slicing Strings
|
||||
Indexing is a bad idea because the return type is not very clear what it should be: a byte value, a character a grapheme cluster, or a string slice.
|
||||
|
||||
This needs to be more specified
|
||||
|
||||
Rather than indexing using a single number, you can use `[]` with a range to create a string slice that contains particular bytes
|
||||
|
||||
```rust
|
||||
let hello = "Здравствуйте";
|
||||
|
||||
let s = &hello[0..4];
|
||||
```
|
||||
|
||||
s is a `&str` that contains the first 4 bytes of the string
|
||||
but since each of these characters has two bytes i means that s contains `Зд`
|
||||
|
||||
if we where only to slice part of a character then Rust will would panic at runtime in the same way as if an invalid index were accessed in a vector
|
||||
|
||||
Be careful when creating string slices with ranges
|
||||
|
||||
## Methods for Iterating over Strings
|
||||
the best way to operate on pieces of strings is to be explicit about whether you want characters or bytes
|
||||
|
||||
For individual Unicode scalar values sue the `chars` method
|
||||
|
||||
calling chars on “Зд” separates out and returns two char values
|
||||
you can also iterate over the result to access each element
|
||||
|
||||
```rust
|
||||
for c in "Зд".chars() {
|
||||
println!("{c}");
|
||||
}
|
||||
```
|
||||
|
||||
this will output
|
||||
```
|
||||
З
|
||||
д
|
||||
```
|
||||
|
||||
You can also use the `bytes` method to return each raw byte
|
||||
```rust
|
||||
for b in "Зд".bytes() {
|
||||
println!("{b}");
|
||||
}
|
||||
```
|
||||
|
||||
this will output
|
||||
```
|
||||
208
|
||||
151
|
||||
208
|
||||
180
|
||||
```
|
||||
This may be appropriate for your use case
|
||||
|
||||
Remember that valid Unicode scalar values may be made up of more than one byte
|
||||
|
||||
Getting grapheme clusters from strings such as with the Devanagari script is complex
|
||||
Therefore this functionality is not provided in the std library
|
||||
|
||||
Download a Crate from [crates.io](https://crates.io) if you need this functionality
|
||||
|
||||
## Strings Are Not So Simple
|
||||
|
||||
Rust chooses to make correct handling of `String` data the default behavior for all Rust programs, which means that handling UTF-8 data up front
|
||||
|
||||
Whilst this exposes the complexity of Non-ASCII characters is prevents these kinds of errors later in development
|
||||
|
||||
the std library offers a lot of functionality built off the `String` and `&str` types
|
||||
|
||||
some other useful methods include `contains` for searching in a string and `replace` for substituting parts of a string with another string
|
||||
|
||||
check the documentation for other useful methods
|
Loading…
x
Reference in New Issue
Block a user