12 KiB
Advanced Types
The Rust type system has some features that we have mentioned so far but haven't gone into detail.
To start we will go into the newtypes in general as we examine why newtypes are useful as types.
Then we will go onto type aliases, a feature similar to newtypes but slightly different semantics.
As well we will discuss the !
and dynamically sized types.
Using the Newtype Pattern for Type Safety and Abstraction
The newtype pattern are also useful for tasks beyond those discussed already.
This includes statically enforcing that values are never confused and indicating the units of a value.
Before we saw an example of using newtypes to indicate units: recall that the Millimeters
and Meters
structs wrapped u32
values in a newtype.
If we wrote a function with a parameter of type Millimeters
, we couldn't compile a program that accidentally tired to call function with a value of type Meters
or a plain u32
.
We can also use the newtype pattern to abstract away some implementation details of a type.
The new type can expose a public API that is different form the API of the private inner type.
Newtypes can also hide internal implementation.
Lets say we could provide a People
type to wrap a HashMap<i32, String>
that store a person's ID associated with their name.
Code using People
would only interact with the public API we provide.
Like a method to add a name string to the People
collect: this code wouldn't need to know that we assign an i32
ID to names internally.
The newtype pattern is a lightweight way to achieve encapsulation to hide implementation details, which we discussed before in Ch18.
Creating Type Synonyms with Type Aliases
Rust provides the ability to declare a type alias to give an existing type another name.
We need to use the type
keyword to do this.
For example we can create the alias Kilometers
to i32
like this.
type Kilometers = i32;
The alias Kilometers
is a synonym for i32
.
Unlike the Millimeters
and Meters
types we created before.
Kilometers
is not a separate, new type.
Values that have the type Kilometers
will be treated the same as values of type i32
.
type Kilometers = i32;
let x: i32 = 5;
let y: Kilometers = 5;
println!("x + y = {}", x + y);
Because Kilometers
and i32
are the same type, we can add values of both types and we can pass Kilometers
values to functions that take i32
parameters.
However using this method, we don't get the type checking benefits that we get from the newtype pattern discussed earlier.
In other words, if we mix up Kilometers
and i32
values somewhere, the compiler will not give us an error.
The main use for type synonyms is to reduce repetition.
As an example, we might have a lengthy type like this.
Box<dyn Fn() + Send + 'static>
Writing this lengthy type function signatures and as type annotations all over the code can be tiresome and error prone.
Just image a project full of code like this.
let f: Box<dyn Fn() + Send + 'static> = Box::new(|| println!("hi"));
fn takes_long_type(f: Box<dyn Fn() + Send + 'static>) {
// --snip--
}
fn returns_long_type() -> Box<dyn Fn() + Send + 'static> {
// --snip--
}
A type alias makes this code more manageable by reducing the amount of repetition.
Here we have introduced an alias named Thunk
for the verbose type and can replace all uses of the type with the shorter alias Thunk
.
type Thunk = Box<dyn Fn() + Send + 'static>;
let f: Thunk = Box::new(|| println!("hi"));
fn takes_long_type(f: Thunk) {
// --snip--
}
fn returns_long_type() -> Thunk {
// --snip--
}
This is much easier to read and write.
Choosing a meaningful name for a type alias can help communicate your intent as well.
Thunk is a word for code to be evaluated at a later time, this is an appropriate name for a closure that gets stored.
Type aliases are also commonly used with the Result<T, E>
type for repetition.
Consider the std::io
module in the std library.
I/O operations often return a Result<T, E>
to handle situations when operations fail to work.
This library has a std::io::Error
struct that represents all possible I/O errors.
Many of the functions in std::io
will be returning Result<T, E>
where the E
is std::io::Error
, such as these functions in Write
trait:
use std::fmt;
use std::io::Error;
pub trait Write {
fn write(&mut self, buf: &[u8]) -> Result<usize, Error>;
fn flush(&mut self) -> Result<(), Error>;
fn write_all(&mut self, buf: &[u8]) -> Result<(), Error>;
fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<(), Error>;
}
The Result<..., Error>
is repeated a lot.
Therefore std::io
has this type alias declaration
type Result<T> = std::result::Result<T, std::io::Error>;
Due to this declaration is in the std::io
module, we can use the fully qualified alias std::io::Result<T>
.
That is a Result<T, E>
with the E
filled in as std::io::Error
.
The Write
trait function signatures end up looking like this.
pub trait Write {
fn write(&mut self, buf: &[u8]) -> Result<usize>;
fn flush(&mut self) -> Result<()>;
fn write_all(&mut self, buf: &[u8]) -> Result<()>;
fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
}
This type alias helps in two ways:
- It makes code easier to write.
- And
- It gives us a consistent interface across all of
std::io
Due to it being an alias, it is just anotherResult<T, E>
, this means we can use any methods that work onResult<T, E>
with it, as well as special syntax like the?
operator.
The Never Type that Never Returns
Rust has a special type named !
that is known in type theory lingo as the empty type because it has no values.
We prefer to call it the never type because it stands in the place of the return type when a function will never return.
Here is an example in use.
fn bar() -> ! {
// --snip--
}
This code should be read as "the function bar
returns never."
Functions that return never are called diverging functions.
We can't create values of the type !
so bar
can never possibly return.
What is the use of a type you can never create values for?
Recall the code from Ch2, part of the number guessing game.
Here is a sample of that code
let guess: u32 = match guess.trim().parse() {
Ok(num) => num,
Err(_) => continue,
};
Before we skipped over some details about this code.
In ch6 we discussed that match
arms must all return the same type.
For example this code will not compile.
let guess = match guess.trim().parse() {
Ok(_) => 5,
Err(_) => "hello",
};
The type of guess
in this code would have to be an integer and a string, and Rust requires that guess
have only one type.
So what does continue
return?
How are we allowed to return a u32
from one arm and have another arm that ends with continue
?
continue
has a !
value.
That is, when Rust computes the type of guess
, it looks at both match arms, the former with a value of u32
and the latter with a !
value.
Because !
can never have a value, Rust decides that the type of guess
is u32
.
The formal way to describe this behavior is that expressions of type !
can be coerced into any other type.
We are allowed to end this match
arm with continue
because continue
doesn't return a value.
Instead it moves control back to the top of the loop, so in the Err
case, we never assign a value to guess
.
The never type is useful with the panic!
macro as well.
Remember the unwrap
function that we call on Option<T>
values to produce a value or panic with this definition:
impl<T> Option<T> {
pub fn unwrap(self) -> T {
match self {
Some(val) => val,
None => panic!("called `Option::unwrap()` on a `None` value"),
}
}
}
Here, the same thing happens as in the match
case form before.
Rust sees that val
has the type T
and panic!
has the type !
, so the result of the overall match
expression is T
.
This works because panic!
doesn't produce a value, it ends the program.
In the None
case, we will not be returning a value form unwarp
so this code is valid.
One final expression that has the type !
is a loop
.
print!("forever ");
loop {
print!("and ever ");
}
This loop never ends, so !
is the value of the expression.
However, this wouldn't be true if we included a break
, because the loop would terminate when it got to the break
.
Dynamically Sized Types and the Sized
Trait
Rust must know certain details about its types, such as how much space to allocate for a value of a particular type.
This leaves one corner of its type system a little confusing at first: the concept of dynamically sized types.
Sometimes referred to as DSTs or unsized types, these types let us write code using values whose size we can know only at runtime.
Lets look into the details of a dynamically sized type called str
, which we have been using throughout.
This does not include &str
, but str
on its own, is a DST.
We can't know how long the string is until runtime, meaning we can't create a variable of type str
, nor can we make that argument of type str
.
Consider this code, which will not compile.
let s1: str = "Hello there!";
let s2: str = "How's it going?";
Rust needs to know how much memory to allocate for any value of a particular type, and all values of a type must use the same amount of memory.
If Rust allowed use to write this code, these two str
values would need to take up the same amount of memory.
These two have different lengths:
s1
needs 12 bytes of storage.s2
needs 15. This is why it is not possible to create a variable holding a dynamically sized type.
So what should we do?
We should make the types of s1
and s2
a &str
rather than a str
.
Recall from the "String Slice" section from Ch4, that the slice data structure just stores the starting position and the length of the slice.
Even though a &T
is a single value that stores the memory address of where the T
is located, a &str
is two values.
The address of the str
and its length.
We can know the size of a &str
value at compile time: it's twice the length of a usize
.
This means we always know the size of a &str
, no matter how long the string it refers to is.
Generally this is the way in which dynamically sized types are used in Rust, they have an extra but of metadata that stores the size of the dynamic information.
The golden rule of dynamically sized types is that we must always put values of dynamically sized types behind a pointer of some kind.
We can combine str
with all kinds of pointers.
For example Box<str>
or Rc<str>
.
In fact we have seen this before but with a different dynamically sized type: traits.
Every trait is a dynamically sized type we can refer to by using the name of the trait.
In Ch18 in "Using Trait Objects That Allow for Values of Different Types", we mentioned that to use trait as trait objects, we must put them behind a pointer, such as &dyn Trait
or Box<dyn Trait>
(Rc<dyn Trait>
would work as well).
To work with DSTs, Rust provides the Sized
trait to determine whether or not a type's size is known at compile time.
This trait is automatically implemented for everything whose size is known at compile time.
Additionally Rust implicitly adds a bound on Sized
to every generic function.
That is, a generic function definition like this:
fn generic<T>(t: T) {
// --snip--
}
This is actually treated as though we had written this:
fn generic<T: Sized>(t: T) {
// --snip--
}
By default, generic functions will work only on types that have a known size at compile time.
However, you can use the following special syntax to relax this restriction.
fn generic<T: ?Sized>(t: &T) {
// --snip--
}
A trait bound on ?Sized
means "T
may or may not be Sized
".
This notation overrides the default that generic types must have a known size at compile time.
The ?Trait
syntax with this meaning is only available for Sized
, not any other traits.
Note that we switched the type of the t
parameter from T
to &T
.
Because the type might not be Sized
, we need to use it behind some kind of pointer.
Here we have chosen to use a reference.