RustBrock/Macros.md
darkicewolf50 ff4c4fccee
All checks were successful
Test Gitea Actions / first (push) Successful in 20s
Test Gitea Actions / check-code (push) Successful in 16s
Test Gitea Actions / test (push) Successful in 16s
Test Gitea Actions / documentation-check (push) Successful in 17s
finished ch20.5 and ch20 WHOOH
2025-04-17 16:09:28 -06:00

502 lines
22 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Macros
The term *macro* refers to a family of features in Rust: *declarative* macros with `macro_rules!` and three kinds of *procedural* macros:
- Custom `#[derive]` macros that specify code added with the `derive` attribute used on structs and enums.
- Attribute-like macros that define custom attributes usable on any item
- Function-like macros that look like function calls but operate on the tokens specified as their argument
Each will be discussed, but let's first look at why we even need macros when we already have functions.
## The Difference Between Macros and Functions
Fundamentally macros are a way of writing code that writes other code, which is known as *metaprogramming*.
In Appendix C, we discuss the `derive` attribute, which generates an implementation of various traits for you.
All of these macros *expand* to produce more code than the code you have written manually.
Metaprogramming is useful for reducing the amount of code you have to write and maintain, which is also one of the roles of functions.
Macros have some additional powers that functions don't have.
A function signature must declare the number and type of parameters the function has.
Macros can take a variable number of parameters.
To show this: we can call `println!("hello")` with one argument or `println!("hello {}", name)` with two arguments.
Macros are also expanded before the compiler interprets the meaning of the code, so a macro can.
For example, implement a trait on a give type.
A function cannot, because it gets called at runtime and a trait needs to be implemented at compile time.
The downside to implementing a macro instead of a function is that macro definition are more complex than function definitions because you are writing Rust code that writes Rust code.
This indirection of macro definitions are generally more difficult to read, understand and maintain than function definitions.
Another important difference between macros and functions is that you must define macros or bring them into scope *before* you call them in a file.
This is opposed to functions where you can define anywhere and call anywhere.
## Declarative Macros with `macro_rules!` for General Metaprogramming
The most widely used form of macros in Rust is the *declarative macro*.
These are sometimes also referred to as "macros by example," *"`macro rules` macros"* or just plain "macros."
At their core, declarative macros allow you to write something similar to a Rust `match` expression.
`match` expressions are control structures that take an expression, compare the resulting value of the expression to patterns, and then run the code associated with the matching pattern.
Macros also compare a value to patterns that are associated with particular code.
In this situation, the value is the literal Rust source code passed to the macro, the patterns are compared with the structure of that source code and the code associated with each pattern, when matched, replaces the code passed to the macro.
This happens during compilation.
In order to define a macro, you use the `macro_rules!` construct.
Now lets explore how to use `macro_rules!` by looking at how the `vec!` macro is defined.
Ch8 covered how we can use the `vec!` macro to create a new vector with particular values.
For example, the following macro creates a new vector containing three integers:
```rust
let v: Vec<u32> = vec![1, 2, 3];
```
We could also use the `vec!` macro to make a vector of two integers or a vector of five string slices.
We wouldn't be able to use a function to do the same because we wouldn't know the number or type values up front.
Here shows a slightly simplified definition of the `vec!` macro.
```rust
#[macro_export]
macro_rules! vec {
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$(
temp_vec.push($x);
)*
temp_vec
}
};
}
```
Note: The actual definition of the `vec!` macro in std library includes code to preallocate the correct amount of memory up front.
That code is an optimization that we don't include here to make the example simpler.
The `#[macro_export]` annotation indicates that this macro should be made available whenever the crate in which the macro is defined is brought into scope.
We then start the macro definition with `macro_rules!` and the name of the macro we are defining *without* the exclamation mark.
Then name here `vec` is followed by curly brackets denoting the body of the macro definition.
The structure in the `vec!` body is similar to the structure of a `match` expression.
Here we have one arm with the pattern `( $( $x:expr ),* )`, followed by `=>` and the block of code associated with this pattern.
If the pattern matches, the associated block of code will be emitted.
Given this is the only pattern in this macro, there is only one valid way to match; any other pattern will result in an error.
More complex macros will have more than one arm.
Valid pattern syntax in macro definitions is different than the pattern syntax covered in Ch19.
This is because macro patterns are matched against Rust code structure rather than values.
Now lets go over what the pattern pieces in the previous examples mean.
The full macro pattern syntax can be seen in the [Rust Reference](https://doc.rust-lang.org/reference/macros-by-example.html).
First, we use a set of parentheses to encompass the whole pattern.
We use a dollar sign (`$`) to declare a variable in the macro system that will contain the Rust code matching the pattern.
The dollar sign makes it clear this a macro variable as opposed to a regular Rust variable.
Next comes a set of parentheses that captures values that match the pattern within the parentheses for use in the replacement code.
Within `$()` is `$x: expr`, this matches any Rust expression and gives the expression the name `$x`.
The comma following `$()` indicates that a literal comma separator character must appear between each instance of the code that matches the code within `$()`.
The `*` specifies that the pattern matches zero or more of whatever precedes the `*`.
When we call this macro with `vec![1, 2, 3];`, the `$x` pattern matches three times with the three expressions `1`, `2`, and `3`.
Now lets look at the pattern in the body of the code associated with this arm: `temp_vec.push()` within `$()*` is generated for each part that matches `$()` in the pattern zero or more times depending on how many times the pattern matches.
The `$x` is replaced with each expression matched.
When we call this macro with `vec![1, 2, 3];` the code generated that replaces this macro call will be this:
```rust
{
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
}
```
Here we dined a macro that can take any number of arguments of any type and can generate code to create a vector containing the specified elements.
To learn more about how to write macros, read online documentation or other resources like ["The Little Book of Rust Macros"](https://veykril.github.io/tlborm/) started by Daniel Keep and continued by Lukas Wirth.
## Procedural Macros for Generating Code from Attributes
The second form of macros is the *procedural macro*, which acts more like a function (and is a type of procedure).
Procedural macros accept some code as an input, operate on this code, and produce some code as an output rather than matching against patterns and replacing the code with other code as declarative macros do.
The three kinds of procedural macros are custom derive, attribute-like and function-like and all work in a similar way.
When creating procedural macros, the definitions must reside in their own crate with a special enum crate type.
This is for complex technical reasons that the Rust team hopes to eliminate in the future.
Here we show how to define a procedural macro, where `some_attribute` is placeholder for using a specific macro variety.
```rust
use proc_macro;
#[some_attribute]
pub fn some_name(input: TokenStream) -> TokenStream {
}
```
This function that defines a procedural macro takes `TokenStream` as an input and produces a `TokenStream` as an output.
The `ToeknnStream` type is defined by the `proc_macro` crate that is included with Rust and represents a sequence of tokens.
This is the core of the macro: the source code that the macro is operating on makes up the input `TokenStream` and the code the macro produces is the output `TokenStream`.
This function also has an attribute attached to it which specifies what kind of procedural macro we are creating.
We can have multiple kinds of procedural macros in the same crate.
Now we will look at the different kinds of procedural macros.
First we will start with a custom derive macro and then go on to explain the small dissimilarities that make the other forms different.
## How to Write a Custom `derive` Macro
Lets start with a crate named `hello_macro` which defines a trait named `HelloMacro` with one associated function named `hello_macro`.
Rather than forcing our users to implement the `HelloMacro` trait for each of their types, we will provide a procedural macro so users can annotate their type with `#[derive(HelloMacro)]` to get a default implementation of the `hello_macro` function.
The default implementation will print `Hello, Macro! My name is TypeName!` where `TypeName` is the name of the type which this trait has been defined.
We will write a crate that enables another programmer to write code like this below using our crate.
```rust
use hello_macro::HelloMacro;
use hello_macro_derive::HelloMacro;
#[derive(HelloMacro)]
struct Pancakes;
fn main() {
Pancakes::hello_macro();
}
```
This will print `Hello, Macro! My name is Pancakes!` when we are done.
The first step will be to make a new library crate using this:
```
$ cargo new hello_macro --lib
```
Then we will define the `HelloMacro` trait and its associated function.
```rust
pub trait HelloMacro {
fn hello_macro();
}
```
*src/lib.rs*
We have a trait and its function.
Now at this point our crate user could implement the trait to achieve the desired functionality, like this:
```rust
use hello_macro::HelloMacro;
struct Pancakes;
impl HelloMacro for Pancakes {
fn hello_macro() {
println!("Hello, Macro! My name is Pancakes!");
}
}
fn main() {
Pancakes::hello_macro();
}
```
However they would need to write the implementation block for each type they wanted to use with `hello_macro`.
We want to spare them from having to do this.
Additionally, we can't yet provide the `hello_macro` function with default implementation that will print the name of the type the trait is implemented on.
Rust doesn't have reflection capabilities, so it cannot look up the type's name at runtime.
Instead we need a macro to generate code at compile time.
Next is to define the procedural macro.
At the time of writing this, procedural macros need to be in their own crate.
The convention for structuring crates and macro crates is as follows.
For a crate named `foo`, a custom derive procedural macro crate is called `foo_derive`.
Now lets start a new crate called `hello_macro_derive` inside our `hello_macro` project.
```
$ cargo new hello_macro_derive --lib
```
The two crates are tightly related, so we create the procedural macro crate within the directory of our `hello_macro` crate.
If we change the trait definition in `hello_macro`, we will have to change the implementation of the procedural macro in `hello_macro_derive` as well.
The two crates will need to be published separately, and programmers using these crates will need to add both as dependencies and bring them both into scope.
We instead could have the `hello_macro` crate use `hello_macro_derive` as a dependency and re-export the procedural macro code.
The way we have structured the project makes it possible for programmers to use `hello_macro` even if we don't want the `derive` functionality.
We need to declare the `hello_macro_derive` crate as a procedural macro crate.
We also need functionality form the `syn` and `quote` crates.
So we need to add them as dependencies.
Add this to the *Cargo.toml* file for the `hello_macro_derive`.
```toml
[lib]
proc-macro = true
[dependencies]
syn = "2.0"
quote = "1.0"
```
To start defining the procedural macro, place the code into your *src/lib.rs* file for the `hello_macro_derive` crate.
Note that this code will not compile unless we add a definition for the `impl_hello_macro` function.
```rust
use proc_macro::TokenStream;
use quote::quote;
#[proc_macro_derive(HelloMacro)]
pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
// Construct a representation of Rust code as a syntax tree
// that we can manipulate
let ast = syn::parse(input).unwrap();
// Build the trait implementation
impl_hello_macro(&ast)
}
```
*hello_macro_derive/src/lib.rs*
Note that we have split the code into the `hello_macro_derive` function, which is responsible for parsing the `TokenStream` and the `impl_hello_macro` function, which is responsible for transforming the syntax tree.
This makes writing a procedural macro more convenient.
The code in the outer function (`hello_macro_dervie` in this case) will be the same for almost every procedural macro crate you see or create.
The code specified in the body of the inner function (`impl_hello_macro` in this case) will be different depending on your procedural macro's purpose.
We have introduced three new crates: `proc_macro`, [`syn`](https://crates.io/crates/syn), and [`quote`](https://crates.io/crates/quote).
The `proc_macro` crate comes with Rust, so we don't need to add that to the dependencies in *Cargo.toml*.
The `proc_macro` crate is the compiler's API that allows us to read and manipulate Rust code from our code.
The `syn` crate parses Rust code form a string into a data structure that we can perform operations on.
The `quote` crate turns `syn` data structures back into Rust code.
These crate makes it much simpler to parse any sort of Rust code we might want to handle: writing a full parser for Rust code is no simple task.
The `hello_macro_derive` function will be called when a user of our library specifies `#[derive(HelloMacro)]` on a type.
This is possible because we have annotated the `hello_macro_derive` function here with `proc_macro_derive` and specified the name `HelloMaro`.
This matches our trait name, this is the convention most procedural macros follow.
The `hello_macro_derive` function first converts the `input` from a `TokenStream` to a data structure that we can then interpret and perform operations on.
This is where `syn` comes into aciton.
The `parse` function in `syn` takes a `TokenStream` and shows the relevant parts of the `DeriveInput` struct we get form parsing the `struct Pancakes;` string.
```rust
DeriveInput {
// --snip--
ident: Ident {
ident: "Pancakes",
span: #0 bytes(95..103)
},
data: Struct(
DataStruct {
struct_token: Struct,
fields: Unit,
semi_token: Some(
Semi
)
}
)
}
```
The fields of this struct show that the Rust code we have parsed is a unit struct with the `ident` (identifier, this means the name) of `Pancakes`.
There are more fields on this struct for describing all kinds of Rust code.
Check the [`syn` documentation for `DeriveInput`](https://docs.rs/syn/2.0/syn/struct.DeriveInput.html) for more info.
Soon the `impl_hello_macro` function will be defined, which is where we will build the new Rust code we want to include.
Before we do this.
Note that the output for our derive macro is also a `TokenStream`.
The returned `TokenStream` is added to the code that our crate users write, so when they compile their own crate, they will get the extra functionality that we will provide in the modified `TokenStream`.
You may have noticed that we are calling `unwrap` to cause the `hello_macro_derive` function to panic if the call to the `syn::parse` function fails.
This is necessary for our procedural macro to panic on errors because `proc_macro_derive` functions must return `TokenStream` rather than `Result` to conform to the procedural macro API.
Here we have simplified this example by using `unwrap`; in production you should provide more specific error messages about what went wrong by using `panic!` or `expect`.
Note that we have the code to turn the annotated Rust code from a `TokenStream` into a `DeriveInput` instance.
Now we will generate the code that implements the `HelloMacro` trait on the annotated type.
This is shown here.
```rust
fn impl_hello_macro(ast: &syn::DeriveInput) -> TokenStream {
let name = &ast.ident;
let gen = quote! {
impl HelloMacro for #name {
fn hello_macro() {
println!("Hello, Macro! My name is {}!", stringify!(#name));
}
}
};
gen.into()
}
```
We get an `Ident` struct instance containing the name (identifier) of the annotated type using `ast.idetn`.
The struct before shows that when we run the `impl_hello_macro` function on the code form before that.
The `ident` we get will have the `ident` field with a value of `"Pancakes"`.
The `name` variable here will contain an `Ident` struct instance that, when printed will be the string `"Pancakes"`, the name of the struct from way before.
The `quote!` macro lets us define the Rust code that we want to return.
The compiler expects something different to the direct result of the `quote!` macro's execution, so we need to convert it to a `TokenStream`.
We do this by calling the `into` method, this consumes this intermediate representation and returns a value of the required `TokenStream` type.
The `quote!` macro also provides some very interesting templating mechanics.
We can enter `#name` and `quote!` will replace it with the value in the variable `name`.
You can even do some repetition similar to the way regular macros work.
Check the [`quote` crate's docs](https://docs.rs/quote) for a thorough introduction.
We want our procedural macro to generate an implementation of our `HelloMacro` trait for the type the user annotated, which we can get by using `#name`.
The trait implementation has the one function has the one function `hello_macro`, whose body contains the functionality we want to provide.
Printing `Hello, Macro! My name is` and then the name of the annotated type.
The `stringify!` macro used here is built into Rust.
This takes a Rust expression, such as `1 + 2`, and at compile time turns the expression into a string literal, such as `"1 + 2"`.
This is different than `format!` or `println!`, macros which evaluate the expression and then turn the result into a `String`.
It is possible that the `#name` input may be an expression to print literally, so we use `stringify!`.
Using `stringify!` also saves an allocation by converting `#name` to a string literal at compile time.
Now at this point, `cargo build` should complete successfully in both `hello_macro` and `hello_macro_derive`.
Now we will hook these crates to the code from before to see the procedural macro in action.
Create a new binary project in your *projects* directory using `cargo new pancakes`.
We need to add `hello_macro` and `hello_macro_derive` as dependencies in the `pancakes` crate's *Cargo.toml*.
If you are publishing your versions of `hello_macro` and `hello_macroderive` to [crates.io](https://crates.io) they would be regular dependencies.
If not you can specify them as `path` dependencies as follows:
```toml
hello_macro = { path = "../hello_macro" }
hello_macro_derive = { path = "../hello_macro/hello_macro_derive" }
```
Now you can put the code from way before into *src/main.rs*, and run `cargo ran`.
It should print `Hello, Macro! My name is Pancakes!`.
The implementation of the `HelloMacro` trait form the procedural macro was included without the `pancakes` crate needing to implement it.
The `#[derive(HelloMacro)]` added the trait implementation.
## Attribute-like macros
These kinds of macros are similar to custom derive macros, but instead of generating code for the `derive` attribute, they allow you to create new attributes.
They are also more flexible: `derive` only works for structs and enums.
Attributes can be applied to other items as well, such functions.
Here is an example of using an attribute-like macro: say you have an attribute named `route` that annotates functions when using a web application framework.
```rust
#[route(GET, "/")]
fn index() {
```
This `#[route]` attribute would defined by the framework as a procedural macro.
The signature of the macro definition function would look like this:
```rust
#[proc_macro_attribute]
pub fn route(attr: TokenStream, item: TokenStream) -> TokenStream {
```
Here, we have two parameters of type `TokenStream`.
The first is for the contents of the attribute: the `GET, "/"` part.
The second is the body of the item the attribute is attached to: here it is `fn index() {}` and the rest of the function's body.
Other than that, attribute-like macros work the same way as custom derive macros.
You create a crate with the `proc-macro` crate type and implement a function that generates the code you want.
## Function-like macros
Function-like macros define macros that look like function calls.
This is similar to `macro_rules!` macros.
They are more flexible than functions.
For example they can take an unknown number of arguments.
However `macro_rules!` macros can be defined only using the match-like syntax that was discussed in the [ “Declarative Macros with `macro_rules!` for General Metaprogramming"](./Macros.md#declarative-macros-with-macro_rules-for-general-metaprogramming) section from earlier before.
Function-like macros take a `TokenStream` parameter and their definition manipulates that `TokenStream` using Rust code as the other two types of procedural macros do.
Here is an example of a function-like macro is an `sql!` macro that may be called like this:
```rust
let sql = sql!(SELECT * FROM posts WHERE id=1);
```
This macro would parse the SQL statement inside it and check that is syntactically correct.
This is much more complex processing than a `macro_rules!` macro can do.
The `sql!` macro would be defined like this:
```rust
#[proc_macro]
pub fn sql(input: TokenStream) -> TokenStream {
```
The definition is similar to the custom derive macro's signature: we receive the tokens that are inside the parentheses and return the code we wanted to generate.
s