Language Core¶
This very long page will go over the whole language specification explaining each part in a short form. As already said, for more detailed information read the documents already mentioned at the start of the Rust part.
Local Variables¶
They can be defined using let
which will immediately allocate memory but they have to be initialized with a value first to be used. Both can also be done in one step:
let x = 5;
Reserved names
Rust has some keywords which are not possible to be used as variables. If you want to use such, use a raw variable like fn r#match ...
starting with r#
on definition and every call. This is mainly used while transferring files to newer editions. It is always better to use another name.
Constants¶
All hard coded values, which won't change in any circumstance, should be declared as constant. They are defined using const
, have to be initialized on definition and can't be changed later.
const NUMBER: u32 = 5;
Mutability¶
To ensure memory safety all variables are immutable by default. So you have to specify if you want to change them later. To set a variable as mutable precede it with the mut
keyword.
let x = 5; // immutable variable
let mut x = 5; // mutable variable
Shadowing¶
You can declare a new variable with the same name as a previous variable. Meaning the first variable is shadowed by the second so that the second variable’s value is what appears when the variable is used. You do so by redefining the variable again.
This is often used to change the type of the value but reuse the same name which is not possible with mutable variables.
Scope¶
The scope is the range within the program for which the variable is valid. It is valid from it's definition till the end of scope. The scope of the variable is the block in which it is defined.
A new scope can be created by using curly braces also directly within the function.
Copying¶
Data on the stack (primary data types) are copied by value, others on the heap are moved, if you assign one variable to another. That means a moved variable can't be used any longer.
let a = 1;
let b = a; // copy data on the stack
let s1 = String::from("hello");
let s2 = s1; // move data to s2, s1 can no longer be used
let s3 = s2.clone(); // clone data on the heap
To really copy heap data you need to clone
them.
The same goes for the copying/moving variables into functions as parameters or by returning them. So they are no longer valid in the parent function after the call is made.
Ownership¶
Rust’s central feature is ownership with the following rules:
- Each value in Rust has a variable that’s called its owner.
- There can only be one owner at a time.
- When the owner goes out of scope, the value will be dropped.
- The owner may borrow the variable to another function.
Like shown above ownership is managed by copying/moving variables between functions.
Alternatively the variable owned in the parent function can be borrowed to sub functions as reference indicated by &
. The owner stays and can further work with the variable. Keep in mind that the reference also have to be mutable if it's value should be changeable.
fn main() {
let s1 = String::from("hello");
let len = calculate_length(&s1);
println!("The length of '{}' is {}.", s1, len);
}
fn calculate_length(s: &String) -> usize {
s.len()
}
You can have only one mutable reference to a particular piece of data in a particular scope. If you need more you have to create separate scopes to end validity for one mutable reference before the other is created. The same goes for mixed mutable and immutable references to prevent you from references which change while using them.
References can't be borrowed out of the definition scope.
References¶
Referencing is done by &
before the variable and dereferencing with *
before the variable containing a reference.
Primary Data Types¶
Rust is a statically typed language, which means that it must know the types of all variables at compile time. The compiler can usually infer what type we want to use based on the value and how we use it. But if not we have to annotate it.
let value: u32 = "42".parse().expect("Not a number!");
Integer¶
Length | Type | Min | Max |
---|---|---|---|
8-bit | i8 | -128 | 127 |
8-bit | u8 | 0 | 255 |
16-bit | i16 | -32768 | 32767 |
16-bit | u16 | 0 | 65535 |
32-bit | i32 | -2147483648 | 2147483647 |
32-bit | u32 | 0 | 4294967295 |
64-bit | i64 | -9223372036854775808 | 9223372036854775807 |
64-bit | u64 | 0 | 18446744073709551615 |
arch | isize | ? | ? |
arch | usize | 0 | ? |
The isize
and usize
types depend on the kind of computer your program is running on: 64 bits if you’re on a 64-bit architecture and 32 bits if you’re on a 32-bit architecture.
Integer values can be written as:
- Decimal:
98_222
- Hex:
0xff
- Octal:
0o77
- Binary:
0b1111_0000
- Byte (u8 only):
b'A'
Integer types default is i32
- this type is generally the fastest, even on 64-bit systems.
If you set a variable you can define the type by also adding the type to the numeric value like 5u8
or with an optional underscore for better readability 5_u32
.
Float¶
Rust’s floating-point types are f32
and f64
, which are 32 bits and 64 bits in size, respectively. The default type is f64
because on modern CPUs it’s roughly the same speed as f32
but is capable of more precision.
Like for integers the _
may be used as visual separator and the type may be appended like 13_f32
.
Boolean¶
Boolean type in Rust are defined as bool
and have two possible values: true
and false
.
Character¶
Rust’s char
type is specified with single quotes and represents a Unicode character.
Tuple¶
A tuple
is an ordered list of multiple other types. To access parts of it you can destructure it or directly access sub parts.
// define
let coordinate: (i32, i32) = (25, 40);
// destructure
let (x, y) = coordinate;
println!("The value of y is: {}", y);
// direct access
println!("The value of y is: {}", coordinate.1);
As far as they contain only primary data types they are completely stored on the stack.
Array¶
The array
gives you a list of values, all with the same type. Arrays in Rust have a fixed length at compile time and cannot grow or shrink.
let a = [1, 2, 3, 4, 5];
let second = a[1]; // access specific element
As far as they contain primary data types they are completely stored on the stack.
You can define the type with the number of elements in square brackets.
let a: [i32; 5] = [1, 2, 3, 4, 5];
Also you can initialize the array with the same value in each element:
let a = [3; 5]; // [3, 3, 3, 3, 3]
Slices¶
Slices let you reference a contiguous sequence of elements in a collection like array or string (characters) rather than the whole collection.
A string slice is a reference to part of a String which is created from a String with a range that begins at start and continues up to, but not including the end position, both are optional. The type is written as &str
.
let s = String::from("hello world");
let hello = &s[0..5];
String literals are also slices pointing to a position in the binary.
Slices are also possible on arrays:
let a = [1, 2, 3, 4, 5];
let slice = &a[1..3];
Here the type is &[i32]
.
Literals¶
Numeric literals can be type annotated by adding the type as a suffix. As an example, to specify that the literal 42
should have the type i32
, write 42i32
.
String literals are string slices pointing to the area in the binary: let s = "hello"
is the same type &str
like a literal from a String
: let s = &my_string[..]
.
Custom Types¶
Structs¶
Structs are one possibility to create objects. They are similar to tuples but with named elements, so that it is clear, what each element is. As a result the order of the elements isn't of matter.
struct User {
name: String,
email: String,
online: bool,
}
let mut user1 = User {
email: String::from("someone@example.com"),
name: String::from("someone"),
online: true,
};
user1.email = String::from("anotheremail@example.com");
// if variable name and struct field is the same the creation can be simplified
fn build_user(email: String, name: String) -> User {
User {
email,
name,
online: true,
}
}
To create a new instance partly of the old one the ..
before the variable name specifies that the undefined fields should have the same field as the specified object.
let user2 = User {
email: String::from("another@example.com"),
username: String::from("anotherusername567"),
..user1
};
Also you can define structs like tuples without naming the data pieces.
struct Color(i32, i32, i32);
let black = Color(0, 0, 0);
And at last unit like structs may be created to be used if you need a trait without data on the element.
struct UnitTest;
let test = UnitTest;
Enum¶
The enum
can be used to present different exclusive states which may also contain an associated value.
enum IpAddrKind { V4, V6 }
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
let localhost = IpAddr::V4(127, 0, 0, 1);
As shown the second example defines an enum with assigned values. Similar to struct an enum may contain methods.
Rust didn't have the null
value but the Option
enum will be used. It is so common that you may use its values Some
and None
directly without the Option::
prefix.
Methods¶
Methods are functions contained within a struct or enum, which are bound to the struct object data.
An implementation block impl
holds method functions for a struct. The method always has a reference to the struct object itself as &self
as first parameter:
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
}
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
let area = rect1.area();
}
The self reference can also be made mutable. By calling such methods Rust will do automatic dereferencing so there is no need to make references in the calling. As with functions additional parameters are possible.
Associated Functions¶
This is like static methods in object oriented languages. They don't get a self reference and are called using ::
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn square(size: u32) -> Rectangle {
Rectangle { width: size, height: size }
}
fn area(&self) -> u32 {
self.width * self.height
}
}
fn main() {
let sq = Rectangle::square(3);
let area = sq.area();
}
You can also put the methods and associated functions in multiple implementation blocks.
Working with Types¶
Type Aliasing¶
The main use of aliases is to reduce boilerplate; for example the IoResult<T>
type is an alias for the Result<T, IoError>
type. This is done by defining the type like: type NanoSecond = u64;
Type Conversion¶
Explicit type conversion (casting) can be performed using the as
keyword:
let decimal = 65.4321_f32;
let integer = decimal as u8;
let character = integer as char;
Conversions between custom types are implemented using the traits From
and Into
or for Strings ToString
and FromString
. They are used through the from()
and parse()
functions like: String::from(my_str)
or "10".parse::<i32>().unwrap()
Functions¶
Functions will be declared using fn
before it's name. If parameters are possible, they need to be defined with their types.
Within the function you may use statements and expressions. While expressions evaluate to an value, statements only perform some actions. Because assignments are statements, something like x = y = 6
is not allowed. While function calls are expressions, macros are statements and each expression ending with a ;
is also turned into a statement.
Return values are declared with an arrow ->
before the function body:
fn add(a: i32, b: i32) -> i32 {
a + b
}
// calling it using
let res = add(5, 7);
You can return early from a function by using the return keyword and specifying a value, but most functions return the last expression implicitly. That's why no ;
is set after the addition expression to keep it an expression and return it's result.
Closures¶
Sometimes you need anonymous functions given as a closure to another function or variable. They can capture values from the defining scope. The arguments here are given between pipes |err|
, instead of round brackets used in normal functions, before a code block:
let add = |a, b| {
let x = add(5, 7);
x
});
// calling it using
let res = add(5, 7);
Closures are usually short and relevant only within a narrow context they mostly don't need type annotations. It can also consist of a single expression without the curly braces:
let add_one_v4 = |x| x + 1 ;
Each variable in the definition scope of the closure is directly accessible:
let x = 4;
let equal_to_x = |z| z == x;
let res = equal_to_x(4); // that's true
Using the move
keyword before the parameters, the closure will take ownership of the used variables.
A struct can be made to hold the closure and it's result value. This allows to call it multiple times but only execute it once. When taking a closure as an input parameter, the closure's complete type must be annotated using one of a few traits. In order of decreasing restriction, they are:
Fn
: the closure captures by reference (&T
)FnMut
: the closure captures by mutable reference (&mut T
)FnOnce
: the closure captures by value (T
)
See the description of structs and traits later in this book. See the Caching using Struct as an example of how this is used.
Diverging functions¶
Functions which will never return are called diverging functions. They are marked using !, which is an empty type.
fn foo() -> ! {
panic!("This call never returns.");
}
Macros¶
Macros are used for meta programming where the macro is expanded with its code implementation before compiling. With Macros you write Rust code which produce other Rust code.
Declarative Macros¶
This is the most used form. They compare the given value to patterns and then run the associated code to generate rust code out of it.
As an example the Vector macro is used as:
let v: Vec<u32> = vec![1, 2, 3];
And the implemention looks like:
#[macro_export]
macro_rules! vec {
( $( $x:expr ),* ) => {
{
let mut temp_vec = Vec::new();
$(
temp_vec.push($x);
)*
temp_vec
}
};
}
Valid pattern syntax in macro definitions is different than the pattern syntax because macro patterns are matched against Rust code structure rather than values.
The generated code will be:
{
let mut temp_vec = Vec::new();
temp_vec.push(1);
temp_vec.push(2);
temp_vec.push(3);
temp_vec
}
Procedural Macros¶
Procedural macros accept some code as an input, operate on that code, and produce some code as an output rather than matching against patterns. The three kinds of procedural macros (custom derive, attribute-like, and function-like) all work in a similar fashion.
use proc_macro;
#[some_attribute]
pub fn some_name(input: TokenStream) -> TokenStream {
}
The function that defines a procedural macro takes a TokenStream as an input and produces a TokenStream as an output. It represents a sequence of tokens which can be changed into an output TokenStream.
Comments¶
As in most C-like languages you can have line and block comments:
// this is a line comment which goes to the end of this line
/* this is a block comment ending on the first closing characters */
Additionally there are document comments to add text to the API documentation.
/// Generate library docs for the following item.
//! Generate library docs for the enclosing item.
Documentation Comments¶
Documentation comments use three slashes, ///
, instead of two and support Markdown notation for formatting the text. They have to be placed just before the item they’re documenting.
The documentation should explain the element. But don't describe the concrete API because this is added automatically by Rust. Useful sections may be:
Examples
- short code parts to explain typical usePanics
- scenarios in which the function could panicErrors
- the kinds of errors that might occur within theResult
Safety
- for unsafe functions there should be an explanation why the function is unsafe and covering the invariants that the function expects callers to uphold
Running cargo test
will run the code examples in your documentation as tests this will keep your examples always functional and up to date.
To document the file itself you can use the alternative marker //!
which is used to document the element this comment is contained in.
Attention: Documentation in main.rs
will not be exported. Only files which could be loaded as module can be documented, so put the main documentation in lib.rs
.
Control Flow¶
if¶
An if
expression allows you to branch your code depending on conditions. If the condition is met the following block is executed but if not the optional else
block is executed. The result of the condition has to be a bool
.
let number = 3;
if number < 5 {
println!("a small number");
} else if number < 10 == 0 {
println!("a little bigger number");
} else {
println!("big number");
}
As shown multiple alternative conditions may be joined. But if you need multiple alternatives consider using match
.
if
is an expression so it can be used in a statement to set the returning value of the evaluated block to a variable. This means the values that have the potential to be results from each arm of the if
must be the same type.
let condition = true;
let number = if condition {
5
} else {
6
};
loop¶
The loop
keyword tells Rust to execute a block of code over and over again forever or until you explicitly tell it to stop using break
. With continue
the current loop iteration is stopped and the next iteration will follow.
Also the loop can be named to break or continue out of a specific loop:
fn main() {
'outer: loop {
println!("Entered the outer loop");
'inner: loop {
println!("Entered the inner loop");
break 'outer;
}
println!("This point will never be reached");
}
println!("Exited the outer loop");
}
You have to use a label like 'outer:
before the loop statement and also behind the break
or continue
,
Also you might return a value to the rest of the code by putting it after the break
, and it will be returned by the loop expression.
while¶
If you want to check a condition to decide if the loop should continue you may use the while
loop. Before each round of the loop the condition is evaluated and while it is true
the loop goes on.
let mut x = 5;
while x < 10000 {
println!("{}", x);
x = x * x
}
Alternatively you can do the same using loop
and an if
check with break
.
for¶
To iterate a defined number of times use for n in 1..101
to go from 1 to 100. To also include the second value use for n in 1..=100
which is the same.
A for
loop is also used to execute some code for each item in a collection:
for element in a.iter() {
println!("the value is: {}", element);
}
Alternatively the following iterators may be used in collections:
iter()
- borrows each element of the collection through each iteration, leaving the collection untouched and available for reuse after the loopinto_iter()
- consumes the collection and moves each element into the loop, they are no longer available in the collectioniter_mut()
- mutably borrow each element of the collection, allowing for the collection to be modified in place
Pattern Matching¶
Patterns are a special syntax in Rust for matching against the structure of types, both complex and simple.
Within the patterns you may use:
- value - only this concrete value
1 | 2
- multiple matches are possible with or1 ... 5
- ranges(a, b)
- destructuring vector{ x: a, y: b }
- destructuring structs{ x, y }
- destructuring structs, using property name as variable names_
- is used to ignore values..
- ignoring remaining or previous partsref
orref mut
- creating a reference@
- create variable (before) and also test it (after the @)
Match¶
The match
command allows you to compare a value against a series of patterns and then execute code based on which pattern matches.
match x {
None => None,
Some(i) => Some(i + 1),
}
All possible values have to be checked. But you can use the _
placeholder as pattern to match the rest of the possibilities.
If it gets more complex use match guards with an additional if
:
let num = Some(4);
match num {
Some(x) if x < 5 => println!("less than five: {}", x),
Some(x) => println!("{}", x),
None => (),
}
If Let¶
The if let
syntax lets you write a singular pattern match in an easier form:
if let Some(3) = some_u8_value {
println!("three");
}
Here the pattern comes before the equal sign and the value behind. As with a normal if you can also use an else part which is executed if the pattern doesn't match. It is used like if
and can also be combined with it using else if let
and else
.
While let¶
Similar to the if let
the while let
will run a loop while the match succeeds.
Error Handling¶
Rust knows two types of errors:
- unrecoverable errors - will force the code to stop
- recoverable errors - will be handled within the code
Use the unrecoverable errors sparsely. If a library use a recoverable error the caller always has the option to make it unrecoverable but not the other way around.
Unrecoverable errors¶
Then some bad situations occur which should not be there the panic!
macro will print a failure message, unwind and clean up the stack and quit the program.
The debug build will have included symbols, so that also a backtrace is possible which shows you the files and lines which brought you to the problem. If the program is called with the RUST_BACKTRACE=1
environment setting this backtrace will be displayed.
Use such errors in prototyping to be later replaced or in tests. Or use it than it can't really fail.
Recoverable errors¶
Functions which may have an error are returning a Result
enum which is defined as having two variants, Ok
with an associated value and Err
with an associated message.
This can be checked like:
let f = File::open("hello.txt");
let f = match f {
Ok(file) => file,
Err(error) => {
panic!("There was a problem opening the file: {:?}", error)
},
};
At first this is an recoverable error in the file::open
method but then it was decided to make it in this situation unrecoverable using the panic!
macro.
The type of error
within the Err
handler here is io::Error
, which is a struct provided by the standard library. This struct has a method kind
that holds an io::ErrorKind
value. The enum io::ErrorKind
is also from the standard library and has variants representing the different kinds of errors that might result from an IO operation. The relevant variant in this example may be ErrorKind::NotFound
, which indicates the file doesn’t exist yet. It can be checked like:
if error.kind() == ErrorKind::NotFound { ... }
The Result<T, E>
has some useful shortcut helpers:
unwrap
- ifOK
return the value and ifErr
then callpanic!
unwrap_or_else
- likeunwrap
but onErr
call the code in the given closureexpect
- do the same but use the given error message in panic message
Propagation¶
This will let the error bubble up through it's call stack. The following example will check for errors and immediately propagate them:
let mut f = match f {
Ok(file) => file,
Err(e) => return Err(e),
};
As a shortcut here the ?
operator may be used:
let mut f = File::open("hello.txt")?; // immediately return on Result::Err with it
File::open("hello.txt")?.read_to_string(&mut s)?; // also possibly in between calls
This will also automatically convert the error type to the function's defined returning type. But it can only be used in functions returning a Result
.
Generics¶
Generics are abstract stand-ins to allow different concrete types to be used. Therefore the generics has to be defined in the function signature:
fn largest<T>(list: &[T]) -> T { ... }
This defines that the same type which is given ad reference will be returned. So this can be used with ì32
or float
...
Generics can also be used in struct
or enum
like:
enum Result<T, E> {
Ok(T),
Err(E),
}
On implementations you have to define the generics directly after imp<T> enum<T>{ ... }
.
The performance of generics is not slower because the compiler will transform the generics to different concrete functions which are used.
WHERE clauses¶
A bound can also be expressed using a where clause immediately before the opening {
, rather than at the type's first mention.
Instead of:
fn apply<F: FnOnce()>(f: F) {
f();
}
It can be written as:
fn apply<F>(f: F)
where F: FnOnce()
{
f();
}
Traits¶
Traits are similar to interfaces. They define behavior for multiple types. A trait
contains the function signatures which has to be met:
pub trait Summary {
fn summarize(&self) -> String;
}
This can be used to ensure that types marked with this trait are only allowed on types with the defined signature. To implement a trait on a type:
impl Summary for NewsArticle {
fn summarize(&self) -> String {
format!("{}, by {} ({})", self.headline, self.author, self.location)
}
}
Traits can only be implemented on local types. If a trait should have a default behavior this is defined as functions with body to the trait definition.
Boundaries¶
A trait can be used to limit generics to only some types.
pub fn notify<T: Summary>(item: T) {
println!("Breaking news! {}", item.summarize());
}
Multiple trait bounds can also be set like T: Summary + Display
but to make this better readable it can also put as where
condition before the function body:
fn some_function<T, U>(t: T, u: U) -> i32
where T: Display + Clone,
U: Clone + Debug
{
You can also use traits to conditional implement methods:
impl<T: Display + PartialOrd> Pair<T> {
fn cmp_display(&self) { ... }
}
Or directly implement a trait on types matching another trait:
impl<T: Display> ToString for T { ... }
Phantom Types¶
A phantom type parameter is one that doesn't show up at runtime, but is checked statically (and only) at compile time. In combination with generics this can help to ensure type safety.
Lifetime¶
References have lifetimes, which are the scope for which this references are valid and help to prevent dangling references. The default lifetime can be adjusted using annotations.
Problems occur if the reference of an inner variable is borrowed to an outer variable with longer lifetime. To solve this references are annotated with relations to lifetime spaces:
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime
Each lifetime for it's own didn't give any help but the relation of different references with the same lifetime defined in the function signature:
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str { .. }
This defines that all references in the parameters and the return value must have the same lifetime. Because we’ve annotated the returned reference with the same lifetime parameter `a
, the returned reference will also be valid for the length of the smaller of the lifetimes ofx
andy
.
Alternatively you can return an owned data type rather than a reference so the calling function is then responsible for cleaning up the value.
Lifetimes are also possible in method definitions or in a struct
that holds references But you won't need to add lifetime annotations anywhere because the compiler will add them automatically for known patterns.
Static¶
The 'static
lifetime is special, it keeps the references for the whole lifetime of the program. This is used on string literals which are stored in the binary.
Iterator¶
You can create an iterator by calling iter
, into_iter
, or iter_mut
on a vector. You can create iterators from the other collection types in the standard library, such as hash map. You can also create iterators that do anything you want by implementing the Iterator trait on your own types.
On iterators you can also use high level operations like:
filter
- to only select elements where the given closure is truemap
- call a closure on each element and replace it with the closure`s resultzip
- combine two iterators together as tuplescollect
- return a vector of valuescount
- count the number of elements
And some more.
Own Iterator¶
Each iterator has to implement the next
method:
impl Iterator for Counter {
type Item = u32;
fn next(&mut self) -> Option<Self::Item> {
self.count += 1;
if self.count < 6 {
Some(self.count)
} else {
None
}
}
}
Attributes¶
An attribute is metadata applied to some module, crate or item. This metadata can be used to/for:
- conditional compilation of code
- set crate name, version and type (binary or library)
- disable lints (warnings)
- enable compiler features (macros, glob imports, etc.)
- link to a foreign library
- mark functions as unit tests
- mark functions that will be part of a benchmark
They are written as #[attribute]
if they apply for the current module or following item and #![attribute]
if they apply for the whole crate. Just like with doc comments. They also may get values...
Compiler Warnings¶
#[allow(dead_code)]
- to don't warn about unused functions
Conditional Compiling¶
This allows to compile parts only on specific settings like OS:
#[cfg(target_os = "linux")]
fn are_you_on_linux() {
println!("You are running linux!");
}
#[cfg(not(target_os = "linux"))]
fn are_you_on_linux() {
println!("You are *not* running linux!");
}
The same may be achieved using the cfg!
macro.
The condition may be build using nested parts using the following helper:
any
like#[cfg(any(unix, windows))]
all
like#[cfg(all(unix, target_pointer_width = "32"))]
not
like#[cfg(not(foo))]
You can also set another attribute based on a cfg
variable with #[cfg_attr(a, b)]
. This will set attribute b
, but only if attribute a
is set.
Custom Conditions¶
Using Cargo, they get set in the [features]
section of your Cargo.toml:
[features]
# no features by default
default = []
# Add feature "foo" here, then you can use it.
# Our "foo" feature depends on nothing else.
foo = []
#[cfg(feature = "foo")]
mod foo {
}
Compile this using cargo build
, no additional flag will be send to the rustc
compiler (default). But using cargo build --features "foo"
it will send the foo
flag to rustc
and the output will have the mod foo
in it.