Wonderfull Rust

The Type System

Where Rust's Type System Stands

Rust's type system is a fusion of the theoretical foundations of ML-family languages (OCaml, Haskell) and the practical needs of C++ systems programming. It combines static typing, type inference, parametric polymorphism (generics), and ad hoc polymorphism (traits), plus the unique concepts of ownership and lifetimes expressed at the type level.

Hindley-Milner Type Inference

Rust's type inference is based on Hindley-Milner (HM) type inference. HM type inference is an algorithm widely used in ML-family languages that minimizes type annotations in programs while determining the type of every expression at compile time.

fn main() {
    let x = 42;                    // x: inferred as i32
    let y = 3.14;                  // y: inferred as f64
    let v = vec![1, 2, 3];         // v: inferred as Vec<i32>
    let s = String::from("hello"); // s: inferred as String

    // Types are inferred backwards from usage
    let mut nums = Vec::new();     // At this point, Vec<_> (type undetermined)
    nums.push(42);                 // <- Here, it's determined to be Vec<i32>
}

The last example is particularly important. At the point Vec::new() is called, the element type is unknown, but the subsequent push(42) causes i32 to be inferred backwards as the element type. This is due to HM type inference's constraint-based unification.

Local Type Inference

Rust's type inference is in principle performed locally. Function signatures (argument types and return types) must always be explicit, and type inference is confined within function bodies:

// Type annotations are required in function signatures
fn add(a: i32, b: i32) -> i32 {
    a + b  // Types within the body are inferred
}

This differs from Haskell, which can infer top-level types too. Rust chose local type inference for these reasons:

  1. Compile speed: Type checking can be parallelized per function
  2. Readability: Function interfaces are always explicit; you can understand types by looking at the signature
  3. Error message quality: The limited scope of inference makes it easier to pinpoint the cause of type errors

Turbofish Syntax

When types can't be determined by inference alone, you need to specify them explicitly:

// When type inference alone can't determine the type
let x = "42".parse::<i32>().unwrap();   // turbofish ::<>
let y: i32 = "42".parse().unwrap();     // type annotation on the variable

// collect often requires type annotation
let v: Vec<i32> = (0..10).collect();
let v = (0..10).collect::<Vec<i32>>();

The ::<Type> syntax is called the turbofish and is used to explicitly specify type parameters of generic methods.

Generics

Parametric Polymorphism

Generics (parametric polymorphism) is a mechanism for abstracting over types as parameters:

// T is a type parameter representing any type
fn first<T>(items: &[T]) -> Option<&T> {
    items.first()
}

struct Pair<A, B> {
    first: A,
    second: B,
}

enum Either<L, R> {
    Left(L),
    Right(R),
}

Trait Bounds

You can constrain type parameters to require that "this type must implement this trait":

// T must implement Display
fn print_twice<T: std::fmt::Display>(value: T) {
    println!("{}", value);
    println!("{}", value);
}

// Multiple trait bounds
fn process<T: Clone + std::fmt::Debug + PartialEq>(value: T) {
    let cloned = value.clone();
    println!("{:?}", cloned);
}

// More readable with where clauses
fn complex_function<T, U>(t: T, u: U) -> String
where
    T: Display + Clone,
    U: Debug + Into<String>,
{
    format!("{}: {:?}", t, u)
}

Trait bounds greatly enhance the expressiveness of generics. By expressing "any type with specific capabilities" rather than "any type at all," you can define type-safe yet generic functions and data structures.

impl Trait

impl Trait is used as syntactic sugar for generics, or to hide return types:

// Argument position: syntactic sugar for generics
fn print_item(item: &impl Display) {
    println!("{}", item);
}
// The above is equivalent to:
fn print_item_expanded<T: Display>(item: &T) {
    println!("{}", item);
}

// Return position: hides the concrete type
fn make_greeting(name: &str) -> impl Display {
    format!("Hello, {}!", name)
}

impl Trait in return position is especially important. You can expose "some type that implements Display" without revealing the concrete type (here, String). This means internal implementation changes won't break API compatibility.

Associated Types

Traits can have associated types instead of type parameters:

trait Iterator {
    type Item;  // Associated type

    fn next(&mut self) -> Option<Self::Item>;
}

The distinction between associated types and generic type parameters is important:

  • Associated types: When the implementor determines the type uniquely. There's only one implementation of the trait for a given type
  • Type parameters: When there can be multiple implementations for the same type
// Associated type: Iterator impl for Vec<T> is uniquely Item = T
impl<T> Iterator for VecIter<T> {
    type Item = T;
    fn next(&mut self) -> Option<T> { /* ... */ }
}

// Type parameter: Can implement From multiple times for the same type
impl From<i32> for MyType { /* ... */ }
impl From<String> for MyType { /* ... */ }

GATs (Generic Associated Types)

GATs (stabilized in Rust 1.65) let associated types have generic parameters:

trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}

To understand the problem GATs solve, consider what happened without them:

// Without GATs: couldn't write an iterator that returns references to itself
trait StreamingIterator {
    type Item;  // <- Can't have a lifetime parameter

    // Can't express that Item borrows from &self
    fn next(&mut self) -> Option<Self::Item>;
}

// With GATs: the lifetime is parameterized
trait LendingIterator {
    type Item<'a> where Self: 'a;

    fn next<'a>(&'a mut self) -> Option<Self::Item<'a>>;
}

struct WindowIter<'s> {
    data: &'s [i32],
    pos: usize,
}

impl<'s> LendingIterator for WindowIter<'s> {
    type Item<'a> = &'a [i32] where Self: 'a;

    fn next<'a>(&'a mut self) -> Option<&'a [i32]> {
        if self.pos + 2 <= self.data.len() {
            let window = &self.data[self.pos..self.pos + 2];
            self.pos += 1;
            Some(window)
        } else {
            None
        }
    }
}

GATs make it possible to express "associated types with different lifetimes per method call." This serves as the foundation for many advanced patterns including streaming iterators, async traits, and collection abstractions.

Type Parameters in GATs

GATs can have type parameters, not just lifetimes:

trait Container {
    type Item<T>;

    fn wrap<T>(&self, value: T) -> Self::Item<T>;
}

Type-Level Programming

Rust's type system lets you represent information at the type level and verify it at compile time:

The Newtype Pattern

struct Meters(f64);
struct Kilometers(f64);

// Confusing Meters and Kilometers causes a compile error
fn distance_in_meters(d: Meters) -> f64 {
    d.0
}

// distance_in_meters(Kilometers(5.0));  // Compile error!

Even though both are f64, semantically different values can be distinguished at the type level. Runtime cost is zero (monomorphization erases the Newtype wrapper).

Type-Level Markers with PhantomData

use std::marker::PhantomData;

struct Authenticated;
struct Unauthenticated;

struct Session<State> {
    token: String,
    _state: PhantomData<State>,
}

impl Session<Unauthenticated> {
    fn login(token: String) -> Session<Authenticated> {
        Session {
            token,
            _state: PhantomData,
        }
    }
}

impl Session<Authenticated> {
    fn get_secret_data(&self) -> &str {
        "secret"
    }
}

// Accessing secret data from an unauthenticated session is impossible at the type level

PhantomData has zero size at runtime, but represents state at the type level through type parameters. This technique, called the Typestate Pattern, makes it possible to "eliminate invalid states at compile time."

Summary

Rust's type system achieves its power through the combination of these characteristics:

Feature Description
HM type inference Minimizes type annotations within function bodies
Generics Type-safe generic programming
Trait bounds Constrain capabilities of type parameters
Associated types Unique determination of type parameters
GATs Genericization of associated types
Monomorphization Zero-cost execution of generics
Newtype Semantic distinction at the type level

These features are powerful individually, but combined they achieve Rust's design goals: "eliminate invalid states at the type level," "catch bugs at compile time," and "abstract with zero runtime cost."

Back to book