Wonderfull Rust

Memory-Level Programming

Safe Rust and Its Limits

All of the safety guarantees we've seen so far -- ownership, the borrow checker, lifetimes -- exist within the realm of "safe Rust." However, as a systems programming language, Rust anticipates situations where you need to step beyond this safe boundary.

You want to directly access a hardware memory-mapped register. You want to call an OS system call. You want to interface with a C library. You want to implement a lock-free data structure. -- These can't be achieved with "safe Rust" alone.

unsafe Rust

The unsafe keyword partially disables the compiler's safety guarantees, declaring a region where the programmer takes responsibility for ensuring safety.

There are exactly 5 operations additionally permitted inside an unsafe block:

  1. Dereferencing raw pointers (*const T / *mut T)
  2. Calling unsafe functions
  3. Implementing unsafe traits
  4. Accessing mutable static variables (static mut)
  5. Accessing union fields
fn main() {
    let mut num = 5;

    // Creating raw pointers is safe (dereferencing is the unsafe part)
    let r1 = &num as *const i32;
    let r2 = &mut num as *mut i32;

    unsafe {
        // Dereferencing raw pointers is only possible inside unsafe blocks
        println!("r1: {}", *r1);
        println!("r2: {}", *r2);
    }
}

Important: unsafe doesn't mean "anything goes." The borrow checker, type checking, and other safety checks remain active inside unsafe blocks. unsafe only lifts the restrictions on the 5 operations above.

unsafe Functions

// This function requires the caller to guarantee safety
unsafe fn dangerous() {
    // ...
}

fn main() {
    unsafe {
        dangerous();
    }
}

unsafe fn indicates "there are preconditions for using this function correctly." Meeting those preconditions is the caller's responsibility.

Safe Abstractions

The most important use of unsafe is as the internal implementation of a safe API:

// Safe public API
pub fn split_at_mut(values: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) {
    let len = values.len();
    assert!(mid <= len);

    let ptr = values.as_mut_ptr();

    // Uses unsafe internally, but the API itself is safe
    unsafe {
        (
            std::slice::from_raw_parts_mut(ptr, mid),
            std::slice::from_raw_parts_mut(ptr.add(mid), len - mid),
        )
    }
}

This function splits one mutable slice into two non-overlapping mutable slices. The borrow checker doesn't allow "creating two mutable references from one mutable reference," but when the programmer can guarantee the two slices don't overlap, unsafe makes this possible.

The public API is safe, and the unsafe details are hidden internally. Safe Rust is built on top of this pattern. The standard library's Vec, String, HashMap -- all of them use unsafe internally, but their public APIs are safe.

FFI (Foreign Function Interface)

Rust provides first-class support for interoperability with C:

Calling C Functions

extern "C" {
    fn abs(input: i32) -> i32;
    fn strlen(s: *const std::ffi::c_char) -> usize;
}

fn main() {
    unsafe {
        println!("abs(-5) = {}", abs(-5));
    }
}

Functions declared in an extern "C" block are called using C's ABI (Application Binary Interface). Calling external functions is always unsafe -- the Rust compiler cannot verify the safety of external functions.

Exposing Rust Functions to C

#[no_mangle]
pub extern "C" fn rust_function(x: i32) -> i32 {
    x * 2
}

#[no_mangle] disables Rust's symbol name mangling, allowing C to link to it by the name rust_function. extern "C" specifies using the C calling convention.

#![no_std]: A World Without the Standard Library

Rust's standard library (std) depends on OS features (heap memory allocation, file I/O, networking, threads, etc.). In environments where no OS exists (embedded systems, kernel development, etc.), std can't be used.

The #![no_std] attribute removes the dependency on the standard library:

#![no_std]

// The core library is still available (basic features independent of the OS)
use core::fmt;

// The alloc library is available if an allocator exists
// extern crate alloc;
// use alloc::vec::Vec;

Rust's library hierarchy:

Library Dependencies Provides
core None Primitive types, basic traits, Option, Result
alloc Allocator Vec, String, Box, Arc
std OS File I/O, networking, threads, standard I/O

core depends on neither an OS nor an allocator. All of core's features are available even in bare-metal environments.

Bare-Metal Programming

Running Rust in an OS-free environment requires some additional setup:

#![no_std]
#![no_main]

use core::panic::PanicInfo;

// Panic handler (must be defined manually since there's no OS)
#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

// Entry point (not the OS-provided main, but specified by the linker script)
#[no_mangle]
pub extern "C" fn _start() -> ! {
    // The program starts here
    loop {}
}
  • #![no_std]: Don't use the standard library
  • #![no_main]: Don't use Rust's normal entry point (fn main())
  • #[panic_handler]: Define panic handling yourself (since there's no OS)
  • _start: The linker's entry point

Embedded Programming Example

Code example for an ARM Cortex-M microcontroller:

#![no_std]
#![no_main]

use cortex_m_rt::entry;
use panic_halt as _;

#[entry]
fn main() -> ! {
    // GPIO pin control, timer setup, etc.
    let peripherals = stm32f4::Peripherals::take().unwrap();

    // Blink an LED
    loop {
        // LED ON
        peripherals.GPIOA.odr.modify(|_, w| w.odr5().set_bit());
        cortex_m::asm::delay(8_000_000);

        // LED OFF
        peripherals.GPIOA.odr.modify(|_, w| w.odr5().clear_bit());
        cortex_m::asm::delay(8_000_000);
    }
}

Rust's type safety and ownership system work just as well in bare-metal environments. "Peripheral acquisition happens only once" (singleton pattern) is enforced at the type level, and register access bit fields can be manipulated in a type-safe manner.

Direct Memory Layout Control

The repr Attribute

// C-compatible memory layout
#[repr(C)]
struct CStruct {
    a: u8,
    b: u32,
    c: u8,
}

// Represent as a specific-sized integer
#[repr(u8)]
enum Color {
    Red = 0,
    Green = 1,
    Blue = 2,
}

// No padding (used for network protocol packet definitions, etc.)
#[repr(packed)]
struct Packet {
    header: u8,
    length: u16,
    data: u32,
}

// Transparent representation (guarantees zero runtime cost for the Newtype pattern)
#[repr(transparent)]
struct Wrapper(u32);

Pointer Arithmetic

fn main() {
    let data = [10u8, 20, 30, 40, 50];
    let ptr = data.as_ptr();

    unsafe {
        // Access memory directly through pointer arithmetic
        for i in 0..data.len() {
            let value = *ptr.add(i);
            println!("data[{}] = {}", i, value);
        }
    }
}

Why Rust Is Chosen

In the world of bare-metal and systems programming, C and C++ have traditionally been dominant. The reasons Rust is being chosen in this space:

  1. Both safety and control: Use unsafe locally, while the majority of code benefits from safe Rust
  2. Zero-cost abstractions: High-level constructs (iterators, generics, traits) are usable even on bare metal
  3. Resource management through ownership: Safe memory management (RAII) without GC
  4. The Cargo ecosystem: Embedded crates (HAL, PAC, BSP) are managed through Cargo
  5. Cross-compilation: As seen in the Cross-Building chapter, adding targets is easy

The Linux kernel adopting Rust as its second language in 2022 is the most symbolic example of these characteristics being proven in practice.

Back to book