Part 1: Bootup to LED
Introduction
This post explains how to get a minimal application booting on a Teensy 3.2. Further blog posts will explain how to access the various hardware features of this chip. The goal of this tutorial series is to explore building safe hardware abstractions in Rust. As such, existing libraries for embedded devices will not be used.
The Teensy family is a set inexpensive embedded development boards, originally designed to be programmed using the Arduino environment. The Teensy 3.2 that we’ll be targeting is based on a Freescale (NXP) MK20DX256 ARM Cortex-M4 microcontroller.
The Teensy 3.x boards use microntrollers in the same family. Much of what’s in this series will be applicable to all of them, and you can probably follow along if you’re willing to read the datasheet for your board. I’ll try to call out where things might be different for other chips, but you’ll have the best luck using the Teensy 3.1 or 3.2 that the tutorials are written for.
This tutorial is written mostly for Linux; specifically Arch. You may have to adjust commands for other OSes, or even for other Linux distros. If anything is broken for you, please feel free to file an issue on GitHub.
Target Audience
This series of posts is aimed at someone who has done “lightweight” embedded development on an Arduino (or similar). It does not rely on you having existing knowledge of how microcontrollers are programmed at the “low level”. You will be expected to already have a basic knowledge of Rust, although I’ll cover some language details when we get in to places where embedded work differs from desktop development.
A Short Introduction to Embedded Programming
Unlike with typical desktop or server applications, embedded programs do not have an operating system to provide them with hardware control. Instead, they must access the hardware directly. The exact process for hardware control varies depending on the type of processor in use. For the ARM microcontroller that we’re using, we access the hardware through memory mapped registers.
Memory mapping is assigning a special memory address which, when read from or written to, interacts with a hardware device instead of RAM. For example, address 0x4006A007 is the UART Data Register. Writing a byte to this address will cause that data to be sent across the serial port.
Writing to arbitrary memory addresses requires unsafe Rust. One of our goals through this series will be to use Rust’s language features to create safe interfaces for these unsafe memory accesses.
Development Environment
Currently, embedded development requires the use of nightly Rust to be practical. While many things can now be done with stable rust, we will still need a nightly version to access some specific hardware instructions. We’ll use Rustup to install nightly Rust.
We need to add the appropriate stdlib for the architecture we’re
targeting. For the Teensy 3.2, this is thumbv7em-none-eabi
. This
provides the core
crate that our embedded application will be linked
against.
Modern nightly versions of rust provide lld
, the LLVM
linker. However, we still require binutils
in order to convert our
binary to a format which can be loaded onto the teensy. For arch, we
install binutils like so:
Finally, you’ll want to get the Teensy Loader. This is a small command line tool that handles flashing a program to the Teensy. If you are on Linux, it may also be available through your package manager.
Code Overview
For this first post, we’ll be focused on the bootup procedure of the MK20DX256. We’ll start by building up the skeleton of an embedded application. Next, we’ll handle some basic hardware initialization tasks. Lastly, we will add some code to turn on the Teensy’s LED. This will let us see that our code is executing on the device.
Bootup Sequence
The MK20DX256 starts up by loading an initial stack pointer and
reset vector from the beginning of flash memory. The reset vector
is the equivalent of main
in a normal desktop application - it is
the first bit of our code that will execute.
Once our main function has control, it will have to perform some basic hardware setup - disabling the watchdog and enabling the clock gate for any peripherals that the application needs.
The watchdog is a piece of hardware which will reset the microcontroller unless the running application “checks in” in a certain interval. It’s designed to restart crashed or hung programs. For our needs in this tutorial it just adds complexity, so we will disable it.
The other part of hardware initialization is clock gating. This term comes from implementation details of how microcontrollers are constructed. You should think of a clock gate as an on/off switch for a piece of functionality. As we progress, we will need to enable the clocks for a number of hardware features.
Application Setup
We’ll start by creating a new application with cargo, and setting it to use nightly Rust.
The first thing to do is make our program embedded-friendly. There are
a few major changes to src/main.rs
that we’ll need to make. Here’s the
new code, with explanations below:
The first line enables the use of intrinsics, and is the reason we
need nightly Rust. The next two lines actually disable features of the
Rust environment - the standard library, and the main
wrapper. The
Rust standard library relies on a full operating system, and can’t
typically be used for embedded development. Instead, we will have
access to libcore, which is the subset of std
that is available
without an OS. Similarly, the main
wrapper is used for application
setup tasks that aren’t necessary in embedded programs.
Lastly, we’ve marked main
as an extern function, and added an
infinite loop to it. Extern tells the Rust compiler that this
function follows the C calling convention. The details of what this
does vary by target, and are beyond the scope of this post. The
important effect of the change is that it’s now safe to use main
as
our reset vector. Adding the infinite loop ensures that main will
never return. There’s no code for main to return to in this embedded
environment.
Language Items
The Rust compiler relies on certain functionality to be defined by the standard library. Unfortunately for us, we just disabled it. This means that we are responsible for providing these features.
For now, the only language feature we’re responsible for is the panic handler. This is the function that gets called to display a message when our code panics. We will eventually want to pass these messages along to the user, but initially we will ignore them and hang the program.
Static Data
There are two arrays of data the the hardware expects. The first is
the interrupt table. This contains the initial stack pointer and
reset vector that was mentioned earlier. The second is the flash
configuration. This is a block of 16 bytes which control how the
flash can be read and written. The Teensy bootloader makes assumptions
about these values, so we will use the same set of bytes as the
Teensy Arduino tooling. Specifically, we disable all
flash security through the FSEC
field, and tell the processor to
boot into high-power mode with FOPT
.
We will use the link_section
attributes in a minute to control where
in the flash memory these arrays end up. The no_mangle
attribute is
needed to tell Rust that these arrays have special meaning at link
time. Without it, the data will not appear in our final executable.
_stack_top
is not really a function. It is a memory address
representing the initial stack pointer. We pretend that it is a
function so that our _VECTORS array is easier to write. Fortunately
calling it from our own code is unsafe, so we can be pretty sure that
only the hardware will read these values.
Compiling and Linking
Our program now contains the important data tables, as well as a
main
that can be called by the microcontroller. We will now turn our
attention to building the project for the Teensy. We’ll use a Makefile
to handle the build process. Laying out the code and data in the
Teensy’s flash memory is done with a linker script.
Linker Scripts
The linker script includes information on available memory regions,
and how program code and data are organized within those regions. Our
linker script will start out very simply, as we don’t have a lot going
on in our program. We’ll put our linker script in a new file called
layout.ld
:
MEMORY
{
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 256K
RAM (rwx) : ORIGIN = 0x1FFF8000, LENGTH = 64K
}
SECTIONS
{
.text : {
. = 0;
KEEP(*(.vectors))
. = 0x400;
KEEP(*(.flashconfig*))
. = ALIGN(4);
*(.text*)
} > FLASH = 0xFF
.rodata : {
*(.rodata*)
} > FLASH
_stack_top = ORIGIN(RAM) + LENGTH(RAM);
/DISCARD/ : {
*(.ARM.*)
}
}
We begin by specifying the address ranges of both FLASH
and
RAM
. Next we lay out our program code at the beginning of the flash
memory. We start with the interrupt table in the first 1024
bytes. After that comes the flash configuration at address 0x0400. The
rest of flash can be laid out however we want. For now, we add our
program code and read-only data right after the flash configuration
block.
This file also specifies the address of _stack_top
as the highest
available memory address. Since we currently have no other data in
RAM, this means that our stack can grow to fill up the entire memory
space. We’ll eventually constrain it by adding additional data to RAM.
Finally, we discard any sections in the executable that match the
pattern .ARM.*
. This is all metadata that we don’t need in our
binary, and would waste space in our constrained environment.
Rust Linker Configuration
Rust will not use our linker script until we tell it to. This is done
with a cargo configuration file, which must be named
.cargo/config
. While we’re here, we’ll also tell Cargo to target our
microcontroller.
The Makefile
Building our project is a three-step process:
- Use cargo to compile the rust code for the target processor
- Convert the built program to the right format with objcopy
- Flash our code to the Teensy with
teensy_loader_cli
This sort of repetitive sequenced work is exactly what Makefiles were designed for. Ours isn’t particularly complicated:
The targets marked .PHONY will always be built by make, even if it thinks they are up-to-date. This is needed for the ELF target, since its dependencies are managed by cargo. We also add it to the flash target so that make will still flash our image even if someone creates a file called “flash”.
Running make
will build our project. make flash
will install it to
a Teensy.
At this point, we can compile our project and flash it to a Teensy. Sadly, it does nothing interesting - we wouldn’t even be able to tell if it was running. In the next section we’ll expand our code to do something more useful.
Accessing The Hardware
Our first steps here will be some basic hardware initialization tasks. We’ll build accessors for the watchdog and for the System Integration Module, or SIM. The SIM handles clock gating as well as most other global configuration of the microcontroller. Once we have those in place, we’ll turn to the I/O functions necessary to turn on the LED.
Disabling the Watchdog
The first bit of hardware setup we’ll do is disabling the watchdog. The watchdog’s control is done through a series of 12 16-bit registers at address 0x40052000. This can be represented in Rust as a packed structure.
We’ll add this struct to a new file - src/watchdog.rs
. The fields of
this struct use the same names that the manufacturer does for these
registers. They’re hard to read here, but being consistent makes
searching for their documentation much easier.
Once we have a struct representing the hardware, we need to build our functions to access it safely. To design this abstraction, we need to think about the invariants of accessing these registers. An invariant is any rule or condition that our unsafe code must take into account, in order for it to be safely callable by safe code. Fortunately the watchdog is pretty simple - it looks just like a struct in memory, and can be treated as such. The biggest invariants here are Rust’s rules about reference aliasing. There can only be one mutable reference to the watchdog struct.
For now, we will say that acquiring a reference to the watchdog is an unsafe operation. This puts the responsibility on the calling code to verify there is only one mutable reference. Once we have that reference, all the functions to update the watchdog will be safe - after all, we’re just changing some fields in memory.
In reality, using the watchdog to its full potential could introduce additional invariants. For example, requiring that a certain value be written to a watchdog register during your main loop. This is not a memory safety issue, and thus strictly falls outside of Rust’s idea of safety. It could cause correctness issues, though, and good API design will try to minimize correctness errors - even if they’re technically “safe”.
The watchdog’s implementation looks like this. Note that new
is
unsafe, but disable
is safe.
The disable function is following the procedure set forth in the
manufacturer’s data sheet. The watchdog is protected against being
accidentally disabled by a random write to memory, so our code must
“unlock” it first, by writing special values to the unlock
register. Once that’s done, we need to wait for the watchdog to
actually unlock itself. The __NOP
intrinsic tells the processor to
briefly do nothing. This introduces our necessary 2-cycle
delay. Finally, we read the control register and un-set the “enable”
bit.
All of our memory access are volatile. This tells the Rust compiler
that the read (or write) has an effect that it can’t see from our
program code. In this case, that effect is a hardware access. Without
marking our memory accesses volatile, the Rust compiler would be free
to say “You never read from unlock
, so I will optimize away the
unneeded write to it”. This would, naturally, cause our code to fail.
This disable process shows why we must have only one mutable reference to the watchdog. If an interrupt were to occur partway through this function and write to the watchdog, our attempt to disable it would fail. Knowing that an interrupt cannot change watchdog settings gives us confidence that this code will execute as we expect.
Clock Gating
The other piece of hardware involved in the microcontroller setup is the System Integration Module. We’ll use this to enable the appropriate clock gate to enable our I/O port. Just like the watchdog, the SIM is controlled through a block of memory, which also will be represent as a struct. It has the same basic memory safety rules as the watchdog does, and for now has no extra memory-safety invariants.
There is a potential correctness issue involved with the SIM - it’s possible to use a mutable reference to the SIM to disable a hardware function that another section of code relies on. We can design an API that keeps better track of which functional units are needed, but we will save that for a future post. For now, we’ll just have to trust ourselves.
The complete code for src/sim.rs
is here:
The simple match-based clock management we have here would get unwieldy pretty quickly if we intended to use it to manage a large number of hardware functions. We’ll get rid of it when we look in to more robust ways to manage clock gates.
I/O Ports
With the initial hardware setup out of the way, we can turn our attention to achieving that bright orange1 glow that we’ve been working towards. We will put a pin into GPIO mode, and use it to turn on the LED. GPIO stands for “General Purpose I/O”. When a pin is in GPIO mode, software has control over the high/low state of an output pin and direct read access to the state of an input pin. This is in contrast to the pin being controlled by a dedicated hardware function, such as a serial port.
Pins are grouped into ports, and all of a pin’s settings are controlled from the port’s register block. This poses a bit of a challenge for us. We’d like each pin to be a self-contained struct, so that ownership of it can be passed from one software module to another, and only the owning module can mutate its pins. This follows Rust’s one-owner rule for pins, but would require that each pin be able to mutate its settings in the Port register block. We all know how Rust feels about shared mutable state.
Fortunately, each pin has a separate control register in the port’s block. That means there’s no actual overlap of memory locations that might be written. We’ll take advantage of this to write some very, very careful unsafe code that allows each pin instance to modify its own control settings.
We’ll start out with a port implementation in src/port.rs
.
Note the array of 32 words called pcr; each of these is an
individual pin control register. The set_pin_mode
function is
responsible for switching a single pin into GPIO (or any other)
mode. The only memory it touches is the PCR associated with a single
pin, and is unsafe to call. It’s unsafety is because calling it for a
pin that you do not own could cause a race condition. An interrupt
that changes a PCR between the read and write in this function could
have its changes overwritten.
The pin struct is next on our list. A pin is not a reference to any particular register. Instead, it is a concept in our code that represents a piece of a port. It will have a mutable reference to its containing port, as well as an integer representing which index in the PCR array it is associated with.
In order for this mutable port reference to be safe, Pin instances must only call methods of Port that affect the correct PCR. We can’t really enforce this, but to encourage it, Pin’s Port reference will actually be a pointer. This makes it impossible to call Port methods without an unsafe block, and reinforces the peculiarity of this arrangement.
GPIO and the Bit-Band
There are two ways to access the GPIO registers. The first is through a block of 32-bit registers, associated with a port. It looks something like this:
This is very convenient to work with, but has an unfortunate flaw. Each of the fields represents all 32 pins in a Port. This means that any pin changes are subject to a race condition during our read/modify/write process. Pins that are owned by a separate piece of code can have an impact on how our pin behaves.
Fortunately, ARM has a solution to this. We will take advantage of the bit-band alias. Bit-banding is a feature of certain ARM processors that maps a memory region to one 32 times as large. Each 32-bit word of this larger regions maps to a single bit of the original region. This gives us the capability to set or clear a single bit at a time, without risk of race conditions. If we visualized this as a rust struct, the bit-band alias for the GPIO would look like this:
This is what we will use to control the GPIO. Just like with Pins and the PCR registers, we will have individual GPIO structures that represent a single GPIO pin. They will ensure safety by only writing to the register words associated with their pin index. Let’s look at all that code now, then walk through it.
The Gpio struct, just like the Port struct, holds a pointer to the shared data block, as well as an index of its pin number. It has two functions: one to set itself as an output, and one to set its output value to high. Thanks to the bit-band, these functions can be implemented with a single write, eliminating the potential race condition that a read-modify-write of a shared memory address would create.
Converting a Pin into a Gpio consumes the Pin. This prevents having more than one reference to a single hardware pin. Getting another copy of a pin from the port is unsafe, so we can be confident that safe code will never make a second copy of a pin that is in use as a GPIO.
Putting it Together
We now have all the pieces for our first program. Going back to the beginning, our application will do the following:
- disable the watchdog
- turn on the clock gate for Port C
- grab pin 5 from that port, and make it a GPIO
- set that GPIO high to light the LED
This all ends up being surprisingly short in main:
Our only unsafe code in main is creating the mutable references to the various register blocks. Creating these is always unsafe, since more than one would violate Rust’s memory safety rules. The rest of the code is 100% safe.
It’s finally time to send our first pure-rust embedded program to the
Teensy! Connect your Teensy to a USB port, then run make flash
. You
should see the LED on the Teensy light up once the process is
complete. If it doesn’t, double-check your linker script, and the link
sections of the _VECTORS and _FLASHCONFIG arrays. You might also
double-check the addresses of the register blocks.
Our next post will look at enabling a UART for serial
communication. This will give us access to real panic messages. We’ll
take advantage of panics to enforce some of our rules about duplicate
pins, similar to how Rust’s RefCell
panics on duplicate mutable
accesses.
1 All genuine Teensys have an orange LED. If yours has a different color, I’m sad to say it’s a knockoff.