Part 2: Sending a Message
Overview
In this post, we’re going to set up the Teensy for serial communication with the outside world. We’ll be able to use this serial connection to send debug information back to our computers.
Serial communication is very timing-sensitive. Because of this, we can’t use a serial connection until we have a very stable clock source configured.
A serial interface
To connect the serial pins of the Teensy to our computer, we will need some sort of USB serial interface. I’ll be using an Arduino Leonardo for this. Any arduino-compatible device with at least one hardware serial port will work, including a second Teensy. You may also be able to find an FTDI breakout board or cable - but they’re pricey enough that a second development board is probably a better investment. Trust me, you can never have too many development boards!
The Teensy 3.2 and 3.5 are 5V tolerant, meaning they can be safely used with just about any USB to serial converter (including the Arduino Leonardo that I’m using). The 3.1 and 3.6 require an adapter running at 3.3V. For them, you’ll likely want to either use another Teensy, or a dedicated 3.3v serial adapter. Be careful about USB to RS232 adapters. They run at much higher volatages than the microcontroller, and will fry your teensy if you try to use one. You want an adapter that has pin headers, not one with a DE-9 plug.
If you’re using an Arduino or Teensy as your adapter, check out the serial_forward Arduino sketch in the code repository. Running this sketch will cause any output from our Teensy to be forwarded through our Arduino to the IDE’s console.
Cleaner Register Handling
In the first post, we hand-coded all of our volatile register accesses and bit manipulations. This is useful for understanding how the MCU works, but has a few problems:
-
Complex bit manipulations become “write-only” - it is hard to read the code and understand what’s happening
-
We skipped some types of bounds checking, because it would have made the code even harder to read. It would be nice to have this.
-
Volatile accesses introduce lots of extra “unsafe” blocks, making it hard for us to see when we have code (like the Port and Gpio) that actually does weird things with memory.
To solve these, we will use a pair of crates from Philipp Oppermann:
volatile
and bit_field
. The volatile
crate provides safe
wrappers for accessing volatile integers, like our hardware
registers. This is paired with bit_field
to streamline our
manipulation of those registers.
Rewriting the Watchdog
The Watchdog provides a good demonstration for how these crates are used. The updated code is shown below:
All of the struct fields are now Volatile
. This type is always the
same size as the integer it wraps, so it can be dropped in like this
without requiring any changes to our struct layout. The new
function
is unchanged, and is still unafe. Most of the change comes in the
disable
method.
Even though updating these Volatile
values is no longer unsafe, we
still need to keep this function marked as unsafe
. Rust considers
certain interactions with packed structs to be unsafe
since they can
violate platform alignment constraints. This should never happen when
doing register accesses like this, but unfortunately the compiler does
not know that.
Other Structs
The changes to the SIM are very similar to those for the watchdog -
switching to update
and set_bit
, instead of raw volatile
operations and bit manipulations. The Port and Gpio structs have some
special considerations:
-
Since
Port::set_pin_mode
can affect other objects (Pin or Gpio instances) it needs to stay unsafe, even though it now only uses safe functions. -
The GPIO methods all still need unsafe blocks to deal with the raw pointer to the GPIO struct.
Both of these are fine, though - we’re actually doing weird things with memory in the Port and Gpio structs, so they should be flagged as unsafe. Now that we aren’t forced to use unsafe for all register accesses, it becomes more obvious that this code requires special care.
All of these changes can be seen [on github][branan].
A more stable clock
When the MK20DX256 boots, it runs off of an internal reference clock. This is more than fast enough for our needs - The processor runs somewhere between 20MHz and 25MHz - but it is not accurate enough. In order to connect to another device over a serial interface, we need a clock error below about 3%.
Fortunately, the Teensy is equipped with an accurate 16MHz crystal oscillator. We will use this crystal as a reference to establish a very stable 72MHz frequency for the MCU.
If you’re using a Teensy 3.6, you’re on your own here - the clock generator in the MK66FX1M0 chip has very different parameters than the one in the earlier chips. If you’re well-versed in reading data sheets, you should be able to sort out the differences. Feel free to email me if you need help here.
If you’re using a Teensy 3.0, the maximum rated clock speed is slower. You’ll need to update some of the parameters used to stay within the manufacturer’s rated speeds, but the overall register layout is the same.
Clock Modes
There are a number of modes that the MK20DX256’s clock system can be in. They control which basic oscillator (the very accurate external crystal or a less-accurate internal reference) is used, and how that reference is multiplied to produce the final clock speed.
The procesor starts in FEI mode - “FLL enabled, internal (reference)”. This means that the clock is generated by the “Frequency Locked Loop”, based on the internal 32KHz reference. Our goal is to transition the clock generator to “PEE” mode, which means the clock is generated by the “Phase Locked Loop”, based on the external 16MHz crystal reference. The difference between frequency- and phase-locked-loops is outside the scope of this post, but in general a PLL creates a more stable output frequency.
From FEI mode, we will transition the processor through FBE and PBE modes, before arriving in PEE mode. We will rely on Rust’s type system to ensure that our transitions between these modes are safe and correct.
The Clock Registers.
Before we dive into the clock’s state machine, we’ll look at the basic clock registers. Like all the other functional units of the MK20DX256, the MCG (Multipurpose Clock Generator) is represented by a block of registers at a known address:
This is the familiar pattern of defining the memory layout as a packed
struct, with our new
method being an unsafe way to create that
struct at the appropriate hardware address for these registers.
The Clock State Machine
Each of the clock mode transitions represents small modifications to the MCG - typically just a few bits changed in a register. Unfortunately, the set of valid modifications changes with the mode. Invalid modifications can cause the processor itself to fail in unexpected ways (all CPUs rely on a stable clock, after all). We could just rely on ourselves to only program the MCG properly, but we would be better served by building tools to ensure we do it safely and correctly.
We’ll start by defining each of our possible states as a struct holding a mutable reference to the MCG. Each of these structs will have methods impl’d on them that allow only the changes to the MCG which are safe in that state. There is no struct for PEE mode, since we have no need to modify the MCG at that point. A more complete impmlementation would include PEE mode, as well as the other states that we aren’t using.
Now, for each struct, we will need to define the functions that transition the clock to the next state. We will start with the FEI to FBE transition. For this, we need to enable the crystal oscillator, and then switch our output to it. This second operation will also consume the Fei instance, returning an Fbe instance. This represents the transition of the MCG from using the FLL based on the internal reference (FEI mode), to being clocked directly from the external crystal (FBE mode).
Even though we’re transitioning away from using the FLL, we want to be sure that we continue to operate it within its normal parameters. This is why we must pass in a clock divider for it - our 16MHz crystal is far too fast to be used as a reference for the FLL without this divider. Making sure that we do things right will also make it easier to expand this code to support FEE and other modes in the future.
In both functions, we must wait for our changes to take effect before we continue. We read the status register in a loop until the MCG reports that it is in the expected state.
From FBE mode, we transition to PBE mode by enabling the PLL. This will be done as a single function, which takes the PLL divider parameters. Our output frequency will be the crystal frequency (16MHz) multiplied by the fraction numerator/denominator.
This gets us to PBE mode, or “PLL Bypassed, based on the External oscillator”. The PLL will be running at our desired frequency here, but not selected as the main clock source. The last step is to engage the PLL, making it our actual clock source. There’s a little bit of weirdness in our wait loop here, due to how the MCG reports which of the FLL or PLL is in use.
This function consumes the Pbe instance, but does not return anything. This is because we are “done” with clock setup here. If you wanted to potentially transition out of PEE mode, you would return some sort of value here to allow continued modification of the clock state.
That’s it for the clock mode transitions! When we update main, we’ll work through each of these clock modes, starting from FEI, to move the MCG to the state we want it in. At each step, we know that we can only make the modifications that are safe.
The last part of clock setup is getting our initial Fei instance from the MCG. The MCG, however, doesn’t know that it’s in FEI mode. We need to query the various control registers to determine which state it is actually in, and return the appropriate struct. For the purposes of this blog post, we’ll panic if the clock is in an unknown state. In production code, you’d want to implement the full set of clock states so that this function would always return a valid value.
There is a bit of unsafety here, to coerce the oscillator selection field to our enum type. We otherwise are simply comparing the set of control registers to the expected values for each known mode.
The Oscillator Unit
I wasn’t entirely truthful when I said we were “enableing the crystal”
earlier. What we actually did was select the crystal as the clock
source for the PLL. The crystal oscillator itself needs to be enabled
and configured through its own set of registers. We’ll put the code
for this in a file called osc.rs
.
The oscillator is very simple compared to some of the functional units we’ve had to deal with. Most of the body of this function is setting the capacitive load for the crystal (between 2pf and 30pf). It also sets the enable bit, to turn on the oscillator.
Putting it Together & Updating main
We’ve created two new files for managing two new functional units of the MCU: the MCG and the OSC. The OSC has a single function to enable the unit, and set the capacitance needed for the external crystal. The MCG has a series of structs that define a state machine for bringing the clock machinery to the correct mode. We’ll update main to take advantage of these. At the end we still just turn on the LED, but in between we now set up the clock machinery to run the MCU at its rated 72MHz.
This looks a lot more complicated, but is really only a few changes. From the top:
- We add the OSC and MCG to the list functional units we want references for.
- After the watchdog is disabled, we enable the external oscillator, with 10pf of capacitiance. This is the right amount for the cyrstal on the Teensy. You might need a different number, if you have a different board.
- We set the “clock dividers” of the SIM. More on this below.
- If the clock is in FEI mode, we transition it to PEE mode using the set of functions we defined above. If it’s not in FEI mode we panic, since it should always been in FEI mode at boot.
The only thing we haven’t already covered is the new set_dividers
method of the SIM. We want to run the main CPU core at 72MHz, but not
all the parts of the chip can run that fast. We need to keep the
peripheral bus below 50MHz, and the Flash below 25MHz. This function
sets up the “clock dividers”, so that we can keep the bus and flash
running at the slower speeds that they are rated for.
The Serial Port
Now that we have a stable clock, we can move on to configuring the serial port, or UART (Universal Asynchronous Reciever/Transmitter). For our needs the UART itself is fairly easy to program. Most of the challenge here will be in expanding our pin handling to support serial functionality.
More About Pins
When we first set up a GPIO pin, we had a couple of advantages:
- All pins can act as GPIOs, so we didn’t need any validation logic.
- All pins use the same mux value when configured as a GPIO.
For the UART, neither of these shortcuts holds. Our setup code will need to know both which pins can be used for serial communication, and which mux value will appropriately connect those pins to the UART.
Just as for the GPIO, we will add functions to the Pin struct to convert a pin to a serial-specific struct.
Each Tx or Rx instance includes which UART it is valid for. We could choose to encode this in the type system, but that sends us down a road of a lot of complexity (possibly including separate Structs for each UART). We still potentially panic when converting an invalid pin to an Rx or Tx instance, so the tradeoff of avoiding another panic when passing that pin to the wrong UART doesn’t seem worth it. Your needs might be different.
This code also reference a new PortName::B
. This is easy to add to
the existing Port and GPIO code. The Port lives at 0x4004A000, and the
GpioBitband is at 0x43FE0800.
We now have the code in place to set up pins B16 and B17 as UART
pins. These map to pins 0 and 1 on the Teensy, the same as Serial1
when programming the Teensy with the Arduino IDE.
The UART
The UART is our first struct that will require a complex ::new
implementation. In addition to selecting which UART unit we will use,
the method must handle several other parameters:
- An optional Rx pin, to enable recieve functionality
- An optional Tx pin, to enable transmit functinality
- The clock divider, as an A,B pair. This is interpreted as
A + B/32
.
On the Teensy 3.5 or 3.6, you could choose to use a float for the third parameter. This would require you to then convert it to the integers used by the hardware module itself. It’s up to you if this extra work is worth the simpler interface.
This starts with validating the parameters: The Rx and Tx pins must be usable for this UART, and the clock dividers need to be in the acceptable range. The clock dividers are passed to the hardware. Lastly, recieve and transmit are enabled if appropriate pins were passed in, and the Uart reference is returned.
String Output
To use the UART as an output device, we will implement the
core::fmt::Write
trait for it. This will enable the write!
macro,
and make it easy to use a UART for output from our panic handler.
Most of the functions in the Write
trait have default
implementations, which we will rely on. The only one we must implement
is write_str
. Our implementation writes each byte of the string in
sequeuence, waiting in between for the hardware to indicate that it is
ready for another byte. Once all the bytes are written, we wait for
the Transmit Complete (TC) flag to be set, indicating that the UART is
finished sending all the data.
Hello, World!
Adding serial output to our main is now just a few lines of code. First, enable the Port B and UART clocks:
With those clocks enabled, we can grab a uart instance, and write a message:
This sets the UART clock divider as 468.75 (24/32). The baud rate will be 72MHz/(16*468.75), or 9600. This is a pretty standard rate, and will be easy to use with any serial adapter.
Finally, we send a message across the serial port. How you recieve this message will depend on the specific adapter you’re using. If you’re using an Arduino, open the serial console in the IDE and watch for the message to appear when you reset the Teensy.
Next Time
Now that we have good output capabilities, we will use them to make
our panic handler useful. With good panics, we can then focus on
cleaning up our hardware setup to be safer and more robust. Our goal
will be to minimize the amount of unsafe code in main
.