Well! I’ve been a busy bee recently. I took a small holiday (not holiday) to France a couple of days back for the 24 hour endurance Le Mans skate and lo and behold, the team I was taking part in – Nottingham BladeSoc Fast won our category! We got to take home some nice ol’ sponsored t-shirts and celebrate our success with a nice cool Kronenbourg, well worth the minimal sleep, searing heat and blistered feet if you ask me.
On other notes, I start work for this summer on Monday (6-7-15) back at Imagination Technologies so I’ve been moving into my current digs for that. All of this however means minimal electronics progress these past few weeks.
Regardless of all these time consumers, I’ve managed to get down to learning some basic VHDL. What’s a better way of learning than writing a microcontroller, right? I’d love to say that it was a breeze and I managed it first time but that would be a complete lie. It took me a good eight attempts or so to get anything remotely useful though I feel I’m getting there now!
I’ve decided to go for a register based approach (as opposed to a stack based approach) featuring 4 CPU registers. The system itself works on an 8bit unsigned data and each instruction word is 16 bits long (inefficient, I know!). I’ve allowed for up to 16 seperate instructions though at the moment, only 14 are implemented. This allows me two more instructions for future improvements. I’m designing the microcontroller to be implemented on my Altera Cyclone IV FPGA so I’m not particularly strapped for logic space though I can’t imagine my design is particularly efficient. Current specification:
- Maximum clock speed (calculated, 85degC worst case): 128.4MHz, potentially up to 136.6MHz at 0degC
- 256B ROM, 128B RAM with 128B of memory mapped peripheral (MMP) space
- 8bit GPIO port
Obviously, it can only be programmed in assembly as of yet so the instructions are:
- NOP: No operation
- STR: Store value in register. This instruction can store a constant stored in the instruction word into a register, or it can store one register in another register e.g. RA -> RB
- MOV: Move a value from memory (RAM/ROM/MMP) to a register
- WRR: Write the value of a register to memory (RAM/MMP)
- IDC: Increment/Decrement a register. A flag in the instruction word declares whether the register is incremented or decremented
- NND: NAND two registers together. The registers which are NAND’d are declared in the instruction word
- NOR: Exactly the same as above, just NOR’d instead of NAND’d
- SWP: Swap two registers
- SHF: Shift a register left or right by X. Direction and amount is defined in instruction word
- EQU: Check if two registers are equal. If two registers are equal, set the first register to 255
- LTH: Same as above, just a less than check is done as opposed to an equal check
- MTH: Same as above too just a more than check as opposed to a less than check!
- JMP: Immediate jump to ROM location. This instruction sets the PC (program counter) to the value defined in the instruction word
- JPS: Immediate jump if a register is set to ROM location. If a register is completely set (0xFF, 255d, 11111111b), the PC is set to the value defined in the instruction word
And thats all! It might not seem like quite a lot of instructions but I’ve written a couple of programs and tested them on the real thing.
Each instruction (opcode) is accompanied by a 12bit vector of parameters (opdata). This 12bit vector specifies what is involved with the instruction.
A favourite test: GPIO Toggle speed!
A quite fun and relatively easy program to test is seeing how fast the GPIO can toggle. I normally do this as my first test on proper microcontrollers so I can look at the GPIO speed vs the clock speed of the microcontroller. This gives me a brief look at how efficient the architecture is. In this instance, I’ll be using the memory mapped GPIO and writing to it to toggle a GPIO pin.
So! Toggling a GPIO pin requires 7 instructions. Initially, the GPIO pin needs to be set as an output. This is done by loading a register with the value 1 (GP0 will be used as the output). This value is then written to the memory mapped GPIO. For some reason however, the GPIO section as of yet doesn’t properly work and I need to invert the data bus for the correct data to be stored in the registers. I have absolutely no idea why this happens. None the less, these two instructions look like:
STR A 1 – 0001 000000000001
WRR A 129 – 0011 000010000001
Now the pin has been initialized. The next section of code is literally just toggling the pin. To toggle the pin, a 0 is loaded in to register A then written to the memory mapped GPIO output register. A 1 is then written into register A and this is written to the memory mapped GPIO output registers. An immediate jump is then executed to jump back to loading the 0. Due to the jump instruction, the duty cycle won’t be 50% and the pin will be on for longer than it will be off (load + store vs load + store + jump).
STR A 0 – 0001 000000000000
WRR A 131 – 0011 000010000011
STR A 1 – 0001 000000000001
WRR A 131 – 0011 000010000011
JMP Loc 2 – 1110 000000000010
As with writing the instruction spec, Its a load easier to use Excel to also write the programs! The programs are stored in the ROM of the FPGA and this ROM is used for execution.
To make sure I never have to wait too long with compilations and coad loading, I make sure to do all my simulations in Quartus using the vector waveform method. While its not the most efficient, its an easy way of setting up simulations.
Obviously seeing that it works in simulation, its time to test it on the real thing!
From the above image, it can be calculated that to toggle a pin, it takes an average of 12.5 instructions. The reason for this is that not each instruction is single cycle. Memory writes take 2 where as register writes just take 1.
The next test: Software PWM!
Normally, a dedicated PWM peripheral is used, for example a timer linked to a GPIO output, like in the STM32 series of microcontrollers. PWM can also be generated (slowly) in software using a counting variable, a comparison variable and GPIO writes. As I’ve not yet implemented a timer peripheral, I’ll be using the second method to generate some PWM. As my microcontroller is an 8 bit system, any variable will wrap at 255 back to 0. This allows us to essentially generate a free running counter that will count from 0 to 255, overflow and return back to 0. I will be using one of the registers as a software counter and another register to contain a constant for comparison. In pseudo code, a software PWM function would look like so:
Comp = 110 //PWM comparison, output duty cycle should be 110*100/255 = 43.13%
if(Cntr<Comp) GPIO = 1;
else GPIO = 0;
This is relatively easy to implement on my microcontroller and can be done in 13 instructions. The instruction count would increase however if you wanted to reduce the PWM resolution (e.g. overflowing at a lower value instead of 255). Some care needs to be taken when doing register comparisons (equal, more than and less than) as if the check is true, the whole register gets set, overwriting what was previously in the register. This is a problem here as whenever the counter (Register A) is less than the compare value (Register D), A should set to 255 when the less than instruction is executed which is bad as that will loose our position on the PWM count! This is solved by writing register A to register C and doing the comparison between register C and D. This is fine as register C isn’t used for any of the the PWM generation and was essentially a spare register.
From the above, it can be seen that we’re setting the comparison value to 160 (in register D). This should theoretically give us a duty cycle of 160*100/255 = 62.7%. The microcontroller is currently running off a 50Mhz clock. Using my good ol’ logic analyzer tells us that this is…
…62.917%! Not bad compared to what we wanted! As expected, setting D to 255 gives us a duty cycle of 100%.
Finally, the smallest achievable pulse width is obtained through setting the comparison value to 1. This equates to a pulse of width 0.625us which is probably more of an aliasing issue with my logic analyzer.
I’m going to continue improving this processor as it really is a great learning exercise for figuring out the many niggles I’ve come across with VHDL. I’ve managed to prove that it can actually do something useful to us as humans too which is always nice! While it might not have been efficiently coded, at least it fit on my FPGA and has a somewhat understandable instruction set… BTW, my VHDL sucks along with my assembly!