What do you get when you combine an FPGA with some SDRAM and an OV7670 camera, transferring uncompressed frames through UART? Really poor performance, reliability and a pretty useless end product…
Though just because it sucks that doesn’t mean it isn’t worth doing! I’ve had my first exam and I’ve wanted to crack on with this one for a while – I had a go at this kind of thing during the summer and was able to store frames is a BRAM allowing for images of 160×120 (8bpp) to be stored (160*120*8 = 153600 bits of the 276480 that my Cyclone IV has). It was pretty cool seeing pictures streamed from the camera through UART to my laptop but kinda poor knowing the maximum resolution I could use was 1/4th the resolution the camera was capable of generating. At this point, I hadn’t written an optimised SDRAM controller (I’d written a poor one but that’s about it).
Now however, I’ve written a nice SDRAM controller which happily runs at 96MHz with a pretty reasonable throughput (2 cycles/write on open rows, more [3 for CAS2] for reads). Combining this with my new love for throwing FIFOs everywhere with a bit of VHDL to capture signals leads to a camera which can store frames in an SDRAM buffer. Add in a UART handler and my final year project UART and you have a UART connectable webcam!
As can be expected, UART is slow. The maximum that most USB based UART converters can support is generally 12MBaud. 12MBaud translates to 1.95fps ([12M/10]/[640*480*2]) for 16bpp images, shocking! Even worse is the fact that the PL2303HX on my FPGA development board is obviously a fake and doesn’t support 12MBaud transfer rates. The maximum I can get out of it reliably seems to be 1228800 Baud though if I don’t mind the odd frame dying, I can run it at 6MBaud. 6MBaud can display 640x480x16bpp frames at just under 1fps for ideal conditions – I was only able to achieve around 0.5fps in practice.
This issue obviously comes with the fact that I’m trying to send uncompressed frames over a slow interface (6MBaud isn’t THAT slow considering early internet was 57.6kBaud). I’m currently looking into implementing some DCT based compression to speed up transfers.
To receive frames, I’ve written a C# program which sends a single byte through UART and waits to receive a full frame. Upon receiving the frame, the program prints the bytes to a picture box and a rich text box depending on which tab is selected. Either I suck at C# (exceptionally likely) or C# is a really really reallllllllly poor language. Running at 6MBaud (serial reception in a separate thread) with no scaling (i.e. receiving RGB565 and setting pixels on the picturebox) consumes around 28% of CPU resources which makes my computer melt! Running at RGB332 decreases this load due to the reduced throughput of data along with running unscaled small resolutions.
160×120 RGB332 scaled and unscaled – hello university room
320×240 RGB332 unscaled and scaled
640×480 RGB332 and 640×480 RGB565, note the colour banding!
320×240 RGB565 unscaled and scaled
160×120 RGB565 unscaled and scaled
All scaling is performed through pixel skipping so it isn’t particularly good though its efficient and easy to implement on the FPGA side. Resources remain at ~1000/6272 regardless of scaling.
To ensure that maximum throughput is achieved with minimal pixel dropping, a FIFO is between the OV7670 capture module and the SDRAM handler. The OV7670 pushes pixels into the FIFO whenever a new one is ready and the SDRAM handler tries its hardest to ensure this FIFO is always empty. I was originally aiming to keep the FIFO less than half full but this introduces issues with emptying the FIFO at the end of the frame – basically more effort than I was at the time willing to put in. By trying to keep it empty, the SDRAM handler does more work though it generally means pixels get written the second they enter the FIFO and the FIFO is kept empty. When the SDRAM handler isn’t trying to write pixels, it looks for read requests from the UART handler. These read requests are to grab pixels from the SDRAM and send them to the C# program. The main components are therefore the OV7670 capture module, the SDRAM handler, the SDRAM controller (convert read/write requests into SDRAM happy signals), the UART handler and the UART module.
System block diagram
This is in light of my soon to be stereo camera project. The code for this is integrated with a lot of Altera components (FIFO and PLL) but it’ll be on my Github anyway!