Sunday, December 23, 2012

ZPUino - The "was", "is" and "will be"

Ok, guys, I'm not very fan of copy/paste, but here's a copy of what I just posted to ZPU mailing list:

Hi guys,

Since it's almost Christmas it's perhaps time to get you all updated about ZPUino, what has been done and accomplished so far, what is being done right now, and
what future holds.

The ZPUino project started back in 2010 and published first alpha release in December the same year. The objective of the project was to implement an Arduino
(wiring) compatible platform, but running with a ZPU core and devices similar to those present on Arduino AVR devices. The project developed in several phases
and with several hardware versions for each phase. It started by a simple SoC using the traditional ZPU core, and with some basic devices like UART and SPI. A
software bootloader/programmer was also implemented, using the standard serial port and a variant (very variant) of HDLC protocol for communication with
programmer devices - ZPUino was designed to bootstrap its "sketches" from an external SPI flash, and logic for programming those flash devices was split between
the host programmer (which now is known to run on major operating systems, like Microsoft Windows, Linux and MacOS), and the device programmer.

Everything was set up to allow almost seamless migration of Arduino code into ZPUino code.

During this first phase the Arduino IDE/Wiring library was adapted to support ZPUino, and a new compiler mode was then implemented, since it did not support
multi platform (as of now, it does, but I still keep the "make" approach I designed back then).

The second phase relied on hardware design. A new core was implemented (ZPUino Premium), which had a full 3-stage pipeline and was able to execute most basic
instructions in one clock.
Some new core devices were also added, like Audio (sigma-delta), and complex PWM-able timers. The main IO interface is wishbone compliant, so any wishbone
compliant device should work with the design (I've tested a few, like OpenCores I2C, and works like a charm). A few design variants were written, like memory
mapped VGA, DMA VGA (such as the ZX Spectrum version), audio synthesis, and many more. But only internal RAM (BRAM) was supported.

There was a singular variant of this design, one which actually implemented a new instruction (which I called FMUL16), which could perform a 16.16 fixed point
multiplication, and speed up some operations. This variant was used in the SoundPuddle project.

Let me now tell you about the SoundPuddle project.

Back in April this (2012) year, I was contacted by John English from Colorado, US, asking if ZPUino could do real time signal analysis for a project he wanted
to show in Apogaea 2012.
After some initial analysis I said it was feasible, and so we moved to implement the thing on ZPUino in a S3E500 board (Papilio One), from Gadget Factory. It
was indeed feasible, and it was a huge success. It was improved and shown at Burning Man festival the same year. Feedback was awesome.

For some low-level details on this one:

A 1024-point FFT was implemented in software, whose inputs came from an external ADC. The FFT code was entirely done in assembly code (a whopping 177 bytes!),
using the FMUL16 instruction. This was fast enough for what the project needed (actually, it ended up being too fast, and we had to add some delays). The real
constraint here was the amount of memory available of the device. The system ran with around 40KB. Tough, but possible.

Intro video for Kickstacker is here:

Almost at same time, Jack Gasset  (from Gadget Factory) started the Retrocade Synth project: . This uses now the Extreme core, as described below.

Both projects were successfully funded, and are now shipping to its supporters.

Back to the design:

The core, due to it's pipelined design, required fast memory since it needed to simultaneously read the instruction stream, read stack values and write back
stack. And we were
very limited on block RAM, so it was time to move to another design.

ZPUino Extreme was then born.

ZPUino Extreme took another approach - it used block RAM for the stack (which was fixed, 4KB or 8KB), and used external memory for the program area and data. In
order to do so, we designed memory interfaces (SRAM, SDRAM and DDR-SDRAM), all working in wishbone pipelined mode, and added a simple, direct-mapped instruction
cache. This allowed us to run larger codebases, and access more memory than usual. This is still the fastest core if you need large code/data, and can live with
the limited, non-switchable stack. For most single-task applications, this is indeed the core you need.

But for complex designs this was still not enough. The fixed, limited stack prevented us from running more complex applications. At first a simple
write-back-stack, read-new-stack approach was tried, but was somewhat complex, and very slow.

So, ZCoreV3 was born :)

Yes, I decided to change the name for the core. I was running out of acronyms :P - now, seriously, I though a lot about the naming of ZPUino cores, and they
wouldn't cope with further development improvements, so I went radical.

First of all, ZCoreV3 is not yet in production, although it's considered (by me) stable. It's stability will be proven during next months, although I'm feeling
confident. A few improvements are also being thought of, so it might take a while before a first stable version is available to you all.

So, what's so different about ZCoreV3 ? Well, something simple, but something very complex: the stack is no longer fixed.

Although this might look like a simple thing, it's indeed the most complex thing I did in hardware!!!

ZCoreV3 shares the same pipeline and instruction cache as ZPUino Extreme, and adds a data cache, direct-mapped, one-way associative, dual-ported, write-back,
which can in "hit" scenarios attain a 1-clock read delay, and 0-clock write delay. Only one of the ports is writeable, though. Conflicts (r/w) are handled by
the cache itself,  so the core does not need to address that. The core is also slightly different, featuring not only TOS cacheing, bu also NOS cacheing (but
TOS is always written back for stack push operations).  Further improvements are to identify "hot" cache lines (those being accessed as stack) and perform
write-through for some memory accesses (or eventually convert it to a two-way associative cache).

So, since ZCoreV3 design is able to address a lot of memory, and not many restrictions on it's use (if any), we can probably put it to some real work....

... and it now runs Linux (MMU-less version)!

There are still some things needing implementation on Linux side (and uClibc), and a few stability issues, but things now look very promising.

I'm uploading a small video of it running on Gadget Factory Papilio Pro board (S6LX9), with 8MB SDRAM, and a real SD card. You can see it here:

A few things still to address. Some stability issues need to be addressed (all those are software, eventually related to kernel stack switch), some functions
(memcpy, memset, string functions) need some optimizations (ie., assembler versions, memcpy already has one), the SPI controller is limited to 8-bit, which
makes it very slow (as you can see from the video, takes some time to exec. the first application), and some more, which I'll address. First, make it run
stable, then optimize.

I'm hoping to get this to run on S3ESK soon, at same speed (96MHz), so you guys can also help (I know some of you have this board at home).

Plans for the future: oh, well, first, get Linux and other operating systems running stable, getting DMA to work properly with the dcache, some new VGA
adaptors, what else....

Let's hope 2013 is a good year for ZPU and ZPUino.

A few thank-you:

- To all ZPU and ZPUino users, we're doing this for you, thank you !
- My family, for their support (although they don't know what I'm doing! :P )
- Jack Gassett, and Gadget Factory, for they support with hardware and ideas! Thanks Jack!
- John English, the SoundPuddle Engineer, for the real-world use of ZPUino and a lot more!
- All those who helped with ZPUino, they are so many I won't risk forgetting anyone, so you're all included!
- All ZPU fans!

As always, any doubts, questions, opinions, so on, are very very welcome!

And have a merry Christmas!


PS: I'm not explaining something here - it's a challenge to your intellect and HDL knowledge :P I'll just say "data cache", hopefully someone will question how
is it possible. lol!

And merry Christmas to you all :)

Sunday, October 21, 2012

Small demo video of Linux and ZPUino

A small demo video of Linux running inside ZPUino emulator:

Some small tech details:

Linux 3.4.0 (uclinux/Linux MMU-less) with uClibc 0.9.29 and busybox 1.20.2.

Wednesday, October 17, 2012

Judge for yourselves

Hi guys, just leaving this here, so you can judge for yourselves what's going on:
console [ttySZ0] enabled, bootconsole disabled
ZPUINO: UART at 0x8800000, irq 1
brd: module loaded
loop: module loaded
Registering ZPUino SPI driver
ZPUino: probing for SPI controller
zpuino_spi zpuino_spi.0: master is unqueued, this is deprecated
zpuino_spi zpuino_spi.0: at 0x0A800000
ZPUino. SPI controller initialized 00281000
mousedev: PS/2 mouse device common for all mice
mmc_spi spi0.0: SD/MMC host mmc0, no DMA, cd polling
Waiting for root device /dev/mmcblk0p1...
mmc_spi spi0.0: setup: unsupported mode bits 4
mmc_spi spi0.0: can't change chip-select polarity
mmc0: card lacks mandatory switch function, performance might suffer.
mmc0: new SDHC card on SPI
mmcblk0: mmc0:0000 SD    30.8 GiB 
 mmcblk0: p1
VFS: Mounted root (ext2 filesystem) readonly on device 179:1.
Freeing init memory: 64K (1000 - 11000)

BusyBox v1.20.2 (2012-10-17 09:44:15 WEST) hush - the humble shell

/ # uname -a
Linux (none) 3.4.0-uc0 #533 PREEMPT Wed Oct 17 19:04:33 WEST 2012 zpu GNU/Linux
/ #

Wednesday, July 18, 2012

SoundPuddle, now also on KickStarter

Some of you might already heard of SoundPuddle. The SoundPuddle is an interactive space of visual-acoustic synesthesia. This spectrographically colorful dome creates color and light from sound, illuminating every noise you make on an immersive canopy of light. You will laugh, shout, and sing as thousands of solar powered LEDs unify your ears and eyes.

This unique art piece was first created for the Apogaea Festival -- a collaborative outdoor arts and music event that is held in the beautiful mountains of Colorado.Its primary structure is 24 foot (7.25m) wide dome - large enough to fit a band and their gear, yet small enough for a single person to play and explore. 

At SoundPuddle heart is a Papilio One 500 FPGA, running ZPUino and performing real-time signal analysis, using 10 SPI controllers to handle the almost 1700 RGB led present. All this was possible due to close collaboration between John English, the project head, Jack Gassett, from Gadget Factory, and myself. The display at Apogaea was very successful, and a lot of positive feedback was received.

The project wants to grow, and for that it needs more funding. A Kickstarter project just started for this same reason:

You can find general information about the project at the website.

Again, a fantastic project, based on ZPUino, and with a lot of room for improvements! If you like the project, consider donating - we never know if SoundPuddle will someday be near you, so you can enjoy it too.


UPDATE: check also the Photo Stream!.. Some nice stuff there to see.

Thursday, June 7, 2012

RetrocadeSynth project - now on Kickstarter

A new ZPUino-based project on Kickstarter !

The RetroCade Synth boasts the capability to play the built-in Commodore 64 SID chip, the Yamaha YM-2149 chip, .mod files, and MIDI files - all at the same time! The RetroCade Synth can be played via any external MIDI control interface or via your favorite audio/ sequencing software. We have built a custom VST software dashboard which gives you visual control over all the various parameters the synth has to offer.

An excellent initiative from Gadget Factory. Go and support Open Source Hardware!


Friday, May 11, 2012

1.0.1 slightly delayed


just to let you know that 1.0.1 release is slightly delayed - It was scheduled for May 13th, but due to a few (non-technical) reasons it's now due on May 15th.

This is what is expected:

  • The multiplier and shifter are now merged together. Smaller implementation.
  • Added a new ZPU instruction, FMUL16, opcode (01h), which performs a fixed-point 16:16 multiplication. Useful for FFT and other math computations.
  • Multiplier is now signed instead of unsigned
  • SPI controller is now able to do 16, 24 and 32-bit transfers
  • UART now has a bit depicting if all transmission has ended. Needed to fix a bug when your sketch changed baudrate and it was still transmitting.
  • Updated UARTSTATUS register index on IDE
  • IDE now reports a better size for sketches, including program and code size.
  • Removed misinclusion of <new> in some headers
  • IDE can now build and include asssembly (.S) files in the sketch folder
  • SPI interface is now directly connected on most boards, so to speed up sampling.
  • Added new board (Papilio One 250 with extra 2Kb RAM). All P1 250 are compatible.
  • SmallFS was updated to use the new SPI multibyte transfer sizes.
  • A new HDL core: a 16:16 fixed point square root IP
As always, expecting your feedback.


Thursday, April 12, 2012

New ZPUino cores available

A few new ZPUino cores are now available. These are variants of the main design, and can be found on the "variants" folder of the relevant board. Additionaly you can find there the prebuild bitfiles and some documentation, inside the "release/latest" subfolders.

New cores are:

P1 500 "Apollo" variant:
- Includes an YM2149 audio synthesizer

P1 500 "sid" variant:
- Includes an SID audio synthesizer

P1 250 "Apollo" variant:
- Includes an YM2149 audio synthesizer, and an extra serial port.

Other cores to follow in the next few days.


HDL Repository has moved

Due to some recent issues with repo.or.gz, I moved the HDL repository into github.

To change your current checkout, do:

git remote add github git://
git fetch github
git branch --set-upstream master remotes/github/master

All should work, I hope.


Tuesday, April 10, 2012

ZPUino 1.0 Released

Yes, that is true.

ZPUino 1.0 is now available for you to use and enjoy.

Expect some updates on next few days - release cycles are about to change, we're switching to release early, release often.

As always, direct any questions to

All information in the usual place

Friday, March 16, 2012

Preparing release 1.0

I think it's time now to wrap up all bits and release a 1.0 version, which hopefully will be before the end of March.

What to expect from this release ? well, basically:
  • The new improved extreme core (things will go faster)
  • Smaller generated code for sketches
  • Upload-to-RAM feature
  • New timer infrastructure (means, PWM on steroids)
  • Easier to tailor to your own needs
  • New VGA interfaces
  • Optional I2C (integration from OpenCores)
  • Faster SmallFS implementation
  • Arduino 1.0 IDE
  • Eventually more memory... this depends on the board itself.
  • Per-board variants (like I2C, VGA)
  • Audio chips!! YM2149, Pokey and SID. Note that due to these "chip" sizes, it might not fit your FPGA.
  • Papilio Plus support (S6LX4 for now)
For the hardcore details:
  • New DMA interface (not controller). It's used for VGA, so it can use regular memory as framebuffer.
  • Better external memory integration. This one is still under test, but might make it.
  • PWM have now 2 comparators (low and high). Also you can add as much PWM as you wish per timer.
  • Decreased serial FIFO: means, more block ram memory available.
  • Other goodies. See official repo branch for more details.
Other stuff is in the forge, but might not make it for this release. I think I've delayed it for too long now. Release early, release often.

As always, feel free to send in your comments to

PS: Did I told you I will also publish a ZX-Spectrum VGA compatible output, and that it will be accompanied by a "arduino sketch" Jet Set Willy game engine implementation (not a 100% implementation, but close) ? Yes, it's true. And you can have fun making your own maps, sprites and variants!. I'm just finishing some documentation, so you can go and make it a 100% implementation if you wish.


Thursday, January 12, 2012

How complex ?

How complex can a voltmeter be, with ZPUino, my yet-to-be-released SerPro3 software and a Gadget Factory ADC wing ?

Well, judge for yourselves:

This is ZPUino code (29 lines):

#include <SerPro3.h>
#include <SPI.h>
#include <Papilio.h>


Papilio::ADC8_Wing myadc;

void setup()
myadc.begin( Papilio::Wing_A_Low );

uint16_t sample(int channel)
return myadc.sample(channel);

void loop()
if (Serial.available())



And this is PC code (23 lines, Linux):

#include <SerPro/SerPro-glib.h>


void go()
while(1) {
printf("Voltage: %.02f\n", ((double)sample(0)*3.3)/255.0 );

int main(int argc,char **argv)



And this is how it runs, with ADC connected (channel 0) to 1.2V power rail:

$ ~/sketchbook/remote/pc/remote /dev/ttyUSB1
Channel set up OK
Voltage: 1.20
Voltage: 1.20
Voltage: 1.20
Voltage: 1.20
Voltage: 1.20

Not much of a hassle, is it ?