ZPUino extreme (the smallest version of it) successfully implemented and tested on FPGA. This core can do 1-cycle operations on many instructions, hence a lot faster than traditional cores. It also features a separate stack/memory, so memory is now mostly free and available for a DMA engine.
It might need a small redesign however, due to slowness of internal Block RAM. Despite timing things went very good. It still needs interrupt support, but should not be very intrusive.