Lantana

ZX Spectrum retro game programming

If you are a child of the 1980’s, you maybe remember the Sinclair ZX Spectrum. It was an affordable home computer that could be connected to a color TV set, and used compact cassettes as mass storage.

My first computer was a Sinclair ZX-81. I learned BASIC and also Z80 assembler on it. Soon the ZX-81 was replaced by a ZX Spectrum. I programmed a lot, wrote all kind of tools and a few demos. I always wanted to write a game together with my friends, but as teenagers we lacked the necessary persistence to bring such a project to the end. Then, on the day I got my Amiga 500, I quickly lost any interest in my good old Spectrum.

But it’s never too late... I just started a tiny little game project called Coredump, written in Z80 assembler for the good old ZX Spectrum. Why? Just because I can. Because I always wanted to. And because retro programming also means a lot of fun!

This first article is about the tool chain I am using. I will add more articles as the game grows and is (hopefully) completed some day.

Back in the 80’s, programming assembler on the ZX Spectrum was a very tedious task. I had to deal with cassette tapes (and their very slow access), an assembler that already consumed some of the scarce RAM, and I had no tools that simplified the development process. When I did a mistake and crashed the Spectrum, I needed to reload the assembler, the source code and the resources from tape. Often I also lost some of my work because when dealing with tapes, saving a source code is much more work than just pressing Ctrl-S, so I rather risked having to retype the changes after a crash.

Today it’s much easier to write retro software. I can develop it on my Linux machine, which is very fast and has a lot of storage space. I use a modern text editor and a lot of powerful tools. For testing, I just need to assemble a snapshot file and run it on an emulator, which takes less than a second. If the emulator crashes, no work is lost.

These are the tools I use for programming. All of them are available for Linux and MacOS, some also for Windows.

  • Fuse is an excellent ZX Spectrum emulator, with a very precise timing.
  • A decent editor. I started with Atom, but now I am using Eclipse because it fits better to my workflow. Just use your favorite editor.
  • zasm is a nice Z80 cross assembler that is also able to generate SNA files that run on the emulator.
  • Multipaint is an open source drawing tool that handles the limitations of the ZX Spectrum graphics (and believe me, there are limitations). It turned out not to be so useful for sprite and tile generation, because it does not offer a precise control of the paper and ink color that is used in the generated screen file.
  • So I also use Gimp for pixeling sprites and tiles. Maybe I will also use Inkscape later.
  • Tiled is an excellent map editor. I use it to design the world of my game.
  • Some self made helper tools convert the graphics and the world into the binary format that is used in the game. I use Java for these tools, just because I am most proficient with Java. There is no technical reason for that, just use the language you feel most comfortable with.
  • Finally, I use ant to stitch all the parts together and run the snapshot file.

The ZX Spectrum hardware is very simple and easy to understand (which also means that you have to do a lot of things without hardware aid). The Z80 processor has a simple instruction set. So retro programming is not just for the old-agers, but also for the young generation who is interested in a first approach to the hardware level of computers. It is also fun to get the most out of a limited and slow hardware.

There is a lot of documentation available in the net:

  • World of Spectrum has a lot of hardware documentation in the references section.
  • A quick overview of all Z80 instructions and their timings.
  • A commented ROM disassembly gives a first look at the Z80 assembler, and also offers some useful functions (like multiplication, the Z80 itself does not offer any multiply or divide instructions).

When I started looking for resources to the ZX Spectrum, I was surprised about how active the retro scene is. There are a lot of blogs offering tutorials that explain hardware tricks, and there is even a demo scene showing you things you’d never thought to be possible on that machine. After all, the Speccy is almost 35 years old now, and wasn’t famous for a powerful hardware even back at its time.

Optimizations

On a slow processor like the Z80, it is essential to think about execution time. Often a clean approach is too slow, and you need to optimize the code to make it a lot faster.

The ZX Spectrum screen bitmap is not linear. The 192 pixel rows are divided into three sections of 64 pixel rows. In each of these sections, all the 8 first pixel rows come first, followed by the second pixel rows, and so on. The advantage is that when writing characters to the bitmap, you only need to increment the H register to reach the next bitmap row. The disadvantage is that a pixel precise address calculation is hell.

This is how the coordinates of a pixel are mapped to the address:

HL
1514131211109876543210
010Y7Y6Y2Y1Y0Y5Y4Y3X7X6X5X4X3

X2, X1 and X0 represent the bit number at the address. It can be used as a counter for right shift operations.

My first attempt was a straightforward code that shifted, masked and moved the bit groups into the correct places. It took 117 cycles. This is nice, but we can do better.

We need a lot of rotation operations to shift the bits to the right position. Rotation is a rather expensive operation on a Z80, because there are no instructions that rotate by more than one bit at a time. My idea was to divide the X coordinate by 8 (by rotating it three times to the right) and simultaneously shift Y3 to Y5 into the L register. With a similar trick, I could set bit 14 while rotating, which saved me another or operation with a constant.

This is the final optimized code. It takes the X coordinate in the C register, and the Y coordinate in the B register. The screen address is returned in the HL register pair. BC and DE are unchanged, so there is no need for expensive push and pop operations.

pixelAddress:   ld      a, b
                and     %00000111
                ld      h, a    ; h contains Y2-Y0
                ld      a, b
                rra
                scf             ; set bit 14
                rra
                rra
                ld      l, a    ; l contains Y5-Y3
                and     %01011000
                or      h
                ld      h, a    ; h is complete now
                ld      a, c    ; divide X by 8
                rr      l       ; and rotate Y5-Y3 in
                rra
                rr      l
                rra
                rr      l
                rra
                ld      l, a    ; l is complete now
                ret

It only takes 108 cycles, ret inclusive. Optimizing saved me 9 cycles (or about 8%). This doesn’t sound like much, but if the code is invoked in a loop, those 9 cycles are multiplied by the number of loop iterations.

I claim this is the fastest solution without resorting to a lookup table. Try to beat me! 😁