You may remember Epic Comeback, the small but punchy Commodore 64 demo I made last summer. There is a utility called C64 Debugger which, among other things, can visualize what's happening in the C64's memory while a program is running. This can reveal some interesting stuff, even if you are not that much into programming in assembly. I made a video of it, check it out in full screen!
Some of the observations that you can make...
‐ The memory activity colors are: Blue for reading, Red for writing, Green for mixed reading and writing.
‐ Upon a hard reset the Commodore 64's memory is initialized with a 00 00 00 00... FF FF FF FF... byte pattern, so you can see stripes in the visualized memory.
‐ When the demo is first loaded from disk, it takes up about 1/3rd of the 64K RAM. It's compressed though. First the decompressor copies the whole compressed data to the top of the memory, then starts working on it and fills up the memory with decompressed data from the bottom up. At one point the decompressed data starts overwriting the original compressed data (red line) while the rest of the data is still being decompressed (blue line). Red catches up at the end, just as the decompression ends, filling up nearly the whole 64K RAM of the Commodore 64. This is so much data that the C64 kernel couldn't load it from disk in a decompressed state by default.
‐ The chessboard animation is not too interesting in memory activity, it's mostly a character set based animation with a bunch of video matrices and video bank switching. On the bottom of the screen you can see the constant update of those zero page variables that the animation is using.
‐ When the animation goes away, the demo generates a bunch of wave tables for the plotter, this takes up about 1/4th of the RAM in the middle of the visualizer. The rest of the memory is filled up with two generated executable code blocks, one for each video buffer. The effect is redrawn on each screen refresh, but uses so much CPU time (well, all of it) that double buffering is necessary to avoid screen tearing.
‐ In the middle you can see the X and Y double-sine wave of each of the 128 plots (this looks awesome in the memory visualizer) applied to the lookup wave tables, to figure out where to draw the plots in the video memory. This is all reading, so it's shown with Blue color. The plotter drawing is pure writing, so it's shown with Red color above the waves.
‐ The generated plotter drawing code gets executed and self-modified all the time, as you can see it from the activity on the bottom part of the memory visualizer. As the plotter slides out of the visible screen area, less and less plots need to be drawn, so the memory darkens a bit until the next plotter configuration slides up into view.
‐ I did a hard reset at the end, so you can see how the memory is cleared up before the kernel initializes the BASIC interpreter, which is the default user interface of the Commodore 64.
As I mentioned before, when I started out developing Retro Assembler, I had a demo in mind that I wanted to make with it as a starter project. I've done it, took some time, I worked on it between July 25 and August 10 – finished it just in time for the 19th Árok Party, a cozy 8-bit gathering held in Ajka, Hungary. I couldn't be there but some of my friends were, the demo was sent in for the Commodore 64 demo competition and to my surprise it won 1st place. I couldn't be happier with how things turned out with this little passion project, it was quite the confirmation that it was worth the time I spent on it.
I uploaded a video capture to Youtube (and this way I have my own Youtube channel, yay!), check it out! The imperfections are due to the 50Hz system vs 60Hz video capture differences.
Now if you're interested, I can tell you about some technical details. As you may know, the Commodore 64 has 64 kilobytes of RAM, that's about the size of a blank Microsoft Word document before compression. That's a challenge on its own. The demo, without loading from disk, uses way more RAM than that, about 90K to be exact, by generating most code and data for the plotter effect.
The demo runs on both PAL and NTSC systems alike, by detecting the system type and making the music play on its intended 50Hz speed on NTSC systems that use 60Hz screen refresh. This can only be achieved by skipping the music player call in every 6th frame. That can cause some audio artifacts, but this is all we can do to make it sound right. The home of Commodore 64 demos (and better games) is mainly Europe, despite the computer's American roots.
The main block of the demo takes up the first 14K of the RAM, it contains the framework that runs the demo itself, the scroller and the chessboard zoomer's code, music, logo, scroller font and text. Then from $3800 the effects use basically the rest of the available memory. I didn't bother using the register area between $d000-$e000, that would come with too many complications that I tried to avoid.
It's an animation, but that doesn't make it less cool. It would be hard to make this run in real time on this hardware, in 320*128 pixels. The raw data for it is 36K, with 22 animation phases (with 640 bytes of character matrix for each, taking up 14K in total), squeezed into 11 character sets (2K each, 22K in total).
22 phases of this animation would give us only 180 degrees of rotation. Due to the properties of the chessboard and the loop I went for, I can fake the other 22 animation phases by swapping the foreground and background colors halfway. Originally I wanted to go for a smoother animation with twice the phases, but that turned out to be impossible to hold in the RAM. Since the animation would run way too fast if it got refreshed in every frame, I only step to the next phase in every 2nd screen refresh. That still makes it smooth enough to be enjoyable. Every 4th loop of the animation the color palette changes to make the effect more interesting, while I'm babbling about in the scroll text.
The animation was made by using this data generator I wrote back in 2010 (see, I did plan to make this demo for a while). I can step the animation manually, and it calculates how many unique characters (8*8 pixel blocks) the phase contains, and how many unique characters we need for the entire animation so far. This way I could experiment with the optimal size, rotation speed and zoom properties without wasting time on implementing the entire effect on the Commodore 64 itself, only to realize halfway that something doesn't add up.
I love this effect, it's 128 plots forming cool blobs. I wrote something similar in a demo in 1997, only that couldn't refresh in each frame, unlike this one. Remember how the demo's framework and the chessboard animation took up 50K of the 64K RAM? When the chessboard disappears, I start the execution of a small program (2K) that generates 14K sine wave data and 13K unrolled code that can draw the plotter on the screen. With the effect's execution code, video memory buffers and all this generated stuff, it takes up about 40K RAM above the 14K for the framework.
The plotter refreshes during each screen refresh and takes up so much CPU time that it barely fits into the "raster time" as we call it. Due to this, to avoid screen tearing, it has to use double buffering. That means, we see the previously drawn animation phase, while we draw the next animation phase in the other buffer. Then switch. The code for it is highly optimized, with this flow:
Cleaner code, that deletes the previously drawn screen.
Plotter code modifier, that sets up the X and Y coordinates for each of the 128 plots in the plot drawing code itself.
Plotter code that draws the screen and modifies the cleaner code.
This way it's self-sustained, the plotter cleans up after itself in the video memory after each frame. Each buffer requires its own code because they draw to different memory locations, each requires 6.5K of code.
For the curious, here is a sample of the code that draws one plot out of 128. It would be a bit too complicated to explain what exactly it does, but in short, it draws a plot with a dynamic vertical shift applied to it (the blobs swing up and down, then leave the entire screen to switch animation style) after figuring out where to draw in the video memory, and updates the cleaner code for this buffer to clean up after itself the next time the code is called.
bcs (to Next plot code)
(Next plot code)
The plotter's execution takes up so much CPU time that I had to unroll even the scroll routine to make it all fit into a single frame. But as you may know, the NTSC systems work with 60Hz refresh and basically a smaller screen border, which results in smoother animation with less CPU time to use. The plotter uses two kinds of blob shapes, simpler ones and more complex ones with additional position calculations. The shapes are controlled by a playlist that modify the values of the plot position calculator and on NTSC systems I was forced to skip the complex ones. This way the demo runs smoothly on both video types, but the PAL version is more colorful, so to speak.
The demo takes up so much initial memory space, that the Commodore 64 kernel can't even load it into the memory without serious compression, for which I used Pucrunch. But this isn't a surprise, most C64 executables are compressed some way. Only in this case I couldn't even test the demo without compressing it first, once I put it all together into one file.
I spent the majority of my childhood and my early adult life coding a lot of demos and a few games on Commodore 64 and Plus/4, in assembly. Later I also worked on demos for the Amiga and Gameboy Color. I miss those days a lot, and in the recent years I had this recurring urge to write a new C64 demo. Or at least do something in assembly because it's just fun. After spending quite some time reading up on the ins and outs of ARM V7 that I could code for in my Raspberry Pi, I realized that it's just silly to go down that route. The Pi hardware is surprisingly poorly documented, people are just guessing or reverse engineering Linux distributions to figure out how the GPU's screen buffer can be accessed, mail slots and messaging, ummm no thanks. I can spend my time a bit more wisely, and my heart is captured by the Commodore 64 forever anyway.
My friends and I used to code in an awesome utility called Turbo Assembler, it's still a fascinating piece of marvel. It loaded to the memory address $9000 and upwards, and the tokenized source code was building downwards from there, so it left us plenty of work space in the 64 kilobytes RAM that the Commodore 64 and Plus/4 were built with.
Back in those days I wished I could write my own assembler, but I couldn't even imagine how I could do that effectively, with at least the same features I used in Turbo Assembler. Then decades passed, I worked on a bunch of serious software, and suddenly it didn't sound impossible anymore, just a bit challenging. I decided that if I'm going to code a new demo (or twenty), it must be done in my own assembler. I got to it, started working on the project on a Sunday morning on June 18th, and by the next Saturday it compiled the first test source code, a music player.
Retro Assembler was born.
Retro Assembler – Commodore 64 source code in Notepad++
After a month of work on it (not full time, just evenings and weekends), it's ready to be published. It's so advanced that it could easily be called a version 2.0 but I'll just go with a solid 1.0 for good measure. I'm afraid I got a bit carried away with it (in the meantime my wife Leslie binge-watched all five seasons of Orange Is The New Black), now it's a macro assembler with a lot of serious features. I made it to support the MOS6502 microprocessor family, so most Commodore computers and some early game consoles (Nintendo NES, Atari 2600 etc) can be targeted with it. But it's not all it will ever do. The application was designed in a way that adding other CPU types is doable with moderate amount of work, so in a future release the Gameboy CPU will be supported too, and shortly after the Zilog Z80 as well. I might even code something for the ZX Spectrum I never had. Motorola 68000 and perhaps ARM may be supported as well, but those can wait.
Versatile memory management using relocatable Memory Segments
Global, Local and Regional labels
Complex expressions and operators
Various control Directives
Nested source code include files, binary includes, custom include paths
Powerful Macro capabilities with parameters and default values
Loop, If and While directives for advanced code building
Byte, Word and String management in memory
Data generators (Need a Sine Wave in your demo? Got it!)
Bin, H6X, T64 and D64 output formats, that work great with the WinVICE Commodore 64 etc Emulator
Syntax highlighting in Notepad++ by custom User Defined Languages
The assembler is a command line tool that can be wired up in Notepad++ (and other editors) to turn it into a Development Environment with Build & Execute
I guess that's good enough for an initial release. I don't expect too many people to start using it, but even if nobody ever downloads it, I don't mind. I'm really proud of this assembler and now it's time for me to start working on that demo. Fellow sceners, I'm back!