Retro Assembler

Version 2.3  Latest update: 11/13/2018
Created by Peter Tihanyi @ Engine Designs

Table of Contents

About the Assembler

Assembler mode
Disassembler mode
Supported CPU Types
Updates and Version numbers
Settings (General, Txt Output, Gameboy, NES, SNES, SMS/GG, TAP)

Value Types

Numbers
Strings
Characters

Expressions and Operators

Labels

Global Labels
Local Labels
Regional Labels
Current Memory Address

Comments

Directives

.target
.org
.equ
.var
.random
.setting
.breakpoint
.closelabels
.debug
.end
.include
.incbin
.segment
.code, .data, .bss
.bank
.region
.endregion
.function
.endfunction
.macro
.endmacro
.loop
.endloop
.if
.endif
.while
.endwhile
.break
.align
.storage
.byte
.word
.dword
.lobyte
.hibyte
.loword
.hiword
.text
.stext
.generate
.memory
.memorydump

Instructions

6502 Family
65C02 / 65SC02
65816
Nintendo Gameboy
Z80

Output File Formats

BIN – Binary file
PRG – Binary file with load address header
T64 – Tape image format for Commodore computers
D64 – Disk image format for Commodore computers
TXT – Configurable text file format
GB – Nintendo Gameboy ROM format
NES – Nintendo Entertainment System ROM format
SNES – Super Nintendo Entertainment System ROM format
SMS – Sega Master System ROM format
GG – Sega Game Gear ROM format
TAP – ZX Spectrum 48K Tape format

Integration with IDEs and Text Editors

Visual Studio Code
Notepad++

Change Log


About the Assembler

Retro Assembler was created as a hobby project to work with source codes targeting microcomputers and classic gaming consoles. Hence the name Retro. I grew up coding a lot of demos and a few games on Commodore 64 and Plus/4, so the main target was the 6502 CPU family, which is the closest to my heart. I happen to know I'm not alone with that feeling. But I also worked on the Amiga and Gameboy Color, so the assembler was created in a way that it can support multiple platforms with little effort on my part. Ultimately the goal is to support numerous CPUs and output formats, and help the task of software development with neat assembler features.

The application is being developed on Windows 10 in C#, in two versions:

Disclaimer: As the .Net Framework versions improve, future versions of the assembler will be compiled for the latest frameworks available through Windows Update. I will try to keep it compatible with Mono, but if you're not on Windows, you're better off using the .Net Core version. There is no functional difference between the assembler's two versions, they are built using a shared code project.

If you use Retro Assembler, I would be happy to hear from you. You can tweet me at @Peter_Tihanyi – I'd love to see what you do with it!

Enjoy!

Assembler mode

This is the default mode, where the assembler loads the main source file (with optional source file includes in it) and compiles it according to the rules of the target platform. The default target CPU is 6502 and the full 32 bit address space can be used for code and data.

The assembler works with separate memory banks and segments (using unique names). When the end result is saved, it saves a file made of all segments in each separate bank. If multiple banks are in use, the merged file names get a notation with the bank number. In case of a single bank (the default Bank 0), this info is omitted from the filename.

If enabled in settings, the assembler saves each individual segment as well, separately, so if you use overlapping segments, you can link them up manually as you please. It also saves a *-Info.txt report file which lists each bank and their segments, with the minimum and maximum memory addresses used by each segment.

If the optional Output file path is not set, the assembler uses the input file's directory and name to build the output file. For example the file MyCode.asm would be compiled and saved as MyCode.bin

Usage

Command line options

Option Description
-c Turns on Case Sensitive mode for Labels, Functions and Macros.
-u Allows using undocumented instructions in certain CPU types.
-x Prints out Global Labels and their values after a successful code compiling.
-L Sets LaunchEnabled to True for IDEs like VS Code to Build & Start the compiled code automatically.
-C=<type> Sets the target platform's CPU type. Default: 6502. See Supported CPU Types for accepted values.
-O=<type> Sets the output file's type. See Output File Formats for accepted values.

Disassembler mode

Optionally the assembler is capable of loading a binary file (software compiled to a certain supported CPU) and disassemble it into a text file according to the rules of the target platform that the binary file was made for. The disassembled code goes either to the standard output, or into the optional Output file as text.

When disassembling 65816 code, the disassembler tries to do its best to follow the state of the M and X registers as they are being changed in the sequentially read code. Unless you set those from subroutines, the disassembler will likely figure out whether the Accumulator and the Index Registers are in 8 or 16 bit mode, decoding the instructions accordingly. This unfortunate ambiguity of the 65816 instruction set makes things a bit risky, especially that all 256 bytes are assigned to individual instructions. Once it goes off the track, it can decode completely false instructions for a while before it accidentally finds the address of an actual instruction, and not just a half of one. But in the right circumstances, it will work okay.

Usage

Command line options

Option Description
-d Required: Turns on the Disassembler mode.
-C=<type> Sets the input file's CPU type. Default: 6502. See Supported CPU Types for accepted values.
-D=<number>  Sets the load address for the input file. It will be determined for Prg files automatically from the load address header. The entered number can be in either decimal format (4096) or in hexadecimal format ($1000 or 0x1000).

There is an advanced mode for the load address option. You can also specify the start address in the memory and the length of the code chunk that will be disassembled. Please note that these optional values are not handled as Offset and Length for the binary file loader. The whole file will be loaded into the target memory, these only control what the disassembler should turn into readable code.

-D=<LoadAddress>,<StartFrom>,<Length> (Must be without spaces!)

Examples

//The file is loaded to $2000, but only the code between $3800-$3880 will be disassembled.
-D=$2000,$3800,$80

//A C64 file is loaded to $0801, but skip the BASIC RUN header and disassemble the whole file from $0810
-D=$0801,$0810

-D=0,$0810  //Does the same, since this example file was a PRG file with a known load address.

//The file is loaded to $0000, but only the code between $0740-$07c0 will be disassembled.
//The first $100 bytes of the file is ignored, for example it may be a header or something
//that's not part of the actual code file. (Note the negative number!)
-D=-$0100,$0740,$80

Supported CPU types

CPU Name Description
6502 The well known MOS 6502 CPU and its standardized variants.
Aliases: 6502C, 6510
65C02 An extension over the standard 6502 instruction set, with additional vendor specific WDC/Rockwell instructions.
Alias: 65SC02
65816 A serious extension over the standard 6502 instruction set, with 16 bit registers, 24 bit address size, bank switching etc.
Aliases: 65C816, 65C816S, 65802
Gameboy Nintendo's Gameboy CPU based on the Z80, but with a modified set of instructions and registers.
Z80 The well known Zilog Z80 CPU.

Automatic CPU detection from file names

Figuring out what CPU the source code file was made for is not trivial. You always should have a .target directive in the main source code file to select the correct CPU, but it's useful to have a fallback method for some special cases.

You can tag your source code files with a special sub-extension, like this: MyFile.6502.asm

The assembler (and even the disassembler) will choose the correct CPU type for your file, automatically. It was "invented" for the Visual Studio Code Extension support, which really needs some help to choose the correct language syntax, but it was easy to add it to the assembler itself as well. Use the main CPU type strings above, not aliases.


Future roadmap: 6809, Motorola 68000 (and family) support is planned. ARM V7 and others are to be decided.

Updates and Version numbers

The assembler looks for updated versions periodically, unless it's disabled in Settings by setting UpdateCheck to False. It's recommended to install the latest version when available, to get the newest improvements and bug fixes.

The version number style is 3 numbers, the last one is omitted when its value is 0. For example "2.2.1" where the numbers mean:

  1. The major version of the application. This gets updated only when the application code is seriously changed, refactored or restructured. Maintaining compatibility with prior versions is paramount, but in some cases this may break existing projects.
  2. The minor version of the application. This gets updated when something bigger has been added, like new directives or support for a new CPU type. This resets the revision number.
  3. The revision of the current major-minor version. This gets updated when smaller changes or bug fixes get implemented in the application.

Settings

The application has various command line switches that can change the assembler's behavior. You can set up your own defaults in the retroassembler-settings.xml file for most of these, along with your preferred include paths. Just be careful with future updates that would overwrite this customized file of yours.

To avoid such accidents, create the retroassembler-usersettings.xml file based on the normal settings Xml file, where you can keep the settings you changed from their defaults. The assembler package will never contain (thus overwrite) this file, but the assembler will always load this for your final settings. You should delete all <Setting /> lines from it, except for the ones you actually changed with your custom values, just to make it clear for you to read.

Most settings can be set or changed in the source code files, using the .setting directive.

The flow by which settings get their values:

  1. Each known setting gets initialized to a "good default" by the assembler itself.
  2. These settings are also listed in the retroassembler-settings.xml file with the "good default" values, they are loaded and overwrite the defaults set in Point 1. You can edit these to set your own defaults, by following the rules of accepted value types, sane values etc. It's in your hands, literally. But you should just not edit this, and see Point 3 instead.
  3. Some of these settings may be listed in the custom retroassembler-usersettings.xml file with your "custom good default" values, they are loaded and overwrite the defaults set in Point 2. You can create this by duplicating and renaming retroassembler-settings.xml. It's best to keep only the customized settings in this file for clarity. This file is optional. If it's missing, you will not get an error.
  4. Certain settings are exposed as command line options, they can be updated that way for the loaded source code or binary file.
  5. Most settings can be updated from the source code file itself, using the .setting directive. This allows for maximum customization for certain projects, while the rest use the defaults loaded from the Xml file.
  6. A few special settings can only be managed from the source code file, they are not present in the Xml file.

Perhaps best practice would be changing the Xml file's defaults only for the values you can only edit in that file (or you positively want to be the default in all your projects), and set up the rest from the main source code file's header, with comments about why it changes those defaults. Make sure you have a backup of your good Xml file, because an updated version of the assembler will overwrite your file if you're not careful. New versions may come with more or different Xml values.

The known settings and their values are described below.

The Name of the settings are handled case-insensitively, eg "RandomSeed" is the same as "randomseed".

General settings for the assembler

Name Type Description Default
CpuType String The default one of the Supported CPU Types that the assembler to identify instructions, when it's not set in the source code by the .target directive.
Can't be updated with .setting, use .target to do so.
6502
OutputFileType String The default one of the Output File Formats that the assembler uses to save the compiled source code. bin
OutputFormat String It's an alias of OutputFileType because with hindsight, I probably should have named it that way.
IncludePath String Directory paths separated by ';' where the assembler looks for files that are referenced by the .include and .incbin directives. Example: "C:\MyFiles;C:\MyFiles\Macros"
Can't be updated with .setting because it's loaded only on application launch.
VicePath String The directory path where the VICE emulator is, to access the program "c1541" for T64->D64 output file conversion. Example: "C:\Emulators\C64\WinVICE-3.2-x86"
Can't be updated with .setting because it's loaded only on application launch.
CaseSensitiveMode Boolean Enables case sensitive handling of Labels, Functions and Macros. false
ShowLabelsAfterCompiling Boolean Enables printing out Global Labels and Variables after compiling. false
TreatWarningsAsErrors Boolean The assembler may show some warnings after compiling. This setting turns those warnings into errors for additional strictness. false
OutputSaveEntireBanks Boolean Normally the output files are only as long as the number of bytes they use in memory, with gaps included. This setting forces saving entire memory banks, using each bank's SizeKB setting and mapping address. false
OutputSaveIndividualSegments Boolean Normally only merged memory segments get saved for each memory bank. This setting enables saving individual segments beside the merged memory banks for manual linking. false
OutputSaveInfoFile Boolean Enables saving an Info file for the compiled source code with start-end addresses of each segment, in case it's needed. true
OutputSaveTimeStamp Boolean Enables appending a time stamp to each saved file's name, so all compiled versions are kept alongside in the output directory. false
RandomSeed Integer Sets the Seed value for the random number generator. 0 is for "use a truly random seed". Other values will make the random number generator more predictive. It's recommended to change it only from the source code, if you really need to. 0
BeforeBuild String A command that should be executed before code compiling. When it's set in the Settings XML file, it truly gets executed before the build starts. If one or multiple of these are placed into the source code (as well), the assembler executes them when these .setting lines are processed.
This setting can be defined multiple times for multiple commands. See LaunchCommand for the command style details.
AfterBuild String A command that should be executed after a successful code compiling. Whether it comes from the Settings XML file, and/or one or multiple of these are set in the source code, they get collected up, and they all get executed (in order) after a successful code compiling.
This setting can be defined multiple times for multiple commands. See LaunchCommand for the command style details.
LaunchCommand String The command that should be used to launch the successfully compiled source code, if LaunchEnabled is True. This may be something like "C:\\Emulators\\MyEmu\\emulator.exe -run {0}" where the optional {0} is replaced by the compiled code file's filename with full path. The same rules apply as for the terminal of your choice: if you have paths or parameters with space in them, you have to put those into "quotes"! Additionally, a plain "{0}" may also work just fine, if your system associates the file's extension to a specific emulator or utility, like for .nes files. If this string is empty, the launcher will never run.
LaunchEnabled Boolean Enables the automatic launch of the successfully compiled source code, by using the LaunchCommand setting for process information. The assembler's -L command line switch overrides this setting to True, so you can just build your code without launching, and launch it with a specific IDE command, such as VS Code's Retro Assembler: Build & Start false
UpdateCheck Boolean Enables checking for assembler updates periodically. It checks the enginedesigns.net website about once a day and creates an updatecheck.txt file to keep track of things, so it doesn't need to go online that often. It's on by default, but it can be turned off. true

Settings for the configurable Txt output format (byte dump with memory addresses)

Name Type Description Default
OutputTxtAddressFormatFirst String String formatting for the memory address in the first line of a new area.
Example: "8fc0 " (with space).
{0:X04} with uppercase X would make it print an uppercase hexadecimal number.
"{0:x04} "
OutputTxtAddressFormatNext String String formatting for the memory address in subsequent lines of the area, in case you want to drop the address display there. Example: "8fc0 " (with space) "{0:x04} "
OutputTxtValueFormat String String formatting for each displayed byte value. Example: "7f"
{0:X02} with uppercase X would make it print an uppercase hexadecimal number.
"{0:x02}"
OutputTxtValueSeparator String Separator placed between each displayed byte value in a line of text. " "
OutputTxtAreaSeparator String Separator placed between each area. Areas are continuous bytes, if there is a gap in the memory, a new area opens. "\r\n"
OutputTxtLineSeparator String Separator placed between each displayed line. It's basically the newline character of your choice. "\r\n"
OutputTxtValuesInLine Integer The maximum number of displayed byte values in each line. 8

Nintendo Gameboy (GB) ROM format and CPU settings

Name Type Description Default
GameboyStart Integer The code in Gameboy a ROM usually starts at the memory address $0150. The CPU jumps to $0100 on reset, which contains a user-set "nop ; jr StartAddress" instruction pair. The linker puts these instructions to $0100, using this address value. Set this to 0 if you really want to control those 4 bytes yourself. 0x0150
GameboyTitle String A maximum 15 characters ASCII string with the ROM's name. It will be saved as uppercase text. "ROM
Image
Title"
GameboyLicenseeCode String A 2 characters ASCII string with the licensee code. You can set it to anything. "GB"
GameboyLicenseeCodeOld Integer Old, deprecated licensee code. It's recommended to leave it at the default $33 and use the GameboyLicenseeCode string instead. 0x33
GameboyManufacturerCode String A 4 characters ASCII string with the manufacturer code, which will overwrite the last 4 characters of a 15 character GameboyTitle, making it max 11 characters long. Not recommended to set.
GameboyCartridgeType Integer The Memory Bank Controller (MBC) type ID of the cartridge. Unless you're doing something special, leave it as 0 and the linker will figure out which MBC5 ID you need, based on the cartridge settings below. 0
GameboyCartridgeRamKB Integer Sets how much external RAM the cartridge has (if any). Set it to 0, 2, 8 or 32 KB. 0
GameboyCartridgeBattery Boolean Sets whether there is battery backed external RAM on the cartridge. false
GameboyCartridgeRumble Boolean Sets whether there is a rumble motor on the cartridge. false
GameboyRomVersion Integer ROM version number. 0
GameboyRomJapanese Boolean Sets whether the ROM is for the Japanese market. Otherwise International. false
GameboyMonochromeEnabled Boolean Sets whether the ROM runs on monochrome devices (DMG), or perhaps only on Gameboy Color (CGB). true
GameboyColorEnabled Boolean Sets whether the ROM has Gameboy Color (CGB) functions enabled. true
GameboySuperGBEnabled Boolean Sets whether the ROM has Super Gameboy (SGB) functions enabled. false
GameboyPutNopAfterHalt Boolean Sets whether the assembler should put a "nop" instruction after "halt" instructions automatically to avoid hardware bugs.
Can't be updated with .setting because it's loaded only on application launch.
true
GameboyPutNopAfterStop Boolean Sets whether the assembler should put a "nop" instruction after "stop" instructions automatically to avoid hardware bugs.
Can't be updated with .setting because it's loaded only on application launch.
true

Nintendo Entertainment System (NES) ROM format settings

Name Type Description Default
NESMapper Integer Sets the cartridge board type (usually with a Memory Manager Controller) and capabilities for the ROM. By default it's 0 for a small ROM. You have to manage this value for your project.
Normally 0-255. When over 255, the extra 4 bits go into Byte 8 in the ROM header.
0
NESSubMapper Integer Sets the selected cartridge board type's extended capabilities.
0-15, these 4 bits go into Byte 8 in the ROM header as bits 5-8.
0
NESVerticalMirroring Boolean Enables vertical mirroring mode, instead of the default horizontal mirroring mode. false
NESBatteryBackedWRAM Boolean Sets whether there is battery backed external RAM on the cartridge. false
NESFourScreenMode Boolean Enables the Four Screen mode. false
NESPlayChoice10 Boolean Indicates that it's a PC-10 game. false
NESVsUnisystem Boolean Indicates that it's a Vs. game. false
NESPal Boolean Sets whether the ROM is compatible with the PAL video standard (in Byte 12). false
NESNtsc Boolean Sets whether the ROM is compatible with the NTSC video standard (in Byte 12). true
NESByte10 Integer Sets the NES 2.0 standard's extended flags in Byte 10 of the ROM header, raw. 0
NESByte11 Integer Sets the NES 2.0 standard's extended flags in Byte 11 of the ROM header, raw. 0
NESByte13 Integer Sets the NES 2.0 standard's extended flags in Byte 13 of the ROM header, raw. 0
NESByte14 Integer Sets the NES 2.0 standard's extended flags in Byte 14 of the ROM header, raw. 0

Super Nintendo Entertainment System (SNES) ROM format settings

Name Type Description Default
SNESTitle String A maximum 21 characters ASCII string with the ROM's name. It will be saved as uppercase text. "ROM
Image
Title"
SNESPadding Boolean Enables ROM padding with empty banks, up to the next standard (or accepted) ROM size. It's turned On by default and it's highly recommended to use this setting, because the checksum calculator sets a correct result only for standard ROM sizes and a few exceptions underlined below. The standard (or accepted) ROM sizes are the following:

256KB (2 MBit), 512KB (4 MBit), 1MB (8 MBit), 1.5MB (12 MBit), 2MB (16 MBit), 3MB (24 MBit), 4MB (32 MBit)

You can turn it off if you want to manually manage the ROM's size, especially if you want to use a non-standard size, but then it's up to you to use enough empty banks at the end if needed, and you'll have to repair the checksum in the SNES header. For most users, this should be turned On, and a 4MB ROM should be enough for most projects.
true
SNESHiROM Boolean Sets whether the ROM uses 64KB banks, instead of the standard 32KB banks used by the LoROM format. false
SNESExLoROM Boolean Sets whether the ROM uses the Extended LoROM format for non-standard ROM sizes. false
SNESExHiROM Boolean Sets whether the ROM uses the Extended HiROM format for non-standard ROM sizes. false
SNESFastROM Boolean Sets whether the ROM uses FastROM, which is a setting for physical cartridges. It may matter timing-wise for a system that emulates the ROM access speed (120ns vs 200ns). false
SNESCartridgeType Integer Sets the cartridge type. Only standard ROMs are supported without special chips, so you likely want to use these values: 0=ROM only, 1=ROM and RAM, 2=ROM and Save RAM (SRAM). Of course other values may be set here, but then you have to make sure the ROM is linked up correctly. 0
SNESSRAMSize Integer Sets the SRAM size as enumerator value (not in KB) because there may be other settings I don't know of. The typical settings are: 0=None, 1=2KB, 2=4KB, 3=8KB 0
SNESCountry Integer Sets the country code which also has an effect on the system type. The most used ones are: 0=Japan (NTSC), 1=USA (NTSC), 2=Europe, Australia, Oceania and Asia (PAL) 1
SNESLicenseeCode Integer Sets the licensee code of the publisher. You can use 1 for Nintendo, or anything else you pick. 1
SNESVersion Integer Sets the version number of the ROM. It can be a number between 0 and 127, and usually understood as Version 1.N The SNES doesn't check this, it's only for the developer. 0

Sega Master System (SMS) & Sega Game Gear (GG) ROM format settings

Name Type Description Default
SMSCountryCode Integer Sets the country code which could also have an effect on the system type.
3=SMS Japan
4=SMS Export
5=GG Japan
6=GG Export
7=GG International
4
SMSProductCode Integer Sets the product code for the ROM. It can be a number between 0 and 159999. It doesn't make any functional difference. 0
SMSVersion Integer Sets the version number of the ROM. It can be a number between 0 and 15. The systems don't check this, it's only for the developer. 0

ZX Spectrum 48K Tape (TAP) format settings

Name Type Description Default
TapStart Integer Sets the start address of the program. 0x8000
TapClear Integer Sets the memory clearing start address, executed before the program would be loaded from the tape file. 0x5fff

Settings managed only from source code files

Name Type Description Default
RegA16 Boolean Sets whether immediate values in instructions using the Accumulator should be saved as 16 bit numbers, even with 8 bit values. (65816 only)
Example: lda #$12 --> lda #$0012
false
RegXY16 Boolean Sets whether immediate values in instructions using the Index registers (X and Y) should be saved as 16 bit numbers, even with 8 bit values. (65816 only)
Example: ldx #$12 --> ldx #$0012
false

Value Types

The following value types can be used in directives and instructions:

Numbers

Numbers can be decimal, hexadecimal or binary values.

123         //Decimal value.

$12         //Hexadecimal value.
0x12        //Hexadecimal value (alternative).

$1234       //Hexadecimal value, 16 bit.
0x1234      //Hexadecimal value, 16 bit (alternative).

$12_34      //Hexadecimal value with optional value separator(s).

%100101     //Binary value.
0b100101    //Binary value (alternative).
%ffff_0000  //Binary value with optional value separator(s).

0o17        //Octal value with "zero + letter o"

Number values can go up to 32 bit unsigned values, or 31 bit signed values for negative numbers.

Preferred Number Size

In certain situations you may need to force a value into a different (always bigger) number size. For example in a self-modifying code you may want to ensure you start out with a 16 bit value, but $0000 would normally be normalized to $00 which would generate an instruction with a different addressing format, different byte count.

You can enforce a different number size by prefixing it with extra "0" characters (only in hexadecimal mode!), and if there is an addressing format for the chosen instruction that can handle it in the bigger size, that specific addressing format will be chosen.

sta $12,x        -->  sta $12,x   ($95 $12)
sta $0012,x      -->  sta $0012,x ($9d $12 $00)
sta 8 + $0012,x  -->  sta $001a,x ($9d $1a $00)  (Expressions remember the preferred size.)

This works with 16, 24 and 32 bit values, and you may prefix it with even just a single $0. As long as the number can be stepped up to the next size, it will be done. Then it's up to the CPU's instruction set, whether it has what you need. If the instruction doesn't exist with the 16 bit value for example, but your value was 8 bit to begin with, it will be tried as an 8 bit value as well. A working code has higher priority over a possibly mangled code.

When it comes to expressions, the "preferred number size" is remembered after each operation made on two (or more) numbers, but the 2nd number's preferred size is kept only in case of Addition or Subtraction. Which means, an sta $12 & $00ff will not suggest the compiler to save it as if you entered sta $0012

Strings

Strings are one or more characters in double quotes, translating to ASCII bytes.

"Hello world!"   //Normal text

"Hello\nWorld!"  //Normal text with an escaped newline character.

Characters

A character is one character in single quotes, translating to an ASCII byte.

'X'       //X character

'\r'      //An escaped newline character.

lda #'X'  //The X character used as a byte value in an instruction.

Expressions and Operators

Expressions constructed of operators, values and Labels can be used in virtually any directive or instruction, which allows for some clever code building mechanics.

Brackets (parentheses) can be used to form more complex expressions in directives, instructions and macro calls. Be careful how you form them. It may be better to evaluate complex expressions into .var variables and use them in the instructions, but it's up to you.

The available operators, in the order of evaluation:

Symbol
( )
~
* /
+ -
<N  >N
<<  >>
&
^
|
==  !=
<  >  <=  >=
&&
||
,
Type of operation
Expression
Bitwise-NOT
Multiplicative
Additive
Low and High Byte
Bitwise shift
Bitwise-AND
Bitwise-exclusive-OR
Bitwise-inclusive-OR
Equality
Relational
Logical-AND
Logical-OR
Sequential evaluation

Labels (including the Current Memory Address pointer) get replaced by their number value during expression evaluation, and in directives it's required to use labels with prior definitions in order to build reliable code.

The comparers (==  !=  <= etc) and logical operators (&&  ||) work best in .if and .while directives, because their end result is either 1 for true, or 0 for false. If you use them with .equ, you'll just end up with a 0/1 number that you can use as a flag.

Examples


MyValue    .equ (MyConstant << 4) + 5

           .if MyValue >= 8 || OtherValue != 13
           (code lines)
           .endif

           //Assuming IrqRoutine is at $087c
           lda #<IrqRoutine  //Low byte : $7c
           sta $0314
           lda #>IrqRoutine  //High byte: $08
           sta $0315

Labels

Labels (also known as Symbols) are constants or memory addresses that can be used in directives and instructions as parts of expressions and operands.

A label's name can only contain letters, numbers and the "_" character, and it can't begin with a number. Single character labels can't match a register name in the selected CPU type, for example "x" is not going to work as a label in 6502 mode. It's good practice to avoid using single character labels if you can help it.

White spaces are usually ignored in the source code file. The only place where you must use a space or tab (or a colon) as separator is between the main label of the code line and the directive/instruction after it.

To maintain compatibility with other assemblers, if the label is followed by a ":" (colon) character, this character gets processed as white space. So "MyLabel:" is the same as "MyLabel", you can format them either way.

Global Labels

The scope of a Global Label is the entire source code, so they must be named uniquely. Labels can be defined the following ways:

MyLabel      .equ $73  //MyLabel gets the constant value $73

MemAddress   lda #$00  //MemAddress gets the value of the current memory address, eg $0813

Then the value of these labels can be used in directives and instructions with ease. Keep in mind that only those labels can be used in directives, that have been defined before the directive's code line.

MyLabel      .equ $73         //MyLabel gets the constant value $73

             lda #MyLabel+2   //lda #$75
             sta WhoKnows     //sta $0000 because WhoKnows is unknown
                              //and the therefore assembler assumes a 16 bit value.

MyNewLabel  .equ MyLabel+4    //$77

MyNewLabel2 .equ WhoKnows+4   //ERROR: The label WhoKnows is not defined yet.

MyLabel     .equ 55           //ERROR: The label "MyLabel" is already defined.

MyVariable  .var 10           //Create a variable with the initial value 10

MyVariable  .equ MyVariable+1 //The variable is updated to 10 + 1 = 11
MyVariable = MyVariable+1     //The variable is updated to 11 + 1 = 12

You can use Labels almost anywhere. They will get replaced by the number value they hold (constant or memory address) and this value will be used in calculations, or in determining the addressing type of certain instructions.

If the label's value is not known during the time the assembler gets to an instruction that uses it (in the 1st Pass), the assembler assumes that it's 16 bit memory address for those instruction addressing types that work with memory addresses.

For example an "sta MyLabel,y" instruction placed before MyLabel's definition will be handled as "sta $0000,y" (3 bytes) instead of "sta $73,y" (2 bytes), then in the 2nd Pass the code will be entered as "sta $0073,y" (3 bytes) because the worst case was already assumed in 1st Pass, in order to get the number of bytes that the instruction will take in memory.

So if you want to use zero page values for a faster code execution, define those labels before the code lines that try to use them.

Local Labels

There is a Local Label type, that can be used with a limited scope. Defining a new Global Label and using certain directives close down the currently "open" Local labels, by setting a range of start and end line numbers in the merged source code, where the Local label can be addressed.

By having this automatic closure, the Local label names can be redefined in various sections of the code. Like small loop branches can simply utilize the same @Loop label at most places, without running into any "label exists" errors.

The name must start with the "@" character, then any letter, number or the "_" character can be used in any combination. This means that even the really simple "@1" is accepted as a local label name.

Example

FillMem     ldx #$00
            lda #$ff
@Loop       sta MemAddress,x  //Define a new Local label to "use it, then forget about it".
            inx
            cpx #$28
            bne @Loop  //Branches back to "sta MemAddress,x" as it should.
NewLabel    lda #$00
            beq @Loop  //ERROR: The @Loop label can't be found, because the
                       //definition of NewLabel closed its range. 

The Local label ranges are closed by the directives .function, .loop, .if, .while and their .end* counterparts, as well as Function and Macro references.

It's recommended to use local labels inside Macro code lines, loops and whiles, and wherever else you need labels with a short life span.

Regional Labels

If you understand how Local labels like "@Loop" works, you will understand Regional labels too. They are almost the same, labels with a limited scope. But unlike Local labels, these don't get closed down by the definition of a Global label, or by using a directives .loop, .if and .while.

They were designed to be used in Macros and Functions, so they stay open and available until an .endmacro or .endfunction directive is used. This allows the programmer to avoid using Global labels mainly in Macros (and also in Functions), which would lead to a "Global label exists" error when a Macro is referenced in the source code more than once.

The name must start with "@@" characters, then any letter, number or the "_" character can be used in any combination. This means that even the really simple "@@1" is accepted as a Regional label name.

Example

.macro MyMacro()

FillMem     ldx #$00
            lda #$ff
@@Loop      sta MemAddress,x  //@@Loop is now a Regional label.
            inx
            cpx #$28
            bne @@Loop  //Branches back to "sta MemAddress,x" as it should.
NewLabel    lda #$00    //Defining a new Global label here, that would close down Local labels.
            beq @@Loop  //It's OK! Compared to the Local label example above, this works fine.

.endmacro

            //Let's use this macro in our code.
            //It will inject the macro's code lines at place, with modifications.
            MyMacro()

            jmp @@Loop  //ERROR: The @@Loop label can't be found, because its range has been closed.

The Regional label ranges are closed by the directives .function, .endfunction, .macro, .endmacro

Known issue: There is a limitation with Regional (and Local) labels. If you make a Macro function call inside a Macro, the inner function closes the Regional labels when it exits. So jumping over an inner Macro function call using a Regional label is currently not possible. Fixing this would require rewriting the parser and introducing namespaces, so this will remain an issue until further notice.

Current Memory Address

Another kind of label that's worth mentioning is the "*" (asterisk) character.

"*" gets replaced by the current memory address, like $0813 during expression evaluation.

Please note that the "*" character is also used in multiplications, so the assembler tries to determine the context where the "*" character is used in, and acts accordingly.

Examples

ldx #$07
dex
bne *-3   //Branches back to "ldx #$07"

jmp *     //Infinite loop to the memory address where the "jmp" instruction is.

MyLabel = *+$20   //The current memory address +$20 will be set as value for MyLabel.

MyLabel2 = 5 * 6  //MyLabel2 will get the value 30 due to the multiplication.

Comments

Comments can be placed at the end of any directive or instruction, or they can be the only content in a code line. They are ignored by the assembler, so a code line that has been commented out is not processed at all. The comment markers are // and ; that work equally.

Block comments are also supported, where multiple code lines can be commented out, or the source code can contain a bigger block of text without prefixing each line with the comment marker. The block comment markers are /* for opening and */ for closing the block.

Examples

MyLabel   lda #$0e   //This is a comment for the instruction.

MyLabel2  ldx #$06   ;This is also a comment, with the alternative comment marker ";".

MyLabel3  //ldy #$00 //Now this is just a line with "MyLabel3" in it, the instruction is ignored.

/* Some optional comment text, and the encapsulated code lines are ignored.
          sta $d020  //Ignored.
          stx $d021  //Ignored.
*/        nop        //The "nop" instruction is actually processed as valid code content.


Directives

Directives are control commands for the assembler. The generally accepted format is:

[label] .directive parameter(s) [comment]

Specific parameters and formatting exceptions will be explained in the description of each directive. Certain directives have alternative names (aliases), they are interchangeable with the official name.

Please note that Labels used as directive parameters must have prior definitions, meaning their value (usually a constant or a memory address) must be defined before the directive code line in the source code.


.target

Sets the target architecture by specifying the CPU type. It's a good practice to put this directive to the top of every source file you have, so the assembler will know what kind of assembly language you used.

In theory one could make a project which compiles different source files, or even different sections of the same file to different CPU types, but that should be avoided, and in the context of this assembler it wouldn't make much sense.

See the supported CPU types listed above. They are case insensitive.

Change! This directive used to also set the memory size (max 128 MB) of the virtual memory buffer where the project's output bytes are generated. This is no longer necessary, now a project can use the a full 32 bit address space. The directive still takes a second parameter to preserve compatibility with existing source code files, but this memory size parameter is ignored.

Format

.target "CPU type"

Examples

.target "6502"

.target "65C02"

.target "Gameboy"

Alternative.cpu


.org

Instructions and data bytes are always placed in the currently selected memory segment, right at the Current Memory Address, which is also called as Program Counter. The Originate directive sets this pointer to a defined memory address, to control where the program will be compiled in memory.

Format

[label] .org MemoryAddress

Optionally you can put a label in front of .org, then this label will get the selected address as value.

Examples

.org $2000

.pc $2000

*= $2000

Alternatives.pc*=


.equ

Assigns a constant value to a Label, which later can be used as directive parameter, instruction operand, part of an expression etc. The value may be a number, another Label with previous definition, or an expression that evaluates to a number. Label values can be assigned only once, unless you use the .var directive.

If the Label is an existing entry marked as Variable, then this directive updates its value.

Format

Label .equ Value

The = character also can be used instead of .equ to make programming easier.

Examples

MyValue .equ 123         //Works only for the first time, unless it's a Variable.

MyValue = Start + $0200  //Works only for the first time, unless it's a Variable.

Counter = Counter + 1    //This is a Variable in our example, so this works anytime.

Alternative=


.var

Creates a Label marked as Variable. It has to be a Label that doesn't exist yet (neither as a Constant or a Variable), and then it can be updated in the code with the .equ directive. Which has the shortcut "=", so it can be just updated as VariableName = NewValue.

Format

Label .var Constant

Example

Counter .var 10         //Create a Variable with the initial value 10.

Counter = Counter - 1   //Update the Variable's value, even by using expressions.

.random

Creates a Variable (see the .var directive) with a random value, between 0 to 255 by default, but the limits can be customized. Afterwards this value can be used in the code, or can be modified the same way Variables can be modified.

The "seed value" for the random number generator is an important thing to mention. Each time the assembler runs, the random number generator gets a new seed value, calculated from the current time of the computer's clock. This value gets saved and restored between compiler passes, so the same random numbers will be generated in both passes for consistency.

If you need more control over the seed value before a random number is generated, it can be set by .setting using "RandomSeed". For most projects the default seed would be just fine. But if you need your code to generate the same random numbers every time it gets compiled, you need to set your own Random Seed value.

Format

Label .random [Min], [Max]
Label .random [Max]

Example

Choice1 .random 1, 10        //Create a Variable with a random value between 1 to 10.

.setting "RandomSeed", 1933  //Initialize the random number generator with a custom seed.

Choice2 .random 55           //Create a Variable with a random value between 0 to 55.

Alternative.rnd


.setting

Updates a setting value in the default/current settings of Retro Assembler. The defaults can be changed in retroassembler-settings.xml and then some of them can be overridden by command line arguments when the assembler is launched. A few of these settings can't be changed using this directive because they are needed during initialization.

The setting has to exist by name, and the value type has to be the same what the chosen setting expects. The value types may be string, integer or boolean.

See the configurable setting values for details.

Format

.setting "SettingName", Value

Alternative format

.setting "SettingName" = Value

Examples

.setting "SomethingWithString", "Example value"
.setting "SomethingWithInteger", 123
.setting "SomethingWithInteger", myLabel
.setting "SomethingWithBoolean", true
.setting "SomethingWithBoolean", false

.breakpoint

Creates a breakpoint, that will be saved into "moncommands.txt" for debugging in the VICE Monitor.

If there is at least one breakpoint in the source code, the monitor commands list gets saved and the list of labels are also included. But if there are no breakpoints set, the "moncommands.txt" file simply gets deleted, and doesn't get created. So if all you need is the labels, just set a breakpoint somewhere in the source code, even after the last instruction to force the file's creation.

Note: This function doesn't work well with source codes using multiple memory banks, for obvious reasons.

Format

.breakpoint [IF Condition]

Examples

nop

.breakpoint

lda CurrentColor

.breakpoint "A == $01"

sta $d021
The output in "moncommands.txt" will be something like this:
break 0821
break 0824 if A == $01
al 0070 .NtscSystemFlag
al 0071 .NtscPlayCounter
al 0855 .irq
al 08b1 .CurrentColor

.closelabels

This directive forcibly closes the range of currently opened (still addressable) Local and Regional labels. It's mainly for unconventional use cases of Regional labels, should you decide to utilize them outside of a Macro or Function for some reason. Then by using .closelabels you can reset these labels for reusability reasons.

Format

.closelabels

.debug

Prints the debug text on the console while compiling the code in the 2nd pass. The parameters can be combinations of strings, numbers, labels and expressions.

Format

.debug parameter(s)

Example

.debug "The current memory address is " * ", how cool is that!"

Alternative.out


.end

The loading and therefore processing of source code lines stop at the line where this directive is used.

This can't be used in conditional code like in an .if block because it gets processed in the source code loading stage, so it will just leave an open .if as the last line of code.

Format

.end

.include

Includes the content of a source code file at the directive's code line, as if it was part of the main source code file's contents. Include files may use the .include directive to load other files, up to 16 levels of depth.

If the file is without a full path, the assembler tries to find it in known include directories. The defaults are the input source code file's directory, the assembler application's base directory and the include directory under these two. The rest of the lookup directories can be set up in the retroassembler-settings.xml file.

Format

.include "filename.ext"

Example

.include "C64_Registers.inc"

.incbin

Loads a binary file into the current memory segment, either at the current memory address (that you can control with a prior .org command), or at the memory location specified by the file's 2-byte load address header in auto mode. It's good for loading graphics assets, music and other data content, if you want to use the assembler as a linker.

If the file is without a full path, the assembler tries to find it in known Include directories. The defaults are the input source code file's directory, the assembler application's base directory and the include directory under these two. The rest of the lookup directories can be set up in the retroassembler-settings.xml file.

Unless you use the auto property, you can optionally set an Offset and even a Length, to control what sections to load from a more complex binary file. But in this mode the file is loaded at the current memory address, so make sure you set that up correctly beforehand.

Format

.incbin "filename.ext", [Offset], [Length]

.incbin "filename.ext", auto

Example

.incbin "music.bin" auto  //Load the file at $1000 where it belongs.

.segment

Segments are handled as separate virtual memory buffers within the assembler. They can be used to separate parts of the code, data blocks, memory banks on systems where they can be paged in, etc. Each segment belongs to a chosen memory bank.

When a segment is first mentioned, it gets created. In further cases the assembler simply switches to the existing segment by the same name and continutes to put instructions and data to the selected segment's current memory address.

If the bank number is missing from the first mention of a new segment, it gets created under the current bank. The current bank is where the last used segment is hosted, or if a new bank was just created, then it's the new, empty bank.

The referenced bank number must exist already, set up by a previous .bank directive, otherwise the assembler returns with an error message.

When a new segment is created in a memory Bank, the first segment of the Bank always gets a specific start address set, to the Bank's mapping address. Subsequent segments in the same bank are created as relocateable segments without a specific start address.

If the source code file doesn't specify a memory address (with .org) in a relocateable segment before the first instruction, the segment's contents will be relocated during compilation time, right behind the previous segment's last used memory address.

If you put at least one instruction into a relocateable segment, you are no longer allowed to set a specific memory address for the rest of the data in the segment. You have to create a new segment for those instructions or data bytes.

The assembler creates three standard segments by default: Code, Data and BSS, in this order, all in the default Bank 0. These can be accessed with directive shortcuts. The Code segment has a default start address of $0800 (or $0150 in a Gameboy project), the others are created without a specific start address, meaning they can be relocated during code compilation.

At the end of code compilation, if there are multiple segments in actual use, the assembler saves a merged file for each Bank. Optionally it can save each segment into individual files, by the setting "OutputSaveIndividualSegments".

The assembler also saves a "[CodeFileName]-Info.txt" file with information about each Bank and Segment used, with lowest and highest memory addresses, so the saved individual segments can be loaded to the correct memory address in your choice of linking method.

Format

.segment "Name", [BankNumber]

Examples

.segment "Scroll", 0  //New segment "Scroll" into Bank 0.

.segment "Code"       //Choose existing segment "Code"


.bank 1, 16, $4000    //Create the new Bank 1

.segment "Tiles"      //Create a new segment "Tiles" in Bank 1, because Bank 1 is the current bank.

//Save individual segments in Bin/Prg/Txt formats
.setting "OutputSaveIndividualSegments", true

You may do rapid switching between segments to separate certain data types. For example:

.segment "Code"  //".code" would do the same, as shortcut.

(Scroller subroutine code lines)

.segment "Data"  //".data" would do the same, as shortcut.

.stext "hello, this is my scroll text!"
.byte ' ', $ff //End of the scroll text with an additional space before repeat.

.segment "Code"  //".code" would do the same, as shortcut.

(other subroutines)

This way the scroller code and the scroll text can be kept near each other in the source code itself (they may even come from an include file), but the Code and Data would still be separated in memory during code compilation. In this example the scroll text bytes get placed after the (other subroutines) instructions in the output binary file.


.code, .data, .bss

Shortcut to the default Code, Data and BSS segments, respectively. It's the same as using the .segment directive with the selected segment's name, such as:

.segment "Data"

Format

.code
.data
.bss

Note that these segments are created in memory Bank 0.


.bank

Creates a new memory bank, or updates an existing bank's properties.

Memory banks act as a storage container for one or multiple memory segments. They don't do much for the typical project for a computer using the 6502 CPU, because usually the maximum it can address is 64 KB. But for example the Commodore 128 has 128 KB RAM and switchable memory banks, or a cartrige ROM can use switchable banks as well.

The real use for banks is with Gameboy projects, where anything larger than 32 KB needs to utilize ROM bank switching on the cartridge.

Each project gets a Bank 0 created by default, with 64KB size, mapped to $0000. The Code, Data and BSS segments are all created for Bank 0. If necessary, you can update this Bank 0's properties with the .bank directive, to change its size and mapping address for special projects.

Optionally an Info string can be set for each bank. This is used in special cases, like the NES ROM format builder identifies bank types and numbers by this value. You can read more about the NES output format here.

It's mainly up to the developer to stay within the Bank's limits. Saving full banks can be enabled by the setting "OutputSaveEntireBanks" (always on for Gameboy ROM output format) if that's what the project needs. Since the size and mapping address is set for each Bank, the assembler, after code compilation, saves the banks clipped down to the specific bank size. Code and data that's entered in the bank's segments outside the bank's boundaries are thrown away. So if you have a bank that's 16KB in size and maps to $4000, then even if you have code or data compiled to $8000+ in the virtual memory, those bytes will be ignored on save.

If necessary, you can opt in to save individual segments beside the banks, without clipping applied.

Bank numbers are zero based, and you may create gaps in the number of banks if you need to. The missing banks will be filled in with $00 in a Gameboy ROM and in other possible file formats that rely on saving multiple banks. The NES ROM builder doesn't use the bank numbers, only the Info strings.

If there is only Bank 0 created (which is likely for most assembly projects), the assembler doesn't deal with banks, unless it's enforced in settings, or the output file format requires it. The output file names will not be marked with the bank's number like "MyCode-Bank0.bin", it will be saved as "MyCode.bin"

Format

.bank BankNumber, SizeKB, MapAddress, [Info]

Examples

.bank 0, 16, $0000  //Update the existing Bank 0 for a Gameboy ROM

.bank 1, 16, $4000  //Create Bank 1 for a Gameboy ROM

.bank 0, 16, $c000, "NES_PRG0"  //Create PRG Bank 0 for a NES ROM

//Save full banks with clipping in Bin/Prg/Txt formats
.setting "OutputSaveEntireBanks", true

//Save individual segments (without clipping) in Bin/Prg/Txt formats
.setting "OutputSaveIndividualSegments", true


.region

Regions are logical blocks that encapsulate one or more source code lines. This directive is ignored by the assembler, but can be used in certain text editors to fold regions on demand. It's just like the #region directive in Visual Studio, purely a visual element.

This directive must be closed by using .endregion

Format

.region [Region name as free text]

(Code lines)

.endregion

.endregion

Closes the previously opened .region directive, so the encapsulated source code lines can be folded in certain text editors. It's just like the #endregion directive in Visual Studio, purely a visual element.

Format

.endregion

.function

Functions are logical blocks that encapsulate one or more source code lines, that are meant to be called as a subroutine. They don't have calling parameters, and internally this directive just converts the function's name into a Label.

Functions can be called in the code using "FunctionName()". The assembler replaces this with the target CPU's "call subroutine" instruction eg "jsr FunctionName" for 6502 code, or with the appropriate instruction for other CPU types.

This directive must be closed by using .endfunction

Format

.function FunctionName()

//Or as a friendly alternative...
.function FunctionName

Example

.function Scroller()

lda #$07
sta $d016
(other code lines)

.endfunction  //Serves as the "return from subroutine" instruction, eg "rts".

Calling example

lda #$0f
sta $d020

Scroller()  //Same as "jsr Scroller".

Lda #$00
sta $d020

Please note that you can't open a new .function or .macro inside a function, but you are free to use .loop, .if and .while directives.


.endfunction

Closes the previously opened .function directive. The assembler replaces this with the target CPU's "return from subroutine" instruction, eg the "rts" instruction for 6502 code, or with the appropriate instruction for other CPU types, so the subroutine can return automatically after the last code line. You may put your own "rts" instruction into the function at the point of your chosen return, but at the end the assembler will always add an "rts" as closure.

Format

[label] .endfunction

Examples

.endfunction

Return .endfunction  //Marks the "return from subroutine" instruction with the global label "Return".

Alternative.endf


.macro

Macros are logical blocks that encapsulate one or more source code lines, that are compiled into the segment memory at the place of the macro call, using the arguments set by the macro call.

A macro can have one, more or zero parameters, and each parameter can optionally get a default value, in case the parameter is not specified in the macro call itself. If you don't set a default value for a parameter, it will be handled as number 0 and you better set a value for that parameter during the actual macro call, unless you happen to need 0 there.

The parameter names don't actually get created as global labels, so you can reuse parameter names or you can use names that you defined as a label elsewhere. Also, macros of course can use global labels and local labels inside the code block, but if you need to define a label there, you better use a local label.

As this might be a bit complicated, so pay attention to the example below where I'll try to highlight the features.

This directive must be closed by using .endmacro

Format

.macro MacroName( [Parameters=[DefaultValues]] )

Example

.macro SetColors(BackgroundColor=$06, BorderColor=$0e, MemAddress)

lda #BackgroundColor
sta $d021
lda #BorderColor
sta $d020
ldx MemAddress
stx $3300

.endmacro

Calling examples

MyColor .equ $09

SetColors($0b, $0f)    //BackgroundColor is $0b, BorderColor is $0f.

SetColors(, MyColor+1) //BackgroundColor is the default $06, BorderColor is $0a ($09 + 1).

SetColors()            //Use the default values $06 and $0e for the colors.

Note that we never set a value for MemAddress, so it keeps reading the value from memory address $0000.

A more complex macro example using regional labels (such as @@Red) as variables. These are necessary because local labels like @Red would be closed by the .if directives.

//SetColor() macro for SNES which enters an RGB value into the selected color.
//The color palette index is selected in $2121 ($00 is for the background)
//and the color value is entered through $2122 as a 16 bit value, meaning
//it has to be entered as a series of two 8 bit values.

        //Sets the currently selected SNES color using a 24 bit RGB value.
        .macro SetColor(RGB, Address=$2122)

        //Get the separated RGB colors out of the 24 bit RGB value.
        //Example: RGB = $000f1f (R=$00, G=$0f, B=$1f) for a nice Blue
        //Each color value can be between $00-$1f (5 bits), but keep
        //all 8 bits for correction below.

@@Red   .var (RGB & $00ff0000) >> 16
@@Green .var (RGB & $0000ff00) >> 8
@@Blue  .var (RGB & $000000ff) >> 0

        //Repair the color values if they are outside the 5 bit limit.
        //Instead of just cropping the bits, use the highest intensity.

        //NOTE: This macro easily could be converted to something that
        //takes real 24 bit color values with $00-$ff intensity and
        //converts them to 5-bit intensity by bit shifting ($ff >> 3 = $1f)

        .if @@Red > $1f
        @@Red = $1f
        .endif

        .if @@Green > $1f
        @@Green = $1f
        .endif

        .if @@Blue > $1f
        @@Blue = $1f
        .endif

        //Combine these potentially repaired values into
        //15 bit RGB colors for the SNES (the highest bit is unused)

@@Color .var (@@Blue << 10) | (@@Green << 5) | (@@Red << 0)

        //Set the lower byte of the 16 bit color value...
        lda #<@@Color
        sta Address

        //Then set the higher byte of the value to enter the full color.
        lda #>@@Color
        sta Address

        .endmacro

Please note that you can't open a new .macro or .function inside a macro, but you are free to use .loop, .if and .while directives, that may be controlled by the calling arguments of the macro.

Known issue: There is a limitation with Regional (and Local) labels. If you make a Macro function call inside a Macro, the inner function closes the Regional labels when it exits. So jumping over an inner Macro function call using a Regional label is currently not possible. Fixing this would require rewriting the parser and introducing namespaces, so this will remain an issue until further notice.


.endmacro

Closes the previously opened .macro directive.

Format

.endmacro

Alternative.endm


.loop

Loop blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory LoopCount times in a row. The maximum limit is 100,000.

This directive must be closed by using .endloop

Further .loop directives can be nested under a .loop, just indent the code to make it easier to follow.

Format

.loop LoopCount

Example

.Loop 8

nop

.endloop

This example is intentionally simplistic, but you can do some clever things with loops, especially if you keep modifying a variable value inside the loop, and use that value as a code modifier.


.endloop

Closes the previously opened .loop directive.

Format

.endloop

Alternative.endl


.if

If blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory only if the conditional value or expression evaluates to 1 (true).

Since it works with expressions, arithmetic and logical comparisons etc, it can be a rather powerful tool.

This directive must be closed by using .endif

Further .if directives can be nested under an .if, just indent the code to make it easier to follow.

Format

.if Condition

Example

.if (SomeValue >= 2) || (OtherValue == 13)

lda #$00
sta $d020

.endif

.endif

Closes the previously opened .if directive.

Format

.endif

.while

While blocks are logical blocks that encapsulate one or more source code lines, that get compiled into the segment memory in a loop, as long as the conditional value or expression keeps evaluating to 1 (true) during each iteration.

Since it works with expressions, arithmetic and logical comparisons etc, it can be a rather powerful tool.

Given how easy it is to get into an endless loop with a badly set condition, you must be careful with this. Having some variable like a counter or other value that is constantly (or just sometimes) updated inside the block is key.

If the loop cycle reaches 100000, the parser will take it as an infinite loop detection and will return with an error message. If you need more loop cycles to achieve what you need, you will need to find a different solution.

You may use the .break directive to terminate the loop on a chosen condition set by the .if directive.

This directive must be closed by using .endwhile

Further .while directives can be nested under a .while, just indent the code to make it easier to follow.

Format

.while Condition

Example

MyCounter .var 0

.while MyCounter != 20

sta $3200 + MyCounter

MyCounter = MyCounter+1

.endwhile

.endwhile

Closes the previously opened .while directive.

Format

.endwhile

Alternative.endw


.break

Terminates the .while loop cycle where it's placed into. It can only be used in .while blocks.

Format

.break

Example

MyCounter .var 0

.while MyCounter != 20   //Run for 20 loop cycles!

.if MyCounter == 5       //Actually I changed my mind, 5 is enough.
                         //(Silly example but you get it.)
.break

.endif

sta $3200 + MyCounter

MyCounter = MyCounter+1

.endwhile

.align

Aligns the upcoming instructions or data to the next "round" memory address. The alignment value must be the power of 2, such as 2, 4, 8, 16, 32, 64 etc, up to 1 megabyte.

If the filler byte is not set, the bytes only get allocated in memory without setting a byte value for them. This makes it possible to align purely allocated bytes too.

Alignment works in relocatable memory segments as well.

Format

[label] .align Alignment, [Filler byte]

Examples

//Align code/data to the next round $xx00 memory address, only allocate the skipped bytes.
.align $100

//Align code with 6502 NOP instructions
.align $80, $ea

.storage

Preserves the following Length number of bytes in the memory to be used as storage bytes.

If the filler byte is not set, the bytes only get allocated in memory without setting a byte value for them. This can be used to mark an array in memory, outside the compiled binary file's saved bytes, to be used by the program. For example in a Gameboy game the compiled code is in ROM banks, but the RAM from $c000 can be preserved using allocated bytes. It's better to use this to allocate an array, instead of just making a label for it, especially if the array is followed by similarly allocated bytes and words by using ButtonState .byte ? etc.

Format

[label] .storage Length, [Filler byte]

Examples

//Allocate $20 bytes in memory, without setting a byte value for those bytes.
.storage $20

//Set $20 bytes in memory, filled with byte value $ff
.storage $20, $ff

Alternatives.ds.fill


.byte

Puts one or more bytes at the current memory address.

The accepted, comma separated value types are 8-bit numbers ($00-$ff), characters and strings. Characters and strings are converted to ASCII bytes, just like the .text directive does it.

Negative numbers are also accepted, with the limitation of max 7 bits.

The ? value can be used for memory allocation, without setting any value at the current memory address.

Format

[label] .byte Value, [Values]

Examples

.byte $12, %1001, <MyLabel, '\t', "My string value"

//Byte allocation
MyAllocatedValue  .byte ?

MyAllocatedValues .byte ?, ?, ?

Alternative.b


.word

Puts one or more words (16 bit values) at the current memory address.

The accepted, comma separated value types are 16-bit numbers ($00-$ffff).

Negative numbers are also accepted, with the limitation of max 15 bits.

The word's two bytes are put into the memory buffer in the order of the target CPU's endianness. For the 6502 family it means that $1234 is entered as "$34, $12".

The ? value can be used for memory allocation, without setting any value at the current memory address.

Format

[label] .word Value, [Values]

Examples

.word $1234, $12, %1001, MyLabel

//Word allocation
MyAllocatedValue  .word ?
MyAllocatedValues .word ?, ?, ?

Alternative.w


.dword

Puts one or more double words (32 bit values) at the current memory address.

The accepted, comma separated value types are 32-bit numbers ($00-$ffffffff).

Negative numbers are also accepted, with the limitation of max 31 bits.

The word's two bytes are put into the memory buffer in the order of the target CPU's endianness. For the 6502 family it means that $12345678 is entered as "$78, $56, $34, $12".

The ? value can be used for memory allocation, without setting any value at the current memory address.

Format

[label] .dword Value, [Values]

Examples

.dword $12345678, $1234, $12, %1001, MyLabel

//Double word allocation
MyAllocatedValue  .dword ?
MyAllocatedValues .dword ?, ?, ?

Alternative.dw


.lobyte

Puts one or more bytes at the current memory address, using the entered value's Low byte (bits 1-8).

The accepted, comma separated value types are 16-bit numbers (0-65535), that also include 8-bit numbers (0-255).

Negative numbers are also accepted, with the limitation of max 15 bits.

Format

[label] .lobyte Value, [Values]

Example

.lobyte $1234, MyLabel

This is the same as

.byte <$1234, <MyLabel

.hibyte

Puts one or more bytes at the current memory address, using the entered value's High byte (bits 9-16).

The accepted, comma separated value types are 16-bit numbers (0-65535), that also include 8-bit numbers (0-255).

Negative numbers are also accepted, with the limitation of max 15 bits.

Format

[label] .hibyte Value, [Values]

Example

.hibyte $1234, MyLabel

This is the same as

.byte >$1234, >MyLabel

.loword

Puts two or more bytes at the current memory address, using the entered 32 bit value's Low word (bits 1-16).

The accepted, comma separated value types are 32-bit numbers ($00-$ffffffff).

Negative numbers are also accepted, with the limitation of max 31 bits.

Format

[label] .loword Value, [Values]

Example

.loword $12345678, MyLabel

.hiword

Puts two or more bytes at the current memory address, using the entered value's High word (bits 17-32).

The accepted, comma separated value types are 32-bit numbers ($00-$ffffffff).

Negative numbers are also accepted, with the limitation of max 31 bits.

Format

[label] .hiword Value, [Values]

Example

.hiword $12345678, MyLabel

.text

Puts one or more bytes at the current memory address, by converting strings to ASCII text bytes.

The accepted, comma separated value types are characters and strings.

Format

[label] .text Value, [Values]

Example

.text "My String Value", 'c'

Alternatives.t.txt


.stext

Puts one or more bytes at the current memory address, by converting strings to simplified ASCII bytes used in scroll texts and other, more compact text displays.

The strings are first converted to lowercase, then the ASCII bytes are rearranged, so $60-$7f (@abc...) are at $00-$1f, while the symbols and numbers are left intact in the $20-$3f range. The end result is text that fits into a character set of 64 characters, which is typical in demos and games.

The accepted, comma separated value types are characters and strings.

Format

[label] .text Value, [Values]

Example

.stext "my scroll text!", 'c'

Alternatives.st.stxt


.generate

This directive generates byte values at the current memory address, depending on the selected Mode, and on the Parameter(s) that the Mode expects. Usually these are data tables that can be utilized for demo effects.

Each Mode has default parameters that they can fall back to if a parameter is not provided, but it wouldn't hurt to set each parameter to your liking to avoid strange results.

Depending on what the generator does, if the resulting data values can't be saved in a single byte when an 8-bit CPU is targeted (for example making a sinwave between $00 and $03ff values with the 6502 CPU selected), the generator creates two identically sized data outputs in a row. First the Low Bytes, then the High Bytes of each corresponding data value.

Format

[label] .generate "Mode", [Parameter(s)]

Modes with examples

//Generates a Sine Wave data table.

.generate "sinwave", MinValue, MaxValue, Length, RotationDegrees

.generate "sinwave", $00, $7f, $100, 270  //Make a wave starting with $00 by rotating it 270 degrees.

.generate "sinwave", $00, $7f, $100       //No rotation requested.


//Generates a Cosine Wave data table.

.generate "coswave", MinValue, MaxValue, Length, RotationDegrees

.generate "coswave", $20, $bf, $80, 180  //Start from the middle of the wave by rotating it 180 degrees.

.generate "coswave", $20, $bf, $80       //No rotation requested.


//Generates a Bounce Wave data table, an arch between Min-Max-Min.

.generate "bouncewave", MinValue, MaxValue, Length, Flip

.generate "bouncewave", $00, $7f, $100     //Bounce through $00-$7f-$00

.generate "bouncewave", $00, $7f, $100, 1  //Flip the data to be through $7f-$00-$7f


.memory

Executes various memory operations in the current memory segment.

The current memory location for new code/data doesn't get changed by this directive.

Format

.memory "Mode", StartAddress, Length, [Parameter(s)]
Mode Parameter 1 Parameter 2 Description
fill Byte value to fill with. - Fills the selected memory fragment with the selected byte.
copy Destination address. - Copies the selected memory fragment to the selected destination address.
move Destination address. - Moves the selected memory fragment to the selected destination address, then zeroes out the bytes at the original memory location.
replace Original byte value. Replacement byte value. Replaces a selected byte value to another in the selected memory fragment.
add Byte value to add. - Adds the selected byte to each byte in the selected memory fragment.
subtract, sub Byte value to subtract. - Subtracts the selected byte from each byte in the selected memory fragment.
shiftleft, left Number of bits to shift. - Shifts each byte in the selected memory fragment to the left by the selected number of bits.
shiftright, right Number of bits to shift. - Shifts each byte in the selected memory fragment to the right by the selected number of bits.
negate, neg - - Negates (inverts) the bits in each byte in the selected memory fragment.
xor, eor Byte value to use. - Performs a bit-wise XOR on each byte in the selected memory fragment.
or Byte value to use. - Performs a bit-wise OR on each byte in the selected memory fragment.
and Byte value to use. - Performs a bit-wise AND on each byte in the selected memory fragment.

Example

.org $2000

.generate "sinwave", $00, $7f, $100

.memory "copy", $2000, $100, $2100

.memory "add", $2100, $100, $80

.memorydump

Saves the selected number of bytes from the target's virtual memory buffer (from the chosen memory bank) into a text file, as .byte source code lines. There will be up to 8 bytes in each line and a separator after every 256 bytes in the generated source code text file.

This can be useful if you need to convert an existing binary file into a source code insert, or if you generate something inside the assembler that you want to save and then manage manually.

Optionally the bytes can be saved into a binary file, just in case that's a better way for the user.

Format

.memorydump "filename.ext", BankNumber, StartAddress, Length, [Binary mode]

Example

.org $2000

.generate "sinwave", 0, $00, $7f, $100

.memorydump "MyWaveData.s", 0, $2000, $100

.memorydump "MyWaveData.bin", 0, $2000, $100, true


//And the output in the file is like this:

//Memory Dump - Start Address $2000

.byte $3f, $41, $42, $44, $45, $47, $48, $4a
.byte $4b, $4d, $4e, $50, $51, $53, $54, $56
(...)

Instructions

Instructions are the actual assembly code that will be converted to an instruction opcode and operand bytes. The generally accepted format is:

[label] mnemonic [operand with chosen addressing type] [comment]

Example

BackgroundColor   sta $d020  //In memory this looks like $8d $20 $d0

6502 Family

On the 6502 family the assembler uses the standard instructions and addressing types, like it's described on 6502 opcodes on 6502.org.

Instructions, including aliases:

adc, and, asl, bcc, bcs, beq, bge, bit, blt, bmi
bne, bpl, brk, bvc, bvs, clc, cld, cli, clv, cmp
cpx, cpy, dec, dex, dey, eor, inc, inx, iny, jmp
jsr, lda, ldx, ldy, lsr, nop, ora, pha, php, pla
plp, rol, ror, rti, rts, sbc, sec, sed, sei, sta
stx, sty, tax, tay, tsx, txa, txs, tya, xor

The accepted aliases:

Instruction Alias
bcc blt
bcs bge
eor xor

Some undocumented (also called as illegal) instructions are also supported using the optional -u switch. They are described on oxyron.de (which is an awesome demo scene group, check out their work) but as a brief recap, here are the supported instructions and their aliases:

ahx, alr, anc, arr, aso, asr, axs, dcp, hlt, isb
isc, jam, kil, lar, las, lax, lse, nop, rla, rra
sax, shx, shy, slo, sre, tas, xaa

Instead of using dnp or tnp or other variants for double and triple nop, the assembler uses the actual nop mnemonic with various addressing types. These are actually useful in demos, the other undocumented instructions not so much, and those can be unstable on different CPUs.

nop #$nn     //$80, uses 2 CPU cycles
nop $nn      //$03, uses 3 CPU cycles
nop $nn,x    //$14, uses 4 CPU cycles
nop $nnnn    //$0c, uses 4 CPU cycles
nop $nnnn,x  //$1c, uses 4 CPU cycles, +1 for crossing a page boundary

These nops can be useful in precise timing, or the nop $nnnn variant is a good way to temporarily comment out a jsr or jmp instruction, by replacing their opcode with $0c in a self-modifying code. Then the memory address they refer to remains intact, the instruction gets ignored by the CPU, and it can be restored when needed.

Mind you, 65C02, 65SC02 and 65C816 use these opcodes to implement new instructions, so if you want 100% compatibility with those CPUs, just refrain from using illegal instructions, including the above mentioned NOPs. You can just use "bit $1234" to temporarily disable a jsr or jmp instruction.

65C02 / 65SC02

The 65C02 / 65SC02 CPU is an extension over the standard 6502, that came with three new addressing modes and several new instructions. Undocumented (illegal) instructions are not allowed for this CPU type. For information about the changes see 65C02 opcodes on 6502.org.

Instructions, including aliases:

adc, and, asl, bcc, bcs, beq, bge, bit, blt, bmi
bne, bpl, bra, brk, bvc, bvs, clc, cld, cli, clv
cmp, cpx, cpy, dec, dex, dey, eor, inc, inx, iny
jmp, jsr, lda, ldx, ldy, lsr, nop, ora, pha, php
phx, phy, pla, plp, plx, ply, rol, ror, rti, rts
sbc, sec, sed, sei, sta, stx, sty, stz, tax, tay
trb, tsb, tsx, txa, txs, tya, xor

(Vendor-specific instructions)
bbr0, bbr1, bbr2, bbr3, bbr4, bbr5, bbr6, bbr7
bbs0, bbs1, bbs2, bbs3, bbs4, bbs5, bbs6, bbs7
rmb0, rmb1, rmb2, rmb3, rmb4, rmb5, rmb6, rmb7
smb0, smb1, smb2, smb3, smb4, smb5, smb6, smb7
stp, wai

The accepted aliases:

Instruction Alias
bcc blt
bcs bge
eor xor

65816

The 65816 CPU is an extension over the standard 6502 (and partially over 65C02), that came with several new addressing modes and instructions, 16 bit registers etc. Undocumented (illegal) instructions are not allowed for this CPU type. For information about the changes see 65816 opcodes on 6502.org.

Instructions, including aliases:

adc, and, asl, bcc, bcs, beq, bge, bit, blt, bmi
bne, bpl, bra, brk, brl, bvc, bvs, clc, cld, cli
clv, cmp, cop, cpx, cpy, dec, dex, dey, eor, inc
inx, iny, jml, jmp, jsl, jsr, lda, ldx, ldy, lsr
mvn, mvp, nop, ora, pea, pei, per, pha, phb, phd
phk, php, phx, phy, pla, plb, pld, plp, plx, ply
rep, rol, ror, rti, rtl, rts, sbc, sec, sed, sei
sep, sta, stp, stx, sty, stz, tax, tay, tcd, tcs
tdc, trb, tsb, tsc, tsx, txa, txs, txy, tya, tyx
wai, xba, xce, xor

The accepted aliases:

Instruction Alias
bcc blt
bcs bge
eor xor
jmp jml (for long addressing)
jsr jsl (for long addressing)

Due to the changes in this CPU, some instructions may be saved a way you don't expect it:

brk $12  -->  $00 $12
brk      -->  $00 $00

cop $12  -->  $02 $12
cop      -->  $02 $00

The 65816 deals with some unfortunate ambiguity when it comes to instruction opcodes. For example the instructions lda #$12 and lda #$1234 use the same $a9 opcode, which makes it hard to figure out what's going on just by reading the bytes in a file. How they should be handled is decided by CPU flags for the Accumulator (M) and for the Index Registers (X) that you need to set up in your code by the instructions "sep" and "rep". These take a bitfield value, where each 1 bit sets or resets the corresponding CPU flag.

The disassembler attempts to deal with this by following the changes in the M and X flags in sequential code.

The assembler gives the control to you. If you enter lda #$12, that will be saved as $a9 $12. If you enter lda #$1234, that will be saved as $a9 $34 $12, even if you didn't actually change the M flag yet.

In cases like the CPU is set to 16 bit Accumulator mode and your value happens to be an 8-bit number (or anything smaller than what you need), you may enter it with a Preferred Number Size, like lda #$0012. This will tell the compiler that the value should be stored on 16 bits. It works with 24 bit values too, for example sta $000012 or sta $001234,x if you must do that.

If you don't want to manage this so closely, in case of the 65816 you can use two setting values that you can modify via the .setting directive. RegA16 and RegXY16 can tell the compiler that from now on, until the setting is set to false again, all immediate addressings should be handled in 16 bit mode. For example lda #$12 will be handled as if you typed in lda #$0012

You may set up macros that turn the CPU flags and these settings on and off to make the code less error-prone.

Nintendo Gameboy

The Gameboy has a custom CPU which is based on the Z80, but has some different instructions and registers.

Instructions, including aliases:

adc, add, and, bit, call, ccf, cp, cpl, daa, dec
di, ei, halt, inc, jp, jr, ld, ldd, ldh, ldhl
ldi, nop, or, pop, push, res, ret, reti, rl, rla
rlc, rlca, rr, rra, rrc, rrca, rst, sbc, scf, set
sla, sra, srl, stop, sub, swap, xor, ex

If you're familiar with Gameboy assembly programming, these are alternative ways to do certain things:

//Alternatives for dealing with "HL, decreased"
ld a,(hl-)
ld a,(hld)
ldd a,(hl)

ld (hl-),a
ld (hld),a
ldd (hl),a

//Alternatives for dealing with "HL, increased"
ld a,(hl+)
ld a,(hli)
ldi a,(hl)

ld (hl+),a
ld (hli),a
ldi (hl),a

//Alternatives for dealing with the High RAM where the hardware registers are
ld ($ff00+c),a
ld (c),a

ld a,($ff00+c)
ld a,(c)

ld ($ff12),a
ldh ($12),a

ld a,($ff12)
ldh a,($12)

Most instructions that normally just imply working with the "a" register can be written as showing the "a" register, just in case that's how you like it. Some examples:

daa    =  daa a

cpl    =  cpl a

and a  =  and a,a

and b  =  and a,b

and c  =  and a,c

The assembler tries to be compatible with existing RGBDS source code files, so the brackets in the addressing modes that use them are accepted with square brackets "[ ]" as well.

A "nop" is placed behind each "halt" and "stop" instruction during code compilation to avoid hardware issues. This behavior can be changed in the Settings Xml file, but it's recommended keep it this way.

Z80

The Z80 CPU is a classic with a wide range of instructions and addressing modes, as it's described on Z80 opcodes on clrhome.org.

Instructions:

adc, add, and, bit, call, ccf, cp, cpd, cpdr, cpi
cpir, cpl, daa, dec, di, djnz, ei, ex, exx, halt
im, in, inc, ind, indr, ini, inir, jp, jr, ld
ldd, lddr, ldi, ldir, neg, nop, or, otdr, otir, out
outd, outi, pop, push, res, ret, reti, retn, rl, rla
rlc, rlca, rld, rr, rra, rrc, rrca, rrd, rst, sbc
scf, set, sla, sra, srl, sub, xor

Most instructions that normally just imply working with the "a" register can be written as showing the "a" register, just in case that's how you like it. Some examples:

daa    =  daa a

cpl    =  cpl a

and a  =  and a,a

and b  =  and a,b

and c  =  and a,c

Another notable alias is for the ex af,af' instruction, which would swap the contents of the af register pair with its background counterparts. This apostrophe (') character at the end may cause your text editor go haywire due to an unclosed character literal, so it's recommended to write this instruction without that. The assembler handles it both ways.

//This instruction...
ex af,af'

//Is the same as...
ex af,af

Some undocumented (also called as illegal) instructions are also supported using the optional -u switch. The only unique instruction by mnemonic is sll, but there are a bunch of otherwise standard instructions with undocumented addressing modes that you can utilize, if needed.

This usually may not apply to pure Z80 source code, but similarly to the Nintendo Gameboy CPU mentioned above, the assembler tries to be compatible with existing RGBDS source code files, so the brackets in the addressing modes that use them are accepted with square brackets "[ ]" as well.


Output File Formats

The assembler is capable of saving the compiled bytes from the target memory banks in various formats. Some are generic, others are system specific. For example it makes no sense to save 65c02 code in Gameboy ROM format, or NES code in a T64 format, even though the NES uses the 6502 CPU similar to Commodore computers.

The output file format can be selected from the command line eg -O=txt, or using the .setting directive eg .setting "OutputFileType", "nes"

BIN – Binary file

Binary file with .bin extension. It contains the compiled memory bytes, from the first used memory address to the last used one. If it's chosen in Settings, it may be an entire memory bank. The unset bytes and memory gaps are always filled with $00.

This file contains no meta information about the load address, it's always up to the user to load the file to the correct memory location.

PRG – Binary file with load address header

Program file with .prg extenstion. It's the same as a Binary file, but the load address is saved as the first few bytes of the file to identify the correct memory location. This is the format that Commodore computers use.

The load address is stored using the target CPU's endianness and address width. Meaning, 8-bit CPUs like the 6502 in Commodore computers store the address on the first 2 bytes, where an address like $0801 is stored as bytes $01, $08.

Projects for other CPUs may use this format as well, it's not limited to the 6502. For a 16 or 32 bit CPU the address header takes up 4 bytes because the load address is stored on 32 bits.

T64 – Tape image format for Commodore computers

Tape image file with .t64 extension. Used by Commodore computers (emulators), it's basically a tape with one or more files on it, up to maximum 30 files. The files are saved as PRG files with load address header on each of them. Use this only for 6502 and perhaps for 65C02 / 65816 projects.

D64 – Disk image format for Commodore computers

Disk image file with .d64 extension. Initially it's created as a T64 file so it can hold up to maximum 30 files. Then it's converted to D64 using the VICE emulator package's program "c1541". The path to this VICE directory has to be set up in the retroassembler-settings.xml file, in the VicePath value.

TXT – Configurable text file format

ASCII text file with .txt extension. It contains the compiled memory bytes from the first used memory address to the last used one. If there are gaps in the memory, the unused bytes are skipped and the continuous bytes are organized into Areas. The output for a file may look like this:

c000 78 d8 a2 40 8e 17 40 a2
c008 ff 9a e8 8e 00 20 8e 01
c010 20 8e 10 40 2c 02 20 10
c018 fb a9 00 95 00 9d 00 01
c020 9d 00 02 9d 00 04 9d 00
c028 05 9d 00 06 9d 00 07 a9
c030 fe 9d 00 03 e8 d0 e2 2c
c038 02 20 10 fb a9 80 8d 01
c040 20 4c 41 c0 40 40

fffa 44 c0 00 c0 45 c0

Nearly every aspect of the text file format can be customized via the OutputTxt Settings. The values are printed using the .Net string.Format() function, but if you're not familiar with it, here is a primer.

A string value like "{0:x04} " means that the Parameter 0 (which is the number being printed) should be in hexadecimal format, padded up to 4 characters with 0s, in lower case, with a space after it. For example $1ab would be shown as "01ab ".

"{0:X02}" would mean the number should be in hexadecimal format, padded up to 2 characters with 0s, in upper case (mind the upper case X in the string). For example $0d would be shown as "0D".

"{0}" would mean the number should be in decimal format, without any padding. For example $0d would be shown as "13".

The rest of the settings are fairly obvious, they are about where to put spacing, where to put newline characters and what they should be like. I chose the Windows standard "\r\n" for default values. You may also control how many bytes should be printed into a single line.

GB – Nintendo Gameboy ROM format

Nintendo Gameboy ROM file with .gb extension. This format uses a special header and memory banks. The scope of this document doesn't allow for a Gameboy development tutorial, but you may consult the attached example.

In a nutshell, the Gameboy uses a custom CPU with some RAM and switchable memory banks that are 16KB each. Bank 0 is always mapped at $0000, other banks are mapped at $4000.

The Gameboy ROM builder identifies the banks by their bank number, the Info string is not in use (unlike for NES ROMs). If all you want to build is a 32KB ROM where you don't use bank switching, just use the default Bank 0 and the linker will handle it as a 32KB bank between $0000-$7fff.

Multiple banks should be used like this:

//The Main bank 0, mapped to $0000
.bank 0, 16, $0000

//Other banks, mapped to $4000
.bank 1, 16, $4000
.bank 2, 16, $4000
.bank 3, 16, $4000

The ROM format requires having a special header between $0100-$014f, which includes the Nintendo logo, creator info, cartridge descriptors and calculated checksum values. The ROM builder handles all that for you. The number of banks is detected and missing banks are padded up for a valid ROM size. You will need to set up the Reset and Interrupt vectors on the zero page, but the jump at $100 to the starting point of your program is generated by the linker. A typical Gameboy program starts at $0150, the assembler automatically sets this address for the entered code in Bank 0 by default.

You may choose your own Memory Bank Controller (MBC) type, but realistically, look at the calendar... Unless you have some super special needs, just set the other GameboyCartridge... setting values and the linker will figure out which MBC5 cartridge type you need. Just write your bank switching code according to the MBC5 specification and call it a day.

See the Gameboy ROM format Settings that you can set up for your project, it covers everything you need to build a valid Gameboy ROM file with header. Set these with the .setting directive from your main source code file.

NES – Nintendo Entertainment System ROM format

Nintendo Entertainment System ROM file with .nes extension. This format uses a special header and memory banks. The scope of this document doesn't allow for a NES development tutorial, but you may consult the attached example.

In a nutshell, the NES uses a 6502 CPU with minimal RAM and switchable Program (PRG) and Character (CHR) banks. The latter is specifically for graphics, tiles and sprites. PRG banks are 16KB each, CHR banks are 8KB each.

The NES ROM builder identifies the bank types and their sequential numbers solely from the Info string parameter of the .bank directive.

Examples:

//The Main PRG bank, mapped to $c000
.bank 0, 16, $c000, "NES_PRG0"

//Another PRG bank, mapped to $8000
.bank 1, 16, $8000, "NES_PRG1"

//This PRG bank will be ignored by the linker because NES_PRG2 is missing,
//so it stops collecting PRG banks and goes ahead to collect CHR banks.
.bank 2, 16, $8000, "NES_PRG3"

//The first CHR bank, mapped to $0000 (doesn't matter, it's not code)
.bank 3, 8, 0, "NES_CHR0"

//The second CHR bank, mapped to $0000 (doesn't matter, it's not code)
.bank 4, 8, 0, "NES_CHR1"

//Special, completely optional "Trainer" bank, mapped between $7000-$71ff
//Just set 1 KB size, only the first 512 bytes will be taken from it.
.bank 5, 1, $7000, "NES_Trainer"

The assembler bank numbers don't matter for NES ROMs, only the numbers in the Info string. So you can even make PRG banks use 0-99, CHR banks use 100-199, Trainer use 200... It's all up to you. But when it comes to the bank numbers in the Info strings, they have to start from 0 and be continuous. The linker will start looking for "Does NES_PRG[number] exist?" and increment the number. If it doesn't find the PRG bank by the currently checked number, it goes ahead to look for CHR banks and stops at a missing bank number there as well.

You can use a lot of PRG and CHR banks, even more than 255 each if you need to. The linker utilizes some of the NES 2.0 ROM format extensions so you can build mega ROMs.

See the NES ROM format Settings that you can set up for your project, it covers everything you need to build a valid NES ROM file with header. Set these with the .setting directive from your main source code file.

SNES – Super Nintendo Entertainment System ROM format

Super Nintendo Entertainment System ROM file with .sfc extension. This format uses a special header and memory banks. The scope of this document doesn't allow for a SNES development tutorial, but you may consult the attached example.

In a nutshell, the SNES uses a 65816 CPU with minimal RAM and switchable memory banks that may contain code, graphics or other data. There are two versions of the memory bank layout:

The ROM builder supports all possible SNES ROM settings in theory, they can be set up in the header, but the non-standard ROM types and sizes will require additional work on the user's side to make the ROM file valid. The standard (or accepted) ROM sizes are handled well and the checksum values are calculated correctly for those. The SNESPadding setting is True by default, meaning it will pad the ROM with extra empty banks at the end of the file to reach the (next) valid ROM size, which are the following:

256KB (2 MBit), 512KB (4 MBit), 1MB (8 MBit), 1.5MB (12 MBit), 2MB (16 MBit), 3MB (24 MBit), 4MB (32 MBit)

The memory banks are used by their bank number (not by the Info string parameter, like in the NES ROM format), starting from 0 and collected sequentially. Gaps between banks are filled with empty banks. The bank size is determined by the SNESHiROM setting, which is False by default, meaning the LoROM format is in use with 32 KB banks by default.

The ROM header is generated in Bank 0, between $ffc0-$ffdf, and (as far as I know) the interrupt vectors are always in Bank 0, from $ffe0 (technically from $ffe4).

LoROM setup example (32 KB banks)

.setting "SNESPadding", true         //Create a valid ROM size by padding with extra, empty banks
.setting "SNESHiROM", false          //32 KB banks between $8000-$ffff (LoROM format)
.setting "SNESCartridgeType", 2      //ROM and SRAM
.setting "SNESSRAMSize", 3           //8 KB SRAM
.setting "SNESFastROM", true         //Fast ROM with 120ns access speed
.setting "SNESCountry", 1            //USA (NTSC)
.setting "SNESLicenseeCode", 1       //Nintendo's licensee code
.setting "SNESTitle", "My test ROM"  //ROM title in max 21 characters

//The Main bank 0, mapped to $8000
.bank 0, 32, $8000

//Other banks, mapped to $8000
.bank 1, 32, $8000
.bank 2, 32, $8000
.bank 3, 32, $8000

HiROM setup example (64 KB banks)

.setting "SNESPadding", true         //Create a valid ROM size by padding with extra, empty banks
.setting "SNESHiROM", true           //64 KB banks between $0000-$ffff (note the TRUE value!)
.setting "SNESCartridgeType", 2      //ROM and SRAM
.setting "SNESSRAMSize", 3           //8 KB SRAM
.setting "SNESFastROM", true         //Fast ROM with 120ns access speed
.setting "SNESCountry", 1            //USA (NTSC)
.setting "SNESLicenseeCode", 1       //Nintendo's licensee code
.setting "SNESTitle", "My test ROM"  //ROM title in max 21 characters

//The Main bank 0, mapped to $0000
.bank 0, 64, $0000

//Other banks, mapped to $0000
.bank 1, 64, $0000
.bank 2, 64, $0000
.bank 3, 64, $0000

Of course these banks need .segment entries and everything else, and the ROM settings are just examples. Substitute your own needs for a cartridge type, SRAM quantity (may be 0) and so on. See the SNES settings and the SNES Kart document for more info.

SMS – Sega Master System ROM format

Sega Master System ROM file with .sms extension. This format uses a special header and memory banks. This is the same as the Sega Game Gear format. The scope of this document doesn't allow for a Sega development tutorial, but you may consult the attached example.

In a nutshell, the SMS and GG use a Z80 CPU with some RAM and switchable memory banks that are 16KB each. Bank 0 is always mapped at $0000, other banks are mapped at $4000 or $8000, mostly the latter. By convention, Bank 0 and Bank 1 (or just a 32KB Bank 0) are mapped in by default in the first 32KB, and the other banks, if available, should be mapped at $8000.

The Gameboy ROM builder identifies the banks by their bank number, the Info string is not in use (unlike for NES ROMs). If all you want to build is a 32KB ROM where you don't use bank switching, just use the default Bank 0 and set it to 32KB in size.

Multiple banks should be used like this:

//The Main bank 0, mapped to $0000
.bank 0, 16, $0000

//The (other) Main bank 1, mapped to $4000
.bank 1, 16, $4000

//Other banks that will be mapped to $8000
.bank 2, 16, $8000
.bank 3, 16, $8000

Or like this, which is essentially the same:

//The Main bank 0 in 32KB size, mapped to $0000
.bank 0, 32, $0000

//Other banks that will be mapped to $8000
.bank 1, 16, $8000
.bank 2, 16, $8000

The ROM format requires having a special header between $7ff0-$7fff, which includes an identifier string, product code, country code, ROM size and a calculated checksum value. The ROM builder handles all that for you. The number of banks is detected and missing banks are padded up for a valid ROM size. The valid ROM sizes are 32KB, 64KB, 128KB, 256KB and 512KB.

See the SMS & GG ROM format Settings that you can set up for your project, it covers everything you need to build a valid SMS/GG ROM file with header. Set these with the .setting directive from your main source code file.

There is another semi-standard header, called SDSC Header placed between $7fe0-$7fef with information about the ROM and its developer. The ROM builder doesn't create this, but you can do so manually with ease in the source code, if you want to support this in your ROM file.

GG – Sega Game Gear ROM format

Sega Game Gear ROM file with .gg extension.

This is exactly the same as the Sega Master System ROM format, but with a .gg extension that helps emulators differentiate between system types. Both use the Z80 CPU, ROM banks and the same header format.

See the SMS ROM format above and use the SMS & GG ROM format Settings in your source code.

TAP – ZX Spectrum 48K Tape format

ZX Spectrum 48K Tape file with .tap extension. This format is limited to 48 KB of data, between $4000-$ffff. The scope of this document doesn't allow for a ZX Spectrum development tutorial, but you may consult the attached example.

The ZX Spectrum uses the Z80 CPU and its 48K version has 48 KB of RAM, mapped between $4000-$ffff. The first 16 KB is occupied by the ROM, which contains a BASIC interpreter and system routines. While there is a 128K version of this computer, the TAP file format is restricted to 48 KB in this case. Future expansions are possible, if there is demand for it.

The Tape file has multiple headers and data blocks. To make it work in an emulator (or on a real ZX Spectrum), it has to start with a BASIC loader. First this program gets loaded into the memory and when it's executed, it loads and runs the compiled binary file.

BASIC loader example

10 REM Retro Assembler
20 BORDER VAL "0": PAPER VAL "0": INK VAL "7"
30 CLEAR VAL "24575"
50 LOAD "code"CODE 
60 RANDOMIZE USR VAL "32768"

The assembler enters the TapClear setting value into line 30 (defaults to $5fff), and the TapStart setting value into line 60 (defaults to $8000) to customize the BASIC loader for the compiled code's needs.

The ZX Spectrum's graphics memory is allocated between $4000-$5fff (roughly), so programs should be loaded from $6000, and in many cases programs start at $8000. This is up to the developer. According to my tests, even programs loaded to $4000 start up fine in the Speccy emulator, but the screen gets messy due to writing into the graphics memory. The Spectrum loads the BASIC program from about $5c00 (just outside the graphics memory) so loading files from $6000 is safe, without the risk of overwriting this small BASIC loader's program. The TAP file builder ignores data you may put below $4000 and above $ffff.

If you are wondering how the Spectrum games load a picture first, and then the game itself, it's done with multiple data blocks in the TAP file. First a BASIC loader is entered, which loads an image file from $4000, then the code/data file from $6000 or higher. The TAP file builder doesn't support this, it's for testing code, so if you are planning to write a game with a hero image, you will need some other utility to link up your final code.


Integration with Text Editors

Source code for Retro Assembler can be written by using any text editor, but it's best to use something that supports user defined languages and syntax highlighting. I made support for two, these might be the most popular code editors today.

If you create a syntax highlighter for your favorite editor, I would be happy to hear from you. It can be added to the assembler package and its documentation with proper crediting.

Visual Studio Code

Visual Studio Code is a powerful, multi-platform IDE from Microsoft. If you don't have a chosen IDE for assembly development, just go with this one.

I created an Extension for it which makes VS Code work with Retro Assembler quite well. It can be downloaded from the Marketplace inside VS Code, just search for "retro assembler" or visit this page: VS Code Extension for Retro Assembler

Make sure you read the Readme of the extension itself in VS Code about how to set it up.

You can set up keyboard shortcuts and the assembler path to compile (and even auto-start) your code directly from the IDE.

VS Code has a bit of an issue with identifying the correct CPU type for the viewed source code file, but you can use the MyFile.6502.asm automatic CPU detection from file name hack explained above, or you can set a default language syntax for the assembly files. This is also covered in the Readme file.

You should consider using the Retro Assembler Light or Dark theme included with the extension, because it will do correct colorization of normal and undocumented instructions, directives etc. I've done what I could to make it look OK with other themes, but the mess of VS Code theme styles is a story for another day. The extension will be updated with new features and coverage for future assembler versions.

Notepad++

Notepad++ is a small but powerful text editor. It was the first editor I made support for to turn it into an IDE. It can use User Defined Languages with syntax highlighting by colorization, folding of comments, functions and macros, and also regions.

You can set this all up by clicking at the menu item Language -> Define your language... and import one (or any/all) of these files that came with the assembler's package, using the Import... button.

File name Description
RetroAssembler_6502.xml Syntax highlighting for the 6502 CPU family's instructions and registers.
RetroAssembler_65C02.xml Syntax highlighting for the 65C02 CPU family's instructions and registers.
RetroAssembler_65816.xml Syntax highlighting for the 65816 CPU family's instructions and registers.
RetroAssembler_Gameboy.xml Syntax highlighting for the Gameboy CPU's instructions and registers.
RetroAssembler_Z80.xml Syntax highlighting for the Z80 CPU's instructions and registers.

After this you must restart Notepad++ so it will utilize the imported file(s) correctly.

The automatically recognized source code file extensions within Notepad++ will be ".asm, .s, .inc", like "musicplayer.asm", but you can change these or add more by going back to the language editor. Select a Retro Asm CPU language and edit the Ext. field on the top.

Whether Notepad++ can recognize the file by extension is questionable, especially if you install language files for more than one CPU. You may have to choose the edited file's language manually in the Language menu. There is no support for the MyFile.6502.asm automatic CPU detection from file name hack explained above – I tried to add it to the extension list and it doesn't work. Maybe in a future version of Notepad++.


Change Log

11/13/2018

10/15/2018

10/1/2018

8/23/2018

8/20/2018

7/24/2018

2/5/2018

12/12/2017

12/7/2017

12/6/2017

8/3/2017

7/21/2017

6/18/2017