Chapter 3 — Assembly Language

Chapter 2 showed you what a program looks like as raw bytes in memory. It also showed the problem: once a program is more than a toy, raw bytes are miserable to maintain. There are no names, every address has to be tracked by hand and the code stops looking like something a human being can reason about.

Assembly solves exactly that problem. The CPU still runs the same machine code, but you get readable instruction names, names for addresses and a source file you can actually inspect without decoding hex in your head. This chapter introduces that surface: ld, constants, named storage and the first file layout rules you need in AZM.


A First Program

Here is the same add-5-and-3 program from Chapter 2, rewritten in assembly:

.org $0000
main:
  ld a, 5
  ld b, a
  ld a, 3
  add a, b
  ld (result), a
  halt

.org $8000
result: .db 0

The five instructions in the body of main are the same five operations you already saw in Chapter 2. Three new constructs frame them.

.org $0000 tells the assembler: everything from here assembles starting at address $0000. main: is a label — the assembler records it as the current address, so main refers to $0000. halt stops the CPU. .org $8000 starts a new block at $8000. result: is another label, and .db 0 places one byte with value 0 at the current address — so result refers to $8000.

Now read the body. ld a, 5 loads 5 into A. ld b, a copies A into B. ld a, 3 replaces A with 3. add a, b adds B (still 5) to A (now 3), leaving 8 in A.

ld (result), a stores A into the byte named result. The parentheses mean “memory at the address of result.” The assembler works out that address and substitutes it into the instruction. The generated bytes are still the same kind of bytes you saw in Chapter 2. Assembly gives you instruction names and address names; the CPU still runs the same bytes.


What AZM adds — and what it doesn’t

You just saw two constructs in that program that are not Z80 instructions: .org and .db. These are assembler directives. Before going further, it helps to be clear about what AZM adds on top of raw Z80 assembly.

Raw Z80 assembly gives you instruction mnemonics (ld, add, jp, call …), labels for addresses and directives for placing data (db, dw). That is all. Every classic Z80 assembler provides roughly this surface.

AZM adds the following on top:

  • Directives: .org places code and data at specific addresses; .equ names a compile-time constant; .db, .dw, .ds define storage; .include splits programs across files; .cstr, .pstr, .istr handle string types
  • op — defines an inline instruction sequence that expands at each call site, with no call overhead
  • type / union — named record layouts with scalar types (byte, word, addr); sizeof and offset compute byte sizes and field positions as compile-time constants; .ds accepts type expressions such as .ds Sprite[16]
  • enum — named sets of values with no memory allocated
  • AZMDoc — formal ;! register contracts on subroutines, verified by the assembler

AZM does not add function declarations, local variables, structured control-flow keywords or typed assignment operators. Other languages call a named block of reusable code a function; in AZM it is a subroutine built from call and ret. Every program is flat Z80 instructions with labels.

If you look up .org or .equ in a Z80 reference you will find them — they are standard assembler directives, not AZM inventions. The Z80 mnemonics (ld, add, cp, djnz, call, ret) are always Z80 instructions and any Z80 reference covers them.


Placing code and data with .org

A Z80 program still has to respect the memory map from Chapter 1. Code has to land somewhere executable. Variables have to live somewhere writable. .org tells the assembler where each block begins.

.org $0000
main:
  ; ... code here ...
  halt

.org $8000
count:   .db 0
scratch: .dw 0

.org $0000 tells the assembler: everything from here assembles starting at address $0000. .org $8000 starts a new block at $8000. The assembler emits each block at its specified address in the output. Execution starts from main$0000 is where the CPU begins after reset.

The source-file order is for readability. The assembler emits each block where .org directs, not in the order the blocks appear in the file.


The ld Instruction

ld is the most frequently used instruction in Z80 assembly. It copies a value from a source to a destination:

ld destination, source

ld is a pure copy. The destination receives the value, the source stays as it was and the flags register is untouched.

Every legal ld has a source and a destination. A source can be a register, an immediate constant encoded directly in the instruction or a byte in memory. A destination can be a register or a byte in memory.


The Parentheses Rule — memorise this before reading further

Parentheses always mean “go to this address in memory.”

ld a, b copies register B into A — no memory involved. ld a, (hl) reads the byte at the address held in HL from memory.

Missing or adding parentheses writes a completely different instruction — one the assembler will happily accept, silently doing the wrong thing. Every beginner gets bitten by this. Now you know to watch for it.


The Z80 implements specific pairings of source and destination types — not all combinations are legal. This chapter covers the two forms used in the examples here. Chapter 4 covers the memory access forms and the complete LD forms table.

8-bit register to register

Any of A, B, C, D, E, H, L can be copied to any other:

ld a, b     ; A = B
ld d, h     ; D = H
ld l, c     ; L = C
ld a, a     ; legal, pointless

Immediate constant into register

Any 8-bit register takes an immediate byte (0–255). Any 16-bit register pair takes a 16-bit constant:

ld a, 42        ; A = 42
ld b, $FF       ; B = 255
ld hl, $8000    ; HL = $8000
ld ix, $4000    ; IX = $4000

Constants

A constant is a name for a fixed value that has no address of its own:

MaxCount .equ 10
BaseAddr .equ $8000

Wherever you write the name, the assembler substitutes the value. ld a, MaxCount becomes ld a, 10. ld hl, BaseAddr becomes ld hl, $8000. Constants produce no bytes in the output and occupy no memory at run time.

The difference between a constant and a label: a constant is a value you write down — 10, $8000. A label is an address the assembler computes from where things end up in the output.


Named Storage

You have already seen one named byte, result. More generally, named storage looks like this:

.org $8000
count:   .db 0
scratch: .dw 0

count starts at $8000. scratch follows immediately at $8001, because count is one byte wide. Since scratch is a word, it occupies two bytes: $8001 and $8002. The assembler computes all of this — you declare the variables in order and it assigns the addresses. Change count to a word later and every address below it shifts without touching the code that accesses them.

.db (define byte) places one byte at the current address. .dw (define word) places two bytes in little-endian order. The number that follows is the initial value.

You access named storage with parentheses — the same notation you use for any memory address:

ld a, (count)         ; A = byte at address of count
ld (count), a         ; byte at address of count = A

The parentheses mean the same thing everywhere — this is the Parentheses Rule from earlier applied to named storage:

Notation Meaning
ld a, (hl) Read byte at the address in HL
ld a, (count) Read byte at the address of count
ld a, ($8000) Read byte at address $8000

Same rule, three different ways of specifying the address. The parentheses are always the signal: leave the name/register/value as-is and go to that location in memory instead.

Word-size access (ld hl, (scratch)) and the full set of memory addressing forms are covered in Chapter 4.


ADD, INC and DEC

add a, b adds B to A and writes the result back into A. The original value of A is gone after the instruction; the next instruction sees A’s new value. If you need A’s original value later, copy it to another register before the add.

inc r adds 1 to register r; dec r subtracts 1. Both modify the register in place and update the flags. dec sets the Zero flag when the result reaches zero, which Chapters 5 and 6 put to use.


The Examples

Two example files accompany this chapter. Each one is a complete program you can assemble and run.

00_first_program.asm

The addition program from the beginning of this chapter: load two values, add them, store the result to a named variable. This is the smallest complete AZM program.

01_register_moves.asm

.org $0000
main:
  ld a, $FF
  ld b, $10
  ld c, $20
  ld d, a
  ld e, b
  ld hl, $1234
  ld de, $5678
  ld bc, $0064
  ld d, h
  ld e, l
  halt

ld a, $FF loads 255 into A — an immediate load, the value encoded directly in the instruction bytes. ld d, a copies A into D — a register-to-register move, no memory involved.

ld hl, $1234 loads a 16-bit immediate into HL: H gets $12, L gets $34. The instruction encodes as three bytes — the opcode, then the value in little-endian order ($34 then $12).

ld de, $5678 overwrites both D and E — the $FF that was in D from the earlier copy is gone. Every instruction replaces its destination entirely.

The final two instructions, ld d, h and ld e, l, copy HL into DE one byte at a time. After both, DE holds $1234. There is no single instruction that copies one register pair into another; you always do it as two 8-bit moves.

Example 02_constants_and_labels.asm demonstrates word-size memory access and is walked through in Chapter 4.


When Your Program Does the Wrong Thing

You have compilable code now. At some point — probably soon — a program will produce the wrong result or no result or simply crash the emulator. Assembly gives you no runtime errors, no stack traces and no error messages. The CPU silently executes whatever bytes are in memory. Here is how to find out what went wrong.

Step 1: Read the assembler listing

Produce a listing before you run the program. From a terminal, run azm your-file.asm; AZM writes a .lst by default unless you pass --nolist. In VS Code with Debug80, start a debug session (F5); the target’s outputDir receives a .lst (and related artifacts) you can open alongside the source. The listing shows each source line alongside the hex bytes it generated and the address where they were placed. Before running a program, glance at the listing and confirm:

  • Did every instruction assemble without an error or warning?
  • Is the data section placed where you intended? (count at $8000, scratch at $8001?)
  • Does the entry point (main) start at $0000, or wherever your memory map expects it?

A misplaced .org is one of the most common sources of programs that compile cleanly and then do nothing sensible at all.

Step 2: Use the emulator’s step mode

Every Z80 emulator has a way to single-step: execute one instruction and pause. Use it for the first several programs in this course. Before each step, ask yourself: what should this instruction do to which register? After the step, check whether the register holds what you expected.

If a register has the wrong value after an instruction, you have found the exact point of failure. Now ask why: was the source register already wrong before this instruction? Work backward from the wrong value to the instruction that produced it.

Step 3: Watch the flags

After any instruction that modifies flags — add, sub, cp, and, or, xor, inc, dec — check what the flags register actually contains in the emulator’s register display. Compare it to what you expected. A jump that branches the wrong way almost always traces back to a flag that was set differently than you thought.

Apply the flag-before-branch check from Chapter 5 when this happens: identify which instruction set the flag, then verify nothing between that instruction and the jump changed it.

Step 4: Check memory after the program halts

Most Z80 emulators let you inspect any memory address after execution. When a program stores a result to a named variable, halt execution and look at the address where that variable lives. If the value is wrong, you know the computation failed somewhere. If the value is correct but the program still behaves unexpectedly, the problem may be in how the result is being used later.

For the examples in this chapter: after 00_first_program.asm runs, address $8000 should hold $08. If it doesn’t, step through the program one instruction at a time until you see where the value diverges from what you expected.

That four-step process — listing check, step mode, flag watch, memory inspection — is everything you need to debug the programs in this course. More advanced techniques (breakpoints, watchpoints, memory maps) build on these same fundamentals.


Summary

  • .org $XXXX tells the assembler where a block of code or data begins in memory; .org $0000 places the entry point where the CPU starts after reset
  • ld copies a value from source to destination without affecting flags; the two forms used here are register-to-register and immediate — Chapter 4 covers the memory access forms and the complete forms table
  • Parentheses always mean “memory at this address” — whether in (hl), (count) or ($8000)
  • .equ names a fixed value substituted at assembly time; it produces no output bytes
  • .db places one byte at the current address; .dw places two bytes (a 16-bit word) in little-endian order
  • ex de, hl swaps the two register pairs in one instruction — introduced in Chapter 7 when both HL and DE are in use as pointers
  • add a, b writes the result back into A, destroying its previous value; copy A to another register first if you need it later
  • inc r and dec r add or subtract 1 in place and update the flags; dec sets the Zero flag at zero, making it useful as a loop counter

Exercises

1. Register trace. Step through this sequence in your head and write down the value in each register after every instruction executes:

ld a, $10
ld b, a
ld a, $06
add a, b
ld c, a

When you reach the end: what is in A? B? C? Has anything changed in HL? Now assemble the snippet (add .org $0000, a main: label, a halt and a .org $8000 data block for any storage you need) and confirm in the emulator.

2. Copy HL into DE — without using ld de, hl. There is no single Z80 instruction that copies one 16-bit register pair directly into another. Write the two ld instructions needed to move the value in HL into DE using only 8-bit register moves. Then write a second version that achieves the same result using the stack (push / pop) — a technique you will meet formally in Chapter 8.

3. Constants versus labels. Given this program fragment:

BASE .equ $8000

.org $8000
count: .db 0

Explain the difference between BASE and count in terms of what each name means to the assembler and what code they produce. Which one occupies a byte in the output binary? Which one is zero bytes in the output?

4. dec and the Zero flag. Starting with ld b, 3, execute dec b three times in a row. Write down the value in B after each dec. After which dec instruction is the Zero flag set? (Chapter 6 will use this exact mechanism to build counted loops.)