../images/terug.gif

BRIEF 6502 ASSEMBLY TUTORIAL



About this page

The majority of the information included here came from "http://net-24-42.dhcp.mcw.edu/6502/index.html" The Incredible 6502, which I edited and revised for a Portuguese version. When I was translating my 6502 site to English, I found there were some significantly differences in this compact version from the original 6502 tutorial, and it was worth retranslating it back to English. Forgive me for any weird expressions arisen from the "double translation" process, but I think it's still pretty understandable.

Introduction

Programming the 6502 isn't complicated. A machine language program will be a sequence of hexadecimal numbers: the command opcode and the argument. The opcodes can be checked here, and according to the addressing mode used, it may have one or two bytes for the argument.

In real life situations, the programmer will hardly have to enter the hexadecimal values: this will only apply if you don't have an assembler (as I didn't have:( ). The best way to learn is through examples, and as some addressing modes aren't so clear, as the Indirect Indexed and Indexed Indirect modes, I'll explain them here.

As a convention, I'll specify the hexadecimal numbers with a dollar sign, so $1234 means 4660 in decimal base.

Registers

Registers are memory positions inside the processor, so they are directly acessed by the 6502 instructions, and are a lot faster than regular RAM memory.

A - Accumulator

The accumulator is the heart and soul of the 6502. On it are realized the logical and arithmetic operations and the majority of the data transfer. The machine language program will be a list of commands to move data to and from the accumulator and to perform calculations on it.

X e Y - Indexes

The X and Y registers are indexes, being able to set an array of memory positions adding them to a base address, as $0010+X for X=1, X=2, X=3... When they are not used as indexes, the X and Y values can store data for fast manipulation, through specific instructions to transfer data between them and the accumulator.

S - Stack pointer

A memory segment is used to manage a data stack. One of the main uses of the stack is to store the return address of a subroutine, as they are called in a stack priority (last call, first return).

P - Processor status

Contains a series of bits indicating the current state. Check the meaning of each one of them in the flags page.

Program counter

The program counter is the only 16-bit register. It indicates the address of the next instruction to be executed. This register can't be accessed directly by the programmer, but the jump and branch instructions, as JMP, can change its value.

Instructions

The instructions are the machine language commands. An instruction may have different opcodes, one for each addressing mode. Using an assembler, you may write your program with mnemonic codes (the three letter acronyms) to indicate the commands and a specific syntax to indicate each addressing mode, so they won't be analyzed in separate here. They'll be discussed next.

Load and Store instructions
  • LDA, LDX, LDY - The Load instructions copy values from the memory to a register (A, X or Y).
  • STA, STX, STY - The Store instructions copy values from a register (A, X or Y) to the memory.
Arithmetic instructions
  • ADC - ADC performs additions between the memory and the accumulator using the carry bit (C) from the status register (P).
  • SBC - SBC performs subtractions between the memory and the accumulator using the carry bit as "borrow".

Using these two basic operations it is possible to make any mathematical calculation, through appropriate algorithms. The Apple II, for instance, implements floating point numeric calculus.


Increment and decrement instructions
  • INC, INX, INY - These instructions add 1 to a register (X or Y) or a memory position.
  • DEC, DEX, DEY - Analogously, these ones subtract 1 from registers or memory.

These instructions implement in a faster way the most common calculations, used frequently, for example, when "running through" an array. Notice that there is no increment or decrement of the accumulator. The 65C02 corrected this flaw with INA and DEA.

Logic Instructions

The logic instructions perform bit-by-bit comparisons following these truth-tables:


AND ORA EOR
0 0 1 1 0 0 1 1 0 0 1 1
0 1 0 1 0 1 0 1 0 1 0 1
---------- ---------- ----------
0 0 0 1 0 1 1 1 0 1 1 0
Flow instructions
  • JMP - JMP is the low-level equivalent to GOTO. The program counter is set to the indicated position.
  • JSR, RTS - JSR Jumps to SubRoutines: it moves the program counter to a new position but first saves the current one in the stack. RTS does the inverse, it ReTurns from Subroutine pulling the position from the program counter, returning the program to its previous course. JSR is also commonly used to call ROM routines.

The "compare" commands compare (!) the value of the registers and the given argument. These commands update the processor flags.

  • CMP - Compares memory and accumulator.
  • CPX, CPY - Compares memory and register (X or Y).

The "branch" commands perform conditional branches following the processor status bits (previously set by a calculation or a "compare" instruction):

  • BCC, BCS - Branches if C is respectively, 0 or 1.
  • BNE, BEQ - Branches if Z is respectively, 0 or 1.
  • BPL, BMI - Branches if N is respectively, 0 or 1.
  • BVC, BVS - Branches if V is respectively, 0 or 1.

The BIT command performs special bit calculations to determine the N, Z and V flags, in an odd way. Check it in more detail in the opcodes page.

Shift and rotation instructions
  • ASL, LSR - The shift and rotation instructions move the accumulator bits, including the carry bit in the indicated direction (ASL left or LSR right) filling the first or last bit with a 0. Notice that ASL is equivalent to multiplying the accumulator by two, and LSR is dividing by two.
  • ROL, ROR - The bits are moved in the indicated direction (ROL left, ROR right) and the final bit becomes the initial, or vice versa.
Transfer instructions
  • TXA, TAX - Used to move values from the accumulator to the X register and vice versa.
  • TYA, TAY - The same applies to the Y register.
Stack instructions
  • TSX, TXS - These instructions define the stack location.
  • PHA, PLA - Stack and unstack (PusH and PulL), respectively, the contents of the accumulator.
  • PHP, PLP - Stack and unstack (PusH and PuLl) the contents of the P register. These allow to restore the processor state after returning from a subroutine.
Flag definition instructions
  • CLC, CLD, CLI, CLV - Define 0 (CLear) to bits C, D, I or V.
  • SEC, SED, SEI - Define 1 (SEt) to bits C, D or I.

CLC is used especially to clear the carry before an addition. SEC is used before a subtraction. CLI and SEI have the "special power" of enabling and disabling interrupts.

No Operation

The NOP doesn't do a thing. It is pretty useful though. Some applications are: set apart some space for future implementations somewhere in the code, making hardware specific delays (the instruction spends 2 clock cycles from the processor), or to separate areas for further data inclusions.

Break

Forces an interrupt, and program execution is branched to address specified in memory positions $FFFE and $FFFF.

The argument

In the 6502, the 16 bit arguments are entered starting by the least significative byte, so if we gave the JMP 03D2 command it will be entered as:


4C D2 03

Where 4C is JMP's opcode in the absolute mode. To say the least, the argument bytes are inverted.

Examples on using the addressing modes

The addressing modes are specified in more detail here. Some examples on how to use these modes follow:

  • Immediate Mode -
    LDA #$30 - The value $30 is loaded into the accumulator.
  • Accumulator -
    ASL A - The ASL can shift memory values as well as the accumulator. If the accumulator contains the value &20, after this command it will turn into $40.
  • Relative -
    BCC $10 - The branch will be relative to the current program counter. If it is pointing to $1234, after this command it will be pointing to $1244.
  • Implied -
    INX - The addressing is implied, i.e., it doesn't need an argument. The INX can only increment the X register, as SEC can only set the flag C.
  • Absolute Mode -
    LDA $2A35 - IF the address $2A35 contains the value $12, so will the accumulator.
  • Absolute Indexed Mode -
    LDA $1234,Y - If Y is 4, the contents of the address $1238 will be loaded into the accumulator.
  • Zero Page Mode -
    STA 20 - Similar to the absolute mode, but the most significative byte is implied as $00 (no confusion with the implied mode!). The contents accumulator will be stored in the address $0020.
  • Zero Page Indexed Mode -
    LDA $20,X - Similar to the absolute indexed mode. If X is 2, the contents of the address $0022 will be loaded into the accumulator.
  • Indirect Indexed Mode -
    LDA ($B4,X) - - If X is 6, be it $B4 + 6 = $BA. $00BA and $00BB will have the effective address, say it $12 and $EE. The address $EE12 will have the value to be loaded into the accumulator. You can imagine a list of addresses starting at 00B4. You'd make a loop where X is incremented twice each iteration, so the addresses would be retrived one by one.
  • Indirect Indexed Mode -
    LDA ($B4),Y - $00B4 and $00B5 will contain the address, say it $EE and $12. If Y is 6, the address $12EE + 6 = $12F4 will have the value to be loaded into the accumulator. You can imagine a vector started in an address pointed by 00B4. You'd make a loop where Y is incremented once each iteration, so the values would be retrieved one by one.
  • Indirect Mode -
    JMP ($1234) - If the addresses $1234 and $1235 contain the values $23 and $45 respectively, the program counter will jump to $4523. This is indeed the only use for this mode, as it occurs only in the JMP command.
Practical Examples

Following are some practical examples, with assembler listings. When we program using an assembler software, it is not neccessary to specify hexadecimal addresses, because the compiler will calculate these addresses placing them over the labels (as "start" and "loop" from the first example).

Countdown loops (from "http://net-24-42.dhcp.mcw.edu/6502/index.html"The Incredible 6502):

; 
; 8-bit countdown
; 
start LDX #$FF ; loads X with $FF = 255
loop DEX ; X = X - 1
BNE loop ; If X isn't zero goes to "loop"
RTS ; returns
; How does the BNE instruction know that X is zero? It doesn't.
; All it knows is that the Z flag is activated.
; As the instruction list specifies, the DEX instruction
; updates the Z flag.
; 
; 16-bit countdown
; 
start LDY #$FF ; loads Y with $FF
loop1 LDX #$FF ; loads X with $FF
loop2 DEX ; X = X - 1
BNE loop2 ; if X isn't zero goes to loop2
DEY ; Y = Y - 1
BNE loop1 ; if Y isn't zero goes to loop1
RTS ; returns
; There are two loops here. In the internal loop, X is decremented and
; when it reaches zero, Y is decremented and the X loop restarts. Its
; the principle of the mileage counter in cars: when each digit turns
; a full lap, the following digit increments. This is possible because
; each hex digit takes exactly 4 bits. In practical terms we will have
; a countdown from 65335 to zero.
Removing an element from an unordered list (from Leo Scanlon's "6502 Software Design"):

; Remove the contents of $2F from a list which initial address
; is pointed at $30 e $31. The first byte is the list's size.
deluel LDY #$00 ; get number of elements
LDA ($30),Y
TAX ; transfer size to X
LDA $2F ; item to remove
nextel INY ; index to next element
CMP ($30),Y ; element and item to remove match?
BEQ delete ; yes. go to removal
DEX ; no. decrements the number of elements to compare
BNE nextel ; more elements to compate?
RTS ; no. element not in list. end.
; remove an element by moving the following elements one byte back
delete DEX ; decrement the element counter
BEQ deccnt ; end of list?
INY ; no. move next element back
LDA ($30),Y
DEY
STA ($30),Y
INY
JMP delete
deccnt LDA ($30,X) ; updates number of elements.
SBC #$01
STA ($30,X)
RTS
16-bit unsigned multiplication (from Leo Scanlon's "6502 Software Design"):

mlt16 LDA #$00
STA $26
STA $27
LDX #$16
nxtbt LSR $21
ROR $20
BCC align
LDA $26
CLC
ADC $22
STA $26
LDA $27
ADC $23
align ROR A
STA $27
ROR $26
ROR $25
ROR $24
DEX
BNE nxtbt
RTS
Simple square root (again from Leo Scanlon's "6502 Software Design"):

This smart algorithm is based on the fact that the integer square root of an integer number is the number of times an increasing odd number can be subtracted from the original number without becoming negative. For example: 25 - 1 = 24 - 3 = 21 - 5 = 16 - 7 = 9 - 9 = 0. Five odd number (1, 3, 5, 7, 9): the square root of 25 is 5!


; Return the 8-bit root in $20 of the 16-bit number in
; $20 and $21. The remainder ends up in $21.
sqrt16 LDY #$01
STY $22
DEY
STY $23
again SEC
LDA $20
TAX
SBC $22
STA $20
LDA $21
SBC $23
STA $21
BCC nomore
INY
LDA $22
ADC #$01
STA $22
BCC again
INC $23
JMP again
nomore STY $20
STX $21
RTS