Since Inccubus is too busy to learn 6502 ASM (Nintendo Entertainment System programming), I'll write up this quick guide to ease the learning process. Bear in mind that the Super Nintendo Entertainment System used a similar system, albeit with a fourth register (Z).
First and foremost, you need to know the structure of 6502 ASM and the NES. There are three registers (four for the SNES): A, X, and Y (plus Z in the SNES). Each of these registers stores a single byte (8 bits). The A register, known as the Accumulator, is the primary register and is used in most operations. If you want to perform arithmetic operations, typically you use the Accumulator. The X and Y registers are often used for addressing, similar to indices in arrays in modern programming languages. You can also use the X and Y registers for temporarily storing values. The 6502 microprocessor also had a page file, which you could stack the accumulator into if you needed to change some addresses and then access their old values again.
The most important feature of the 6502 microprocessor is the status register. This is a single byte with each bit representing a specific status. This can apply to any of the registers or even any byte recently modified. Bit 7 (1000 0000) is set whenever a loaded or modified value is greater than $80. The reason for this is in any byte, bit 7 signifies negatives. So $80 thru $FF are negative values. Bit 6 is typically ignored. In most cases, you can ignore it. It handles overflows in twos complement maths, that is, it keeps track of when a value changes signs. Other bits can be ignored. However, bit 1 keeps track of when a value is set to 0 (hence it's called the Zero Bit), while bit 0 keeps track of when a value exceeds 255 or goes below 0, hence it is called the Carry Bit (since you'd carry a 1).
The most important link I can offer is this site which I use all the time myself:
http://www.obelisk.demon.co.uk/6502/reference.htmlThis will give you a detailed description of what each 6502 ASM command does. Note that if you just opened an NES ROM in a hex editor, you'd only see a series of byte values. The microprocessor reads each of those values and translates them into specific commands. That site will tell you what each of those commands are. Open a ROM in FCEUX and run the Debugger. It will translate the byte values into readable ASM.
Below is a list of what I feel are the most important commands. You will see these and work with these more than any other command, from my experience with Castlevania 3.
JSRJump to Sub-Routine
You will see this command often. It's similar to executing a script in Game Maker. It tells the processor to change its address temporarily to a different one. It has the byte value of $4C. The code %4C 10 80 would tell the processor to execute the ASM at address $8010. Note that addresses are lowest byte-highest byte.
RTSReturn from Sub-Routine
Think of this a the end of a script. This tells the processor to go back to the last JSR addressed. When viewing ROM data, the RTS command is like your girlfriend -- you'll miss it when you don't see it and feel so much joy whenever you do see it; I will explain why later.
JMPJump To Address
Often a JMP command signifies you are at the end of a certain series of code. This can be a good thing in most cases, but many times it leads you off on wild tangents in coding, leaving you even more confused by the time you finally reach an RTS to the point you don't remember what you were reading.
LDA | LDX | LDYLoad Accumulator | X | Y
These functions set the Accumulator, X or Y to a specific value. Sometimes it's a direct value, like $00. Other times it's the value stored in a byte address. If $0000 is set to $C0, then LDA $0000 would set A to $C0. Keep in mind these functions can result in the Negative bit of the Status register being set if the value loaded is greater than 127.
STA | STX |STYStore Accumulator | X Y
While the previous functions loaded values, these functions save them. You will be seeing a lot of these functions whenever you look through ASM.
PHA | PLAPush | Pull Accumulator to Stack
The 6502 has a stack. You can dump values into it and then retrieve them. This is very useful if you, say, want to load $0542 to the Accumulator, change $0542, do some other stuff with the Accumulator, and then go back to working with that original value of $0542.
PHPPush Processor
In most cases you won't need to worry about this. I haven't seen it. All it does is take store the Status register to the stack.
TXS | TSXTransfer X to Stack | Transfer Stack to X
Same as above, only this pertains to the value stored in the X register. I haven't seen this outside of PPU handling.
TAX | TAYTransfer Accumulator to X | Y
Similar to the Store functions, these functions save the value in the Accumulator to either the X or Y register. Very useful function.
TXA | TYATransfer X | Y to Accumulator
Similar to dumping a value in the stack, these functions let you not only retrieve the results of your meddling with the X or Y registers, they also let you use X and Y registers as little mini-stacks. Konami did that quite a bit in CV3.
SEC | CLCSet Carry | Clear Carry
These are very important commands. The first sets the Carry bit of the Status register to 1, whilst the other sets it to 0. Setting the Carry lets you subtract two numbers directly. Clearing the Carry lets you add two numbers directly. Note that this only applies to the next Carry-intensive operation. So for example, the CLC before an ADC will only apply to that ADC, since the ADC will clear or set the Carry based on its own rules.
ADCAdd with Carry
This is how the 6502 handled basic arithmetic. Addition is handled by the formula A+M+c, where A is the Accumulator, M is the value referenced by the next byte address, and c is the Carry bit of the status register. When the sum is greater than 127, the Overflow bit is set. When the sum is over 255, the Carry bit is set. This means that unless the Carry bit is cleared in the next operation, if there is another ADC operation right afterward, it will be added to the next summation. Since the 6502 microprocessor uses bytes instead of words or Dwords, anything larger than $7F (127) is treated as a negative value. This means $01+$FF is the same as $01-$01. Sometimes a game programmer will do subtraction this way, but there is a function in 6502 ASM for that.
SBCSubtract with Carry
This is how the 6502 handled subtraction. Be aware that here the Carry bit isn't subtracted from the difference, but rather the
negation of the Carry bit. In other words, the formula is A-M-!c, or A-M-(1-c). This operation can be a tad confusing to work with. This time around, the Carry bit is set if the difference is logically greater than 0, otherwise it is cleared. In the case of $01-$02, the result would be $FF in byte format, but logically it's -1, so the Carry bit would be cleared.
INC | DECIncrease | Decrease
You can directly alter byte addresses in the RAM. These functions will either add 1 to the byte value or subtract 1. In other words, INC $0043 will add 1 to whatever value is stored in the RAM at address $0043. Most games do this to count steps, which they then use to randomize actions.
INX | DEXIncrease X | Decrease X
Same as above. These functions let you change the values of the X register. This is frequently used in what are essentially FOR loops.
INY | DEYIncrease Y | Decrease Y
Same as above, but for the Y register. These are typically used for step counters similarly to the GM equivalent of timelines.
AND | ORA | EORBitwise AND | Bitwise OR | Bitwise Exclusive OR
These are just your typical bitwise operations. They only apply to the Accumulator.
ASL | ROLArithmetic Shift Left | Rotate Left
I put these together because they are essentially the same. They can be performed on either the Accumulator or directly to an address in the RAM. An arithmetic shift left, as opposed to a normal bitwise shift to the left, transfers the setting of bit 7 to the Carry bit. For example, if the Accumulator has a value of $E0 and undergoes an arithmetic shift left, the Carry bit will be set, but if the Accumulator was $20 the Carry bit would be cleared. A rotation would replace bit 0 of the Accumulator or address with the value of the Carry bit before setting the Carry bit. Going back to the previous example, $E0 would shift to become $C0, but if the Carry bit was already set prior to that, then the value would become $C1; since $E0 was larger than $7F, its bit 7 was set and so therefor the Carry bit would be set as well. You can visualize the result of the Carry bit if you open Windows Calculator and run in Scientific mode's HEX setting. Leave it set to Qword or Dword and use the LSH button to perform a left shift on whatever you enter into the calculator. If bit 7 was set, the value would exceed $FF and the next highest nybble would be set to 1. That $E0 shifted to $C0 was really shifted to $1C0, but the 1 gets transferred to the Carry bit instead.
LSR | RORLogical Shift Right | Rotate Right
This is very similar to the left shift. This time, bit 0 is transferred to the Carry bit. In the rotation, the Carry bit's previous setting is transferred to bit 7 of the resulting value.
CMP | CPX | CPYCompare Accumulator | Compare X | Compare Y
These are some of the most important functions in 6502 ASM. You use these functions to check the value of the specific register against either the value referenced in the RAM or a specific number. The Status register is then set accordingly. If you ever see a CPX call, there's a strong possibility that you are inside a loop. If you see INX CPX in that order, you are most assuredly looking at the end value of a loop. The equivalent of INX CPX $11 in a for loop is for(x=0;x<$11;x+=1). Needless to say, that pairing will be a site for sore eyes. Or a source of frustration, because oftentimes, if you see that while trying to hack a ROM, it means you missed the code you were actually looking for. These commands are basically A-M (or X-M or Y-M), so if the values are the same, the difference would be 0 and therefore the Zero bit would be set. This also means, just like in SBC calls, if difference between the Accumulator (or X register or Y register) results in a positive number, the Carry bit is set. In other words, if A>=M (or X>=M or Y>=M), the Carry bit is set. Likewise, if the difference is greater than 127, the Negative bit is set. Unlike SBC, the CMP operation does not return a value that cna be stored in the RAM.
BEQ | BNEBranch if Equal | Branch if Not Equal
Branches are at the heart of games. These branch commands check the Zero bit in the Status register. If the Zero bit is set, a BEQ check passes. This is often used with CMP or DEX calls, since CMP is basically A-M.
BCS | BCCBranch if Carry Set | Branch if Carry Clear
These check if the Carry bit was set or not.
BPL | BMIBranch if Plus | Branch if Minus
These branches check if the Negative bit of the Status register is set.
So those are the most useful commands you'll come across (and also the majority of the commands). There are a few more, but you'll rarely encounter them. So now I'll discuss a couple nuances I didn't go into detail about above.
It is important to understand how JSR, JMP, BEQ, BNE, BCS, BCC, BPL, BMI, and so on, relate to each other. This can be very mind-boggling. Fortunately, FCEUX makes it hard to get lost, although if you do get lost you may have to play through the same part of the game again and again as you try to follow the code. A typical routine might look something like this:
0000 JSR $0006
0003 JMP $33C9
0006 LDX $034A
0009 INX
000A CPX #$03
000C BCC 0009
000F TXA
0010 RTS
0011 LDA #$00
In case you can't figure out what the code does at a glance (which is nothing), I will explain it. Address $0000 tells the processor to jump to the subroutine at address $0006. That subroutine tells it to load the value of address $034A of the RAM into the X register and then increase the X register. If after increasing it the Carry bit hasn't been set, the code tells it to branch back to address $0009, telling the processor to increase the X register again. If X gets increased to 3 or if it was already larger than 3 at the start of the subroutine, the value of the X register would then be transferred to the Accumulator and the code would leave that subroutine and go back to address $0003. Address $0011 is completely ignored in this situation, as address $0003 tells the processor to skip way ahead to another address. What this means for us is that basically this routine is over.
That's just what it means most of the time, but that's not always the case. Consider the following situation:
0000 JSR $0012
0003 JSR $0014
0006 BPL $0120
0008 JSR $0017
000B BCC $001B
000D BCS $0120
000F LDX $0224
0012 DEX
0013 RTS
0014 TXA
0015 JMP $001A
0017 ROL
0018 ROL
0019 RTS
001A TAY
001B RTS
Now what's going on? First, we enter the subroutine at address $0012. This subroutine tells the processor to load the value of address $0224 in the RAM into the X register and then decrease the X register by 1. It then exits that subroutine and reads address $0003. That tells the processor to go to address $0014, which transfers the X register's contents into the accumulator. We encounter a JMP command next. It would seem the routine is over. However, if we follow the JMP call to address $001A like it tells the processor to, we see that the Accumulator gets transferred to the Y register and then the subroutine is exited.
Wait, what subroutine? Didn't we just end the current routine with the JMP command? Not quite. In most cases, we can think of a JMP command as ending the current routine or subroutine as far as we're concerned. You will almost always be inside a routine, but most of that routine's subroutines will be meaningless for you. A JMP command typically signifies the end of a meaningful subroutine. However in this case, the RTS call at $001B will send the processor all the way back to $0006, which checks if bit 7 of the Y is set. We are now technically out of the current subroutine, but if the Negative bit was set and that branch check failed, the processor would read $0008 and enter another subroutine at $0017. After rotating the Accumulator twice to the left, the subroutine is exited and the Carry bit is checked. There are two conflicting branches, so we know it's finished.
You may have noticed the addresses weren't incremented evenly. Each function is called by various means. Technically 6502 ASM is nothing but a series of bytes. For example, the function JSR $F35A is actually $4C5AF3. No joke! Okay, so that took up three bytes in the RAM, thus the next address is 3 higher. That was obvious enough, but what about the other JSR in the code that only takes up two bytes? There are numerous types of function calls in 6502 ASM:
ImplicitThis is the simplest form. One byte determines the entire process. Calls like RTS, TAY or SEC use implicit addressing. An offshoot of this is Accumulator addressing. For functions like ROL and ASL, there are special byte values for when the Accumulator is the target.
ImmediateThis form requires two bytes. The first byte is the call and the second byte is an absolute value. CMP #$03 uses immediate addressing.
Zero PageThe 6502 processor has what's known as the Zero Page. All the addresses in the RAM $0FF or lower belong in what is known as the Zero Page. These can be referenced in a single byte. JMP $001C could be called using zero page addressing.
RelativeBranches use this. This kind of addressing adds the specified value to the current address, $80 and above function as -1 to -128.
AbsoluteNot always a good idea to use this, as it takes up three bytes. This refers to a full 16-bit address in the RAM.
IndirectOnly the JMP command uses this. It requires only two bytes, whereby the byte is a Zero Page address which is read along with the next lowest address to retrieve the target address. In other words, if $0000=#$04 and $0001=#$F0, then JMP ($00) would send the processor to address $F004.
IndexedThe other addressing modes can be indexed. This means the value of the X register or Y register is added to the address retrieved by the other addressing modes. In the previous example, if the X register was set to $05, then instead of jumping to $F004 it would jump to $F009.
On the next installment of Theou Aegis' Rookie Guide to 6502 ASM Reading, I will give you a brief tour of FCEUX's Debugger and get started on actual ASM reading lessons. You'll soon be hacking game code in no time!
omg it's so late at night...