Assembly Tutorial
#1
Assembly Tutorial


Chapter 1: Introduction, What is an Assembly Language?

This tutorial will teach you how to read/write basic Power PC Assembly Language for the purpose of making cheat codes for Wii games. This tutorial is a supplementation to my other tutorial - 'How to Make your own Cheat Codes', which can be read HERE.

As a prerequisite, you will already need to know the Basics of Wii Codes and the Code Handler (Gecko). Here is the tutorial link for that - https://mariokartwii.com/showthread.php?tid=434

Let's begin.

What is an Assembly Language? Well before we can answer that, we need to know a few basic things. The Wii (like any other console, or computer device) has a CPU (Central Processing Unit). The name of the Wii's CPU is Broadway. No CPU is capable of understanding Human Languages. However, CPU's understand two elementary things. 0's and 1's. 0 meaning voltage off. 1 meaning voltage on.

These 0's and 1's are standard Binary Numbers. A CPU will execute basic tasks depending on the arrangement or combination of the these Binary Numbers. A single fixed-length block of these numbers is an Instruction. When Broadway "reads" a block (Instruction), it will preform a basic task then "read" the next block (instruction), preform another basic task, and so on and so forth. Combine billions of these blocks within seconds and you have a modern CPU that is running.

Assembly Language is a Human Readable Form of these Instructions (Blocks). We can write out an Assembly Language that will "instruct" or "program" a CPU to preform specified tasks. Therefore, Assembly Language is a Computer Programming Language. A person would type out the Assembly Language on a text file. Then a tool, known as the Assembler, will translate the text to the correct combination of 0's and 1's which is shown in Hexadecimal form on a newly created file known as the executable.

However, for Cheat Codes, there are some differences. A program that is specifically designed to generate Wii Cheat Codes for you will usually contain a field where you will type your Assembly Instructions. After typing in your Instructions, the program will generate the correct representative Hexadecimal values (plus a few extra values) aka the Instructions that Broadway understands. This output of Hexadecimal values is your finalized Cheat Code.


Chapter 2: Registers

What are Registers? They are a set of data holding places within the CPU. Up to this point, you have only been familiar with Memory as a data holding place. There are all types of Registers. First thing's first. There are 32 normal integer registers. These registers are referred to as the General Purpose Registers (GPR for short).

There are also 32 Floating Point Registers (FPR for short). They obviously use floating point values instead of normal integer values. The Count Register (CTR) is used to help make loops and the Link Register (LR) holds the address that is used to navigate to/from a subroutine.

Most Wii Cheat Codes only use the GPR's. Therefore, these Registers are the only ones that will be discussed in further detail for this tutorial.

Data within Registers:

Each GPR holds a 32 bit (word) length of data. For the Dolphin Emulator, every register is displayed in Hexadecimal form and every register has their entire length of data shown. Here is a picture of the GPR's with some values in them, taken when I did a random emulation pause of a Wii game.

[Image: GPRs.png]


Chapter 3: Assembler Basics

For your ASM cheat codes, you will have a Assembler (specifically designed for Cheat Code generation) to write instructions into (the CodeWrite program mentioned in the 'How to Make your own Cheat Codes' thread).

Characters/symbol set:

When you write out instructions in the Assembler, various symbols are required for proper formatting. This will allow the Assembler to interpret your instructions and assemble them correctly into a finished cheat code.

List of symbols:
. (period)
: (colon)
, (comma)
() (parenthesis)
+ (plus)
- (minus)
_ (underscore)
# (hash tag)
x (not multiply, this is for writing Hex values)

Hex vs Decimal:

For writing instructions, there are certain elements of an instruction that you can write in Hex. However, the downside is all known PowerPC Cheat Code Assemblers will disassemble an already made cheat code using decimal representation. If you are not sure what to use, then I recommend using decimal for byte data and using Hex for all other data.

When you write Hex values in the Assembler, you must pre-pend those values with '0x'. As an fyi, Dolphin displays all Register values in Hex but they are *NOT* pre-pended with '0x', as it's already assumed the user knows those values are displayed in Hex form.


Chapter 4: Format for Writing ASM Instructions

Any General Purpose Register is written as rX. X = the register's number. The register number is in decimal form. The first register is Register 0, aka r0. The last is Register 31, aka r31. Fyi: Dolphin may display r1 as sp, and r2 as rtoc.

In every instruction, there is a Destination Register. In most instructions, the Destination Register is the Register that holds the result of an executed instruction, while the Source Register is the Register that is used to compute the result for the Destination Register. Some instructions will have one source register, while others will have two. Every instruction can only have one Destination Register.

There are essentially 4 more formats~

Format 1:
rD, rA, rB

rD = Destination Register
rA = 1st Source Register
rB = 2nd Source Register

Keep in mind this is not an actual instruction, or an exact correct format. This is just to show you a very very general view of any instruction that uses two source registers to compute a value for the destination register. Now let's look at the other 3 formats..

Format 2:
rD, rA

rD = Destination Register
rA = 1st Source Register

Format 3:
rD, rA, VALUE

rD = Destination Register
rA = Source Register
VALUE = Immediate Value

Format 4:
rDVALUE

rD = Destination Register
VALUE = Immediate Value

--

Immediate Value is a 16-bit numerical value that is **not** representative of what's in a Register. You can think of it like writing a value from "Scratch". The use of Immediate Values allows Broadway to have instructions that can provide more flexibility with less register usage.

Before continuing further it's critical that you understand signed vs logical (unsigned) values.

What is signed & logical?
Signed values means negative numbers are possible while Logical values mean negative numbers are impossible.

The entire number range in a register is 0x00000000 thru 0xFFFFFFFF.

---

Signed Range of Numbers in a GPR:

0x80000000 thru 0xFFFFFFFF = Negative Numbers.
0x00000001 thru 0x7FFFFFFF = Positive Numbers (if you don't include zero)

0xFFFFFFFF is -1 in decimal representation.
0xFFFFFFFE = -2
0xFFFFFFFD = -3
etc etc til you reach 0x80000000 which is the 'largest' negative number possible.

So a left to right visual would look like this...
0x80000000 --> 0x00000000 --> 0x7FFFFFFF

Logical (Unsigned) Range of Numbers in a GPR:

0x00000001 thru 0xFFFFFFFF = All Positive Numbers (if you don't include zero)

---

The above ranges present a problem. How do we know if a value is being used as Signed or being used as Logical? For example, is 0xFFFFFFFF being used as -1 or being used as 4,294,967,295? Well for a majority of instructions, there is no specificity of Signed vs Logical treatment because it doesn't make a difference to the result/output of said instrucitons. However, there are certain instructions (like Multiply and Divide) which this does indeed matter, and we will address those Signed Vs Logical issues in the next Chapters. For now just understand, how a number in a GPR can be two different values.

Now we need to move onto Signed Vs Logical Numbers for Immediate Values. Since Immediate Values are 16-bits in size instead of 32-bits, their range of numbers will differ.

Immediate Value 16-bit Signed Range (known as SIMM):
0xFFFF8000 thru 0xFFFFFFFF = Negative Immediate Values (-32768 thru -1)
0x0001 thru 0x7FFF = Positive Immediate Values; not including zero (1 thru 32767)

Left to Right visual:
0xFFFF8000 --> 0x0000 --> 0x7FFF

Immediate Value 16-Bit Logical/Unsigned Range (known as UIMM):
0x0001 thru 0xFFFF = All Positive Immediate Values; not including zero (1 thru 65535)

---

You will notice right away that negative Immediate Values are not 16-bit in size. This is a 'trick' that allows Broadway to have negative 16-bit values displayed inside a 32-bit register. When writing these Immediate Values in the Gecko Code Assembler, you must follow the ranges shown above or else an assembling error will occur. Keep in mind you can write the Immediate Values in decimal form within the Assembler if desired.

Certain instructions will use the Signed range while other instructions will use the Logical range, it all depends on the certain instruction in question. It's impossible for an instruction to allow the use of both Ranges, it will be one or the other.

Signed Immediate Values are known as SIMM. Logical/Unsigned Immediate Values are known as UIMM. The terms SIMM and UIMM are important, so remember what they mean!


Chapter 5: Integer ASM Instructions

At this point you should have a well understanding of the...
  • Registers
  • Symbols that can be used in instructions
  • General Format/Layout of instructions

Let's go over actual real world instructions that a person would use to make codes. Here is one of the most basic ASM instructions....

Add (adds two source registers, place result in destination register)

add rD, rA, rB

The value of rA is added with the value of rB. rD will hold the result of the two values added together. Whatever value was in rD beforehand gets erased and replaced with the new value after the instruction has executed.

Let's say we add the values of r4, and r25. The result of this value will be stored in r20. Our 'add' instruction would be this...

add r20, r4, r25

For a majority of instructions that use two source registers, you can swap them. So you can also write this as...

add r20, r25, r4

Imagine this as a basic math equation of 2 + 3 = 5. It doesn't matter if you swap the positions of 2 and 3, the result is always 5. You obviously can't change the spot where the destination register is within the instruction. Keep in mind certain instructions won't allow the swapping of source registers.

Let's revisit the instruction of 'add r20, r4, r25'. Register 4 (r4) will be 3, and Register 25 (r25) will be 2. The picture below shows you an instance of this instruction right before it is executed. Both Source registers are circled in blue, and the Destination Register is circled in red. The add instruction is highlighted in green.

[Image: addBEFORE.png]

Do not concern yourself with the value of 1 in the Destination Register (r20) in the above picture. This is because once the CPU executes the add instruction, that value of '00000001' will be erased and replaced with the result of a r4+r25. Now view the following picture. It shows what happens once the add instruction gets executed. Take a look at the Destination Register circled in red.

[Image: addAFTER.png]

Once the add instruction has executed, r20 now holds the value of 5.

Back in the previous Chapter, I've mentioned about Signed vs Logical/Unsigned issues. Well for the Add instruction, there is no Signed or Unsigned "treatment" of values when the addition is preformed. For example. if we add 0xFFFFFFFF + 0xFFFFFFFF. The result is always 0xFFFFFFFE.

If we pretend to "treat" the values of Signed this is an easy conclusion to why the result ends up as 0xFFFFFFFE (-2). Because -1 + -1 = -2. Simple. If we pretend to "treat" the Values as Unsigned, the result is still 0xFFFFFFFE. What occurs is that since the GPR cannot exceed 32-bits in width/size an event known as a "Carry" occurs. As a beginner, you do not need to know about Carry's. Just understand that there is no difference to the result placed in the Destination Register when it comes to Signed Vs Unsigned for the Add instruction. Therefore, there's no such thing as Signed vs Unsigned number treatment for the actual addition within the Add instruction. A majority of instructions follow this same concept.

Let's move onto another basic ASM instruction...

Add Immediate

addi rD, rA, SIMM

Notice how the addi instruction omits the use of a 2nd Source register. We now are allowed to write in a value "from Scratch". The addi instruction requires you to fill in a Signed Immediate Value (SIMM). This means any number from 0xFFFF8000 thru 0x00007FFF. If you use a number outside of this range, the Assembler will reject it. 

The use of SIMM in the addi instruction does *NOT* mean the treatment of values for the Addition are Signed. Just like with the regular add instruction, there is no Signed Vs Unsigned treatment for the actual addition operation. Let's go over an actual addi instruction in detail...

addi r4, r30, 12

Notice the number 12. It doesn't have the letter 'r' before it. So we know 12 represents the SIMM instead of a source register. This instruction adds together the value of r30 and the value of 12. The result will be stored in r4. For the addi instruction, you CANNOT swap the positions of 12 and r30! If you wanted to write this same instruction in Hex form in the Assembler, it would be like this..

addi r4, r30, 0xC

The '0x' must be put before any hex value, or the Assembler will assemble it as decimal or not assemble it at all (throw an error). You can of course throw a minus (-) before your value to designate a negative number. So if we did.....

addi r4, r30, -12

This would be adding the value of r30 and negative 12. Thus we are actually subtracting 12 from the value in r30. For simplicity, you can use what are called simplified mnemonics. A simplified mnemonic is a 'shortcut'/'simplified' version of an ASM instruction.

The simplified mnemonic for addi r4, r30, -12 is...

subi r4, r30, 12

Subi stands for Subtract Immediately. View the picture below. It shows you an instance of the 'subi r4, r30, 12' instruction right before it gets executed by the CPU. r30 (Source register) is circled in blue. r4 (Destination Register) is circled in red. The instruction itself is highlighted in green. Remember that there is no secondary Source Register, an Immediate Value is used instead.

[Image: subiBEFORE.png]

As stated earlier in the tutorial, registers are in hexidecimal representation. r30 being 0x0000000b is 11 in decimal representation. The subi instruction will preform 11 minus 12,  aka the value in the Source Register minus the Immediate Value of 12. Now view the picture below to see the result in the Destination Register (r4) once the subi instruction has been executed. Destination Register is circled in red.

[Image: subiAFTER.png]

r4 contains the result of 0xFFFFFFFF.

Let's now discuss the most commonly used simplified mnemonic of all...

Load Immediate

li rD, SIMM

li r6, 0xFF

As you can see there are no source registers in this simplified mnemonic. It is a shortcut for the addi instruction for addi r6, 0, 0xFF. You will notice the 0 in the middle doesn't have an r in front of it...

li r6, 0xFF = addi r6, 0, 0xFF

Special note about r0:
In certain ASM instructions (such as addi), if r0 is used as the first source register, then it is treated by the Assembler as literal 0. Therefore, to avoid confusion, it's best to write out "0" instead of "r0" in such cases.

Full list of special r0 rules: https://mariokartwii.com/showthread.php?tid=1853

The 'li' instruction simply sets a register to the designated immediate value. Which is 0xFF in our case. Therefore, after that instruction is executed, register 1 now has the value of 0x000000FF. 

Example of li to load a register with a negative SIMM~
li r7, 0xFFFFFFFC

This will set r7 to 0xFFFFFFFC. You can also write this as...

li r7, -4

Add Instruction using a register for both a Source Register and the Destination Register:

add r4, r4, r30

In the above instruction, the value of r4 (before execution of the instruction) plus the value of r30 will then be placed in the value of r4 once the instruction has executed. Thus after the instruction has executed whatever old value was in r4 is now replaced by the new value.

Discussing CPU execution order and writing multiple instructions in the Assembler.

Great you understand what occurs on a basic math-based instruction. But its important to understand how the CPU executes multiple instructions. The picture below explains this and provides some sample ASM instructions in a basic diagram.

[Image: cpuexec.png]

Writing multiple instructions in the Assembler is what you would expect...

add r4, r4, r30
li r31, 1
addi r12, r31, 0xA

Each instruction takes up one 'line/row' in the Assembler. You cannot put multiple instructions on one line. Once you have typed out an instruction, you must enter into a new line to write your next instruction.


Chapter 6: Store, Load ASM Instructions

This chapter will demonstrate how to take register values and write them to memory, and how to take values from memory and write them to the registers. Let's take a look at one of the most basic store-type (write a register's value to memory) instructions...

Store Word

stw rD, SIMM (rA)

This instruction will copy the word (entire value) of rD and write it to a memory location that is referenced by the value in rA + SIMM. SIMM is the Signed 16-bit Immediate Value. With any store instruction, both rD & rA will not lose their data.

stw r3, 0x0020 (r28)

The word of r3 will be stored at the memory location (address) that is the value in r28 + 0x0020. The 0x0020 value is usually referred to as the term 'offset'.

Please also note that the memory location of rA + VALUE is usually referred to as the Effective Address.

Let's say our value in r3 is 0x0000200A, and r28 is 0x80001500. Add the offset value to 0x80001500.

0x00000020 + 0x80001500 = 0x80001520.

View the picture below to see an instance of this particular instruction right before it gets executed. The Destination Register (what will get written to memory) is circled in blue. The spot in memory where the write will occur at is circled in red. Source Register is circled in magenta. The instruction itself is highlight in green.

[Image: storeBEFORE.png]

Now view the next picture to see what happens once the stw instruction is executed.

[Image: storeAFTER.png]

The blue arrow shows that the value in the Destination Register (r3) has been copied and pasted to the memory address of 0x80001520.


There are also sth (Store Halfword) and stb (Store Byte) instructions. The sth instruction will only store the lower 16 bits of a register to memory, while the stb instruction will only store the 4th byte (far right) byte of a register to memory.

Load Word & Zero

lwz rD, SIMM (rA)

This is simply the 'reverse' of stw. The word at memory location rA+SIMM will be copied into rD. Whatever was in rD beforehand is now completely erased.

lwz r31, 0 (r15)

For this lwz instruction, the offset is 0 (no offset). Therefore, nothing (zero) is added to r15 so the effective address is simply r15's value. View the picture below to see an instance of this particular instruction right before it gets executed. Source register is circled in blue. The word value that will be copied from memory and written to the Destination Register is circled in red. The Destination Register itself is circled in magenta.

[Image: loadBEFORE.png]

The value of 7FE5FB78 is what will be written to r31. Now here's a picture of once the instruction has been executed.

[Image: loadAFTER.png]

The red arrow shows you that the word value in memory gets copied then pasted into r31.

There are also lhz (Load Halfword & Zero) and lbz (Load Byte & Zero) instructions. The lhz instruction loads a halfword from memory into a register. Whatever value was in the register beforehand gets erased. This means every time a lhz instruction gets executed, the rD for the instruction will always result with a value of 0x0000XXXX (XXXX being the halfword value that was loaded from memory).

The lbz instruction loads a byte from memory into a register. Whatever value was in the register beforehand gets erased. Thus, every time a lbz instruction gets executed, the rD for the instruction will always result with a value of 0x000000XX (XX being the byte value that was loaded from memory)


Chapter 7: Writing an Entire Word Value to a Register from Scratch

You are probably wondering at this point how to write a whole word value from scratch to a Register. This is useful for establishing memory locations to later use for store-type and load-type ASM instructions. So let's say we want to write the value of 0x80E6FF30 to Register 22, how do we do this? Simple, with just two ASM instructions like this...

First we write the upper 16 bits. For example:

Load Immediate Shifted

lis rD, UIMM

Take note. This is the first instruction in this tutorial where it uses UIMM (Unsigned 16-bit Immediate Value). Therefore, the number range for this Instruction's Immediate Value is 0x0000 thru 0xFFFF. If you use a number outside of that range, the Assembler will reject it. The use of UIMM does NOT mean the treatment of values within the Instruction Itself is Unsigned.

Let's cover an actual lis instruction..

lis r22, 0x80E6

Load Immediate Shifted (lis) is similar to the Load Immediate (li) instruction but you are setting the upper 16 bits of a register instead of the lower 16 bits. Whenever any lis instruction is executed the lower 16 bits are always CLEARED (set to 0000)!

So at this point, r22 has a value of 0x80E60000. To write in the lower 16 bits without effecting the upper 16 bits, we do this with an instruction called Or Immediate (ori). Before explaining the ori instruction, here's a picture of the above lis instruction (lis r22, 0x80E6) being executed by the CPU (you will see an ori instruction highlighted in green but it has NOT YET executed).

[Image: lisori1.png]

As you can see in r22 (circled in blue), the lower 16 bits have been cleared and it now contains the value of 0x80E60000. Let's now discuss the ori instruction...

Or Immediate

ori rD, rA, UIMM

ori r22, r22, 0xFF30

When writing out an ori instruction for the purpose covered in the tutorial, be sure the ori instruction's Destination and Source Register is the same register that was used in the lis instruction. If you are wondering what exactly happens with the Or Immediate instruction and you are not familiar with Logical Operations (And, Or, Xor), I wouldn't concern yourself with it for now. Just remember to use the lis and ori instructions as a template if you need to set an entire word value into a register from scratch.

Here's a picture of the above ori instruction (ori r22, r22, 0xFF30) executed by the CPU.

[Image: lisori2.png]

As you can plainly see, r22 now has the full value of 0x80E6FF30. In conclusion, if you were to write the lis and ori instructions in the Assembler, it would look like this...

lis r22, 0x80E6
ori r22, r22, 0xFF30


Chapter 8: Branch, Compare ASM Instructions

Branch instructions are used as 'jumps' to skip over certain other instructions. Let's take a look at the most simple branch instruction...

Branch

b SIMM

To understand the branch instruction better, let's go over a small snippet of code that includes a basic branch instruction

b 0x8
li r3, 1
stw r3, 0 (r31)

The letter b is used for what is known as an unconditional branch. Unconditional meaning the branch is executed no matter what the conditions are. Think of it like a jump. The branch will skip/jump over a certain amount of instructions below, thus not executing said instructions. In the provided example, the 'li r3, 1' instruction would be skipped.

Now, the '0x8' next to branch is the amount to 'jump/skip'. This 'jumping' value is a Signed Value by the way, meaning you can have branches that jump backwards. Since each instruction is 4 bytes in compiled length, a jump of 0x4 would be pointless as this would simply just go down to the next instruction below. Obviously, the larger the jump, the harder it would be to correctly calculate the amount to write for the branch instruction. Therefore, we use a trick called 'labels'.

Labels are just that, they are labels.  Wink

To allow the Assembler to know you are using labels, you designate labels with two symbols. The underscore symbol and the colon symbol. To first establish a branch label name, you must implement an underscore somewhere in the name. Like this...

b the_label

You can name labels whatever you want as long as you do NOT use special characters like percent signs or dollar signs. You can implement the underscore symbol if you want like the example provided. Okay, you have set the label name, now all you need to do is put that same label name right before the first instruction that you want executed after the jump has occurred. Put in the label name and append a colon afterwards like this...

b the_label
li r3, 1

the_label:
stw r8, 0 (r31)

Here is a picture showing the direction of the CPU execution plus some more notes to give you a better 'visual' look.

[Image: simplebranch.png]


Now the branch instruction in the provided example above would be useless. Why would you randomly skip over ASM instructions? Well branches are needed if you wanted to create a subroutine. Think of your list instructions like a road. When the game is preforming the list of instructions one after another, think of that like traffic driving on the road. However, you can now put a fork in the road, and tell the traffic which way route to take. The two routes will then later merge back together.

Let's dive into Conditional Branches. We need a create that 'fork' in the road. Conditional branches are branches that only execute base on an 'if'. For example let's look at the 'branch if not equal' instruction...

Branch If Not Equal

bne the_label

li r8, 1

the_label:
stw r8, 0 (r31)

the_label will only be 'jumped to' if the conditional branch is true. In order to set up this 'if' for a conditional branch, we need to make a comparison. The most common instruction to establish a comparison is Compare Word Immediate.

Compare Word Immediate Signed

cmpwi rD, SIMM

IMPORTANT NOTE: This is an instruction where the treatment of values is indeed factored into the operation of the Instruction. Values within this instruction are treated as Signed!

Value in rD is compared to SIMM as Signed values.

cmpwi r10, 0xA

The signed value in r10 will be compared to the signed value of 0xA. We have thus created our 'if statement'. So now add in the rest of the instructions from earlier....

cmpwi r10, 0xA
bne the_label

li r8, 1

the_label:
stw r8, 0 (r31)

The value in r10 is compared to the value of 0xA. Then, if the value in r10 is NOT equal to 0xA, you will 'jump' to the_label, thus skipping the 'li r8, 1' ASM instruction. Here's a picture giving you a better visual of what is occuring.

[Image: cbranch.png]

Now let's move onto a different conditional branch instruction...

Branch If Equal

cmpwi r10, 0xA
beq the_label

li r8, 1

b the_end

the_label:
stw r8, 0 (r31)

the_end:
stw r3, 0x0010 (r24)

As you can see not only are we using 'beq' now, we are adding an unconditional branch and a second label called the_end. You may quickly notice why I've added the unconditional branch. Remember the road analogy I've used earlier... Let's follow the first route of the fork in the road (if r10 does equal 0xA)

If r10 equals 0xA, we jump to the_label. We then execute the first 'stw' instruction. Now remember the traffic/road analogy. After executing the first 'stw' instruction, we proceed directly to the next ASM instruction below, which is the second 'stw' instruction. The label name itself is NOT a barrier in our 'road' in any way shape or form. The labels are just label names to calculate the branch offsets for the Assembler so you don't have to do the calculations by hand.

Now, let's instead take the second route of the fork in the road. If r10 is NOT equal to A, we do NOT jump to the_label. We instead proceed straight down our road to the 'li' instruction. After that, we encounter our unconditional branch. This obviously means we take the branch/jump no matter what. We do this because why would we go to the_label when our r10 value was NOT equal to 0xA? That would make no sense. Therefore, we jump to the_end, thus skipping the first 'stw' instruction.

Still confused? Here is a picture giving you a better visual.

[Image: cbranch2.png]

Here is a list of commonly used conditional branch instructions.
  • beq - Branch If Equal
  • bne - Branch If Not Equal
  • bgt - Branch If Greater Than
  • blt - Branch If Less Than
  • bge - Branch If Greater Than Or Equal To
  • ble - Branch If Less Than Or Equal To

Let's go over another compare instruction really quick... 

Compare Word Signed

cmpw rD, rA

IMPORTANT: Values within this instruction are treated as Signed values.

This will simply compare the signed values of two registers.

cmpw r4, r8
bgt the_label

In this example, if the value in r4 is greater than the value in r8, then the jump to the_label will be taken.


Chapter 9: Overall Illustration

Here's a picture I made to give you a general visual guide of what ASM instructions do..

[Image: ASMpic.png]

Math-based, comparison, and branch instructions only modify the registers. If you need to have the registers effect what's in memory and vice versa, you would use load and/or store instructions.

Instructions executed by the CPU are in Static Memory. While the store and load instructions being executed will effect data that resides in Dynamic Memory.


Chapter 10: Extra Stuff

Let's go over some more symbols that we haven't covered yet.

Period (.):

You can use the period to establish a value to have it's own unique label name. Btw, this has nothing to do with branch labels. Think of these like making definitions, or having 'macros'. The period is followed by the word 'set'. For example:

.set ITEM_MUSHROOM, 0x4

...some ASM here....

li r31, ITEM_MUSHROOM

This now allows the ASM writer to put ITEM_MUSHROOM for any time we wants to use the value of 0x4. Very basic 'macro' per say. Can come in handy if you are writing lengthy ASM.

Plus & Minus (+ and -):

The plus and minus symbols are used for conditional branches. Whenever a branch is done, you can help Broadway by supplying a 'hint'. The plus symbol stands for more-likely, while the minus symbol stands for less-likely. For example....

cmpwi r8, 0xC
bne+ the_label

The plus symbol next to the 'bne' will tell Broadway that the branch is more-likely to occur.

Hash Tag (#):

Whenever someone is writing very lengthy ASM, it can be handy to add notes that will let that someone know why he/she wrote those instructions. Here's an example of using hash tags to add notes/comments:

#Start assembly source

lis r4, 0x8000 #Set 1st half address to the store word to
stw r30, 0x157C (r4) #Store word to memory location 0x8000157C, the offset amount is used to complete 2nd half of address

#End assembly source


Chapter 11: Conclusion, ASM Reference Page,  & Credits

Alright, this should help get you started writing PowerPC ASM for your cheat codes. I've also created a beginner-friendly ASM Reference page. This page contains many "beginner" instructions plus examples. It's easier to read than a full-blow PPC Assembly Programmer's Manual. Link - https://mariokartwii.com/showthread.php?tid=863

Credits:
IBM, Apple, and Motorola (creators of PowerPC ASM)
WiiBrew (a lot of information was gathered from there)
Star (taught me ASM)
Reply
#2
blyatful
Reply
#3
Wow, this guide is actually quite comprehensive and well-written. It helped me a lot in getting started on ASM and led me into learning Hex (which I didn't know beforehand, although it's actually pretty simple). Kudos
Reply
#4
Thank you for the kind words. Assembly by itself isn't that tough to learn, it's just very difficult coming up with code ideas from scratch and applying your ASM knowledge into making an actual cheat code.

I would suggest going through the codes forum and looking at the Source of basic ASM codes. Ones either written by me or Star. We put plenty of good comments in our Source to help others understand how the code(s) work.

EDIT:

Here is essentially the most basic ASM you can do. Writing a value in a register before that value gets stored.

http://mkwii.com/showthread.php?tid=848

I was looking at a value in the RAM Viewer. I noticed it would get written to whenever I did a certain action with my item. Therefore i set a Write BP. I used my item, the value in memory gets written a new value, the Write BP gets set and the game pauses. I see in the Code view that the value in Register 31 is getting stored to a spot in memory. 

This is easy to manipulate. As you can see in the Source, I simply load in a custom value in Register 31 (replacing the legit value), and then including the game's default ASM to allow the game to store the new value to memory. Very simple.
Reply
#5
deleted
Reply
#6
(06-18-2019, 12:52 PM)Cameron_MKW Wrote: Quick question, with signed values why does negative 1 display as 0xFFFFFFFF on Dolphin?

If a value is 0xFFFFFFFF, it CAN be -1 or 4294967295.

If the value is signed (not logical, which most values are signed btw), this will represent -1 in decimal form.

If the value is not signed (logical), it is 4294967295 in decimal form.

Another example:
Let's say a register has the value of 0xFFFFFFFE. If it's signed the value (in decimal) is -2

If you're working with an instruction that is using a 16 bit signed value (such as Load Immediate), then 0xFFFF8000 is the largest negative number that can be written. When 16 bit signed values are used, they are 'sign extended' basically meaning the upper 16 bits (left hand side) of said register will automatically be set 0xFFFF

li r5, -2
li r5, -0x0002
li r5, 0xFFFFFFFE


All the above instructions are the same thing.

--

If this is still confusing, the knowledge of signed vs logical isn't really needed til you start working with complicated comparison-type instructions. (like blt/bgt on a logical value)
Reply
#7
deleted
Reply
#8
(06-19-2019, 08:34 AM)Cameron_MKW Wrote: Oh OK I think I get it now. Thank you

Forgot to mention this...

For signed values on a register... 0x00000000 thru 0x7FFFFFFF is positive, and 0x80000000 thru 0xFFFFFFFF is negative.
Reply
#9
Signed vs Unsigned really tripped me up, but I finally got it.  Writing it out like this helped me understand:

For signed, starting at 0, the numbers keep increasing until the largest possible decimal number signed hex can represent (2,147,483,647), then suddenly flips to the biggest negative decimal number it can represent.  However, looking at it this way, the crucial part to understand is that value in hex keeps increasing, even after the decimal number it represents makes the flip.

Signed Hex to Decimal
|  00000000 =  0
|  00000001 =  1
|  00000002 =  2
|  (etc...)
|  7FFFFFFD =  2,147,483,645
|  7FFFFFFE =  2,147,483,646
|  7FFFFFFF =  2,147,483,647 <--- biggest possible decimal number in signed hex
|  80000000 = -2,147,483,647 <--- Sudden flip to the biggest negative!
|  80000001 = -2,147,483,646
|  80000002 = -2,147,483,645
|  (etc...)
|  FFFFFFFD = -3
|  FFFFFFFE = -2
|  FFFFFFFF = -1
v
hex value keeps
increasing

But if it's unsigned (logical), we don't make the flip at 80000000.  We just keep increasing, which means we can represent double the amount of decimal numbers.  Meaning, instead of the max being 2,147,483,647, the max is now double that, 4,294,967,295.
(if you just did that math yourself and are wondering why it came out to be one less, it's because we didn't have to repeat a number on the flip, so we get one bonus number in there.)

Unsigned (Logical) Hex to Decimal
|  00000000 =  0
|  00000001 =  1
|  00000002 =  2
|  (etc...)
|  7FFFFFFD =  2,147,483,645
|  7FFFFFFE =  2,147,483,646
|  7FFFFFFF =  2,147,483,647 
|  80000000 = 2,147,483,648 <--- No flip, number keeps increasing!!!!!
|  80000001 = 2,147,483,649
|  80000002 = 2,147,483,650
|  (etc...)
|  FFFFFFFD = 4,294,967,293
|  FFFFFFFE = 4,294,967,294
|  FFFFFFFF = 4,294,967,295 <--- biggest possible decimal number in unsigned hex
v
hex value keeps
increasing

Hope this helps anyone else that was confused like me!  This website helped me a lot, you can put in any hex value and it will show both the signed and unsigned equivalent decimal value.
Reply


Forum Jump:


Users browsing this thread: 2 Guest(s)