Welcome, Guest
You have to register before you can post on our site.

Username
  

Password
  





Search Forums

(Advanced Search)

Forum Statistics
» Members: 544
» Latest member: CorvoPSY
» Forum threads: 1,664
» Forum posts: 11,965

Full Statistics

Online Users
There are currently 100 online users.
» 0 Member(s) | 97 Guest(s)
Bing, Facebook, Google

Latest Threads
Make it to 10,000
Forum: General Discussion
Last Post: Fifty
2 hours ago
» Replies: 5,700
» Views: 4,203,940
Custom Kart Working on Do...
Forum: Code Support / Help / Requests
Last Post: CorvoPSY
04-27-2024, 01:14 PM
» Replies: 1
» Views: 98
Boot into TT cup selectio...
Forum: Misc/Other
Last Post: Vega
04-26-2024, 08:27 PM
» Replies: 3
» Views: 2,909
Request: Custom brsars pe...
Forum: Code Support / Help / Requests
Last Post: lschlick
04-25-2024, 05:45 PM
» Replies: 0
» Views: 49
E-mails not working on th...
Forum: General Discussion
Last Post: Fifty
04-19-2024, 04:56 AM
» Replies: 3
» Views: 306
More missing info from Br...
Forum: Coding & Hacking General Discussion
Last Post: Vega
04-15-2024, 10:04 PM
» Replies: 0
» Views: 215
Request: Custom music vol...
Forum: Code Support / Help / Requests
Last Post: lschlick
04-15-2024, 06:13 PM
» Replies: 0
» Views: 210
Request: Additional sound...
Forum: Code Support / Help / Requests
Last Post: _Ro
04-15-2024, 01:50 AM
» Replies: 6
» Views: 256
Mistake in the Broadway M...
Forum: Coding & Hacking General Discussion
Last Post: Vega
04-14-2024, 11:58 PM
» Replies: 0
» Views: 110
Camera Distance Modifier ...
Forum: Visual & Sound Effects
Last Post: vlonebozo
04-13-2024, 11:05 PM
» Replies: 1
» Views: 6,793

 
  Tutorial on the 'BL-Trick' + Psuedo Ops
Posted by: Vega - 12-05-2018, 02:52 AM - Forum: PowerPC Assembly - No Replies

Tutorial on the 'BL-Trick' + Psuedo Ops

Requirements:
Be at least a Beginner Level coder that already knows the basics and has made a few simple codes
Understand basic Loops - https://mariokartwii.com/showthread.php?tid=975



Chapter 1: Introduction

Beginner and Intermediate Coders may run into a situation where they need to overwrite a string of data in dynamic memory. The Coder may have found a Hook address that, when executed, points to the start of an important String of Data that can be edited. Once the Data has been edited, the effects are seen on the user's game.

The beginner or intermediate coder might try a method like this....

Code:
lis rX, 0xXXXX
ori rX, rX, 0xXXXX

He/she would probably use a repetitive series of lis+ori instructions to write out the string of data and then use a basic Loop to transfer the data from the registers to dynamic memory.

Another method would still include the usage of lis+ori, but instead of a loop for the transfer, a stmw instruction is used (multiple word write). The downside is that this method requires the address of the start of the Data to be divisible by 4 and it requires the 'Push/Pop' the stack method.

There is even a couple of more similar methods (stswi and stswx), but at the end of the day, there's a superior method to all of them.

What is this method called? It's known as the BL-Trick.



Chapter 2: Overview of Re-Creating a Code

The BL-Trick is a great tool to use for transferring strings of data. It can be somewhat difficult to understand, so it's best if we recreate a code I have personally made before. That way you will know in the future when to apply the BL-Trick for your own codes.

We will create my "Friend Roster Plus Your Mii; Name Changer & Extender" code. It incorporates a simple BL-Trick.

Let's pretend on Mario Kart Wii that you have applied a Memory Read Breakpoint at your Mii Name in dynamic memory while switching into the Friend Roster while online. The BP was hit and you end up with the following instruction + address (PAL)...


Code:
PAL:
8075144C lwz r5, 0x0068 (r4)


After some manual edits directly within the Dolphin-memory-engine, you also come to the conclusion that the Mii Name can be extended to a max of 23 characters. Now that we have an instruction address, we can use Instruction BP's for further debugging of this address if needed.

If you want, feel free to boot your MKWii game if you have one available, connect to Wifi. Once you are at the Wifi Main Menu, place an Instruction BP at one of the addresses depending on the region of your game..

NTSC-U 8074BF0C
PAL 8075144C
NTSC-J 80750AB8
NTSC-K 8073F80C

Now select 'Friends' and the emulation should pause itself due to the BP hit. Let's take a look at Code View, Registers, and Memory all at once.

[Image: bl01.png]

r4 (source register of our instruction) is outlined in red. r4 + 0x0068 points to the start of the Mii Name. The Mii Name (day1test in the supplied picture) is outlined in magenta. Mii Names are always in 16-bit ASCII in every Wii game. 16-bit ASCII is just like standard (8-bit) ASCII, but each ASCII character is a halfword to accomodate for special characters such as Japanese letters, Picture-like symbols, etc.

Examples of 8 bit vs 16 bit ASCII:
  • 20 = space in ASCII
  • 0020 = space in 16-bit ASCII
  • 30 = zero in ASCII
  • 0030 = zero in 16 bit ASCII

Referring back to our code at hand, we need an efficient way to write out our new custom Mii Name and then replace it overwriting 'day1test'. We also want to extend the Mii name to the max of 23 characters (23 halfwords).



Chapter 3: Overview of a simple BL-Trick

The BL-Trick allows you to write a string of data in your source without using any PowerPC instructions. You may be wondering "How is this possible"? Let's look at a simple example of a BL Trick that will write out the word value of 0x12345678.

Code:
#Branch to the Label Name & Link
bl the_label
.long 0x12345678
the_label:
mflr r12

And here is that source in compiled form (C2 code with blank address)

Code:
C2000000 00000002
48000009 12345678
7D8802A6 00000000


48000009 = bl 0x8 (bl the_label)
12345678 = Our value written from scratch (.long 0x12345678)
7D8802A6 = mflr r12


As you can see the '.long' is not a PowerPC instruction. Before we go over this '.long', the bl and mflr instructions need to be explained first.

Branch & Link (bl)
It's similar to a standard branch instruction (b), but once the bl instruction has executed, the next address AFTER the instruction itself is placed into the Link Register.

Confused? Let's go over some of photos using the above example.

Here's a picture of right before the bl instruction gets executed~

[Image: bl02.png]

You will notice there's a 'ps_msub' instruction underneath the bl instruction. This is actually the '.long 0x12345678' part. Dolphin's Code View tries to decompile every word value present in Memory. So certain contents within a BL-Trick may appear as 'legit' instructions in Code View.

Here's a picture of once the bl instruction has executed (take notice the Link Register outlined in red)~

[Image: bl03.png]

The Link Register aka the LR has been modified and now contains the address that points to the start of the BL-Trick.

Here's a picture of once the 'mflr r12' instruction has executed. Blue arrow is indicating what exactly occurred in the instruction.

[Image: bl04.png]

The value in the LR was copied over to r12. r12 now points to the start of the BL-Trick. Since our BL-Trick is only one word value, r12's value is the address that points to 0x12345678.

The following two instructions are responsible for copying data to/from the LR~
  • mflr rD #Value in Link Register is copied to rD
  • mtlr rD #Value in rD is copied to the Link Register



Chapter 4: Pseudo-Ops

Code:
.long 0x12345678

What is this .long? It's called a Pseudo-Op. In non-coding terms, pseudo ops are 'keywords' for your ASM Code Assembler to add in numerical values to a code without requiring the use of a PowerPC instruction.

List of Pseudo-Ops:
  • .byte = byte (example: .byte 0xFF)
  • .short = halfword (example: .short 0x0102)
  • .long = word (example: .long 0x80456C04)
  • .llong = doubleword (example: .llong 0x8000150090323C7C)
  • .float = float value in it's 32-bit single precision form (example: .float 1 will use a value of 0x3F800000)
  • .string = ASCII write that auto appends a null byte at the end (example: .string "Hello!")
  • .ascii = ASCII write without the appended null byte
  • .string16 = 16-bit ASCII that auto appends a null halfword at the end. This Psuedo-Op only works in PyiiASMH!
  • .space X = X bytes of zero (example: .space 8)
  • .align 2 = Use this for alignment when needed because the BL trick 'amount' must be word divisible. Just place it at the very end of your BL Trick. The Assembler will auto calculate how many extra zero bytes to add to the end of your BL-Trick so the source can be word-aligned. If no alignment is necessary, you can still add this, as it will NOT add any extra unnecessary zero bytes.



Chapter 5: Making the code pt. 1/2

We will create a BL-Trick that contains our new extended 23-character Mii Name, and then we can use a basic Loop to copy the Mii Name from the space within our code to Dynamic Memory. With any code that involves the BL-Trick, you must be aware about the safety of the Link Register. You may have to backup up its value to a GPR (via mflr) and then move that value back to the Link Register at the end of your code (via mtlr). Let's find out if we can use the LR freely in this code. The obvious method to find out would be to look at the code's instruction address in Code View and scroll down til we find an instruction that modifies the LR.

At address 8075151C is this...

[Image: bl05.png]

The instruction at the address (highlighted in blue) is 'mtlr r0'. Okay at this point we need to know what is the the closest previous instruction that modifies r0. Well we can see just 4 instructions above at address 0x8075150C is a 'lwz r0, 0x0024 (sp)' instruction. It is highlighted in blue in the picture below.

[Image: bl06.png]

We see that a value is loaded into r0, and that value is then moved to the LR. In conclusion, we can freely write to the LR without backing it up since the CPU will write a new value to the LR anyway.

Referring back to the default instruction...

Code:
PAL:
8075144C lwz r5, 0x0068 (r4)

Since it's a load instruction, we want it as the last instruction of our source. That way we can freely use r5. r11 and r12 are safe to use 99% of the time. So that's 3 free registers. Alright, we should have enough free registers to use. So we won't need to use the 'Push/pop' the Stack Method.

We need an instruction that will set a register to point to start of the Mii Name that is currently in dynamic memory. However, we actually want to point 2 bytes BEFORE the start of the Mii Name. Why is this? Well recall back in the Creating Loops tutorial, load and store 'updating' instructions are used. They are usually stwu (store word & update), and lwzu (load word & update). These updating type of instructions constantly increment the loading and storing addresses after each loop iteration.

Since each Mii Name character is a halfword, it will make sense at some point later on in our Source to implement a loop that loads & stores a halfword for each iteration (lhzu & sthu). The loading offsets for the lhzu and instructions will be 0x2, so the loop can always continue its load+store for each individual halfword sized Mii Name character.

We want to have a register point to 2 bytes before the Mii Name so when this loop goes thru its first iteration, the first Mii Name character (the 'd' in 'day1test') will be exactly loaded.

With all of that being said here's the first instruction of our source. We will use r5.

Code:
#Need a register to point to 2 bytes before the original Mii Name
addi r5, r4, 0x66

Now we need to create our new 23-character Mii name using a BL-Trick. It will just a bunch of random numbers (01230123012301230123456789905). Here's the BL-Trick portion....

Code:
#Use BL-Trick to write out Mii Name 01230123012301230123456789905
bl mii_name

.short 0x0000
.llong 0x0030003100320033 #Start of Mii Name
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0034003500360037
.llong 0x0038003900390030
.short 0x0035 #Last Mii Name character
.short 0x0000
.short 0x0000

mii_name:
mflr r12

Okay so r12 now points to the start of the custom Mii Name. I used the .llong and .short Psuedo-Ops to give you a 'hex' view of the Mii Name.

Explaining all 4 .short's~
  • The first .short (0x0000) will be due to our creation of our loop that we will address soon. As you should already know, all loops use a form of store/load updating instructions. This first 0x0000 of space will allow use to load the Mii Data on the first iteration of the loop without any extra instructions needed.
  • The second .short (0x0035) is simply the final Mii Name character.
  • The third .short (0x0000) will be for a 'check' in the upcoming loop. We need to know when to stop the loop. There are no Mii Characters that are a 0x0000. So once we hit this null halfword, we know to end the loop.
  • The fourth .short (0x0000) is simply for alignment. The BL-Trick as a whole needs to be word-aligned.



Chapter 6: Making the code pt. 2/2

Now it's time to make that Loop that I covered about earlier~

Code:
the_loop:
lhzu r11, 0x2 (r12) #Load Mii Name Character from BL Trick
sthu r11, 0x2 (r5) #Store Mii Name Character to dynamic memory
cmpwi r11, 0 #Check for null halfword (is Mii Name transfer done?)
bne+ the_loop #If NOT null, keep loop going

This loop is a tiny bit different that the ones described in the Creating Loops tutorial. It's due to the fact that the custom Mii Name can vary in length, we need a way to know when any custom Mii Name has been completed transferred over to dynamic memory.

There are no 16-bit ASCII characters that are a null halfword (0x0000). So a basic check against the value of 0 will work. Once the loop has been completed, the custom Mii Name will be in dynamic memory. All we need now is the code's default instruction and we're done. Let's add that in and look at the entire source.

Code:
#Need a register to point to 2 bytes before the original Mii Name
addi r5, r4, 0x66

#Use BL-Trick to write out Mii Name 01230123012301230123456789905
bl mii_name

.short 0x0000
.llong 0x0030003100320033 #Start of Mii Name
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0034003500360037
.llong 0x0038003900390030
.short 0x0035 #Last Mii Name character
.short 0x0000
.short 0x0000

mii_name:
mflr r12

the_loop:
lhzu r11, 0x2 (r12) #Load Mii Name Character from BL Trick
sthu r11, 0x2 (r5) #Store Mii Name Character to dynamic memory
cmpwi r11, 0 #Check for null halfword (is Mii Name transfer done?)
bne+ the_loop #If NOT null, keep loop going

#Default Instruction
lwz r5, 0x0068 (r4)



Chapter 7: Conclusion

And that's the BL Trick! Key notes about BL Tricks to wrap up this tutorial...
  • Be sure of Link Register safety
  • Slap on a '.align 2' at the end of the BL-Trick to align it if necessary
  • And Happy Coding!

Print this item

  Creating Loops
Posted by: Vega - 12-04-2018, 03:54 AM - Forum: PowerPC Assembly - No Replies

Creating Loops



This thread will teach a beginner ASM coder how to write basic loops in Power PC ASM. Loops are a piece of code with the task of copy-pasting a chunk/string of data from one place of memory to another. For this tutorial, we have a chunk/string of data starting at memory address 0x80002008.

The string of data is this..
  •     Address         Data
  • 0x80002008 0x11223344
  • 0x8000200C 0xAABBCCDD
  • 0x80002010 0x12345678
  • 0x80002014 0xABCDEF01
  • 0x80002018 0x12AB34CD

We want to copy this data to memory address starting at 0x81450000. The string of data is a total of 5 words in length (or 10 halfwords, or 20 bytes).

In other words, we start with this...
  •     Address         Data
  • 0x80002008 0x11223344
  • 0x8000200C 0xAABBCCDD
  • 0x80002010 0x12345678
  • 0x80002014 0xABCDEF01
  • 0x80002018 0x12AB34CD

And we want to end up with this...
  •     Address         Data
  • 0x81450000 0x11223344
  • 0x81450004 0xAABBCCDD
  • 0x81450008 0x12345678
  • 0x8145000C 0xABCDEF01
  • 0x81450010 0x12AB34CD

What a beginner coder might do is write a source of multiple uses of lwz+stw like this...

Code:
lis r11, 0x8000
lis r12, 0x8145
lwz r10, 0x2008 (r11)
stw r10, 0 (r12)
lwz r10, 0x200C (r11)
stw r10, 0x4 (r12)
lwz r10, 0x2010 (r11)
stw r10, 0x8 (r12)
lwz r10, 0x2014 (r11)
stw r10, 0xC (r12)
lwz r10, 0x2018 (r11)
stw r10, 0x10 (r12)

Instead of using a stream of lwz+stw's, we can use Loops.

There are 2 types of loops:
CTR Loop
Subic. Loop



CTR Loop

The CTR loop uses the Count Register. The Count Register (CTR) is used to keep track of how many times the loop will execute. The amount of times that a loop will need to be executed depends on these two factors.

  1. How much total Data is being copy-pasted (transferred)
  2. How you want to transfer the Data

Loops can transfer data via bytes, halfwords, or words. For our data shown above, we have 5 words, and we will transfer it via one word at a time. Therfore, we need our loop to execute a total of 5 times. If we to transfer a byte at a time, we would need the loop to execute 20 times, if transferring a halfword at a time, we would need the loop to execute 10 times.

First, let's set the CTR to have the value of 5.

Code:
li r12, 5
mtctr r12

The mtctr instruction stands for Move to CTR. The value of r12 is copied to the CTR. Now we need to set our first loop loading address...

Code:
lis r12, 0x8000
ori r12, r12, 0x2008

And we're good to continue on, right? No, we're not. Why is this incorrect?

Our 1st loop loading address needs to -0x4 away from 0x80002008, which is 0x80002004. Why is this required? Well we will be using what is called 'updating' instructions for our loop (will explain more on this shortly). This means since we are transferring one word at a time, we need one word of space (or -0x4) before the first loading address. If we were transferring halfwords, this would be -0x2, if bytes then this would be -0x1.

Now, let's correctly set the first loading address of the loop~

Code:
lis r12, 0x8000 #0x80002008 - 0x4 = 0x80002004
ori r12, r12, 0x2004

We must apply that same logic to the first storing address of the loop. 0x81450000 - 0x4 = 0x8144FFFC.

Code:
lis r11, 0x8144 #0x81450000 - 0x4 = 0x8144FFFC
ori r11, r11, 0xFFFC

We got our initial loading & storing addresses set, let's make the loop...

Code:
the_loop:
lwzu r10, 0x4 (r12)
stwu r10, 0x4 (r11)
bdnz+ the_loop

A lot to unpack here. First, all loops need a label name. The lwzu and stwu instructions are those 'updating' instructions I mentioned about earlier. Let's figure out what they do....


Load Word Zero & Update
lwzu rD, SIMM (rA)


SIMM + rA = The effective address. The word located at the effective address is loaded into rD. Afterwards, the Effective Address becomes the new rA. Therefore, if the rA is used in a future instruction, it has a new incremented/decremnted value. Use lhzu for halfwords, and lbzu for bytes.


Example:
#r4 = 0x80456CF4
lwzu r0, 0x24 (r4)
#After lwzu has executed r4 is NOW 0x80456D18. (0x80456CF4 + 0x24 = 0x80456D18)


Store Word & Update
stwu rD, SIMM (rA)


Same concept as lwzu but this is storing rD's value to memory instead of loading a value from memory into it. Use sthu for halfwords, use stbu for bytes.


These updating instructions can cut down the amount of instructions your source contains. Let's say we have this lwzu instruction...
lwzu r0, 0x24 (r4)

If we were to mimic this withOUT lwzu, we would have to use two instructions...
lwz r0, 0x24 (r4)
addi r4, r4, 0x24

Okay, you now know what lwzu and stwu does. Let's talk about the bdnz+ instruction. This stands for Branch Decrement Not Zero. The instruction does the following...
  • Decrement the value in the Count Register by 1
  • If Count Register does not equal 0, take the branch
  • If Count Register equals 0, skip the branch.

By placing a bdnz+ instruction at the end of our loop with its branch label going back to the top of the loop, this allows us to decrement our Loop Tracker (CTR) and at the same time, stop executing the loop once the Loop Tracker (CTR) hits Zero.

Here's the entire source~

Code:
li r12, 5
mtctr r12

lis r12, 0x8000 #0x80002008 - 0x4 = 0x80002004
ori r12, r12, 0x2004

lis r11, 0x8144 #0x81450000 - 0x4 = 0x8144FFFC
ori r11, r11, 0xFFFC

the_loop:
lwzu r10, 0x4 (r12)
stwu r10, 0x4 (r11)
bdnz+ the_loop

For a better idea of what's going on visually speaking, here is a series of 4 pictures. The 1st picture is right before the loop is first executed. I've manually placed in the values for r11 and r12 beforehand. Then next 4 pictures will show the execution of the loop with one iteration so the CPU ends up back at the lwzu instruction.

[Image: loop1.png]

[Image: loop2.png]

[Image: loop3.png]

[Image: loop4.png]

That is what one iteration of the loop looks like. A word gets loaded into r10 and 'transferred' to the spot designated by stwu instruction using r11.

Here are two more pictures showing the final stages of the loop. First pic is right before the loop is completed. You will notice the CTR has a value of 1 and the bdnz+ instruction is about to be executed. Then 2nd pic is the bdnz+ instruction getting executed, you will see the CTR is now 0 and the loop has fully completed.

[Image: loop5.png]

[Image: loop6.png]



Subic. Loop

With the subic. loop, instead of using the CTR for the amount of times the loop needs to execute, we use a normal general purpose register instead. Here's what our source would look like using the subic. loop...

Code:
li r9, 5 #r9 will be used to mimic our 'CTR'

lis r12, 0x8000 #0x80002008 - 0x4 = 0x80002004
ori r12, r12, 0x2004

lis r11, 0x8144 #0x81450000 - 0x4 = 0x8144FFFC
ori r11, r11, 0xFFFC

the_loop:
lwzu r10, 0x4 (r12)
stwu r10, 0x4 (r11)
subic. r9, r9, 1
bne+ the_loop

The subic. instruction stands for Subtract Immediate Carrying (carrying deals with the carry flag, you don't need to worry what this flag is about). The small dot you see appened to subic is called the Record feature. It's a free use of 'cmpwi rD, 0', which is cmpwi r9, 0 for this source. The "subic." instruction for our loop will subtract one from the value of r9 and store the result back into r9 every time the loop executes. Then it compares the value of r9 against Zero. The bne+ instruction will branch to the_loop whenever r9 is NOT zero. Once r9 is zero, the loop is over and instructions beneath the loop will be executed.



CTR vs Subic.

While both loop types resulted in the same length of assembled code, the CTR loop is better because it has less amount of total executable instructions and thus results in less execution time. The CTR loop also allows you to use one less GPR (general purpose register) than the Subic. loop.

The subic. is needed when let's say your code's default instruction resides at a address that is inside a CTR loop. Obviously, the CTR wouldn't be safe for use, and you will have to use the subic. loop. Happy coding!

Print this item

  ASM Tips n Trix
Posted by: Vega - 12-04-2018, 02:44 AM - Forum: PowerPC Assembly - No Replies

ASM Tips n Trix

This thread will be a list of mini-guides/tips to help shorten or optimize your ASM codes. This is tailored towards a Coder who has recently started learning ASM.



I. Using Offset Values to complete Memory Addresses

Let's say we want to load the word from memory address 0x80001650. A beginner might write the following instructions....

Code:
lis r12, 0x8000
ori r12, r12, 0x1650
lwz r11, 0 (r12)

This is not completely optimized. The use of the ori instruction is unnecessary. We can shorten this...

Code:
lis r12, 0x8000
lwz r11, 0x1650 (r12)

As you can see we have shortened the source. Now let's go over a case where you need a write a load/store instruction, but your Offset Value (SIMM) will exceed the 16-bit signed range (0xFFFF8000 thru 0x7FFF). We have the following source...

Code:
#Load word value from 0x8028CF08
lis r12, 0x8028 #Set the upper bits
ori r12, r12, 0xCF08 #Set lower bits too or else we will exceed the 16-bit signed range
lwz r11, 0 (r12) #Load word into r11

Here's a simple trick to do if your offset value needs to be 0x8000 or higher:

Code:
#Load word value from 0x8028CF08
lis r12, 0x8029 #Add one to your upper 16 bit original value (0x8028 + 1)
lwz r11, 0xFFFFCF08 (r12) #Simply pre-pend the offset value with 0xFFFF. This is known as 'sign-extending'.



II. 'Register into a Register'

Let's say we have the following instruction...

Code:
lwz r11, 0x00AC (r12)

However, after this instruction, let's pretend we are no longer obligated to use r12. Well then there's no need to waste the use of r11. Especially, if we need that register for a different instruction later. Therefore you should do this instead...

Code:
lwz r12, 0x00AC (r12)



III. Using a singular lis instruction for multiple loading/storing

Let's say we have the following instructions...

Code:
lis r12, 0x8000
lwz r11, 0x1500 (r12)
lis r10, 0x8000
lwz r9, 0x1800 (r10)

We have a redundant instruction. We are executing essentially the same lis instruction for two different registers. Do this instead...

Code:
lis r12, 0x8000
lwz r11, 0x1500 (r12)
lwz r9, 0x1800 (r12)

Now we've saved the use of r10.



IV. Optimizing Codes made by Read Breakpoints

Let's say you did a Memory Read Breakpoint and you end up with the following default instruction...

Code:
lwz r5, 0x1778 (r30)

And you want to change the value of r5. A beginner coder might write something like this...

Code:
li r5, 0xC #Custom r5 value
stw r5, 0x1778 (r30) #Make sure new r5 value is in memory
lwz r5, 0x1778 (r30) #Default Instruction

This is redundant. There's no need to take our new r5 value, store it to memory, and then immediately load it back from memory. Remove both the stw and lwz instructions. You are left with this...

Code:
li r5, 0xC

In some cases, a code may require that the value in the register must also be in memory. If that is the case, you will write the source like this..

Code:
li r5, 0xC
stw r5, 0x1778 (r30)

There's still no need to have the default instruction.



V. Optimizing Branch Routes

We have the following list of instructions...

Code:
cmpwi r21, 0x1
beq- the_label
b finish_code

the_label:
li r28, 0x14

finish_code:
stb r28, 0x2 (r30)

This is not fully optimized branch routing. There's no need to have two label names, you can do this instead...

Code:
cmpwi r21, 0x1
bne+ finish_code

li r28, 0x14

finish_code:
stb r28, 0x2 (r30)

As you can see, if r21 is equal to one, it will continue down to the li instruction. This is more efficient that making two whole separate branch labels/routes.



VI. Avoiding Pushing/Popping the Stack

What some beginner coders will do (when needing extra registers in a code) is use the method of 'pushing/popping' the stack. Info for this is HERE. This will cause any code to naturally have more lines of compiled code. It is nice to have free registers, but if you are wanting to cut down the length of code, you should avoid the push/pop stack method.

We know r11 and r12 are always free for use without restoration (99% of the time). You can also use a volatile register (r3 thru r10), and restore their original values at then end of your code. However, finding a volatile register to have the same value every time the ASM instruction is executed (test this via a breakpoint over and over again), is actually rare.

Instead, you can use more registers (without restoring their original values), by looking ahead at further ASM instructions in comparison to your code's address. For example...let's say we have a code address of 0x80456000, and we have the following addresses plus ASM instructions.

Code:
0x80456000 lwz r4, 0 (r5) #Default Instruction, Address of Code
0x80456004 add r23, r6, r9
0x80456008 mflr r0
0x8045600C cmpwi r31, 0x1

If you have an address that has a loading type instruction (lwz, lhz etc) as the default instruction, and you are able to have the default instruction at the end of the source, you can use r4 (for our example). r4 is free w/o restoration because it will get written to anyway.

r23 is also free, because it will get written to later. Same with r0. Obviously, we can't use r5, r6, r9, r31, because they are being used as variables for the other instructions. So using them even with restoring their original values is really not safe.

So with the instructions listed above, our list of free registers would be r0, r4, r11, r12, and r23. Which will most likely be enough to not have to push/pop the stack.



VII. Optimizing conditions with the Record (dot) Shortcut

We have the following source...

Code:
lwz r5, 0x1AA8 (r31)
add r6, r6, r5
cmpwi r6, 0x0
bne+ some_label

Certain ASM instructions can have a dot (.) added to them. This is known as 'Record'. Record is a shortcut for cmpwi rD, 0. D = whatever register you are using for the comparison. Please not that there's no way I can list all the instruction that do or do not have the Record shortcut option. Refer to an actual ASM handbook/reference for assistance.

The add instruction has the ability to equip this Record feature. Like this...

Code:
lwz r5, 0x1AA8 (r31)
add. r6, r6, r5 #Notice the dot appended to add
bne+ some_label

Print this item

  Mii Name Extender [Vega]
Posted by: Vega - 12-03-2018, 05:19 PM - Forum: Incomplete & Outdated Codes - No Replies

Mii Name Extender [Vega]

NOTE: Outdated by Star's version. Star's version is shorter and it's region free.

This code will allow you to put in a custom Mii name when online. Only you can see it. You also have the ability to extend the max length of the Mii name from 10 to 29 characters. If you don't want the max length, just fill in the unused values with 0's.

NTSC-U
C25DA7B0 0000000E
7C0802A6 38830066
48000045 0000WXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
00000000 7D8802A6
A56C0002 B5640002
2C0B0000 4082FFF4
7C0803A6 8003006C
60000000 00000000

PAL
C25FB094 0000000E
7C0802A6 38830066
48000045 0000WXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
00000000 7D8802A6
A56C0002 B5640002
2C0B0000 4082FFF4
7C0803A6 8003006C
60000000 00000000

NTSC-J
C25FA970 0000000E
7C0802A6 38830066
48000045 0000WXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
00000000 7D8802A6
A56C0002 B5640002
2C0B0000 4082FFF4
7C0803A6 8003006C
60000000 00000000

NTSC-K
C25E94B4 0000000E
7C0802A6 38830066
48000045 0000WXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
WXYZWXYZ WXYZWXYZ
00000000 7D8802A6
A56C0002 B5640002
2C0B0000 4082FFF4
7C0803A6 8003006C
60000000 00000000

WXYZ = Mii Character ASCII Value

Example ASCII Values:
0020 = Space
0041 = A
0061 = a
E017 = DSi Heart

Use 0000 for unfilled values if you don't want to use the full length of 29 characters.



Source (using 01230123012301230123456789905 as the Mii Name):

#Address ports
# 805DA7B0 = NTSC-U
# 805FB094 = PAL
# 805FA970 = NTSC-J
# 805E94B4 = NTSC-K

#Safe registers
#r0, r4, r11, r12

#Save LR, fyi: r0 good to use for this instruction
mflr r0

#Start of Mii Name is at r3+0x68 (where loop writing starts at)
#Mii Name characters are halfword a piece, loop will transfer a halfword at a time
#Thus use r4 to point to r3+0x66
addi r4, r3, 0x66

#Use BL Trick to write out Mii Name
bl mii_name

.short 0x0000
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0030003100320033
.llong 0x0034003500360037
.llong 0x0038003900390030
.short 0x0035
.short 0x0000
.short 0x0000

mii_name:
mflr r12

the_loop:
lhzu r11, 0x2 (r12) #Load Mii Data from BL Trick
sthu r11, 0x2 (r4) #Store Mii Data to dynamic memory
cmpwi r11, 0 #Check for null halfword (end of Mii Data)
bne+ the_loop #If NOT null, keep loop going

mtlr r0 #Move to Link Register, this copies r0's value (original LR) to the Link Register
lwz r0, 0x006C (r3) #Default Instruction



Code creator: Vega
Code credits: Star (used his Mii Extender code to setup a Breakpoint)

Print this item

  Friend Roster Friend Code Modifier [Vega]
Posted by: Vega - 12-02-2018, 01:28 AM - Forum: Online Non-Item - No Replies

Friend Roster Friend Code Modifier [Vega]

This code will allow you to put in a fully customized FC value for all friends of your friend roster. If you use the code without a full list of 30 friends, the rest of your list will be filled up with "?"'s for Mii images using the FC you put in the code.

NTSC-U
C25C57BC 00000004
3D80XXXX 618CWWWW
91840000 3D80YYYY
618CZZZZ 91840004
7C03002E 00000000

PAL
C25D28D8 00000004
3D80XXXX 618CWWWW
91840000 3D80YYYY
618CZZZZ 91840004
7C03002E 00000000

NTSC-J
C25D21B4 00000004
3D80XXXX 618CWWWW
91840000 3D80YYYY
618CZZZZ 91840004
7C03002E 00000000

NTSC-K
C25C0A74 00000004
3D80XXXX 618CWWWW
91840000 3D80YYYY
618CZZZZ 91840004
7C03002E 00000000

XXXXWWWWYYYYZZZZ = Desired FC value in Hex

Example: You want 0123-4567-8901 for the FC. The decimal value is 012345678901, put that number into a Dec to Hex converter. Hex value is 2DFDC1C35. Now fill in the missing zero's beforehand to get your 64 bit Hex value to put in code. Final XWYZ Hex value would be 00000002DFDC1C35.

Source:
lis r12, 0xXXXX #Load XXXX value into upper 16 bits of r12, lower 16 bits are cleared
ori r12, r12, 0xWWWW #Load WWWW value into lower 16 bits of r12
stw r12, 0 (r4) #Store the word of r12 to address of r4
lis r12, 0xYYYY #Load YYYY value into upper 16 bits of r12, lower 16 bits are cleared
ori r12, r12, 0xZZZZ #Load ZZZZ value into lower 16 bits of 12
stw r12, 0x0004 (r4) #Store the word of r12 to address of r4 plus offset of 0x4
lwzx r0, r3, r0 #Default Instruction

Code creator: Vega

Print this item

  Friend Roster Globe Location Modifier [Vega]
Posted by: Vega - 12-02-2018, 01:18 AM - Forum: Online Non-Item - No Replies

Friend Roster Globe Location Modifier [Vega]

This code will allow you to put in any globe location value you want for all friends of your friend roster online.

NTSC-U
C25C57C4 00000003
7C600379 3D80ZZZZ
618Czzzz 9184007C
60000000 00000000

PAL
C25D28E0 00000003
7C600379 3D80ZZZZ
618Czzzz 9184007C
60000000 00000000

NTSC-J
C25D21BC 00000003
7C600379 3D80ZZZZ
618Czzzz 9184007C
60000000 00000000

NTSC-K
C25C0A7C 00000003
7C600379 3D80ZZZZ
618Czzzz 9184007C
60000000 00000000

ZZZZzzzz = Globe Location Value

Source:
or. r0, r3, r0 #Default Instruction
lis r12, 0xZZZZ #Load ZZZZ value into upper 16 bits of Register 12, lower 16 bits are cleared
ori r12, r12, 0xzzzz #Load zzzz value into lower 16 bits of Register 12
stw r12, 0x007C (r4) #Store the word of Register 12 to address of Register 4 plus offset 0x7C

Code creator: Vega

Print this item

  Friend Roster Country Flag Modifier [Vega]
Posted by: Vega - 12-02-2018, 01:12 AM - Forum: Online Non-Item - No Replies

Friend Roster Country Flag Modifier [Vega]

This code will allow you to put in any country code (flag) value you want for all friends of your friend roster online. For example, setting the code to country code value 31 (hex), will give the USA flag for every person of your friend roster.

NTSC-U
C25C57C0 00000002
80640004 3D80XX00
91840078 00000000

PAL
C25D28DC 00000002
80640004 3D80XX00
91840078 00000000

NTSC-J
C25D21B8 00000002
80640004 3D80XX00
91840078 00000000

NTSC-K
C25C0A78 00000002
80640004 3D80XX00
91840078 00000000

XX = Country Code (in Hex)

Source:
lwz r3, 0x0004 (r4) #Default Instruction
lis r12, 0xXX00 #Load XX00 (XX - country code value) into Register 11
stw r12, 0x0078 (r4) #Store the word of Register 12 to address of Register 4 plus offset of 0x78

Code creator: Vega

Print this item

  Speed-O-Meter [Vega]
Posted by: Vega - 11-30-2018, 10:32 PM - Forum: Incomplete & Outdated Codes - No Replies

Speed-O-Meter [Vega]

NOTE: Outdated by mdmwii's version which that version also works in Grand Prix and ghost TTs.

Works for TTs and any type of Online VS. If using for TTs, it only works for Solo Racing.

This code will put your speed on the milliseconds section of your timer. Works with any vehicle/character combo and reverse speed also works. All speed measurements are rounded to their nearest whole number shown on the timer. This will get rid of the '96/97' issue with Funky Kong/Flame runner (codes that don't fix this issue will show 96 for both Daisy/Mach and Funky/Flame runner at max wheelie speed).

NTSC-U
C2701160 00000002
3D808000 D02C1660
D03F00EC 00000000
04531090 3D808000
04531094 C00C1660
0053109B 0000001C

PAL
C2707B04 00000002
3D808000 D02C1660
D03F00EC 00000000
04535BD8 3D808000
04535BDC C00C1660
00535BE3 0000001C

NTSC-J
C2707170 00000002
3D808000 D02C1660
D03F00EC 00000000
04535558 3D808000
0453555C C00C1660
00535563 0000001C

NTSC-K
C26F5EAC 00000002
3D808000 D02C1660
D03F00EC 00000000
04523C30 3D808000
04523C34 C00C1660
00523C3B 0000001C

Source (For ASM code):
lis r12, 0x8000 #Set first half address in r12 to store the floating value to in memory
stfs f1, 0x1660 (r12) #Store the floating-single value of FPR 1 at address 0x80001660, offset used to complete 2nd half of address
stfs f1, 0x00EC (r31) #Default ASM, store the floating-single value to address of r31 plus offset of 0xEC

Source and explanation of how the 04 lines work (major props to mdmwii figuring this out on his own in 2009):
First 04 line: 3D808000 # (lis r12, 0x8000) Normally the game at this address takes the value of FPR 2 and subtracts it from FPR 0. This isn't needed since in the next address we are custom loading our floating value in manually. Therefore we use this address to establish the 1st half address in r12 to load our floating single from memory

Second 04 line: C00C1660 # (lfs f0, 0x1660 (r12)) Normally the game takes FPR 0 and adds it with the value of FPR 1, storing the result back into FPR 0. Thus, FPR 0 now has a finalized value before the next address line (fctiwz) is executed which that next address line will take the f0 value and transform it into an integer. The ASM functions after the fctiwz will store the integer-converted float to the stack, then retrieve the integer word back off the stack (0x4 offset added) into r5. Then r5 will contain the decimal integer value to display in the milliseconds value.

Only 00 line: 0000001C #this causes the instruction in memory to change from fctiwz f0, f0 to fctiw f0, f0. Thus the conversion from float to integer is NOT rounded. Therefore this allows the code to show 97 instead of 96 on the timer whenever funky w/ flame runner is at a max wheelie speed. 

Code creator: Vega
Code contributor(s): mdmwii (address founder for the 04 lines and 00 line)

Print this item

  Show Everyone's True Region ID [Vega]
Posted by: Vega - 11-27-2018, 10:07 PM - Forum: Online Non-Item - No Replies

Show Everyone's True Region ID [Vega]

I made this code because I got tired of seeing White Line (CTGP region ID's) everywhere on Wiimmfi. This code will put the correct line color (on your screen only) for all players in a race. The line color will first be based on what version of MKW Disc/ISO they are using and secondly based on their country code + globe position.

Some notes about using this code:
- This only effects the output of the opponent's line color during the actual race. Live View is not effected by this code.
- This doesn't effect any opponent's geo location nor country flag when you see them on the globe

NTSC-U
C261E0CC 00000015
88830184 2C040045
41820024 2C04004A
4182002C 2C04004B
4182001C 2C040050
41820038 3880000F
48000078 38800001
48000070 38800005
48000068 89830178
2C0C0080 41A2FFE0
2C0C00FF 41820030
38800000 4800004C
89830178 2C0C0041
4182003C 2C0C005F
41820034 2C0C00FF
4182001C 38800002
48000028 8983017C
2C0C0011 41A2FFA0
4BFFFFC8 8983017C
2C0C00E6 41820008
4BFFFFDC 38800003
60000000 00000000

PAL
C26513E0 00000015
88830184 2C040045
41820024 2C04004A
4182002C 2C04004B
4182001C 2C040050
41820038 3880000F
48000078 38800001
48000070 38800005
48000068 89830178
2C0C0080 41A2FFE0
2C0C00FF 41820030
38800000 4800004C
89830178 2C0C0041
4182003C 2C0C005F
41820034 2C0C00FF
4182001C 38800002
48000028 8983017C
2C0C0011 41A2FFA0
4BFFFFC8 8983017C
2C0C00E6 41820008
4BFFFFDC 38800003
60000000 00000000

NTSC-J
C2650A4C 00000015
88830184 2C040045
41820024 2C04004A
4182002C 2C04004B
4182001C 2C040050
41820038 3880000F
48000078 38800001
48000070 38800005
48000068 89830178
2C0C0080 41A2FFE0
2C0C00FF 41820030
38800000 4800004C
89830178 2C0C0041
4182003C 2C0C005F
41820034 2C0C00FF
4182001C 38800002
48000028 8983017C
2C0C0011 41A2FFA0
4BFFFFC8 8983017C
2C0C00E6 41820008
4BFFFFDC 38800003
60000000 00000000

NTSC-K
C263F6F8 00000015
88830184 2C040045
41820024 2C04004A
4182002C 2C04004B
4182001C 2C040050
41820038 3880000F
48000078 38800001
48000070 38800005
48000068 89830178
2C0C0080 41A2FFE0
2C0C00FF 41820030
38800000 4800004C
89830178 2C0C0041
4182003C 2C0C005F
41820034 2C0C00FF
4182001C 38800002
48000028 8983017C
2C0C0011 41A2FFA0
4BFFFFC8 8983017C
2C0C00E6 41820008
4BFFFFDC 38800003
60000000 00000000



Source:
####################
###START ASSEMBLY###
####################

#

#########################
##Grab Disc/ISO Version##
#########################

lbz r4, 0x0184 (r3) #Grab the byte that determines a user's Disc/ISO

########################
##Disc/ISO Byte Checks##
########################

cmpwi r4, 0x45 #Compare byte to 0x45 (USA game)
beq- usa_regid #If equal to 0x45, jump to usa_regid label

cmpwi r4, 0x4A #Compare byte to 0x4A (JAPAN game)
beq- jpn_ortwn #If equal to 0x4A, jump to jpn_twn label

cmpwi r4, 0x4B #Compare byte to 0x4B (KOREA game)
beq- kor_regid #If equal to 0x4B, jump to kor_regid label

cmpwi r4, 0x50 #Compare byte to 0x50 (PAL game)
beq- euro_oraus #If equal to 0x50, jump to euro_oraus label

####################
##White Line Label##
####################

##This label will apply white line as a safety net if the disc/iso byte is not recognized.##
##If somebody has a Taiwan white line, they will also get navigated to this label.##

white_line:
li r4, 0xF #Load 0xF into r4 (this is what the game uses as a safety net, if a region ID value is unreadable)
b the_end #White line applied, jump to the_end label

###################
##Blue Line Label##
###################

usa_regid:
li r4, 0x1 #Load 0x1 into r4
b the_end #Blue line applied, jump to the_end label

#####################
##Purple Line Label##
#####################

kor_regid:
li r4, 0x5 #Load 0x5 into r4
b the_end #Purple line applied, jump to the_end label

#################################
##JAPAN Game Country Code Check##
#################################

##Since Taiwan region is part of the Japan disc/ISO, we need to check the person's country code to see if they have the Taiwan country code##
##That is how Taiwan region is reached legitimately.##
##If the user hasn't set their flag, their country code will be 0xFF. Meaning we will later need to check their globe position##

jpn_ortwn:
lbz r12, 0x0178 (r3) #Load country code from user's USER RECORD

cmpwi r12, 0x80 #Compare country code to Taiwan's country code (0x80)
beq- white_line #If player has Taiwan country code, jump to white_line label

cmpwi r12, 0xFF #See if player hasn't set their flag
beq- check_globe1 #If country code value is 0xFF, user hasn't set flag. Jump to check_globe1 label

##If user has flag set with a country code other than Taiwan, we know to give that user a red line.##
##Proceed down to Red Line Label##

##################
##Red Line Label##
##################

its_jpn:
li r4, 0x0 #Load 0x0 into r4
b the_end #Red line applied, jump to the_end label

###############################
##PAL Game Country Code Check##
###############################

##Since AUS/NZ region is part of the PAL disc/ISO, we need to check the person's country code to see if they have the AUS or NZ country code##
##That is how AUS/NZ region is reached legitimately##
##If the user hasn't set their flag, their country code will be 0xFF. Meaning we will later need to check their globe position##

euro_oraus:
lbz r12, 0x178 (r3) #Load country code from user's USER RECORD

cmpwi r12, 0x41 #Compare country code to Australia's country code (0x41)
beq- its_aus #If player has Australia country code, jump to its_aus label

cmpwi r12, 0x5F #Compare country code to New Zealand's country code (0x5F)
beq- its_aus #If player has New Zealand country code, jump to its_aus label

cmpwi r12, 0xFF #See if player hasn't set their flag
beq- check_globe2 #If country code value is 0xFF, user hasn't set flag. Jump to check_globe2 label

####################
##Green Line Label##
####################

its_eur:
li r4, 0x2 #Load 0x2 into r4
b the_end #Green line applied, jump to the_end label

########################################
##Globe Position Checks for JAPAN Game##
########################################

check_globe1:
lbz r12, 0x017C (r3) #Load 1st byte value of word of Globe Position from user's USER RECORD

cmpwi r12, 0x11 #Compare byte to 0x11 (Taiwan Region ID default geo-location of Taipei City)
beq- white_line #If equal to 0x11, we know to give the user a white line. Jump to white_line label

b its_jpn #If user doesn't have the Taiwan Region ID default geo-location, we know to give them red line. Jump to red_line label.

######################################
##Globe Position Checks for PAL Game##
######################################

check_globe2:
lbz r12, 0x17C (r3) #Load 1st byte value of word of Globe Position from user's USER RECORD

cmpwi r12, 0xE6 #Compare byte to 0xE6 (AUS/NZ Region ID default geo-location of Australian Capital Territory)
beq- its_aus #If equal to 0xE6, we know to give the user a yellow line. Jump to its_aus label

b its_eur #If user doesn't have the AUS/NZ Region ID default geo-location, we know to give them green line. Jump to its_eur label

#####################
##Yellow Line Label##
#####################

its_aus:
li r4, 0x3 #Load 0x3 into r4, no need to add a branch funciton, since next step below is the the_end label.

#################
##The End Label##
#################

the_end: #Default Instruction not needed

#

##################
###END ASSEMBLY###
##################



Code creator: Vega
Code contributor(s): Star (subroutine founder)

Print this item

Lightbulb Hi
Posted by: fall - 11-26-2018, 11:50 PM - Forum: Introductions - Replies (1)

Hello, I'm a junior in high school and competitive Mario Kart Wii player. Just continuing my everlasting journey to get Korean text working on my disc  At

Print this item