Get Square Root of Any Hex Value: PPC ASM
#1
https://github.com/VegaASM/SquareRootHex-PPCASM

Another small function I wrote up in my free time. It can take in any Hex value and spit out the square root result. The user can modify r4 to select what type of rounding will be done. Licensed under the Apache 2 License.

Code:
/*
Copyright 2020 VegaASM

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0
    
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

#~~~~~~~~~~~~~~~~#
# START ASSEMBLY #
#~~~~~~~~~~~~~~~~#

hex_square_root:

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# If 0/1, return 0/1 respectively. If not, backup r3 #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cmplwi r3, 1
blelr-
mr r0, r3

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Calculate Word/24bit/Halfword/Byte #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cntlzw r3, r0

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Calc How Many times Secondary Loop is Done #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cmplwi r3, 8
li r5, 4
li r6, 24
blt- start_with_msbyte #Word value found

cmplwi r3, 16
li r5, 3
li r6, 16
blt- start_with_msbyte #24-bit value found

cmplwi r3, 24
li r5, 2
li r6, 8
blt- start_with_msbyte #Halfword value found

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Byte value only option left #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

li r5, 1
li r6, 0

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Start with the Most Significant Byte #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

start_with_msbyte:
srw r7, r0, r6

#~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Set One Above Digit Limit #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~#

li r8, 0x10

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# First type of Loop, No matter what Value is, this 1st type is only done once #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Square the r8 value, Proceed once result is less than r7 #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

first_type_loop:
subi r8, r8, 1
mullw r9, r8, r8
cmplw r9, r7
bgt- first_type_loop

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Loop Done, Subtract r9 from r7 to get Remainder, Build r3 Result w/ r8's value #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

sub r7, r7, r9
mr r3, r8

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Take r8's result from the above loop, and add it to itself (aka mul by 2 #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

add r8, r8, r8

#~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Big (Secondary Type) Loop #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Do Certain Rotate Instruction Based on How Many Loops are Left #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

big_secondary_loop:
cmpwi r5, 4
rlwinm r6, r0, 16, 0x000000FF
beq- build_into_leftovers

cmpwi r5, 3
rlwinm r6, r0, 24, 0x000000FF
beq- build_into_leftovers

cmpwi r5, 2
clrlwi r6, r0, 24
beq- build_into_leftovers

li r6, 0 #Fake decimal value

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Shift r7 leftover 8 bits to bring 'down' the next 'group' (byte) #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

build_into_leftovers:
slwi r7, r7, 8
or r7, r7, r6

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Shift r8 left by 4 bits to create the "Space" #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

slwi r8, r8, 4

#~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Set One Above Digit Limit #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~#

li r6, 0x10

small_secondary_loop:
subi r6, r6, 1

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Place r6 into r8's "space", Then Multiply. Proceed on first iteration when r10 is less than r7 #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

or r9, r8, r6
mullw r10, r9, r6
cmplw r10, r7 #Once we get first iteration of r6 < r12 then we can proceed
bgt- small_secondary_loop

#~~~~~~~~~~~~~~~~~~~~#
# Get Next Leftovers #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Adjust the Increasing Number that will be "Spaced" #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

sub r7, r7, r10
add r8, r9, r6

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Find out if we are doing Remainder of Whole Number #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# If so, we don't want to do the upcoming slwi and or instruction #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cmpwi r5, 1
beq- decrement_loop

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Got Next Digit Result. Move r3 over by 4 bits left. OR in Digit (r6) #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

slwi r3, r3, 4
or r3, r3, r6

#~~~~~~~~~~~~~~~~~~~~#
# Decrement Big Loop #
#~~~~~~~~~~~~~~~~~~~~#

decrement_loop:
subic. r5, r5, 1
bne+ big_secondary_loop

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Check User Option For Rounding #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cmpwi r4, 1
bnelr- #If User wanted rounding towards lowest zero, then function is done, END function

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#
# Check Remainder Value of Whole Number #
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~#

cmpwi r6, 8
bltlr- #If remainder has most significant digit of less than 8 then nothing needs to be done, END function

addi r3, r3, 1 #If digit is 8 or more, we need to round up

#~~~~~~~~~~~~~~#
# END Function #
#~~~~~~~~~~~~~~#

blr

#~~~~~~~~~~~~~~#
# END ASSEMBLY #
#~~~~~~~~~~~~~~#
Reply
#2
Here's my shorter (but not as fancy) take on this, should be called as a function:



r3 = Input value
r4 = Rounding Type, 0 = Round down towards zero, 1 = Standard rounding



Raw code (Region Free)
28030001 4C810020
39800001 39600000
7CAA2B78 7D806378
398C0001 7CAC61D6
7D632851 4180FFEC
41820018 2C040000
41820014 7D4A1850
7D2A5851 41810008
7D806378 7C0C0378
7D836378 4E800020

Code:
cmplwi r3, 1
blelr-
li r12, 1
li r11, 0
loop:
mr r10, r5
mr r0, r12
addi r12, r12, 1
mullw r5, r12, r12
sub. r11, r5, r3
blt+ loop
beq- root_found
cmpwi r4, 0
beq round_down
sub r10, r3, r10
sub. r9, r11, r10
bgt round_down
root_found:
mr r0, r12
round_down:
mr r12, r0
mr r3, r12
blr
Super Mario Eclipse, what Super Mario Sunshine could've been.
Reply
#3
Sweet. I'll run some test values on it soon.

EDIT: It appears word values greater than 0x80000000 won't work.

8042A5B0, B1C4E390, FFFE0001 spit out wrong results
Reply
#4
Interesting, I will look into that! Tongue Clearly it doesn't like negative values lol
Super Mario Eclipse, what Super Mario Sunshine could've been.
Reply
#5
Either way, I am impressed at how small your source is compared to mine.

The other day I was consulting w/ a friend who has a PhD in mathematics, about creating his own non cryptographic hashing algorithm. I would then write his algorithm in PPC ASM. I don't know how well you are with hashes and complex math, but you could write your own hashing algorithm or rewrite a famous one as a good exercise to kill free time.

Mkwii codes bore me to death now, lol. Everything i ever wanted has been made and released.
Reply
#6
I've been able to tell lol, I've slowly been losing interest in just simply "making codes". I've as such, learned Python, and am also branching to other game projects too.

Also, I've already found the issue, and will fix it soon Tongue
Super Mario Eclipse, what Super Mario Sunshine could've been.
Reply
#7
Ye I've been pondering the idea of partially leaving the MKWii community. Most likely will go to Black Ops 4 on the PS4, just want a game I can play by myself time to time and camp/troll others for fun. Something to do with my spare time.

I have a few other MKWii codes in mine to do, but they require Starlet permission and I don't want to go into the realm of learning ARM Assembly + IOS modifications for these codes to work by any loading method.

Other Wii games don't interest me, so making codes for other games is basically a no go for me. I have a feeling my PPC journey is coming to an end.
Reply
#8
Whew. That's gonna be weird. Are you still gonna stay active on this site?
Super Mario Eclipse, what Super Mario Sunshine could've been.
Reply
#9
My activity will drop a bit, but I will still be around. Also, I will still be updating times for the MKWPP.
Reply
#10
Here is my new version of this, call it as a function, r3 = number, r4 = rounding type (0 = Standard Rounding, 1 = Round Towards Zero):

(Region-Free)
7C651B78 3860FFFF
38630001 7CC319D6
7C062840 4180FFF4
28040001 4082000C
3883FFFF 4800001C
3883FFFF 7CE421D6
7CC53050 7CA72850
7C062840 4C810020
7C832378 4E800020


Code:
typedef int BOOL;
enum { FALSE, TRUE };

unsigned int squareRoot(unsigned int num, BOOL roundDown) {
    unsigned int i = 0;
    while (i * i < num) i = i + 1;
    if (roundDown == TRUE || (i * i) - num > num - (i - 1) * (i - 1)) return i - 1;
    else return i;
}
Super Mario Eclipse, what Super Mario Sunshine could've been.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)