Mistake in the Broadway Manual regarding frsqrte
#1
Most coders here are familiar with the Broadway Manual. If anybody has also read through it, you will find it littered with diagram mistakes and typoes. Considering it's a preliminary file that was never meant for the public, this is sort of expected.

However, I came across a decent mistake (or information left missing) regarding the description of the frsqrte instruction. I've been working on a Broadway PPC Instruction Simulator recently and have to make sure it's as accurate as possible. So I've been combing thru the manual at times.

According to the Broadway Manual (page 426), the frsqrte instruction does this~
1. frB used as input (64-bit input)
2. Reciprocal of Square Root Estimate is performed (within 1/4096 accuracy)
3. frD gets result as 64-bit

What's odd is that there is no mention about how the instruction operates in regards to the HID2 PSE bit. I've tried reading other Chapters, and there's nothing explicit. As an fyi, for every single-float instruction (except for fabs, fneg, and fnabs), the operation of said instruction varies depending on whether PSE bit is low or high.

For example, the fmr instruction.
When HID2 PSE is low
1. Entire Value of frB is copied to frD
When HID2 PSE is high
2. ps0 of frB copied to ps0 of frD. ps1 of frD is left UNCHANGED.

Another example (fdivs)
When HID2 PSE is low
1. Entire value of frA is divided by entire value of frB
2. Result placed in frB as 64-bit form but with single precision ofc
When HID2 PSE is high
1. ps0 of frA is divided by ps0 of frB
2. result placed into BOTH ps0 and ps1 of frD

Basically in all the single-float math-type instructions, when HID2 PSE is high, ps0 is input and ps0 + ps1 is output. As you can see above, the fmr instruction is an old ball with frD ps1 being unchanged. The frsp instruction is another odd ball to where the frD's ps1 is left undefined (junk).

The Broadway Manual has zero information about what occurs for frsqrte in regards to HID2 PSE. Nothing. So anyway I did some tests on Real Hardware and this is what occurs~

When HID2 PSE is low
1. Entire 64-bit frB used for input
2. frD gets result as 64-bits

When HID2 PSE is high
1. ps0 of frB is used for input
2. ps0 of frD gets result. ps1 of frD is left UNCHANGED


If anyone cares this is the following code I used for confirmation. f1 gets loaded as 4.0, 0.0. f2 gets loaded as 1.25, 1.25. I then perform a frsqrte (with HID2 PSE high) where f1 is frB and f2 is frD.

f2 resulted as 0x3EFFF400 3FA00000 (~0.499, 1.25). ps0 of f2 didn't result exactly as 0.5, this is expected due to the 1/4096 accuracy effect.

In conclusion this information is missing from the 1.0 version of the Broadway Manual which is the latest version afaik.

Code:
#C2 Address 807BA164
#Pick up box, see result on screen

.set HID2, 920
.set PSE, 0x2000

#Check PSE bit of HID2
mfspr r3, HID2
andis. r0, r3, PSE
bne+ good_to_go

#Set r5 arg for OSFatal
bl setbadfatal
.asciz "NOTE! PSE of HID2 is low. Try again or try a diff hook address."
setbadfatal:
mflr r5
b goodfatal

#Set our single float constant value (4)
good_to_go:
lis r3, 0x4080
ori r3, r3, 0x0000
li r4, 0
lis r5, 0x3FA0

#Place value into ps0 of f1 with ps1 being null
lis r31, 0x8000
ori r31, r31, 0x1500
stw r3, 0 (r31)
stw r4, 4 (r31)
stw r5, 8 (r31)
stw r5, 0xC (r31)
psq_l f1, 0 (r31), 0, 0
psq_l f2, 0x8 (r31), 0, 0

#Do square root on it
frsqrte f2, f1

#Store results
psq_st f2, 0x0 (r31), 0, 0

#Load the fpr into GPRs r5 and r6 for sprintf
lwz r5, 0 (r31)
lwz r6, 4 (r31)

#Set r4 arg for sprintf
bl setsprintf
.asciz "%08X %08X"
.align 2
setsprintf:
mflr r4

#Set r3 arg for sprintf
addi r3, r31, 0x40

#Clear cr1 eq bit cuz no floats for sprintf
crclr 6

#Call sprintf
lis r12, 0x8001
ori r12, r12, 0x1A2C
mtctr r12
bctrl

addi r5, r31, 0x40

#Setup OSFatal args
goodfatal:
bl setupfatal
.long 0xFFFFFFFF
.long 0
setupfatal:
mflr r3
addi r4, r3, 4

#Call OSFatal
lis r12, 0x801A
ori r12, r12, 0x4EC4
mtctr r12
bctr
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)