Dolphin FPU Inaccuracies
#1
Dolphin FPU Inaccuracies

Below is a code that will show an output on your screen. Due to how Dolphin doesn't emulate the FPRs and certain floating point instructions 100% correctly, the code will produce different results on Real Hardware vs Dolphin.

Code is PAL only, start a race/battle, pick up a box.

C27BA164 00000021
7D98E2A6 758C2000
418200F8 3FE08000
63FF1500 38600000
907F0008 3C00C020
3C604084 3C804080
60008000 6063C000
60840000 901F0000
907F0004 909F000C
C01F0000 D81F0010
C83F0004 D03F0018
C05F000C FC001034
F01F001C 80BF0010
80DF0014 80FF0018
811F001C 813F0020
4800004D 52657375
6C74730A 0A6C6673
2B737466 64206653
3A202530 38582025
3038580A 6C66642B
73746673 2066533A
20253038 580A6672
73717274 65206644
3A202530 38582025
30385800 7C8802A6
387F0030 4CC63182
3D808001 618C1A2C
7D8903A6 4E800421
4800000D FFFFFFFF
000000FF 7C6802A6
38830004 38BF0030
3D80801A 618C4EC4
7D8903A6 4E800420
90770020 00000000

The code manipulates 3 Dolphin Inaccuracies. Those are Dolphin not emulating the stfd, lfd, and frsqrte instructions correctly when the PSE bit is high in the HID2 special purpose register.

Nothing to be concerned about as this bug can occur as a result of poorly handwritten assembly. Thus it won't effect how Wii games run on Dolphin, well I assume so at least.

Source:
Code:
#PAL = 807BA164

.set float1, 0xC0208000 #-2.5078125
.set float2, 0x4084C000 #4.1484375
.set float3, 0x40800000 #4.0
.set HID2, 920
.set PSE, 0x2000 #Upper 16-bits

#Just in case.. make sure HID2 PSE is high
mfspr r12, HID2
andis. r12, r12, PSE
beq- skip_code

#Set r31 as 0x80001500 for eva usage
lis r31, 0x8000
ori r31, r31, 0x1500

#Fyi we need 0x80001508 to be and remain null for this experiment
li r3, 0
stw r3, 0x8 (r31)

#Place float1 at 0x80001500
#Place float2 at 0x80001504 #word at 0x80001508 must remain null
#Place float3 at 0x8000150C
lis r0, float1@h
lis r3, float2@h
lis r4, float3@h
ori r0, r0, float1@l
ori r3, r3, float2@l
ori r4, r4, float3@l
stw r0, 0 (r31)
stw r3, 0x4 (r31)
stw r4, 0xC (r31)

#Load float1 into f0 ps0 and ps1
lfs f0, 0 (r31)

#Store FPR using stfd
#On real hardware, ps0 (0xC0208000) is converted to its 64-bit form (0xC0041000 00000000). That 64-bit float is stored.
#On dolphin, this doesn't occur
stfd f0, 0x10 (r31)

#Load float2 in f1 using lfd
#On real hardware, the 64-bit value 0x4084C000 00000000 (664.0) in the EVA is used for the load. It is converted to a 32-bit single float (0x44260000), and then placed into ps0 with ps1 being left undefiend.
#On dolphin, this doesn't occur
lfd f1, 0x4 (r31)

#Store ps0 only, don't worry about undefined ps1
stfs f1, 0x18 (r31)

#Load float3 single into f2
lfs f2, 0xC (r31)

#Now execute frsqrte, put result into f0. f0's original ps1 from earlier (0xC0208000) will be left alone.
#On real hardware, f0 will result as 0x3EFFF400 C0208000. ps0 didn't result exactly as 0.5 due to 1/4096 accuracy effect of frsqrte
#On dolphin, this doesn't occur
frsqrte f0, f2

#Store entire FPR. There is zero conversion for both ps's since both ps's are normalized values. Well at least that's the case for real Hardware. Can't assume so for Dolphin.
psq_st f0, 0x1C (r31), 0, 0

#Load all the float results into r5 thru r9 for sprintf int args
lwz r5, 0x10 (r31)
lwz r6, 0x14 (r31)
lwz r7, 0x18 (r31)
lwz r8, 0x1C (r31)
lwz r9, 0x20 (r31)

#Set r4 src arg of sprintf
bl setupsprintf
.string "Results\n\nlfs+stfd fS: %08X %08X\nlfd+stfs fS: %08X\nfrsqrte fD: %08X %08X"
.align 2
setupsprintf:
mflr r4

#Set r3 dest arg of sprintf
addi r3, r31, 0x30

#Clear cr1's eq bit to tell sprintf there are no args in the fprs to use
crclr 6

#Call sprintf
lis r12, 0x8001
ori r12, r12, 0x1A2C
mtctr r12
bctrl

#Setup OSFatal args
bl setupfatal
.long 0xFFFFFFFF #text color; white
.long 0x000000FF #bg color; black
setupfatal:
mflr r3
addi r4, r3, 4
addi r5, r31, 0x30

#Call OS Fatal
lis r12, 0x801A
ori r12, r12, 0x4EC4
mtctr r12
bctr

#Skip code
skip_code:
stw r3, 0x0020 (r23)

In conclusion, why it may be easy to blame the devs and/or contributors of Dolphin for these inaccuracies, you have to take into account that the behavior of stfd, lfd, & frsqrte (when HID2 PSE is high), is completely absent within the Broadway Manual (well at least absent in both leaked versions, 0.8 and 1.0).
Reply


Messages In This Thread
Dolphin FPU Inaccuracies - by Vega - 08-13-2021, 12:16 AM

Forum Jump:


Users browsing this thread: 1 Guest(s)