Pure C isn't

If you have an Atari ST which has been upgraded with a 68882 maths 
co-processor (FPU) you may have run into problems using programs 
compiled with Pure C or Pure Pascal.
The Pure C/Pascal compilers insert code in a program to detect the 
presence of an FPU and to use it if it is there.
On 68020/68030 equipped machines it calls subroutines written in FPU 
machine code but on a 68000 equipped machine it has to use subroutines 
which emulate the CPU/FPU communication protocol built into the 68020 
and later chips.

The problem is that the code in Pure C/Pascal is not a true emulation 
of that protocol. The CPU should read a response from the FPU and react 
as requested, but the emulation code waits for an expected response and 
does nothing until it gets it. The 68882 in many cases gives a 
different response from that given by the 68881. This is because the 
68882 is 'pipelined'. It divides the execution of an floating point 
instruction into an interpretation/addressing phase and a calculation 
phase and is prepared to release the CPU to process in parallel with it 
at the end of the first phase.

The following programs have been tested and appear to work OK:

- Pre v1.5 versions of CAB. CAB was doing floating point calculations, 
  (although Alexander denies it).
- Kandinsky: Contains 5 subroutines to be patched.
- HDDriver Utility program (partitioning won't work without the patch)
- DAVector (demo version)
- Zorg registered version (the un-registered version appears to unpack
  itself at run-time - in any case this patching program won't fix it)
- Five-to-Five
- Gemview v2.24 (and maybe other versions).
- Atari CD Master from Homa Systems House (INFOPEDI.PRG and 
  PLAYSND.PRG)

The patch program may well work on other problem programs but neither I 
nor Atari Computing accept any liability or responsibility for any 
direct or indirect damage that may arise, either financial, material or 
any other kind from either the use or misuse of this software and 
associated documentation. All trademarks used are recognised and 
acknowledged.

The following is a dissassembly of the standard Pure C FPU subroutine 
for operations between two floating point inputs. The entry to the code 
is at line 3. On entry a0 and a1 point to two floating point operands 
in memory and d2 contains the opcode for the FPU instruction to be 
executed.

1   l0  move.w #3,-14(a2)   ; reset FPU (doesn't clear interrupts on
                                        68882)
2       movea.l d1,a2       ; restore saved a2 - see line 4

3       moveq #$7F,d0       ; set time-out value
                            ; **** entry point of subroutine ****
4       move.l a2,d1        ; save a2
5       lea -$5B0,a2        ; address of FPU operand register
6   l1  cmpi.w #$802,-16(a2) ; read FPU status - $802 means FPU idle
7       dbeq d0,l1          ; wait for idle
8       move.w #$4800,-6(a2) ; opcode - CPU tells FPU to accept an 
                             ; floating point number
9   l2  cmpi.w #$960C,-16(a2) ; FPU asks CPU to pass it 12 bytes
10      dbeq d0,l2          ; wait for response
11      move.l (a0),(a2)    ; a0 points to first floating point number 
                            ; in memory
12      move.l 2(a0),(a2)   ; Pure C packs floating point number into 
                            ; 10 bytes
13      move.l 6(a0),(a2)   ; last 4 bytes of floating point number
14  l3  cmpi.w #$802,-16(a2) ; read FPU status again
15      dbeq d0,l1          ; wait for idle
16      move.w d2,-6(a2)    ; pass opcode to FPU
17  l4  cmpi.w #$960C,-16(a2) ; FPU asks CPU to pass it 12 bytes
18      dbeq d0,l4          ; wait for response
19      move.l (a1),(a2)    ; a1 points to second floating point number 
                            ; in memory
20      move.l 2(a1),(a2)
21      move.l 6(a1),(a2)   ; last 4 bytes of floating point number
22  l5  cmpi.w #$802,-16(a2) ; read FPU status again
23      dbeq d0,l5          ; wait for idle
24      move.w #$6800,-6(a2) ; CPU asks FPU to hand over result
25  l6  cmpi.w #$B20C,-16(a2) ; FPU asks CPU to store 12 bytes
26      dbeq d0,l6          ; wait for response
27      BNE.S l0            ; try again if failed
28      move.l (a2),(a0)    ; store result
29      move.l (a2),2(a0)
30      move.l (a2),6(a0)
31      movea.l d1,a2       ; restore a2
32      rts                 ; and exit

The responses looked for at lines 9, 17 and 25 have meanings bit by 
bit. For example, the C at the end of $960C does ask for 12 bytes. 
However, the bit of interest is the msb, called the 'come again' bit.
The 68881 wants the CPU to check with it again so it has that bit set, 
giving $960C, whereas the 68882 responds with $160C at both lines 9 and 
17.  At line 25 the 68882 will respond with $320C EXCEPT when the 
operation creates an exception, such as division by zero, when it 
responds with $B20C, as does the 68881 at all times.

Note that checking for the incorrect response does not of itself 
prevent the program from working on the 68882. The DBEQ loops at l2, l4 
and l6 will terminate and the program will run, although slowly. It is 
the test at line 27 that is the killer. Because the preceeding dbeq has 
cleared the zero bit, the branch will be taken and the code goes into 
an infinite loop, which is how the program dies.

The code can be made to work on both a 68881 and 68882 simply by 
replacing the BNE.S l0 ($668E) with an NOP ($4E71) but it will run 
considerably faster on a 68881. If the two occurrences of $960C are 
changed to $160C and the $B20C to $320C the 68882 will have the 
advantage.

Most Pure C or Pascal programs contain several similar subroutines, one 
for each type of calculation that floating point numbers are involved 
in, in those programs. Each requires patching.

Ideally, programmers doing floating point calculations in Pure C or
Pure Pascal should modify the library code so that the FPU response is 
read into a CPU register and the msb masked off or set before testing.
Then the code would run reasonably quickly on either chip.

Fix FPU
-------
Here's a HiSoft BASIC listing for FIX_FPU.PRG a program which patches 
another program complied using either PURE C or PURE PASCAL so it will 
run on an Atari equipped with a 68000 and 68882 FPU.
The FPU support code in both Pure C and Pure Pascal appears identical 
in both compilers.

No guarantee is offered that the information in this distribution is 
correct but if you do find any mistakes let me know and I'll do my best 
to put them right!

Contact
-------
David Leaver
10 Goodparla St,
Hawker, ACT 2614
Australia

Email: leaver@netinfo.com.au

Notes concerning the program listing
------------------------------------
The odd bit of code around the call to dgetpath is there because, as 
far as I can tell, the library routine doesn't work as documented.  It 
wants the buffer address and returns a TOS/C style string rather than a 
BASIC string.

---
LIBRARY "gemdos", "gemaes"


? "PurePatch patches programs complied using either"
? "PURE C or PURE PASCAL so that it will run on an"
? "Atari equipped with a 68000 and 68882"
? : ?
? "     ******** WARNING *********"
?
? "The input file is overwritten - if you have not"
? "backed it up, choose CANCEL at the file selector"
? : ?
? "Press a key to continue"
while inkey$ = ""
wend
drive$ = CHR$(FNdgetdrv% + 65)
path$ = "" : filename$ = ""
for i% = 1 to 128 : path$ = path$ + CHR$(0) : next i%
dgetpath sadd(path$) , 0
fullpath$ = drive$ + ":"
i% = 1
temp$ = mid$(path$,i%,1)
while ASC(temp$) <> 0
  fullpath$ = fullpath$ + temp$
  i% = i% +1
  temp$ = MID$(path$,i%,1)
wend
fullpath$ = fullpath$ + "\*.*"
fsel_input  fullpath$,  filename$,  ok%
if ok% = 0 then end
clearw 2
? "Patching ";filename$
? : ? "could take a minute ot two........"
drive$ = LEFT$(fullpath$,1)
fullpath$ = MID$(fullpath$,3,128)
i% = LEN(fullpath$)
while RIGHT$(fullpath$,1) <> "\"
   i% = i%-1
   fullpath$ = LEFT$(fullpath$, i%)
wend
drivemap% = FNdsetdrv%(ASC(drive$)-65)
if FNdsetpath%(fullpath$)<>0 THEN
   ? " Error in setting path : "; fullpath$
   end
end if
handle% = FNfopen%(filename$,2)
if handle% < 0 then
   ? " Error opening file - # "; handle%
   end
end if
buffer$ = CHR$(0)
repeat readloop
  readbyte read&, byte%
  if read& = 0 then exit readloop
  if byte% = 12 then call patch
end repeat readloop
ok% = FNfclose%(handle%)
end

SUB readbyte(n&, byte%)
SHARED handle%
buffer$ = CHR$(0)
n& = FNfread&(handle%,1,sadd(buffer$))
byte% = ASC(buffer$)
end SUB

SUB writebyte(byte%)
SHARED handle%
buffer$ = CHR$(byte%)
n& = FNfwrite&(handle%,1,sadd(buffer$))
end SUB

SUB patch
SHARED handle%
readbyte n&, byte%
if n& = 1 then
   if byte% = 106 or byte% = 105 then
      readbyte n&, byte1%
      if n& = 1 then readbyte n&, byte2%
      if n& = 1 then readbyte n&, byte3%
      if n& = 1 then readbyte n&, byte4%
      if n& = 1 and byte3% = 255 and byte4% = 240 then
         if byte1% = 150 and byte2% = 12 then
            ok& = FNfseek&(-4, handle%, 1)
            writebyte 22
         elseif byte1% = 178 and byte2% = 12 then
            ok& = FNfseek&(-4, handle%, 1)
            writebyte 50
            ok& = FNfseek&(7, handle%, 1)
            if ok& >= 0 then
               readbyte n&, byte%
               if byte% = 102 then
                  ok& = FNfseek&(-1, handle%, 1)
                  writebyte 78
                  writebyte 113
               end if
            end if
         end if
      end if
   end if
end if
end sub
---
EOF