lib/libc/mips/string/bzero.S -- problem in 64-bit mode.
Jayachandran C.
c.jayachandran at gmail.com
Tue Feb 22 05:04:20 UTC 2011
On Tue, Feb 22, 2011 at 6:37 AM, Artem Belevich <fbsdlist at src.cx> wrote:
> Hi,
>
> I think htere's a problem with bzero implementation for 64-bit mips (SZREG==8).
>
> http://svn.freebsd.org/viewvc/base/head/lib/libc/mips/string/bzero.S?revision=209231&view=markup
>
> LEAF(bzero)
> .set noreorder
> blt a1, 3*SZREG, smallclr # small amount to clear?
> PTR_SUBU a3, zero, a0 # compute # bytes to word align address
> and a3, a3, SZREG-1
> beq a3, zero, 1f # skip if word aligned
> #if SZREG == 4
> PTR_SUBU a1, a1, a3 # subtract from remaining count
> SWHI zero, 0(a0) # clear 1, 2, or 3 bytes to align
> PTR_ADDU a0, a0, a3
> #endif
>
> #if SZREG == 8
> PTR_SUBU a1, a1, a3 # subtract from remaining count
> PTR_ADDU a0, a0, a3 # align dst to next word
> sll a3, a3, 3 # bits to bytes
> li a2, -1 # make a mask
> #if _BYTE_ORDER == _BIG_ENDIAN
> (a) REG_SRLV a2, a2, a3 # we want to keep the MSB bytes
> #endif
> #if _BYTE_ORDER == _LITTLE_ENDIAN
> (b) REG_SLLV a2, a2, a3 # we want to keep the LSB bytes
> #endif
> (c) nor a2, zero, a2 # complement the mask
> REG_L v0, -SZREG(a0) # load the word to partially clear
> and v0, v0, a2 # clear the bytes
> REG_S v0, -SZREG(a0) # store it back
> #endif
>
> Let's suppose we're trying to bzero something at 0x1234567. A3 will
> contain number of bytes *remaining* until register-aligned address.
> I.e. 1 in this case.
> When we make it to (c) on big-endian platforms A2=0x00FFFFFF_FFFFFFFF
> and on little-endianA2=0xFFFFFFFF_FFFFFF00.
> after (c) it's 0xFF000000_00000000 and 0x00000000_000000FF
> correspondingly, unless I've got NOR semantics wrong.
>
> Now we load register, AND it with a2 and write the result back. It
> does not look right -- we're clearing *7* bytes instead of only one
> and clobber the data that preceeds the start address.
>
> I believe correct code should look like this:
>
> #if SZREG == 8
> PTR_SUBU a1, a1, a3 # subtract from remaining count
> PTR_ADDU a0, a0, a3 # align dst to next word
> sll a3, a3, 3 # bits to bytes
> li a2, -1 # make a mask
> #if _BYTE_ORDER == _BIG_ENDIAN
> REG_SLLV a2, a2, a3 # we want to keep the MSB bytes
> #endif
> #if _BYTE_ORDER == _LITTLE_ENDIAN
> REG_SRLV a2, a2, a3 # we want to keep the LSB bytes
> #endif
> REG_L v0, -SZREG(a0) # load the word to partially clear
> and v0, v0, a2 # clear the bytes
> REG_S v0, -SZREG(a0) # store it back
> #endif
>
> One thing I don't quite understand is -- why do we bother with all
> this manual masking at all? Why not just use REG_SHI for both SZREG==4
> and SZREG==8 cases? If I read the code I quoted above correctly, it
> attempts to emulate SDL/SDR instructions. Using REG_SHI macro would
> pick correct swl/swr/sdl/sdr variant based on register size and
> endianness and would clear the unaligned bytes.
>
> PTR_SUBU a1, a1, a3 # subtract from remaining count
> REG_SHI zero, 0(a0) # clear 1..SZREG-1 bytes to align
> PTR_ADDU a0, a0, a3
I just tested this with a simple program - and there is certainly an
issue here. If you can send me a patch, I can check that in after
testing.
The kernel version of bzero() does not seem to have the SZREG==8 case,
and this bug.
Thanks,
JC.
More information about the freebsd-mips
mailing list