Re: sparc implementation

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Sep 02 2008 - 19:46:17 PDT

  • Next message: Paul H. Hargrove: "Re: blcr 0.7.3: core dump file"
    My best guess is that the -I options are not quite right to include the 
    libcr/arch/sparc directory.  Take a look at the full command line that 
    make executes for a file in libcr.  For example, on a i686 I see:
    
    /bin/sh ../libtool --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. 
    -I.. -I../../libcr -D_GNU_SOURCE -D_REENTRANT -I../include 
    -I../../include -I../../libcr/arch/i386/   -Wall -Wno-unused-function 
    -fno-stack-protector  -g -O2 -MT libcr_la-cr_async.lo -MD -MP -MF 
    .deps/libcr_la-cr_async.Tpo -c -o libcr_la-cr_async.lo `test -f 
    'cr_async.c' || echo '../../libcr/'`cr_async.c
    
    Notice the -I../../libcr/arch/i386/ part.  My best guess is that your 
    says -I../../libcr/arch/sparc64/, rather than -I../../libcr/arch/sparc/ 
    as you are expecting (note the ../.. is based on where my build dir is 
    located in relation to my source dir, your -I's may differ).  If this is 
    the case, then you should consider sparc and sparc64 dirs with the same 
    relation as the existing ppc and ppc64 dirs (64 includes the 32bit code).
    
    As for the inc and dec-and-test, you can implement them using the following:
    
    CR_INLINE unsigned int
    __cri_atomic_add_fetch(cri_atomic_t *p, unsigned int op)
    {
        unsigned long oldval, newval;
        do {
            oldval = cri_atomic_read(p);
            newval = oldval + op;
        } while (!cri_cmp_swap(p, oldval, newval));
        return newval;
    }
    
    CR_INLINE void
    cri_atomic_inc(cri_atomic_t *p)
    {
        (void)__cri_atomic_add_fetch(p, 1);
    }
    
    CR_INLINE int
    cri_atomic_dec_and_test(cri_atomic_t *p)
    {
        return (__cri_atomic_add_fetch(p, -1) == 0);
    }
    
    
    These should be considered the "reference" implementations and would 
    appear in a porting guide if I had written one.
    
    However, I think the following would be the (nearly) "optimal" 
    __cri_atomic_add_fetch() for UltraSPARC and newer:
    
    CR_INLINE unsigned int
    __cri_atomic_add_fetch(cri_atomic_t *p, unsigned int op)
    {
        register unsigned int oldval, newval;
        __asm__ __volatile__ (
            "ld       [%4],%0    \n\t" /* oldval = *addr; */
            "0:                  \t"
            "add      %0,%3,%1   \n\t" /* newval = oldval + op; */
            "cas      [%4],%0,%1 \n\t" /* if (*addr == oldval) SWAP(*addr,newval); else newval = *addr; */
            "cmp      %0, %1     \n\t" /* check if newval == oldval (swap succeeded) */
            "bne,a,pn %%icc, 0b  \n\t" /* otherwise, retry (,pn == predict not taken; ,a == annul) */
            "  mov    %1, %0     "     /* oldval = newval; (branch delay slot, annulled if not taken) */
            : "=&r"(oldval), "=&r"(newval), "=m"(*p)
            : "rn"(op), "r"(p), "m"(*p) );
        return newval;
    }
    
    
    If I got that right (based on a different SPARC atomics project I worked 
    on), the generated asm for atomic inc and dec-and-test will use 
    immediate +1 and -1 arguments to the add instruction.
    
    I'll need to think about whether you need any memory barriers to make 
    this 100% correct on the SPARC architecture.  Have you considered that 
    for the cri_cmp_swap()?
    
    The syscall code is found in glibc.  For instance in 
    glibc-2.6/sysdeps/unix/sysv/linux/sparc/sysdep.h, where you'll want to 
    use guts of the inline_syscall0() through inline_syscall5() macros, 
    combined with the errno handling as seen in 
    blcr/libcr/arch/i386/cr_arch.h:cri_syscall_cleanup().
    
    Just to be 100% proper about this code I've included here:
    Signed-off-by: Paul H. Hargrove <PHHargrove_at_lbl_dot_gov>
    
    -Paul
    
    Vincentius Robby wrote:
    > Thank you Paul,
    >
    > Now even after I put the cr_arch.h and cr_atomic.h under 
    > libcr/arch/sparc, the following errors appear:
    > In file included from ../../libcr/cr_async.c:37:
    > ../../libcr/cr_private.h:63:23: error: cr_atomic.h: No such file or 
    > directory
    > In file included from ../../libcr/cr_private.h:65,
    >                  from ../../libcr/cr_async.c:37:
    > [some more errors]
    > ../../libcr/cr_private.h:66:21: error: cr_arch.h: No such file or 
    > directory
    >
    > Do I have to change something else for blcr to realize the files' 
    > existence?
    > Also, would you be able to point me to some other resources for the 
    > assembly codes? For cr_atomic.h, I understood how to implement 
    > compare_and_swap but I am not able to infer the atomic increment and 
    > decrement and test from the glibc source codes as well as the source 
    > for other architectures. For the syscall functions, would you know of 
    > where can I look into? Should the glibc have these?
    >
    > Thank you very much for the help, I've been slowly looking into these 
    > for a while, but my inexperience hinders me from advancing as quick.
    >
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Paul H. Hargrove: "Re: blcr 0.7.3: core dump file"