Re: cr_restart: ->cri_syscall(CR_OP_RSTRT_REAP): Invalid argument

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Wed Nov 09 2005 - 09:54:09 PST

  • Next message: Michael Klemm: "Next release of BLCR"
    Great.  Thanks for the update.
    I'll include this fix in the release I will be making for the SC|05
    conference next week.
    
    -Paul
    
    Michael Klemm wrote:
    > Hi,
    >
    > Paul H. Hargrove wrote:
    >> I may have a solution.  The attached patch should cause BLCR to store
    >>  the actual contents of any deleted mmaped file, rather than storing
    >> just the filename.  This should solve the problem if the file is not
    >> still open within NSCD (and thus potentially changing).  However, if
    >> NCSD is also attached to the file (via open() or mmap()) and expects
    >> to communicate with the application through this file, then there is
    >> no good way for BLCR to save and restore this "communication channel"
    >> - the best we could hope for in that case would be to "undelete" the
    >> file by linking it back into the filesystem with its original name.
    >> That is likely to create a "leak" of such files and so I'd not
    >> consider it a general-purpose solution. Let me know if this patch
    >> works or not so I can include in the next release (which I am hoping
    >> to put out next week).
    >
    > We have tested your patch against our set of applications.  It looks
    > like that everything is OK by now.  At least Christian told me that the
    > applications started up correctly and the error caused by the NSCD is
    > gone now.
    >
    > Regards
    >     -michael
    >
    > -- 
    > Computer Science Department 2, University of Erlangen-Nuremberg
    > Martensstrasse 3, D-91058 Erlangen, Germany
    > phone: ++49 (0)9131 85-28995, fax: ++49 (0)9131 85-28809
    > web: http://www2.informatik.uni-erlangen.de/~klemm
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Michael Klemm: "Next release of BLCR"