Re: BLCR 0.4.1 Beta5 now available

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Oct 11 2005 - 12:19:29 PDT

  • Next message: Neal Becker: "Please add extern "C""
    A process which wishes to checkpoint itself can call cr_request(),
    cr_request_file() or cr_request_fd() all of which are described breifly
    in the libcr.h header.
    In all cases the call only starts the checkpoint asynchronously.  These
    calls will not "pick up just before the checkpoint call (and
    re-checkpoint)".
    To wait for the checkpoint to complete before continuing, the following
    should be used:
        
        cr_client_id_t my_id = cr_init();
    ...
        cr_request_file(filename);
        cr_enter_cs(my_id);
        cr_leave_cs(my_id);
    
    Because cr_enter_cs() and cr_leave_cs() together define a critical
    section in which checkpoints are excluded, one is certain that the
    checkpoint is complete before cr_leave_cs() returns (but not in the
    interval between enter/leave for reasons not worth explaining).
    
    This behavior should probably be a boolean "block" option the the
    cr_request_*() calls, but that will have to wait for a future API revision.
    
    -Paul
    
    Neal Becker wrote:
    >On Tuesday 11 October 2005 2:52 pm, jcduell_at_lbl_dot_gov wrote:
    >  
    >>On Tue, Oct 11, 2005 at 02:40:39PM -0400, Neal Becker wrote:
    >>    
    >>>I'm interested in periodically checkpointing my executable.  While it's
    >>>not difficult to do this from the shell using e.g., cron, I wonder if
    >>>blcr-devel has a programmatic interface I could use?  Maybe I could write
    >>>a little python wrapper for it.
    >>>      
    >>Yes, for the moment, you'd need to use cron or write your own wrapper,
    >>etc.  It's too hard to write something like this in a way that will
    >>please everyone (so everyone gets to write it themselves ;)
    >>    
    >
    >Yes, I'm happy to write a wrapper - I'm just asking if there is any doc on the 
    >API.  I guess I can follow the example of the cr_checkpoint source.
    >
    >Another question, is it OK for a process to checkpoint itself?  What happens 
    >when it restarts?  Does it pick up just before the checkpoint call (and 
    >re-checkpoint)?
    >  
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Neal Becker: "Please add extern "C""