Re: query

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Tue Dec 01 2009 - 13:07:54 PST

  • Next message: Eric Roman: "Re: OpenMPI + BLCR: Second time checkpoint hangs for MPI application."
    The function cr_checkpoint() will return 0 when it runs taking a checkpoint.
    If you application is later restarted from a checkpoint file, then the 
    execution will appear to begin with a positive return value from 
    cr_checkpoint().  This distinction between the continuation and restart 
    allows a callback to perform any extra work required at restart to 
    restore resources or settings not saved by BLCR (example: TCP sockets).
    
    -Paul
    
    luyang dong wrote:
    > dear teachers:
    >               The following program is the callback described in 
    > Berkeley Lab's Checkpoint Restart, but I confuse with the 
    > cr_checkpoint(0),why does its returning value have two kinds(0 and 
    > positive).According to my understanding,my_callback is called when 
    > user  Send the application a signal that tells it to checkpoint or 
    > when checkpoint is completed.
    >                                                         thanks a lot  
    >                                                         best wishes
    >                                                         Luyang Dong
    > my_callback {
    > /* cr_checkpoint() returns twice. */
    > ret = cr_checkpoint(0);
    > if (ret > 0) {
    > checkpoint_status = restart;
    > } else if (ret == 0) {
    > checkpoint_status = continue;
    > } else {
    > checkpoint_status = error;
    > }
    > return 0;
    > }
    >  
    >
    >
    > ------------------------------------------------------------------------
    > 好玩贺卡等你发,邮箱贺卡全新上线! 
    > <http://cn.rd.yahoo.com/mail_cn/tagline/card/*http://card.mail.cn.yahoo.com/> 
    
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 Tel: +1-510-495-2352
    HPC Research Department                   Fax: +1-510-486-6900
    Lawrence Berkeley National Laboratory     
    

  • Next message: Eric Roman: "Re: OpenMPI + BLCR: Second time checkpoint hangs for MPI application."