Re: Checkpoint failed: support missing from application

From: Adolfo J. Banchio (banchio_at_famaf_dot_unc_dot_edu.ar)
Date: Fri Sep 23 2005 - 16:00:15 PDT

  • Next message: Paul H. Hargrove: "Re: Checkpoint failed: support missing from application"
    Paul,
    
    thanks for your prompt reply.
    
    The program was run using cr_run, and then I checked
    at /proc/<PID>/maps and there is no line for
    the blcr libraries. So this is the reason.
    
    The only difference with other codes (compiled with
    same compiler) is that this one (and another one
    I recompiled for testing) is compiled with -static flag.
    
    Is as simple as that no program compiled "statically" will
    accept cr_run for checkpointing? In other words, for 
    statically linked codes you have to include the libraries
    at linking time. Is this true?
    
    
    thanks for your help
    
    
    best regards,
    
    adolfo
    
    
    
    
    
    On Fri, 2005-09-23 at 14:08, Paul H. Hargrove wrote:
    >   Checkpointing with BLCR requires that a small stub library be linked 
    > into an application.  The message you are seeing is the one generated 
    > when a checkpoint request is issued for an application that does not 
    > include this support.
    > 
    >   A LAM/MPI built with BLCR support will automatically link in this 
    > library into applications it compiles.  Other applications may do so 
    > explicitly when they are built, or more typically via an LD_PRELOAD done 
    > by the "cr_run" utility we provide.  For instance, "cr_run ./a.out" 
    > would run a.out with the BLCR library loaded.
    > 
    >   It is also possible that the application is correctly linked with the 
    > library, but is somehow disabling the BLCR hook.  One can look for 
    > "libcr.so" in /proc/<pid>/maps to determine if the process with the 
    > given pid has the BLCR library loaded.  If it is loaded and you still 
    > get the "support missing from application" messages, then we can discuss 
    > how to determine the cause of the interference.
    > 
    > -Paul
    > 
    > Adolfo J. Banchio wrote:
    > 
    > >Hello,
    > >
    > >first of all my excuses if this question was already answered
    > >(in this case just point me to that answer), since I can not
    > >get access to the search page of the archive.
    > >
    > >Now, the problem,
    > >
    > >I have a process running (started with cr_run)
    > >
    > >which gives this error message when checkpointed:
    > >
    > >    "Checkpoint failed: support missing from application"
    > >
    > >and the exit status of cr_checkpoint is 52.
    > >
    > >What could be the reason for this?
    > >
    > >By the way, I have BLCR working with SGE, and besides for this
    > >user, it is working Very good for process migration.
    > >
    > >best regards,
    > >
    > >adolfo
    > >
    > >
    > >  
    > >
    

  • Next message: Paul H. Hargrove: "Re: Checkpoint failed: support missing from application"