Re: Document for Friday's mtg

From: Jason Duell (jcduell_at_lbl.gov)
Date: Mon Jun 03 2002 - 11:16:09 PDT


On Mon, Jun 03, 2002 at 09:43:44AM -0700, Paul H. Hargrove wrote:
> Jeff Squyres wrote:
> > Is there any way to avoid this signal context issue?  It seems extremely
> > limiting...  I have one idea that may or may not be workable.
> > 
> > What if we have a checkpoint thread that is launched during MPI_INIT.
> > During normal operations, it sits blocking on something (perhaps a
> > semaphore) and taking no cycles.  When the checkpoint signal is received,
> > the signal handler:
> > 
> > - stops all other threads
> > - wakes up the checkpoint thread
> > - returns
> > 
> > This would allow the checkpoint thread to do whatever it wants, and not
> > have to worry about signal handler safety.
> > 
> > Is that feasible?
> 
> It can't go quite as shown above, because as designed now, then
> checkpoint will be taken when the signal handler returns.  Thus if a
> separate thread did the work, the signal handler must block on its
> completion:

We could always write it so that the signal handler is allowed to tell
the kernel--via yet another ioctl()--that while it's about to return, it
will not be OK to checkpoint until a 2nd system call is done later (by
some other thread or the regular, non-signal handler context).  This
ought to make the separate thread approach work w/o deadlock.


>   - Stop other threads
>   - Wake up blocked checkpoint thread
>   - Block until checkpoint thread completes its work
>   - Return
> But that would likely deadlock - imagine that the signal arrived while a
> mutex internal to malloc() was held.
> 
> It looks like the signal handler context is going to be very limiting. 

If the signal handler context is limiting just because 1) malloc is not
signal safe and 2) pthread locks around data structures aren't safe
either, it's worth noting that you can make both of these safe by using
'signal safe locks' (a mutex that you guarantee is never held when a
signal handler is run), plus a separate malloc library for lam's
internal use.


> I will look at ways to cleanly decouple things so that the handler can
> return with some promise to call back again to complete the checkpoint
> (like a checkpoint-request answered later by a checkpoint-response). 

This has the problem that we've got no guarantee as to how long it will
be before the callback runs, no?  If we didn't have this problem, we
wouldn't be desperately trying to kludge our logic into signal handler
context in the first place...

-- 
Jason Duell                               jcduell_at_lbl_dot_gov
NERSC Future Technologies Group           Tel: +1-510-495-2354
Lawrence Berkeley National Laboratory     Fax: +1-510-495-2998