Re: Question about "fd" token

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Sun May 17 2009 - 11:30:32 PDT

  • Next message: �����: "Re: Re: Question about "fd" token"
    There are certain functions in libcr that are for use of a "target": a
    process that may be checkpointed. These all use the one "local_fd" in
    syscall.c, which is opened the first time one of these functions is
    called, and then reused as it is needed to avoid open/close on every
    call. You ask "Why can not we use only cri_syscall_token" - the answer
    is that we do use a single __cri_syscall_token(), but cri_syscall() is
    wrapper that manages this single reused fd and errno, while
    cri_syscall_token() manages only errno.
    
    The "thread access" code is to ensure that we get exactly one fd even if
    multiple threads call at the same time. The code might look more natural
    if we had used pthread mutexes, but we cannot for the same reason we
    cannot make certain syscalls through the normal paths: because we may
    need to do this when the pthread environment is not available.
    
    There are also functions for use inside the callback code that runs when
    a checkpoint it taken. This includes cr_checkpoint() and
    abort_checkpoint(). For these, the fd passed to cri_syscall_token() must
    be a specific one that the kernel knows is associated with the
    *specific* checkpoint in-progress request. This one is passed from the
    kernel to libcr when the signal handler was invoked, passed in in
    siginfo->si_pid in libcr/cr_core.c:cri_sig_handler().
    
    Finally, there is a third group of calls including
    cr_request_checkpoint() and cr_request_restart() that open an fd that is
    used for all operations for that request. These also use
    cri_syscall_token().
    
    I know I didn't address your questions in order, but I think I've
    explained what you wanted to know. If you still need help, let us know.
    
    -Paul
    
    ����� wrote:
    > Hello, Professor:
    >
    > Thank you very much for the previous answer.
    >
    > I have a question about issuing checkpoint request. when I am reading
    > "/util/cr_checkpoint.c" The code:
    >
    > /* issue the request */
    > err = cr_request_checkpoint(&cr_args, &cr_handle);
    >
    > This is how BLCR issue a checkpoint request, I find the
    > function"cr_request_checkpoint" final calls
    > "cri_syscall_token(*handle, CR_OP_CHKPT_REQ, (uintptr_t)&req)" and the
    > first argument is actually a file descriptor which opened in
    > "/proc/checkpoint/ctrl".
    >
    > But at the same time , there is another function"cri_syscall()", The
    > difference between this one and"cri_syscall_token()" is this one have
    > not to accept a "fd" as an argument. however, it calls
    > "__cri_ioctl((int)cri_atomic_read(&local_fd), op, (void *)arg, errno_p);"
    >
    > My question is about the local variable "local_fd", I see it in the
    > "/libcr/syscall.c". I find some other function in this file use it to
    > control the "thread access". But I still do not know it's other usage
    > here.
    >
    > Q2:
    > Many function like"abort
    > _checkpoint","cr_checkpoint","cr_forword_checkpoint" finally calls
    > "cri_syscall()" Which "fd" are they exactly using? the same as the fd
    > opened in "/proc/checkpoint/ctrl"??
    >
    > Q3:
    > Why can not we use only "cri_syscall_token" ???
    >
    >
    >
    >
    > ===============================================
    > ��������һ������TOM�������ɣ���������1.5G������ʲô��
    > <http://bjcgi.163.net/cgi-bin/newreg.cgi?%0Arf=050602>
    > ===============================================
    >
    
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 Tel: +1-510-495-2352
    HPC Research Department                   Fax: +1-510-486-6900
    Lawrence Berkeley National Laboratory     
    

  • Next message: �����: "Re: Re: Question about "fd" token"