Re: File & socket handling speeds after cr_restart

From: Paul H. Hargrove (PHHargrove_at_lbl_dot_gov)
Date: Mon Jan 08 2007 - 09:20:38 PST

  • Next message: Adams, Vaughn M: "Status of BLCR"
    I do not know of any explanation for "slow" I/O in restarted processes.  
    There is absolutely nothing in BLCR that intercepts or otherwise 
    modifies the handling of sockets (or any other file descriptor) either 
    before a checkpoint or after a restart.
    
    -Paul
    
    Dai MIKURUBE wrote:
    
    > Hi,
    >
    > Sorry, I think the following problem is occured by my extension
    > for migrating sockets. I'll check my code.
    >
    >
    > But I'd still like to know
    > "Do restarted processes have any bottleneck in I/O?"
    >
    > I'm waiting for reply. Thanks.
    >
    > Dai MIKURUBE wrote:
    >
    >> Hi,
    >>
    >> I asked a few questions about BLCR some days ago.
    >> Today, I have another question.
    >>
    >> I'm trying to migrate some networked applications.
    >> At first, I tried creating sockets after restart as follows:
    >>
    >> 1) start Apache 1.3
    >>     # cr_run /usr/sbin/apache
    >>
    >> 2) checkpoint the root process of Apache
    >>     # cr_checkpoint --term (process number of Apache root process)
    >>
    >> 3) restart the checkpoint file
    >>     # cr_restart context.(process number of Apache root process)
    >>
    >> 4) send HUP signal to the process
    >>     # kill -HUP (process number of Apache root process)
    >>
    >> 5) access to the Apache process with httperf
    >>     # httperf ....
    >>
    >> I have succeeded only accessing Apache web server,
    >> but the access is too slow...
    >>
    >>
    >> My question is :
    >>
    >> "Are there some bottlenecks in handling file descriptors, files,
    >>   and sockets of *restarted* processes?"
    >>
    >> or
    >>
    >> "Can restarted processes generates many new sockets?"
    >>
    >>
    >> The connections created by restarted processes look strange.
    >> Though the client "httperf" is waiting for Apache's reply,
    >> the result of "netstat -a" usually says no connection is established.
    >>
    >> "netstat -a" says that some connections is established in a moment,
    >> but the established connections vanish in a moment.
    >>
    >> But, finally, httperf says almost all connections replyed correctly.
    >> (Reply status of All connections are 2xx)
    >>
    >>
    >> Do you know of such behavior of BLCR?
    >> and do you have some ways to avoid these problems?
    >>
    >
    
    -- 
    Paul H. Hargrove                          PHHargrove_at_lbl_dot_gov
    Future Technologies Group                 
    HPC Research Department                   Tel: +1-510-495-2352
    Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
    

  • Next message: Adams, Vaughn M: "Status of BLCR"