Resuming from a crashdump

Matthew Dillon dillon at apollo.backplane.com
Mon Jan 24 14:40:23 PST 2005


:Well booting the kernel generally takes little time, but if all the
:processes could be restored this would be a step in the right direction.
:As John said, restoring the state of some programs will have to rely on
:the program, but perhaps this could lead to an API of some sort that would
:make this less painful on the program author. Eventually most programs
:would support this over time.
:
:So what would it take to get the system to boot the kernel, then rebuild
:the processes from VM?
:
:Chris

    I think this is doable but not universal.  A kernel core dump can't be
    used for that sort of thing (it overwrites swap and swap might contain
    portions of the user processes in it).  

    The kernel would have to write out a special save-to-disk file containing
    the VM image, file handles, signal state, register set, and so forth for
    each process in the system.  Basically it would need to take DragonFly's
    checkpointing code (as a basis), extend it suitably, and use it to dump
    each process.

    Additional state would also have to be saved.... bound UDP sockets and
    sockets in a LISTEN state would have to be saved and restored.  This is
    doable.

    But the following is far more difficult:

    * tty associations		- restore
    * tty state			- restore
    * job control and process group state - restore
    * open pipes		- restore
    * open fifos		- restore
    * open socketpairs		- restore
    * established connections	- throw away

    And this is very difficult:

    * X windows state, established connections to X from applications, and
      so forth.  However, it might be possible to quiece X out of its video
      mode, remap the framebuffer, and then switch back into it.  But it
      would still be a nasty problem.

    -

    Also, it could take just as long to restore the mess as it would just
    to reboot normally and restart your applications.  After all, the system
    is likely to be disk-bound either way.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>


More information about the freebsd-hackers mailing list