dinsdag 21 december 2010

Less known Solaris features: Getting rid of Zombies

Once in a while you will see some strange processes with <defunct> instead of a process name. This happens when a child process terminates, but the parent process isn´t interested in the outcome because it didn´t waited for the childs response. Almost all resources of the child process are freed up at the moment with the exception of the entry in the process table. The parent process need it to get the exit code from it´s child, thus you can´t simply delete it on the termination of the child. The remaining process table entry will be delete, when the parent proccess reaps the child process by gathering the exit code. But when the parent forgets to reap the child, it´s undead, it´s defunct. Or to stay in the terminology: You´ve produced a Zombie process.

Let´s create such a process. It´s really easy, we just have to create a long running process forking away a child but we don´t use the wait() system call to gather it´s response at the exit.

bash-3.2$ nohup perl -e "if (fork()>0) {while (1) {sleep 100*100;};};"&
Okay, let´s check for our processes. In the output of ps -ecl the zombie processes are marked with a Z:

bash-3.2$ ps -ecl |grep "Z"
F S UID PID PPID CLS PRI ADDR SZ WCHAN TTY TIME CMD
0 Z 100 27841 27840 - 0 - 0 - ? 0:00 <defunct>
bash-3.2$
A kill -9 to this process is without effect. Obviously, a zombie will go away when you terminate the parent process, but that isn´t alway an option. How can you get rid of this Zombies? Okay, with Solaris you can reap such processes manually. The preap forces the parent to reap the child by calling wait() system call on the child.

bash-3.2$ preap 27841
27841: exited with status 0
And when you look in the process table again you will see, that the zombie founds it´s peace...

bash-3.2$ ps -ecl |grep "Z"
F S UID PID PPID CLS PRI ADDR SZ WCHAN TTY TIME CMD
Obviously, you should ask yourself, why an application leaves such zombie processes, when the task of reaping them away manually gets a frequent task. Often it´s because of bad programming style.

Geen opmerkingen:

Een reactie posten