Kubernetes Pre-Stop Hook Reflections
I have an application which runs as a Kubernetes job. The application is implemented as PHP Symfony console command, which is specified in…
Kubernetes PreStop Hook Reflections
When the image is built and the container starts, the PHP process runs a console command
app:run. When the command terminates, the container terminates and the Kubernetes job finishes. So far, nothing special.
Terminating the job
The next thing on my list was to implement termination of the job. This is actually trivial to do through kubectl (or the corresponding API):
> kubectl delete job job-674602133 job.batch "job-674602133" deleted
That’s all. However, here comes the interesting part. The
app:run command process spawns some other processes which need to be cleaned manually if the command does not finish gracefully. In fact, these are other running docker containers, but that’s for another long story.
If it were an API, I would send some termination request to it and asynchronously and gracefully terminate the application. But since the job is a console command, all I can do is send a SIGTERM signal to the process. I quite hate working with process signals (especially inside PHP), so I was trying to avoid this a bit.
Luckily, there is another option, which is related to the Kubernetes pod termination. Kubernetes offers to configure a PreStop hook which runs before the pod termination starts. Kubernetes even allows to run a CLI command (together with the obvious HTTP request option), which is great for my console application. It is worth noting that it does not behave exactly the same way as sending and reacting to a SIGTERM/SIGKILL signal to the main container process. The PreStop hook starts another command in the same container (using
docker exec). That means I end up with two unrelated processes — one is running the container and the other one is “terminating” the container.
This behavior is as designed and quite logical considering the fact that the PreStop hook is intended, for example, to save the state of the running pod. It has to run before the main process starts terminating. In my case, I end up having the
app:cleanup commands running simultaneously. This is sort of weird, but not a problem, fortunately. It is the price for abusing the hook.
The official Kubernetes docs explains, in a lot of detail, how PreStop hooks work. But the information on how to actually configure them seems to be missing :]. Luckily, someone else has already found out and also detailed how to configure the grace period. With that help, I ended up with the following template (abbreviated):
template: spec: containers: - name: job-runner image: my-image:my-tag resources: ... env: ... lifecycle: preStop: exec: command: ['php', '/code/bin/console', 'app:cleanup'] terminationGracePeriodSeconds: 60
Piece of cake and it works great. Except that it doesn’t log anything. This is even mentioned in the docs.
The logs for a Hook handler are not exposed in Pod events.
Which, for me, means that it does not work. At Keboola, we are generally quite obsessed with traceability, and this is no exception. The cleanup command removes some running processes and we absolutely have to know that this has happened. We trace all outputs of all pods in Papertrail, so I wanted to see in the pod output that the pod was terminated and what the cleanup command did before this happened.
Now comes the hard part that took me a bit of digging and experimenting with some dead ends. I was finally led down the right track when reading this largely unrelated post on StackOverflow.
When Kubernetes is running a container, it captures the pod outputs using a logging driver. That means that the process stdout is redirected to the socket of the logging driver. This can be easily verified — open a terminal within the pod and run
Let’s ignore the garbage now and concentrate on the important part which is the process with PID 1. It’s a shell command because I am using the shell form of the CMD in Dockerfile, but that’s not important. In either case this is the main container process and the one I’m interested in. Now let’s run
ls -l /proc/1/fd which lists the details of the main process file descriptors:
Indeed — both standard output and error output are redirected. When the cleanup process is run using the PreStop hook (using docker exec), its output is not redirected. That could be verified using the same commands.
The solution is “easy” — redirect the output of the cleanup process to the same pipe used by the main process. The pipe is unknown, so it’s easier to redirect the cleanup process output of the output of the process with PID 1. To test this, I can run the following inside the pod:
echo "test" > /proc/1/fd/1
Indeed, this shows
test in the logs of the main process and the container and the pod. The
/proc/1/fd/1 means that I am writing to the stdout of the process with PID 1. Redirecting to
/proc/1/fd/2 would write to stderr.
Now the only remaining task is force the PHP Symfony command to redirect its output. I’m also using Monolog configured so that every log output gets sent to standard output. So I was trying to avoid some hara-kiri interfering with the logger.
One option was to reconfigure the logger to open the
/proc/1/fd/1 file stream and write to it. This failed consistently — looks like PHP is unhappy that it’s already open.
So I resorted to doing the redirection outside of the application itself. But passing the redirection in the command doesn’t work because the value gets escaped:
lifecycle: preStop: exec: command: ['php', '/code/bin/console', 'app:cleanup', '1>&/proc/1/fd/1']
The shell form does not seem to work either, because then I’d have to escape the redirection:
lifecycle: preStop: exec: command: 'php /code/bin/console app:cleanup 1>&/proc/1/fd/1'
I got tired and used the most reliable solution — wrap the entrypoint in a shell script:
#!/usr/bin/env bash set -e php /code/bin/console app:cleanup 1>&/proc/1/fd/1
which can be run via the PreStop hook without problems:
lifecycle: preStop: exec: command: ['/code/cleanup.sh']
Now when the pod is deleted, Kubernetes fires the PreStop hook. This executes the
app:cleanup Symfony command, which can use the standard Monolog methods such as this one:
$logger->info(sprintf('Terminating process "%s".', $myProcess));
And that message appears in the log feed of the POD as if it were coming from the main process. This gives the perfect impression that when the pod terminates, the main process terminates gracefully.