How we build and operate the Keboola data platform
Ondřej Popelka 3 min read

How Docker Changed Our World

If you are a follower of Keboola, you probably realized that now we do a lot of stuff in Docker. But it was not always this way. We now…

If you are a follower of Keboola, you probably realized that now we do a lot of stuff in Docker. But it was not always this way. We now have 248 components in Keboola Connection (including some testing garbage), out of those 177 are now in docker containers and 65 are developed by 3rd party developers. Not too long ago the picture was completely different.

At that time, we had about 70 components in Keboola Connection (KBC) and we were developing our own framework — Syrup — to make component development easier. Since we did most of our stuff in PHP, this framework was also written in PHP and based on Symfony. It simplified component development to that we only had to do 10 steps to get started.

It was more like 10 steps, each with 10 sub-steps. And then it is almost done, just setup the MySQL database, Elastic and yes an SQS queue, oh and credentials to S3 (for uploading logs), oh and the mailer, yep I almost forgot access to shared project for uploading exceptions stack traces, and yes, there was also Bitly account, which reminds me, that you need to create a record in the service database for your queue. Or you could simplify all that and create an SSH tunnel to our development server where it hopefully all was set up already (unless someone removed the queue record in the service database or something like that).

When the Syrup framework development was initiated, it was meant to simplify our own component development. We secretly hoped that it could be also used by 3rd party developers, but guess how many 3rd party components we had in Syrup?

Yep, that’s right. It needed a day workshop to get into the above mess. That may sound awful, but it was not really that bad, once you set everything up, making a new component was piece of cake. Unless of course you had a really bad day:

Then in December 2014 we started adopting docker (in version 1.3 at that time) and throughout the year 2015 we made some components in it. Then Radek Tomasek — the greatest and most modest developer — came on the 21st of July 2015 and he was the first 3rd party developer and a guinea pig for us. Then we realized that our dream came true, initiating a 3rd party developer familiar with KBC into creating his own component became a matter of hours and requires no setup except Docker. Today we feel that the blocker lies mostly in that 3rd party developers are not familiar with KBC and they don’t understand some of the concepts, so they don’t know what to do. (We know that KBC UX is pita and we are working on it!).

The dockerization also brought in other languages, so now some components are in PHP5, some in PHP7, others is Python and R. Some experiments are even written in Bash :) And then there are some 3rd party components which we don’t even know in what are the written in (images are private). This is great because we don’t have to use only PHP for creating components, not that there is anything wrong with it. Yeah, and no one can say anymore “Oh, you’re those suckers doing ETL in PHP? Good luck with that, you morons”.

So perhaps from the articles on this blog it might seem that all we know is docker. We’re not docker guys in disguise (except Vlado, which has Docker tattooed on his butt), but it is just that it is really huge for us. Of course nothing is just unicorns and rainbows, so we have to admit that sometimes docker is giving us fucking hard times.

So that was a big game changer of the last year and half. Over the last year, we rewrote most of our components to docker containers. This leaves us with only about 10 components running in Syrup framework. Debugging those is still huge PITA, but that’s another story.

If you liked this article please share it.

Comments ()

Read next

MySQL + SSL + Doctrine

MySQL + SSL + Doctrine

Enabling and enforcing SSL connection on MySQL is easy: Just generate the certificates and configure the server to require secure…
Ondřej Popelka 8 min read