Kubernetes Operators for beginners
Foreword
Ok so I know this isn’t new tech (as such) but it is something that seems to be gathering a lot of momentum in the K8s space right now, and looking around when I first started, there wasn’t a lot of “low level” documentation out there.
After a lot of head scratching, reading, more head scratching, YouTube videos, even more head scratching… I finally had my lightbulb eureka moment when it all finally made sense to me. So I thought I’d write this blog article in an attempt to save people the pain I went through.
What is an Operator?
Right lets start at the beginning… I’m guessing you’ve arrived at this page because:
- You’re working a project that involves Kubernetes
- Someone has mentions one of the many Kubernetes buzzwords (Controller, Operator, CRD, Group or Kind)
- Someone has randomly said “Oh wouldn’t it be great if Kubernetes could automate this thing for me?”
- You’ve read the material out there about operators & sat there like “WTF?!?!?!”
Well worry not, help is at hand!
We can think of an Operator as a bundled unit of logic that performs some complicated business logic within Kubernetes.
Ok… great… but that’s very abstract…
Fact is we have probably all already dealt with Kubernetes Operators, we just didn’t know it:
kubectl get pods
At a very abstract level, the command above uses one the the default K8s Operators, the “Pod” operator and the operand we are performing is the “get” op.
And if we want to create or delete a new pod we have operators for these tasks too. We just need to know how to use them and when. We don’t need to know how Kubernetes creates pods, or where it goes to get the info about pods, that is business logic at the K8s level we don’t care about. But what we do care about is that the once we have a pod and a service running on that pod, Kubernetes will monitor it and re-create it if it has issues and terminates.
But Kubernetes already does all that for me right??
Okay, so we have our definition, but now you’re sat there and thinking to yourself “Kubernetes already does all this under the hood for me anyway right… it automatically scales things and spins up new things when pods die without me doing anything? Why should I care?”
You would be correct, it does indeed do those things, and 9/10 times you can get by without even having to think of using operators. But where things do get interesting is when you have to start thinking about state!
Stateless is easy; Stateful is hard
Lets take a very very simple none technical example:
You are sitting at a restaurant, your waiter comes over, takes your order and decides in his infinite wisdom not to write your order down and instead commit it to memory before walking off to give your order to the kitchen.
Q. What happens if that waiter should suddenly drop down dead on their way to the kitchen???
A. Well in a Kubernetes powered restaurant we don’t have to worry because the control plane will just spin up another waiter who will look exactly like the last one… but uh oh… wait… we have a new waiter but what happened to our order? It died along with the first waiter!
Now this is a very contrived example but… This is where I had my eureka moment:
Operators are essentially a way to allow us to provide Kubernetes with more domain specific way of performing a complex tasks rather than just its default “replace the thing that died” approach.
DevOps means DevOps
Containers promised us a nice simpler way of rolling deploying and running our apps; and for the most part it succeeded in that goal. We can create a container with our web application and a container with our NGINX and a container with our Redis instance, throw them all together in a K8s service, deploy it via a nice YAML and have something up and running in no time. If we lose our cache or our NGINX or our app we are safe in the knowledge that it will come back very shortly and will scale etc. as needed.
But what about more complicated pieces of our application? What about our databases?
Well… we can of course create a container with our database image inside it ready to deploy that into Kubernetes, but how do we:
- Configure multiple db instances into a master-slave configuration?
- Setup our replication jobs?
- Schedule our backups?
Ok so this may be a little bit facetious but let’s take another approach. What about if we wanted to spin up some cloud infrastructure as part of our service? Something like an Azure ServiceBus?
- How would we even possibly containerise a service bus?
- How could we configure and scale a ServiceBus?
This is where the promise of containers solving all our DevOps problems really start to show some cracks.
To Operate or not to Operate… that is the question
Now at this point I’d like to take a step back and admit that personally I still have mixed feelings about Operators, and there is one question that still seems to loom in my mind:
Just because I could write an Operator to automate and manage something; does that mean that I should?
Disclaimer time: I am a huge fan of Terraform, honestly I think it may be the best thing to happen to DevOps in a very long time. Architecture as code that lives and evolves, can be collaborated upon and reviewed makes perfect sense to me.
I can spin up, modify and tear down a complex infrastructure project simply by running a script.
Now I could also convert this all into Operators; say I write an Operator for CosmosDB and an operator for ServiceBus etc. etc. that is a lot of overhead for maintenance. Whereas Terraform has a thriving community that allows me to leverage great battle-hardened providers to do that work for me.
On the other hand though, imagine a world where every Operator has already been written and works flawlessly with all the same support and configuration possibilities as Terraform… One .yaml file, one kubectl command, and there we have it, our hugely complex application architecture deployed and never given a second thought. Our on-call phone never rings, there are never any alerts about scale or capacity because we feel safe in the knowledge that Kubernetes is doing all the hard work for us:
… now wouldn’t that be a lovely DevOps world to live in?