Why Containerization Exists!!

How did containerization come into existence

Basically, the Operating Systems we are used to, such as Windows, Linux, MacOS etc., are built with a sole purpose of serving an individual around their needs of computations, being able to host multiple applications, that run multiple processes each, that contain multiple threads, that talk to multiple memory locations, that are subscribed to, by the transistor circuitry, that picks up signals from these locations, process them and do to-fro trips between the processing chip, the storage, and hand the resultant information over either to the display(where the user understands that the work is done) or to the ethernet outputs via the peripheral cable system of the computer architecture.

But, in terms of cloud this is luxury and a lot of extravaganza or wastage of resources (power, hardware, software and TIME) as a cloud resource is designed to cater more in a dedicated fashion to an application, let’s say an API endpoint that returns the value of the temperature it has with it to the client that initiated the request and that’s it; but be able to do it for millions of users (multiple instances of the application being served at most), where most of the times there is no GUI required and solely, the computation power becomes the only need (provided that the maintenance of the resources is gotten away from or is assured as part of the pay per use deal signed into by the user and the provider, whihc is exactly what the cloud offers!).

This being the purpose of cloud computation, it would make more sense to take the four fundamental and quintessential pillars of computational architecture (the File System, the Ethernet, the Process tree, the Memory management) in essential or precomputed chunks and call it a container, where our actual piece of code runs and is served to the clients requesting that piece of code to run. This is Docker for us. Grab all the resources required in precomputed quantities and create a runtime for the application. I wouldn’t even start about the serviceability offered to maintain a desired state and add resiliency to the cluster, the rolling updates and so on. That’s a whole new story. Getting back to the main course of our buffet.

This surely did reduce the amount of resources that were wasted as the applications were allocated with more than necessary otherwise (in case of personal computers).
Boosts utilization, of course: Instead of 2 applications running on an allocated 2 GiB* memory each, that can expand further due to any leaks, while only consumed 180 MiB* of the dedicated allocation which droops the utilization down to a mere (180/(2*(10^3)))*100% ~ 9% which is terribly low.

Therefore, containerization definitely carries the utilization factor upward to a significantly higher value. What about the Hardware downsizing?

– me, myself and the other folks who think alike or are open to disussions leading to discoveries.

The hardware size can go down to the size of a degree college’s Maths text book (A4 size). Not smaller than that. Although, the analogy I have chosen might not be the absolute limit, there exists a limit where the shrink clearly halts at some point in time and cannot continue proportional to the resource utilization improvements aforementioned.

Therefore, while downsizing is not practically possible in terms of hardware, which would have to go against or beyond “Moore’s law”, some intelligent thinkers came up with the idea of calling it containerized on cloud and claim to have achieved the provision of dedicated computing resources to the cloud subscribers, while in reality the container that a cloud user gets to use, lives on a hardware that could be sharing its resources with another cloud user, who thinks the container and its hardware are dedicated to them. This was viable because of two things(apart from all the zillion other factors in this universe).
1. We don’t care where our hardware sits as long as we are assured computation power or access to it.
2. Most importantly, the connectivity offered by the network of INTERNET that makes all of it a possibility. Not to forget all the natural help we get in carrying our data packets over the network haha.

On the flipside, imagine attempting to have tried this at home by inviting a few people to use/share the hardware resources to emulate containerization in person. It would be an extremely intervening experience into one’s space and you wouldn’t buy into it.

Is that not the best of both the worlds, to the cloud providers (AWS, Azure, Google Cloud etc.,)? What they do does not make sense if done at home by us, but them doing it helps us both make some more time to focus on other essential aspects that could bring more value.

– Me again with the same group

Thus, came into existence, the wrapper around containerization that extends the containers across the geography by connecting them via the network of internet –
DOCKER wrapped in KUBERNETES.
Docker – No wings. Cannot fly. But can work right where it is.
Kubernetes (wraps docker) – DockerNetes or Kubes of Docker containers. Now the container can fly. This way, the utilization of resources when lifted to cloud increases significantly and starts flying.

Kubernetes – Like Redbull to Docker. Docker got wings!

– Someone in my head, said that.

Wait. I have an idea. let’s take a moment. Ponder if it’s an absolute 100%? What about the times that go away while the dedicated cloud resource waits for any response that’s on its way, and is cooking somewhere that would get sent back over the internet as a response. It doesn’t have any other app-process threads waiting in the scheduling queue. Although, it is negligible at an individual machine level, when compounded over time and multiplied with the number of users being catered to, it blows up and makes a significant mark. It not only increases the utilization, but sometimes way beyond 100%, if we kept a closer watch on the processes that go into wait states (sometimes waiting on the data from the hard drive a few centimeters to a few thousand miles away or sometimes waiting for another fellow server instance’s response that’s needed to go ahead with the processing further), we can multiplex the wait times into simple computational requests that can be accommodated within the time window of the current thread before it resumes.

Here, we are not talking about scheduling the computational resources to other waiting application process-threads in the queue, and rather to the cloud users sitting at their machines anywhere in the world of internet.

Imagine a use case where, your current thread is waiting on an API call to receive the information from another server that is timezones apart, and it could take around ~20 ms to return. Now imagine that there is another process that would take under 20 ms to finish its computation, if provided with the processor and other necessary resources. But, the current thread that is actively waiting, once it resumes, has a compute intensive task to complete before calling the current transaction done and handing the processor over to the next contending task or process, and would need the computation power dedicated to itself for at least 200 ms until then.

This makes the second thread wait in the queue for 220 ms, instead of 0 ms if given the process for about 10 ms and the processor’s context switch taking about ~(2 or 3) ms. BAMMMM!!! We see the spike in utilization already.

Imagine if such waiting gaps in dedicated processor times are utilized by a closer monitoring mechanism in place, without having any cross cutting concerns, we could achieve close to 100% utilization of resources in real time, which would of course be, a reduced carbon footprint, economically awarding, more work done in less time (Saves TIME the most precious thing we know until now and that which cannot be brought back once gone, no matter what you’re ready to bet on it).

Published by Abhay Nagaraj B R

I think in terms of problem-solving. I like picking up problems from real life and applying data science solutions to them. For instance, when I saw my mom cut a bunch of okras, I noticed how she cuts them one at a time of which the final output was the same. And, I thought, when it’s the same cut (single instruction, SI) on every Okra, then why not cut a bunch (multiple data, MD) of them at once? There we go! SIMD in real life! Which is exactly what GPUs do. This is one of the illustrations of how I look at a problem and work towards resolving it. And, I strongly believe that I can use this mindset of mine, in combination with a good insight into the problem at hand, will be able to develop efficient solutions.

One thought on “Why Containerization Exists!!

Leave a comment