Table of Contents

Docker series: how things actually work from the inside

If you came by this page, you might be interested in knowing what is docker and how it works. I’m gonna explain how it works internally with some examples, so we do not lose it having too many theoritical discussions.

Non-isolated processes:

Every program running on a machine is composed of one or more processes. Imagine that we have one process that runs a malware. This process can badly affect the other normal processes. it can do the following:

Overuse the available recources of the machine (example: memory, network bandwidth ,or cpu)
corrupt the files that other processes use or even change their code files.

For these security reasons, there’s a need to isolate the recources of each process running on the machine, but is it only a security requirement?

Answer is NO, as the running processes are normal applications, say a Java, or a python application. These applications have a set of requirments that should be available to the process environment prior running the application. You may even have two apps that needs different virsions of a requirment, so the question pops up, how can we make an isolation so that each process can run within its own environment that is separate from the other envs? Should we use virtual machines? No, that would be to heavy to run a VM for each process. So, the solution is what Docker calls CONTAINERS that uses an underlying technology called Namespaces.

What are Containers?

container is a lightweight and isolated execution environment that packages software and its dependencies together. Containers provide a consistent and portable way to run applications across different computing environments, such as development machines, servers, or cloud platforms. These containers are build on a Linux technology called Namespaces. This means it is a requirement to use Linux to have that type of isolation.

What are Namespaces:

Wikipedia

Namespaces are a feature of the Linux kernel that partitions kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple spaces. Examples of such resources are process IDs, host-names, user IDs, file names, some names associated with network access, and Inter-process communication.

Imagine that some memory, cpu, network bandwidth and you want to partition all of them bewtween the running process, so that none of them can interfere with the recources of the other processes, or in some cases not know that another process exists ;)

Linux Namespaces:

pid (isolate process ids)
cgroups
mnt (isolate disk space and files)
net (isolate network interfaces)
ipc (inter process communication)
time
user

These are the most common and known namespaces that are being used and available on Linux distributions that support Namespaces. We will go through each one of them (with examples) to explain how they work.

View next post in the series