namespace The concept of

namespace yes Linux The way the kernel is used to isolate kernel resources . adopt namespace Some processes can only see some resources related to themselves , Other processes can only see resources related to themselves , These two processes can't feel the existence of each other at all . The specific implementation is to specify the related resources of one or more processes in the same namespace in .

Linux namespaces It is a kind of encapsulation and isolation of global system resources , Make a difference namespace Processes with independent global system resources , Change one namespace System resources in will only affect the current namespace The process in , For others namespace The process in does not affect .

namespace Use of

Maybe the vast majority of users are like me , It's using docker I didn't know until I got to know linux Of namespace technology . actually ,Linux Kernel Implementation namespace One of the main goals of is to achieve lightweight virtualization ( Containers ) service . In the same namespace The next process can perceive each other's changes , And I don't know anything about the progress of the outside world . In this way, the process in the container can have an illusion , Think you're in a separate system , So as to achieve the purpose of isolation . in other words linux kernel-provided namespace Technology for docker The emergence and development of container technology provide basic conditions .
We can docker From the perspective of implementers, how to implement a resource isolated container . For example, whether it can pass chroot Command to switch the mount point of the root directory , To isolate the file system . To communicate and locate in a distributed environment , The container must have a separate IP、 Routing and ports, etc , This requires isolation of the network . At the same time, the container also needs a separate host name to identify itself in the network . Next, you need to communicate between processes 、 Isolation of user rights, etc . Last , The application running in the container needs a process number (PID), Naturally, we also need to communicate with PID In isolation . In other words, these six isolation capabilities are the foundation of a container , Let's see linux Kernel namespace What kind of isolation capability does the feature provide for us :

The first six in the table above namespace It's the isolation technology necessary to implement the container , As for the latest Cgroup namespace It has not been docker use . I believe that in the near future, all kinds of containers will also be added to Cgroup namespace Support for .

namespace  The history of development

Linux Some of them have been implemented in very early versions namespace, Like the kernel 2.4 And that's what happened mount namespace. Most of the namespace Support is in the kernel 2.6 Done in , such as IPC、Network、PID、 and UTS. There's something else namespace A special , such as User, From the kernel 2.6 It's starting to come true , But in the kernel 3.8 It's just announced that it's finished . meanwhile , With Linux The development of container technology and the demand brought by the continuous development of container technology , There will also be new namespace Be supported , For example, in the kernel 4.6 I added Cgroup namespace.

Linux Multiple API Used for operation namespace, They are clone()、setns() and unshare() function , In order to determine which one is isolated namespace, Using these API when , You usually need to specify some call parameters :CLONE_NEWIPC、CLONE_NEWNET、CLONE_NEWNS、CLONE_NEWPID、CLONE_NEWUSER、CLONE_NEWUTS and CLONE_NEWCGROUP. If you want to isolate multiple namespace, have access to | ( Press bit or ) Combine these parameters . And we can get through /proc Here are some files to operate namespace. Let's take a look at the brief usage of these interfaces .

View the... To which the process belongs namespace

From version number to 3.8 Start with the kernel of ,/proc/[pid]/ns The directory will contain the namespace Information , Use the following command to view the namespace Information :

$ ll /proc/$$/ns

First , these namespace Files are all linked files . The format of the content of the linked file is xxx:[inode number]. Among them xxx by namespace The type of ,inode number Is used to identify a namespace, We can also understand it as namespace Of ID. If one of the two processes namespace Files point to the same linked file , Explain that its related resources are in the same namespace in .
secondly , stay /proc/[pid]/ns Another function of placing these linked files in is , Once these linked files are opened , Just open the file descriptor (fd) There is , So even if it's time to namespace All processes under have ended , This namespace It will always be there , Subsequent processes can be added .
Except for the way you open the file , We can also prevent... By mounting files namespace Be deleted . For example, we can take the current process of uts Mount to ~/uts file :

$ touch ~/uts
$ sudo mount --bind /proc/$$/ns/uts ~/uts

Use stat Order to check the results :

It's amazing ,~/uts Of inode And link to inode number It's the same , They're the same file .

clone() function

We can go through clone() Create... While creating a new process namespace.clone() stay C The declaration in the library is as follows :

/* Prototype for the glibc wrapper function */
#define _GNU_SOURCE
#include <sched.h>
int clone(int (*fn)(void *), void *child_stack, int flags, void *arg);

actually ,clone() Is in C An encapsulation defined in the language library (wrapper) function , It's responsible for building the stack of new processes and calling the clone() system call .Clone() It's actually linux system call fork() A more general implementation of , It can go through flags To control how many functions are used . Altogether 20 Varied CLONE_ At the beginning falg( Sign a ) Parameters are used to control clone All aspects of the process ( For example, whether to share virtual memory with the parent process ), Now we will only introduce namespace dependent 4 Parameters :

  • fn: Specify a function to be executed by the new process . When this function returns , Child process termination . This function returns an integer , Indicates the exit code of the child process .
  • child_stack: Pass in the stack space used by the subprocess , That is, the user mode stack pointer is assigned to the child process esp register . Calling process ( To call clone() The process of ) New stacks should always be allocated to child processes .
  • flags: Indicates which CLONE_ The first flag bit , And namespace Relevant CLONE_NEWIPC、CLONE_NEWNET、CLONE_NEWNS、CLONE_NEWPID、CLONE_NEWUSER、CLONE_NEWUTS and CLONE_NEWCGROUP.
  • arg: Point to pass on to fn() The parameters of the function .

In subsequent articles , We mainly pass clone() Function to create and demonstrate various types of namespace.

setns() function

adopt setns() Function to add the current process to an existing namespace in .setns() stay C The declaration in the library is as follows :

#define _GNU_SOURCE
#include <sched.h>
int setns(int fd, int nstype);

and clone() The function is the same ,C In the language library setns() Functions are also right setns() Encapsulation of system calls :

  • fd: I want to join in namespace File descriptor for . It's a point /proc/[pid]/ns File descriptors for files in the directory , You can get it by directly opening the linked file in the directory or by opening a file that has the linked file in the directory .
  • nstype: Parameters nstype Let the caller check fd Point to the namespace Whether the type meets the actual requirements . If this parameter is set to 0 Means not to check .

As we mentioned earlier : You can mount it namespace Keep it . Retain namespace The purpose is to add the process to this namespace To prepare for . stay docker in , Use docker exec To execute a new command in an already running container, you need to use setns() function . In order to bring the new namespace Use it , It also needs to be introduced execve() Series of functions ( The author is in 《Linux Create subprocesses to perform tasks 》 It was introduced in the article execve() Series of functions , Interested students can go to learn about ), This function can execute the user's command , A more common usage is to call /bin/bash And accept the parameters to run a shell.

unshare() function and unshare command

adopt unshare Functions can be performed on the original process namespace Isolation . That is to create and add new namespace .unshare() stay C The declaration in the library is as follows :

#define _GNU_SOURCE
#include <sched.h>
int unshare(int flags);

Just like the previous two functions ,C In the language library unshare() Functions are also right unshare() Encapsulation of system calls . call unshare() The main function of : You can isolate resources without starting a new process , It's equivalent to jumping out of the original namespace To operate .

By default, the system also provides a program called unshare The order of , It's actually calling   unshare() system call . Below demo Use unshare Command to put the current process's user namespace Set up a root:

summary

namespace yes linux Features provided by the kernel , Born for virtualization . With docker It's the birth of container technology , I've also devoted my life behind the scenes for a long time namespace Technology is coming to you . The author tries to make a comparison between namespace Technology learning and understanding to deepen the understanding of container technology , So the next step is to learn from the articles namespace Every bit of , I hope I can make progress with my classmates .