SIRAJ CHAUDHARY: Docker Swarm

Docker Swarm is Docker's native orchestration tool that allows you to manage and deploy a cluster of Docker containers as a single virtual system. It turns a pool of Docker hosts (machines running Docker) into a coordinated cluster of containers, allowing you to manage multiple containerized services (applications) across many hosts in a simple, reliable way. It provides features such as scaling, load balancing, and service discovery for your applications (docker containers).

Key concepts

Swarm Cluster: A group of Docker hosts (machines) that work together in a coordinated way. These machines can be physical servers or virtual machines, and each one runs Docker. Once grouped, these machines can be managed as a single system.

Nodes: These are the individual Docker engines that participate in the Swarm. There are two types,

Manager Nodes: Responsible for managing the Swarm, assigning tasks to worker nodes, and maintaining the desired state of the cluster.
Worker Nodes: Executes tasks (runs containers) that are assigned by the manager nodes. Worker nodes don’t make scheduling decisions.

Services: A service is a task that you define for a container in the Swarm. For example, if you want to run a web application, you can define a service with a specific image and tell Swarm how many replicas (copies) of that service you want running.

Tasks: A task is the running container instance of a service. Each task is assigned to a node in the Swarm.

Overlay Networks: Docker Swarm creates an overlay network, allowing containers running on different hosts to communicate with each other as if they were on the same machine.

Load Balancing: Swarm automatically balances incoming requests across the available containers in the service, ensuring even distribution of workloads.

Scaling: You can scale services up or down with a single command. For example, if you want to run 10 replicas of your web server, Docker Swarm can easily distribute these replicas across the nodes in the cluster.

Service Discovery: Swarm provides automatic service discovery, so containers can find each other by service name.

State Management: Swarm ensures that the desired state of the services is maintained, including handling failures and re-scheduling tasks as necessary.

To get started with Docker Swarm, you would initialize a Swarm on a manager node using docker swarm init, then add worker nodes using docker swarm join. You can then deploy services using docker service create and manage them with various Docker CLI commands.

Example-1: Setting up a docker swarm cluster (e.g. 3 AWS EC2 nodes) and deploying a service (e.g. nginx docker container)

Step1: We create three AWS EC2 nodes for docker swarm and among these one we will use as a manager node and other two as worker nodes. Make sure you add and allow traffic for port 2377 on each node's security group.

Step2: Initialize docker swarm cluster on your manager node

docker swarm init

This command will output a swarm join token. Save this token; you’ll need it to add worker nodes.

Step3: Add worker nodes to swarm cluster. You can also add more than one manager nodes to swarm cluster.

To add a worker node to this swarm, run the following command at worker-node

docker swarm join --token <SWARM_JOIN_TOKEN> <MANAGER-IP>:2377

👉 To add a manager node to this swarm, run 'docker swarm join-token manager' on existing manager node and that will respond you a swarm join command and that responded command you need to run on a node which you want to make a manager node.

Step4: List all nodes in swarm cluster, run this command on manager node. It will show various details of swarm nodes such as which are manager (leader), which are worker nodes

docker node ls

Step5 : Deploy a service (e.g. nginx) to the swarm cluster, run this command on manager node.

docker service create --name my-nginx -p 80:80 nginx

Note: The manager node will run this service and deploy to a node of swarm cluster.

Step6: Verify the service (e.g. nginx) is running and the replicas are spread across nodes, run this command on manager node

docker service ls

To get more detailed information, including the status of each replica

docker service ps my-nginx

Step7: Scale the service, run this command on manager node

You can scale the number of replicas up or down

docker service scale my-nginx=2

👉 To check on which nodes of swarm cluster these two replicas are deployed, fire the following command on manager node. You can see the node IPs on which this service replicas are deployed. This command provides insight into the state of individual replicas, their assigned nodes, and their statuses.

docker service ps <SERVICE-NAME>

docker service ps my-nginx

Note: Let's say we deploy 5 replicas (containers) of a service to swarm cluster but currently our swarm cluster has only three nodes (VMs) than in this case two containers (of nginx service) will be deployed on each of two worker nodes and one container will be deployed on a manager node. Try it,

docker service scale my-nginx=5

docker service ps my-nginx

docker ps (run this command on each nodes to check no. of running containers)

Step8: List services (containers) running on a specific node, run this command on worker and manager nodes

docker ps

Step9: Update the service, run this command on manager node

To update the service (e.g., change the image version)

docker service update --image nginx:latest my-nginx

Step10: Promote a worker node to a manager node

Once the new node is added as a worker, you can promote it to a manager from the existing manager node using the following command,

docker node promote <NODE-ID>

To get the NODE-ID fire command 'docker node ls'

You can always demote a manager node back to a worker node using following command,

docker node demote <NODE-ID>

Step11: Set node's availability (Drain, Pause, or Active)

Worker nodes can have their availability changed by a manager node. For example, you can mark a worker node as drain, meaning no new tasks will be scheduled on it.

docker node update --availability drain <NODE-ID>

Similarly, you can switch the node back to active

docker node update --availability active <NODE-ID>

Step12: Check all tasks (containers) running on a specific worker node, from a manager node

docker node ps <NODE-ID>

Step13: Inspect a particular node

docker node inspect --pretty <NODE-ID>

Inspect a particular service

docker service inspect --pretty <SERVICE_NAME>

Get information of your docker swarm cluster

docker info

Step14: Removing the service from swarm nodes, run this command on manager node

docker service rm my-nginx

Step15: Remove a worker node from the swarm, run this command on manager node

docker swarm leave

Both worker nodes are down now, see status

A manager node can remove a worker node from the swarm using following command but before removing, ensure that the worker node has left the swarm (docker swarm leave)

docker node rm <NODE-ID>

Step16: Remove a manager node from the swarm, run this command on manager node

docker swarm leave --force

If a manager node leaves the swarm cluster than anther manager node take the charge as manager but suppose there is no another manager node in the swarm cluster than complete swarm cluster will be removed.

Step17: Get help of any command or sub-command

docker <COMMAND> --help

e.g.

docker node --help

docker node update --help

Example-2: Deploy a stack (multiple services) using docker compose on docker swarm

👉 In this basic example we will see how to use docker compose tool for docker swarm cluster.

mkdir my_workspace

cd my_workspace

Step1: Initialize docker swarm cluster on your manager node

docker swarm init

Step2: Add worker nodes to swarm cluster.

docker swarm join --token <SWARM_JOIN_TOKEN> <MANAGER-IP>:2377

Step3: Create a docker-compose.yml file

version: '3'
services:
web:
image: nginx
ports:
- "80:80"
mydb:
image: postgres
environment:
POSTGRES_PASSWORD: test123

Step4: Deploy a stack (multiple services) using a docker-compose.yml file in swarm cluster

docker stack deploy --compose-file <FILE> <STACK-NAME>

Step5: List all stacks deployed in the swarm

docker stack ls

Step6: List all stack's services

docker stack services <STACK_NAME>

Step7: Detail information of a stack

E.g. which service of stack is deployed to which node of swam cluster.

docker stack ps <STACK-NAME>

Step8: Remove a stack and its services

docker stack rm <STACK-NAME>

Docker Swarm uses the Raft consensus algorithm to manage the global state of a cluster,

Leader node: The Raft algorithm chooses a leader node to manage the swarm and orchestrate tasks.
Leader election: If the leader node fails, the Raft algorithm elects a new leader node.
Cluster state: The Raft algorithm ensures that all manager nodes have the same state.
Fault tolerance: The Raft algorithm can tolerate up to (N-1)/2 failures.
Node groups: The Raft algorithm divides the nodes into candidates and voters. Candidates run for leader, and voters vote for a leader.