Docker Swarm is Docker's native orchestration tool that allows you to manage and deploy a cluster of Docker containers as a single virtual system. It turns a pool of Docker hosts (machines running Docker) into a coordinated cluster of containers, allowing you to manage multiple containerized services (applications) across many hosts in a simple, reliable way. It provides features such as scaling, load balancing, and service discovery for your applications (docker containers).
Key concepts
- Swarm Cluster: A group of Docker hosts (machines) that work together in a coordinated way. These machines can be physical servers or virtual machines, and each one runs Docker. Once grouped, these machines can be managed as a single system.
Nodes: These are the individual Docker engines that participate in the Swarm. There are two types,
- Manager Nodes: Responsible for managing the Swarm, assigning tasks to worker nodes, and maintaining the desired state of the cluster.
- Worker Nodes: Executes tasks (runs containers) that are assigned by the manager nodes. Worker nodes don’t make scheduling decisions.
- Services: A service is a task that you define for a container in the Swarm. For example, if you want to run a web application, you can define a service with a specific image and tell Swarm how many replicas (copies) of that service you want running.
- Tasks: A task is the running container instance of a service. Each task is assigned to a node in the Swarm.
- Overlay Networks: Docker Swarm creates an overlay network, allowing containers running on different hosts to communicate with each other as if they were on the same machine.
- Load Balancing: Swarm automatically balances incoming requests across the available containers in the service, ensuring even distribution of workloads.
- Scaling: You can scale services up or down with a single command. For example, if you want to run 10 replicas of your web server, Docker Swarm can easily distribute these replicas across the nodes in the cluster.
- Service Discovery: Swarm provides automatic service discovery, so containers can find each other by service name.
- State Management: Swarm ensures that the desired state of the services is maintained, including handling failures and re-scheduling tasks as necessary.
To get started with Docker Swarm, you would initialize a Swarm on a manager node using docker swarm init
, then add worker nodes using docker swarm join
. You can then deploy services using docker service create
and manage them with various Docker CLI commands.
Example-1: Setting up a docker swarm cluster (e.g. 3 AWS EC2 nodes) and deploying a service (e.g. nginx docker container)
Step1: We create three AWS EC2 nodes for docker swarm and among these one we will use as a manager node and other two as worker nodes. Make sure you add and allow traffic for port 2377 on each node's security group.
Step2: Initialize docker swarm cluster on your manager node
docker swarm init
This command will output a swarm join token. Save this token; you’ll need it to add worker nodes.
Step3: Add worker nodes to swarm cluster. You can also add more than one manager nodes to swarm cluster.
To add a worker node to this swarm, run the following command at worker-node
docker swarm join --token <SWARM_JOIN_TOKEN> <MANAGER-IP>:2377
👉 To add a manager node to this swarm, run 'docker swarm join-token manager' on existing manager node and that will respond you a swarm join command and that responded command you need to run on a node which you want to make a manager node.
Step4: List all nodes in swarm cluster, run this command on manager node. It will show various details of swarm nodes such as which are manager (leader), which are worker nodes
docker node ls
Step5 : Deploy a service (e.g. nginx) to the swarm cluster, run this command on manager node.
docker service create --name my-nginx -p 80:80 nginx
Note: The manager node will run this service and deploy to a node of swarm cluster.
Step6: Verify the service (e.g. nginx) is running and the replicas are spread across nodes, run this command on manager node
docker service ls
To get more detailed information, including the status of each replica
docker service ps my-nginx
Step7: Scale the service, run this command on manager node
You can scale the number of replicas up or down
docker service scale my-nginx=2
👉 To check on which nodes of swarm cluster these two replicas are deployed, fire the following command on manager node. You can see the node IPs on which this service replicas are deployed. This command provides insight into the state of individual replicas, their assigned nodes, and their statuses.
docker service ps <SERVICE-NAME>
docker service ps my-nginx
Step8: List services (containers) running on a specific node, run this command on worker and manager nodes
docker ps
Step9: Update the service, run this command on manager node
To update the service (e.g., change the image version)
docker service update --image nginx:latest my-nginx
Step10: Promote a worker node to a manager node
Once the new node is added as a worker, you can promote it to a manager from the existing manager node using the following command,
docker node promote <NODE-ID>
To get the NODE-ID fire command 'docker node ls'
You can always demote a manager node back to a worker node using following command,
docker node demote <NODE-ID>
Step11: Set node's availability (Drain, Pause, or Active)
Worker nodes can have their availability changed by a manager node. For example, you can mark a worker node as drain, meaning no new tasks will be scheduled on it.
docker node update --availability drain <NODE-ID>
Similarly, you can switch the node back to active
docker node update --availability active <NODE-ID>
Step12: Check all tasks (containers) running on a specific worker node, from a manager node
docker node ps <NODE-ID>
Step13: Inspect a particular node
docker node inspect --pretty <NODE-ID>
Inspect a particular service
docker service inspect --pretty <SERVICE_NAME>
Get information of your docker swarm cluster
docker info
Step14: Removing the service from swarm nodes, run this command on manager node
docker service rm my-nginx
Step15: Remove a worker node from the swarm, run this command on manager node
docker swarm leave
Both worker nodes are down now, see status
A manager node can remove a worker node from the swarm using following command but before removing, ensure that the worker node has left the swarm (docker swarm leave)
docker node rm <NODE-ID>
Step16: Remove a manager node from the swarm, run this command on manager node
docker swarm leave --force
If a manager node leaves the swarm cluster than anther manager node take the charge as manager but suppose there is no another manager node in the swarm cluster than complete swarm cluster will be removed.
Step17: Get help of any command or sub-command
docker <COMMAND> --help
e.g.
docker node --help
docker node update --help
Example-2: Deploy a stack (multiple services) using docker compose on docker swarm
👉 In this basic example we will see how to use docker compose tool for docker swarm cluster.
mkdir my_workspace
cd my_workspace
Step1: Initialize docker swarm cluster on your manager node
docker swarm init
Step2: Add worker nodes to swarm cluster.
docker swarm join --token <SWARM_JOIN_TOKEN> <MANAGER-IP>:2377
Step3: Create a docker-compose.yml file
version: '3'services:web:image: nginxports:- "80:80"mydb:image: postgresenvironment:POSTGRES_PASSWORD: test123
Step4: Deploy a stack (multiple services) using a docker-compose.yml file in swarm cluster
docker stack deploy --compose-file <FILE> <STACK-NAME>
Step5: List all stacks deployed in the swarm
docker stack ls
Step6: List all stack's services
docker stack services <STACK_NAME>
Step7: Detail information of a stack
E.g. which service of stack is deployed to which node of swam cluster.
docker stack ps <STACK-NAME>
Step8: Remove a stack and its services
docker stack rm <STACK-NAME>
Docker Swarm uses the Raft consensus algorithm to manage the global state of a cluster,
- Leader node: The Raft algorithm chooses a leader node to manage the swarm and orchestrate tasks.
- Leader election: If the leader node fails, the Raft algorithm elects a new leader node.
- Cluster state: The Raft algorithm ensures that all manager nodes have the same state.
- Fault tolerance: The Raft algorithm can tolerate up to (N-1)/2 failures.
- Node groups: The Raft algorithm divides the nodes into candidates and voters. Candidates run for leader, and voters vote for a leader.