Cluster Hat setup - Part 3 - Ansible

Submitted by code_admin on Wed, 07/25/2018 - 15:23

Cluster Hat setup - Part 1
Cluster Hat setup - Part 2
Cluster Hat setup - Part 3 - Ansible
Cluster Hat setup - Part 4 - Docker Registry
Cluster Hat setup - Part 5 - Access Point Setup
Cluster Hat - Other Processes
Set up a new Raspberry Pi 3 to join the cluster

Step 11 - Set up ssh keys to prepare for Ansible

Note: The first time I did the setup I set up the ssh keys here. The second time I set them up at an earlier stage. If you have already set up ssh keys you can skip this step.

I chose to use ssh keys for letting ansible connect to the servers.
So first I set up a public/private keypair on the controller:

  1. ssh-keygen -t rsa -b 4096 -f ~/.ssh/id_rsa
  2. cat ~/.ssh/

Copy the output public key into the clipboard

Next I ssh to each of the pi's in the cluster and insall the ssh key:

(Repeat the following for each pi replacing pX with 1-4)
This will
log into ssh
change the password away from raspberry
logout of the pi's ssh
setup the authroized_keys file
login again and confirm that this time we don't need a password
logout of the pi's ssh again

  1. ssh pX
  3. passwd
  4. *Change password to something better than raspberry - can be long and complicated as we should never need it again*
  5. exit
  6. ssh-copy-id -i ~/.ssh/ pX
  7. **Enter password when prompted**
  8. ssh pX
  9. exit

Step 12 - Setup Ansible

I recomend watching this video for an introduction.

Rather than manage each Pi Zero in the cluster individually I want to use Ansible.
I will install Ansible on the controller machine then Ansible will preform tasks on the Pi Zero's in the cluster.

First log into the controller and install Ansible

  1. sudo apt-get install python-pip git python-dev sshpass
  2. sudo pip install markupsafe
  3. sudo pip install ansible

Step 13 - Test out a few Ansible commands

Before testing out these commands don't forget to turn the cluster servers on!
I had some problems with host key checking. Rather than turn off host key checking in ansible I decided to connect once to the host with the full domain name and add the key to the known_hosts file.

  1. ssh
  2. ssh
  3. ssh
  4. ssh

First I created an ansible directory and setup a hosts file on the controller (~/ansible/hosts):

  1. [clusternodes]
  2. p[1:4]
  4. [clusternodes:vars]
  5. ansible_ssh_user=pi

Then I played with a few one off commands being run on the servers:

  1. ansible -i ~/ansible/hosts clusternodes --list-hosts
  3. ansible -i ~/ansible/hosts clusternodes -m ping
  5. ansible -i ~/ansible/hosts clusternodes -m shell -a "date"
  7. ansible -i ~/ansible/hosts clusternodes -m shell -a "cat /var/log/syslog | grep ntp"
  9. ansible -i ~/ansible/hosts clusternodes -m shell -a "cat /var/log/syslog | grep ntp"
  11. ansible -i ~/ansible/hosts clusternodes -m setup

sudo with ansible

  1. ansible -i ~/ansible/hosts clusternodes -m shell -a "ls /" -s

one at a time with ansible

(Default is 5 at a time)

  1. ansible -i ~/ansible/hosts clusternodes -m shell -a "ls /" -s --forks=1

sudo with ansible

  1. ansible -i ~/ansible/hosts clusternodes -m shell -a "ls /" -s

one at a time with ansible

(Default is 5 at a time)

  1. ansible -i ~/ansible/hosts clusternodes -m shell -a "ls /" -s --forks=1

Step 14 - Configuration as code

I would like to use GitHub to store the code for my configuration setup code for the cluster. I will put a link to the public repo for reference but if you are following this tutorial you will not need to clone this repo. (

I added the public key for the pi user on the controller to the ssh keys in my github account.

I will use the ~/ansible directory on the controller for the repo. I ran the following on the controller:

  1. git config --global ""
  2. git config --global "Your Name"
  4. cd ~/ansible
  5. echo "# metcarob-local_cluster" >>
  6. git init
  7. git add
  8. git add hosts
  9. git commit -m "first commit"
  10. git remote add origin
  11. git push -u origin master

As I continue through this tutorial I update this repo but I will not write the commands to do this as I go.

Step 15 - Playbook 001 - Run command

As I mentioned before I want to use ansible to do an update and upgrade on each node in the cluster.

I created the following playbook:
~/ansible/hosts apt_upgrade.yml:

  1. ---
  2. - hosts: clusternodes
  3.   remote_user: pi
  4.   become: true
  6.   tasks:
  7.   - name: Update and upgrade apt packages
  8.     become: true
  9.     apt:
  10.       upgrade: yes
  11.       update_cache: yes
  12.       cache_valid_time: 86400 #One day

This can be run with the command:

  1. cd ~/ansible
  2. ansible-playbook -i ~/ansible/hosts apt_upgrade.yml

I also would like to build a playbook to install docker onto each Pi in the cluster. I am leraning ansible so I will do this one step at a time. My first setp will be to build a simple playbook that does nothing.


  1. ---
  2. - hosts: clusternodes
  3.   tasks:
  4.     - name: install docker
  5.       shell: echo "Hello World"
  6.       changed_when: false

This can be run with the command:

  1. cd ~/ansible
  2. ansible-playbook -i ~/ansible/hosts install_docker.yml

Step 16 - Playbook 002 - Install and Run shell script

First create a script ~/ansible/

  1. #!/bin/bash
  3. echo "Test script ${0} - Params {1} {2}"
  5. exit 0

Then I changed install_docker.yml to

  1. ---
  2. - hosts: clusternodes
  3.   tasks:
  4.     - name: install docker
  5.       script: "PARAM001" "PARAM002"
  6.       changed_when: false

I changed the script to exit 1 as a test and confirmed that ansible reported an error.

Step 17 - Playbook 003 - Create a conditional step that is only called if a certain file exists

Change the playbook so it will only try and install docker if a particular file dosen't exists. (Later I will change it to a file I know is created by the docker install)

file ~/ansible/install_docker.yml:

  1. ---
  2. - hosts: clusternodes
  3.   tasks:
  4.     - name: Check that the somefile.conf exists
  5.       stat:
  6.         path: ~/some_file_i_dont_know_yet
  7.       register: stat_result
  9.     - name: install docker
  10.       script: "PARAM001" "PARAM002"
  11.       when: stat_result.stat.exists == False

Step 18 - Playbook 004 - Installing Docker

I want to test installing Docker but I will test it on one Pi in the cluster rather than running on all 4. I have created a group in the hosts file called "gunieapig"

  1. [clusternodes]
  2. p[1:4]
  4. [clusternodes:vars]
  5. ansible_ssh_user=pi
  7. [gunieapig]

I also changed the hosts line of the install_docker.yml playbook to gunieapig. This was for testing only but I then switched it back when it was working.

The script to install docker is ~/ansible/

  1. #!/bin/bash
  3. curl -sSL | sh
  5. if [[ $? -ne 0 ]]
  6. then
  7.     exit 1
  8. fi
  10. exit 0

We will need to run our docker servers against an insecure registry so we need to update the command line to run the doker instance:

  1. [Service]
  2. ExecStart=
  3. ExecStart=/usr/bin/dockerd --storage-driver overlay -H fd:// --insecure-registry


  1. ---
  2. - hosts: clusternodes
  3.   remote_user: pi
  4.   become: true
  6.   tasks:
  7.     - name: Check that the somefile.conf exists
  8.       stat:
  9.         path: /etc/docker
  10.       register: stat_result
  12.     - name: install docker
  13.       script:
  14.       when: stat_result.stat.exists == False
  16.     - group:
  17.         name: docker
  18.         state: present
  19.       notify:
  20.       - restart docker
  22.     - user:
  23.         name: pi
  24.         groups: docker
  25.         append: yes
  26.       notify:
  27.       - restart docker
  29.     - name: Create docker service directory
  30.       file:
  31.         path: /etc/systemd/system/docker.service.d
  32.         state: directory
  33.         mode: "u=rw,g=r,o=r"
  35.     - template:
  36.         src: overlay.conf
  37.         dest: /etc/systemd/system/docker.service.d/overlay.conf
  38.         owner: root
  39.         group: root
  40.         mode: "u=rw,g=r,o=r"
  41.       notify:
  42.       - restart docker
  44. #    - debug:
  45. #        msg: "FORCING restart of docker for testing"
  46. #      changed_when: true
  47. #      notify:
  48. #      - restart docker
  50.   handlers:
  51.     - name: restart docker
  52.       systemd:
  53.         state: restarted
  54.         daemon_reload: yes
  55.         name: docker

The above playbook could be more efficient because it installs docker, starts it, adds the group then restarts docker. Of course it would be better to add the group and eliminate the need to restart docker but I wanted to experiment with ansible handlers and this is a learning project for me. I may optimise it later if I feel the need.

I successfully installed docker on all 4 Pi Zero's in my cluster. I had a problem with one of them not being able to install docker due to a problem with apt. Rather than solve the problem I was able to wipe the SD card and restore it back to origional settings. Taking any machine out of a cluster and re-imaging it is one of the advantages of having a cluster!

I also noticed .retry files in the directory so I added a .gitignore file.

Step 19 - Create a webserver image on controller

(I got some information for this part from

Install docker on the controller:

  1. curl -sSL | sh
  2. sudo groupadd docker
  3. sudo gpasswd -a ${USER} docker
  4. sudo service docker restart

Note: You will need to exit and restart the ssh session for the group to take effect.

You can check it is working by running:

  1. docker version

If you get nothing back then docker didn't install
If you get client version info but then "Cannot connect to the Docker daemon. Is the docker daemon running on this host?" it means the docker group hasn't taken.
Otherwise you will get client version and server version info.

Create a directory where we can build our docker images

  1. mkdir ~/dockerbuild
  2. mkdir ~/dockerbuild/apachepi
  3. cd ~/dockerbuild/apachepi

When it starts Apache checks for a pid file and refuses to run when it is present. This means that when apache is not stopped properly it will refuse to start again. To get around this we create a script to delete this file before running apache. This script will run inside our docker container
Put the file in (~/dockerbuild/apachepi/apache2-foreground)

  1. #!/bin/bash
  2. set -e
  4. # Apache gets grumpy about PID files pre-existing
  5. rm -f /var/run/apache2/
  7. exec /usr/sbin/apache2ctl -DFOREGROUND

make sure this is executable:

  1. chmod +x ~/dockerbuild/apachepi/apache2-foreground

Create a docker file to build an apache image: (~/dockerbuild/apachepi/Dockerfile)

  1. FROM resin/rpi-raspbian
  2. MAINTAINER Robert Metcalf
  4. # Update
  5. RUN apt-get update
  7. # Install apache2
  8. RUN apt-get install -y apache2
  10. COPY apache2-foreground /usr/local/bin/
  12. EXPOSE 80
  13. CMD ["/usr/local/bin/apache2-foreground"]

Build the image

  1. docker build -t .

Once it's complete you can check the image has been created:

  1. docker images

The command to start the image running is:

  1. docker run --name apachepi_container -p 8080:80 -d

This will run the webserver on the controller on port 8080. You can change 8080 to any port. I name the instance apachepi_container.

The command to check it's running is:

  1. docker ps -a

We can also log go to any computer connected to the wifi network and browse for the controller (for me and see the start page.

Docker will run the server in it's own area and we can get a shell into this area with the command:

  1. docker exec -i -t apachepi_container /bin/bash

(exit will bring us back to the controller)

Once the instance exists it can be started and stopped with

  1. docker stop apachepi_container
  2. docker start apachepi_container

When the instance is stopped it still exists and can be deleted using:

  1. docker rm -f apachepi_container

This gets us a webserver running on the cluster controller. There are two problems with what we have built so far:

  1. We want the webserver to run on the cluster machines not the controller
  2. We can't access the cluster machines from outside the cluster

I will address these problems next.

Next Part

In the next part I will setup a private docker registry on the controller and get the cluster nodes to pull docker and run docker images from the private registry.
Cluster Hat setup - Part 4 - Docker Registry

RJM Article Type
Public Article