Cluster Hat setup - Part 2

Submitted by code_admin on Wed, 07/25/2018 - 15:22

Cluster Hat setup - Part 1
Cluster Hat setup - Part 2
Cluster Hat setup - Part 3 - Ansible
Cluster Hat setup - Part 4 - Docker Registry
Cluster Hat setup - Part 5 - Access Point Setup
Cluster Hat - Other Processes
Set up a new Raspberry Pi 3 to join the cluster

Step 6

I have re-worked this turotial a second time and step 6 is now no longer needed.

Step 7 - Setup Pi Zero's 2,3 and 4

Get 3 fresh sd cards and preform the following steps 3 times, replacing pX with p2, p3, or p4. These steps are the same as we went through for P1.

Write the controller image to the SD card.
Add a file "ssh" in the root of the boot section of the SD Card

To convert the controller image into an image for pX append " quiet init=/sbin/reconfig-clusterhat pX" to the cmdline.txt file in the boot (first) partition. *CHANGE pX to p2, etc
Insert the cards into a Pi Zero and plug them into positions 2, 3, and 4.

Turn it on by sshing into the controller and typing:

  1. clusterhat on p1
  2. clusterhat on p2
  3. clusterhat on p3
  4. clusterhat on p4

Note the lights on the pi hat indicating the result. Wait after each command to ensure the dhcp server gives corresponding ip addresses.

wait a while for the Pi Zero to boot then ssh into it:

  1. ssh pi@192.168.2.202 203 or 204

If you are not sure which ip which Pi Zero appeared as use the command "cat /var/lib/dhcp/dhcpd.leases" from the controller to check.

While logged as user pi into the PX:
create a directory
/home/pi/.ssh
Add an authorized_keys file with controllers public key

  1. echo XXX > /home/pi/.ssh/authorized_keys

(XXX is the contents of /home/pi/id_rsa.pub on the controller)

I logged out and back into the pi zero to confirm the key took.
Then I used the passwd command to change the username for the pi user to a long string which I discarded as I don't plan to use it.

Run raspi-config to:
- Expand the file system
- Change timezone to europe - London

When prompted reboot and wait for the pi zero to reboot.

Further setup like applying updates etc. will be done on all the pi's together using ansible.

Once done you can turn off the pi using:

  1. clusterhat off pX

(Again replace X with 1-4)

Step 8 - Install and configure a DNS server on the controller (bind)

My source for this is https://www.theurbanpenguin.com/raspberry-pi-dns-server/

We can check which DNS server each pi zero is using by logging into it and running

  1. cat /etc/resolv.conf

At the moment it is showing

  1. # Generated by resolvconf
  2. domain example.org
  3. nameserver 192.168.1.1

I want to put a DNS server on the controller and get them to use that one. This way I can then introduce host configuration to the bind server and let each Pi Zero talk to the other Pi Zero's using domain names.

First I installed bind on the controller by running the following commands:

  1. sudo apt-get update
  2. sudo apt-get install bind9 dnsutils

Check it's running

  1. service bind9 status
  2. sudo rndc status

Then I updated the dhcp server config (/etc/dhcp/dhcpd.conf) to give the controller as the DNS server rather than the standard one:

  1. option domain-name-servers      192.168.2.1;

(Changed from 192.168.1.1 to 192.168.2.1)

The clients will pick up the DNS settings from the local DHCP server but the controller dosen't use the local DHCP server. It will pick up the DNS settings from the internet connection. To prevent this we need to change /etc/dhcp/dhclient.conf:

  1. prepend domain-name-servers 127.0.0.1;
  2. prepend domain-search "metcarob-local.com";
  3. request subnet-mask, broadcast-address, time-offset, routers,
  4. #   domain-name, domain-name-servers, domain-search, host-name,
  5.     domain-name, host-name,
  6. #   dhcp6.name-servers, dhcp6.domain-search,
  7.     netbios-name-servers, netbios-scope, interface-mtu,
  8.     rfc3442-classless-static-routes, ntp-servers;

This adds 127.0.0.1 as the local deigined domain server and adds a domain search.
It also removes these two fields from what is requested from dhcp.

This makes the controller use the local server. The domain-search will make the controller add a default domain of metcarob-local.com hosts so we can refer to p1.metcarob-local.com as simply p1.

After a restart of the cluster I checked resolvconf of one pi and checked the new setting was there. I also checked I could ping www.google.com and the new bind was working.

You can check which DNS server is used with the command.

We get this at the bottom of the output:

  1. ;; Query time: 2 msec
  2. ;; SERVER: 127.0.0.1#53(127.0.0.1)
  3. ;; WHEN: Tue Dec 27 12:21:33 UTC 2016
  4. ;; MSG SIZE  rcvd: 195

This shows us we are using 127.0.0.1 on the controller as the DNS.

You will have noticed DNS queries are now slower as bind will send them all to the domain controllers. It might be better to forward them to the local cacheing router instead. To achieve this change /etc/bind/named.conf.options: (Un-comment the forwarders and add DNS for the network)

  1.         forwarders {
  2.                 192.168.1.1;
  3.         };

Note: When I tested "ping raspberrypi.org" my dns server failed. I had to use googles dns server (8.8.8.8) to make it work.
Note2: I continued to have problems after this so I changed the /etc/bind/named.conf.options and

  1. set dnssec-validation no;
  • I will need to come back to this dnssec issue later but for now this workaround works.

Step 9 - Add a DNS zone for the cluster

I need to have static IP addresses for the servers in order to assign them domain names. As I was experimenting it didn't matter which order I turned on Pi Zero's in the cluster they all seemed to end up getting the same IP address. I don't think I can rely on this so I need to add reserved IP's in the DHCP server.

I collected the host name and mac addresses of the Pi Zeros by looking at the leases "cat /var/lib/dhcp/dhcpd.leases"

client-hostname hardware ethernet IP I want to assign
p1 00:22:82:ff:ff:01 192.168.2.101
p2 00:22:82:ff:ff:02 192.168.2.102
p3 00:22:82:ff:ff:03 192.168.2.103
p4 00:22:82:ff:ff:04 192.168.2.104

I noticed a pattern in the mac addresses. I think this is because I am using the USB interfaces.

Once I have collected this information I add the following to /etc/dhcp/dhcpd.conf at the end of the "subnet 192.168.2.0 netmask 255.255.255.0 {" section

  1.     host p1 {
  2.         hardware ethernet 00:22:82:FF:FF:01;
  3.         fixed-address 192.168.2.101;
  4.     }
  5.     host p2 {
  6.         hardware ethernet 00:22:82:FF:FF:02;
  7.         fixed-address 192.168.2.102;
  8.     }
  9.     host p3 {
  10.         hardware ethernet 00:22:82:FF:FF:03;
  11.         fixed-address 192.168.2.103;
  12.     }
  13.     host p4 {
  14.         hardware ethernet 00:22:82:FF:FF:04;
  15.         fixed-address 192.168.2.104;
  16.     }

While I was editing this file I also changed the domain name to:

  1. option domain-name "metcarob-local.com";

Restart the dhcp server

  1. sudo service isc-dhcp-server restart

I then restarted and checked that the Pi Zero's are being assigned the correct address. Instead just ping the pi zeros after restarting them:

  1. ping 192.168.2.101
  2. ping 192.168.2.102
  3. ping 192.168.2.103
  4. ping 192.168.2.104

You will not be able to use the leases file to confirm the ip addresses any more.

I added the bind configuration by editing "/etc/bind/named.conf.local" so it contains:

  1. //
  2. // Do any local configuration here
  3. //
  4.  
  5. // Consider adding the 1918 zones here, if they are not used in your
  6. // organization
  7. //include "/etc/bind/zones.rfc1918";
  8.  
  9. zone "metcarob-local.com" {
  10.     type master;
  11.     file "/etc/bind/db.metcarob-local.com.zone";
  12. };
  13.  
  14. zone "2.168.192.in-addr.arpa" {
  15.         type master;
  16.         notify no;
  17.         file "/etc/bind/db.192.168.2.zone";
  18. };

I then setup the referenced zone file (/etc/bind/db.metcarob-local.com.zone):

  1. ;
  2. ; BIND data file for local metcarob-local.com
  3. ;
  4. $TTL    7200
  5. @   IN  SOA metcarob-local.com. root.metcarob-local.com. (
  6.                   3     ; Serial
  7.              604800     ; Refresh
  8.               86400     ; Retry
  9.             2419200     ; Expire
  10.              604800 )   ; Negative Cache TTL
  11. ;
  12. @   IN  NS  ns.metcarob-local.com.
  13. @   IN  A   192.168.2.1
  14. @   IN  AAAA    ::1
  15. ns  IN  A   192.168.2.1
  16. controller  IN  A   192.168.2.1
  17. p1  IN  A   192.168.2.101
  18. p2  IN  A   192.168.2.102
  19. p3  IN  A   192.168.2.103
  20. p4  IN  A   192.168.2.104

Also create the reverse lookup file: (/etc/bind/db.192.168.2.zone)

  1. ;
  2. ; BIND reverse data file for local 192.168.2.* interface
  3. ;
  4. $TTL    604800
  5. @       IN      SOA     metcarob-local.com. root.metcarob-local.com. (
  6.                               1         ; Serial
  7.                          604800         ; Refresh
  8.                           86400         ; Retry
  9.                         2419200         ; Expire
  10.                          604800 )       ; Negative Cache TTL
  11. ;
  12. @       IN      NS      localhost.
  13. 1       IN      PTR     controller-metcarob-local.com.
  14. 101     IN      PTR     p1.metcarob-local.com
  15. 102     IN      ptr     p2.metcarob-local.com
  16. 103     IN      ptr     p3.metcarob-local.com
  17. 104     IN      ptr     p4.metcarob-local.com

Run the commands to reload these zone files

  1. sudo service bind9 reload
  2. sudo rndc reload metcarob-local.com.
  3. sudo rndc reload 2.168.192.in-addr.arpa.

NOTE: When editing these in the future you must increment the serial each time.

Finally edit /etc/network/interfaces and enable the last two lines we put there in part one.

  1. # added these lines becuase settings in /etc/dhcp/dhclient.conf were being ignored
  2. # for some reason
  3. dns-nameservers 127.0.0.1
  4. dns-search metcarob-local.com

Step 10 - Checkpoint

Now we should have all the Pi's in the cluster setup with DHCP and DNS working so they can all use hostnames to communicate.
We can test this by rebooting the entire cluster, then logging into the controller and starting all the Pi Zero's.

Once we have logged into the controller we should be able to ssh to a Pi in the cluster using it's hostname:

  1. ssh p1

We should be able to successfully get responses form each of the following servers:

  1. ping p1
  2. ping p1.metcarob-local.com
  3. ping controller
  4. ping p2
  5. ping p3
  6. ping p4

If all the above pings work then the DNS server is working.

We can also test reverse lookups are working:

  1. nslookup 192.168.2.1
  2. nslookup 192.168.2.101
  3. nslookup 192.168.2.102
  4. nslookup 192.168.2.103
  5. nslookup 192.168.2.104

Next Part

Next I will start to look at using Ansible to control all the pi's together
Cluster Hat setup - Part 3 - Ansible

Problems

Dates on Pi's in the cluster were wrong

When I first configured my dhcp server I sent out a ntp server as 192.168.1.1 which was invalid. Each pi in the cluster makes a copy of these settings. Then when I fixed the dhcp server by no longer sending out ntp server each pi kept this configuration. I discovered this when testing ansible with the date command.
To fix this (as well as changing the dhcp configuration I had to delete the file /var/lib/ntp/ntp.conf.dhcp (you have to sudo to delete this)
Then I used raspi-config to set each time zone to London.
This worked but on some of the servers took "sudo /etc/init.d/ntp restart" to set in.

(Command "ntpq -pn" was also useful)

RJM Article Type
Public Article