Tunneling Elixir cluster network traffic over Wireguard

Introduction

The other day I was supporting a customer with an Elixir-based platform that would make use of Elixir libcluster, so messages on one host can be passed to other hosts. This can - for example - enable live updates for all users, even if they are not communicating with the same application server.

Encryption

Elixir's libcluster does support encrypted communication using TLS certificates however I was struggling with the help of an application developer to make it work.

"severity":"warn","message":"[libcluster:example] unable to connect to :\"app@Host-B\"

I'm absolutely open to the idea that we did something wrong and certificate-based encryption will work, but we were time-constrained and we decided to opt for another solution that seemed simpler and easier to maintain.

Wireguard as the encrypted transport

I deployed a Wireguard mesh network between all application servers using Ansible, which was straight forward. We just provisioned all hosts into the /etc/hosts file to keep things simple.

In the table below, we show a simplified example of the setup.

Hostname	IP-address	Wireguard Hostname	Wireguard IP-address
Host-A	10.0.10.123	Host-A-wg	192.168.0.1
Host-B	10.0.11.231	Host-B-wg	192.168.0.2

The Elixir applications would only know about the Host-A|B-wg hostnames and thus communicate over the encrypted VPN tunnel.

The problem with wireguard and libcluster

The key issue with libcluster is that when Host-A connects to Host-B, it uses the DNS hostname Host-B-wg hostname. But the actual hostname of Host-B is - you guess it: 'Host-B'. This means there is a mismatch and for reasons unknown to me, the libcluster connection will fail.

So the target hostname as configured in libcluster must match the hostname of the actual host! Since libcluster seems to make usage of domain names mandatatory, using IP-addresses was not an option.

If we would point Host-B to it's Wireguard IP-address (192.168.0.2), the problem would be solved. However, in that case, Wireguard doesn't know about the external 10.0.11.231 IP address and also tries to connect to the non-existing 192.168.0.2 address. So the Wireguard tunnel would never be created.

The solution

The solution is not that elegant, but it works. We still point the Host-B domain name to the wireguard IP address of 192.168.0.2 but we create an additional DNS record specifically for Wireguard, pointing to 10.0.1.231, so it can setup the VPN tunnel.

This is what /etc/hosts looks like on Host-A:

10.0.10.123 Host-A
192.168.0.2 Host-B
10.0.11.231 Host-B-wg

And this is what /etc/hosts looks like on Host-B:

10.0.11.231 Host-B
192.168.0.1 Host-A
10.0.10.123 Host-A-wg

Evaluation

Although all choices are a tradeoff, for us, the Wireguard-based solution makes most sense. Especially now that we have an encrypted tunnel between all hosts and any future communication between hosts can thus be encrypted without any additional effort.

Louwrentius