Sometime ago, i was working in a complex Greenfield project.
We had to design a secure infrastructure, make sure that all traffic was encrypted at Rest and in Transit and deploy a large number of services in AWS. While the Dev teams were working on building the applications, i was focusing on those requirements.
One of the requirements was to design and implement a flat Mesh Network (with encrypted traffic). All servers deployed in AWS should have a point-to-point connection to every other peer in the network. On top of that, some servers hosted on Azure and GCP should be able to join the Mesh. And to add more complexity, external clients like Laptops or Mobile Phones should be able to access securely, specific servers/services in the Mesh.
After gathering all the information and speaking with a number of people to make sure that all requirements were properly documented and added to the backlog, it was time to start working for the PoC (Proof of Concept)
One of the tools that seemed like the right candidate was Wireguard.
What is Wireguard
“WireGuard is a secure network tunnel, operating at layer 3, implemented as a kernel virtual network interface for Linux. It utilizes state-of-the-art cryptography and it aims to be faster, simpler, leaner, and more useful than IPsec. Wireguard is currently under heavy development, but already it might be regarded as the most secure, easiest to use, and simplest VPN solution in the industry. That’s why a lot of VPN service providers have started to using it..”
Source: wireguard.com
After reading the documentation and running some tests, i decided to proceed with that. The process of setting up a Wireguard as a VPN is straight forward. You have to install it, create the required keys, create a wg0.conf file for each server, configure the relevant Security groups on AWS and your VPN is up and running quickly.
But in our case, we had build a Mesh consisting of 100s of servers, most of them being part of AutoScaling groups. As a result, we didn’t have the option to configure wg0.conf
manually, every time a server had to join the mesh.
But what is wg0.conf?
WireGuard works by encrypting the connection using a pair of cryptographic keys, each server needs to have it’s own private and public keys and then exchange public keys with the rest.
The wg0.conf
file contains all the necessary configuration parameters for the WireGuard interface
Here are some of the main parameters that can be configured in wg0.conf
:
PrivateKey
: This parameter defines the private key for the WireGuard interface. It is used to authenticate and encrypt traffic between peers.ListenPort
: This parameter defines the port that WireGuard will listen on for incoming connections ( default is UDP 51820).Address
: This parameter defines the IP address and subnet mask for the WireGuard interface.Peer
: This parameter defines the configuration for a peer on the WireGuard network. It includes the public key of the peer, its IP address, allowed IPs (the IP ranges that the peer can access), and other options such as endpoint configurations.
Mesh Solution Overview
I started working on my Spike and it was a challenge to find the best way to implement such a Mesh topology. At first i deployed a number of EC2 instances, by using Terraform, in multiple AWS regions. After that i was going through my list and started building and trying things:
- Terraform and Ansible: Successfully created a Mesh, but it was really difficult to manage any new peers and auto update the
wg0.conf
when they joined. Came to the conclusion that it was fine for static setups but not for dynamic. - Terraform, Hashicorp Vault and a ton of bash scripts: That looked promising, let’s see how it works. When connecting nodes via wireguard, each node has to know the public key and endpoint ip of all peers. In this scenario, nodes with proper authentication in Vault were allowed to publish their own data and also to read connection data from other peers. They could all read the meeting point data for our mesh (data structure containing basic information about our mesh network), publish their own configuration to vault, query vault for other nodes known to the meeting point and add a wireguard peer for each of them. Although it worked, it was really complex to support it and troubleshoot, especially after the handover.
Then i came across a tool called Netmaker. It was at the early stages of development but looked really promising (Since then, i have tested all versions, including the current one 0.18.5
that was released a few days ago, with big improvements and fixes).
What is Netmaker
“Netmaker is a platform for creating fast and secure virtual networks with WireGuard. It is a tool for creating and managing virtual overlay networks. If you have at least two machines with internet access that you need to connect with a secure tunnel or thousands of servers spread across multiple locations or cloud providers, Netmaker is the perfect “tool”. It connects machines securely, wherever they are.”
Source: Netmaker.org
I spent some time going throug the documentation and soon started testing that one.
Now after this intro, let’s see how we can create a secure Mesh Network using Netmaker and WIreguard.
How to Install Netmaker
- Start by Launching a VM with Ubuntu 20.04 or latest with a public IP. (Ubuntu is the one currently supported)
- Open ports 443, 80, and 51821-51830 (UDP) on the security group. You can make this range smaller, but keep in mind that you need have a port for each network you create. (I ‘am going to explain more about Networks later)
- Run the following script:
sudo wget -qO /root/nm-quick-interactive.sh https://raw.githubusercontent.com/gravitl/netmaker/master/scripts/nm-quick-interactive.sh && sudo chmod +x /root/nm-quick-interactive.sh && sudo /root/nm-quick-interactive.sh
You have to answer a number of simple questions and at the the end you are going to presented with the login URL.
Then you are going to be asked to create a username and password and when you login this is what you are going to see.
Create a network
The first thing we have to do afterwards is to create a Network and enter the IP ranges that our servers would use for secure cross-communication. (Wireguard interface wg0, will get assigned an IP address for that range)
Click the ‘Networks’ tile on the dashboard, or in the left navigation panel click ‘Networks’.
On the Networks screen, click on the ‘Create Network’ button.
Give you network a name, and then enter a CIDR, for the network. Or click on the ‘Autofill’ button and then change the name and the CIDR generated by the autofill option.
Create the Keys
Then proceed by creating the required keys, When done, we can see that there multiple ways to add a peer to our Mesh Network
Create and configure the Nodes
Most of the hard work is done. And now it’s time to launch a few instances in AWS in multiple regions and spread them across Public and Private subnets. In our case almost all instances are in Private subnets, with the exception of Netmaker server and Azure instance.
I like to use Terraform with Gitlab Runners for my test deployments and for this demo i had about 10 EC2 instances up and running really fast (Was using spot instances to minimise costs). Just remember that you need to deploy a standalone (on-demand) EC2 instance for Netmaker.
All the the Security Groups, for the Nodes, were configured to allow incoming traffic (UDP) to port 51820–51830.
And with with the help of User Data and the command shown below, we can configure the nodes to join the Mesh during the launch process.
Here is a example of User Data that can be used (Replace eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000== with your token)
#!/bin/bash
sudo curl -Lo /etc/yum.repos.d/wireguard.repo https://copr.fedorainfracloud.org/coprs/jdoss/wireguard/repo/epel-7/jdoss-wireguard-epel-7.repo
sudo yum install epel-release
sudo amazon-linux-extras install -y epel && yum install -y wireguard-dkms wireguard-tools
curl -sL 'https://rpm.netmaker.org/gpg.key' | sudo tee /tmp/gpg.key
curl -sL 'https://rpm.netmaker.org/netclient-repo' | sudo tee /etc/yum.repos.d/netclient.repo
sudo rpm --import /tmp/gpg.key
sudo yum check-update
sudo yum install -y netclient
netclient register -t eyJzZXJxxxyxxxxxxxxxxxxxxxxcccccccccccvvvvvvv0000000==
After a few minutes we have our instances up and running, fully configured with Wireguard and Netclient (All of the them have automatically joined our Mesh network).
Now let’s launch one more server but this time in… Azure
Time to check our Netmaker GUI and make sure that all nodes have joined. If they don’t show immediately, there is no need to worry. It could take up to 5 mins to show up. In our case all Nodes are now visible with a Healthy status.
At this point we have successfully deployed and configured a flat Mesh network, not only between AWS instances but also with a server in a different cloud provider. All traffic between them is encrypted in transit, by using Wireguard.
Mesh Graph / Visualisation
Let’s see how our Mesh Network looks like at this stage
Test our Mesh Network
How about running some tests to confirm that everything is working as expected?
Access Control Lists
By default, Netmaker creates a “full mesh,” meaning every node in our network can talk to every other node. But there is a nice feature that you can use in order to enable/disable any peer-to-peer connection in the network.
The ACL feature can be accessed by either clicking on “ACLs” in the sidebar, or by clicking on a Node in the Node List.
Add External Clients
There are cases that external clients need to access some services running in the nodes. That can be a Mobile phone, a laptop/tablet or an IoT device.
We can achieve that by creating an Ingress. And once connected to the Ingress, we can reach all servers in the network.
Then we can generate the client configs. Clients can join either by scanning a QR code or by importing the Wireguard config (Please note, that WIreguard must be installed in the mobile, laptop etc)
In our case i have download the config in my laptop and have connected using the Wireguard client.
For this demo i have also installed Apache in an AWS EC2 instance and in the Azure instance and i can access both from my laptop, through a secure tunnel, using the 10.141.x.x IPs ( Mesh network CIDR)
Conclusion
This is just a use case of using Netmaker and wireguard to create a secure Mesh Network, but there are more that we are going to cover and do a deep dive in future posts.
- Automate the creation of a large WireGuard-based (Mesh) network
- Secure access to a home or office network
- Provide remote access to resources like an AWS VPC, or K8S cluster
- Create clusters that span environments
- Remotely access a cluster from an external source
- Remotely access an external source from a cluster
- Manage a secure mesh of IoT devices
Hope you found this post useful and feel free to reach to me for any questions, by using the contact form.