Multihost setup + Replication

Hello,

I’m interested to understand if we have this setup - 2 EC2 instances with replication and multihost setup will this speed up the process of failover in case of one of two existing hosts failure and will any manual steps on the client side will be required?

Thank you!

The client will automatically reconnect. Reducing the ping timeout in the server settings will reduce the time it takes for the client to reconnect.

I didn’t quite understand whether this setup has any additional benefit than the setup just multi host without replication, thank you.

Also could you be more specific about the ping configurations you are referring to ?
These are the only ping options I can see on server side:

Also if I change these settings while there are users using this server should they re-download their client certificates because I can see this configurations also being part of the certificate used to connect to the server?

Another thing I noticed recently is that when you have multi host setup without replication anytime you restart the VPN server there is no guarantee that the server will start back again on the initial host and this basically brakes the extra logic we have with iptables rules on top of PritunlVPN to restrict the access between peers. Is there a way to bind the host to the server somehow and only when the bind host fails to make the transition to the healthy host?
Maybe the correct way to avoid this behavior is to have multi host + replication in place, are there any downsides we can expect if we follow this approach ?

On stable connections the ping timeout can be reduced to 20 or the ping interval to 3 with a timeout of 10. The options are pushed on connection it does not require the client to update the configuration. There is also configuration sync with the Pritunl Client for changes that do require updates, this occurs on connection.

You would need to increase the replication count to equal the host count to get the server to always be running on a specific host.

Another thing I noticed while testing the setup in question is that even though you have the VPN server replicated on 2 hosts the server route populated in AWS route table can point only to single instance ENI. This is another thing that brakes the communication to AWS resources using Cloud Advertise Approach in situation where the peer/s is/are connected to the replica host which ENI is not part of the routes in the subnet route table(is this like this because you can’t have more of the same route within a single route table).
if better example is needed I can try to explain further.
Do you have any suggestions on how this can be handled ?

A VXLAN overlay network is created to handle that. You should have the firewalls at minimum to allow port udp 4789 between the hosts although all traffic between the Pritunl hosts should be allowed. If you have non-NAT routes the source/dest checking option in the AWS instance must be disabled.

If the route advertisement is configured it will automatically update the VPC routing table when a host fails.


I’m referring to a completely different scenario. Here’s a more detailed description:

We have a Pritunl VPN setup with a multi-host configuration (two EC2 instances) and replication enabled, running in a dedicated AWS VPC. Additionally, we have several other AWS VPCs connected via a Transit Gateway, with the appropriate routing table configurations in place to enable communication between the Pritunl VPN VPC and the rest of the VPCs.

When the “Cloud Advertisement” option is enabled in Pritunl, it automatically adds a route to the VPN server subnet in the subnet’s route table, pointing to the ENI of the EC2 instance hosting the VPN server.

The issue arises when both EC2 instances are running the VPN server simultaneously. If a VPN client connects to the second EC2 instance, but the route for the VPN CIDR already exists in the route table and points to the ENI of the first EC2 instance, then the second instance cannot advertise its own ENI for that CIDR. As a result, clients connected to the second instance experience communication issues with AWS services due to incorrect routing.

Hope this helps to understand the issue better.

Also what about this below ?

Blockquote
In a multi-host setup with multiple VPN servers (e.g., three), all initially running on the same host (let’s say host1), I’ve noticed an issue. If we restart one of the VPN servers—say, VPN server1—it may come back up on a different host (e.g., host2). Since there’s no replication in place, this breaks the iptables-based rules we’ve implemented on top of Pritunl to control peer-to-peer access.
As a result, VPN server2 and VPN server3, which are still running on host1, can no longer communicate with the peers behind VPN server1, now running on host2.
Is there a way to bind each VPN server instance to a specific host, and only allow it to move to another healthy host if the original one fails?

Here is the full picture of my setup as this may help to understand further the type of issues I’m facing/describing:

  • 2 EC2 instances currently only with multi host setup
  • 3 VPN servers - two(peers servers) of them with Cloud Advertisement option enabled and one more(admin) without Cloud Advertisement option
  • on the Pritunl hosts there is additional logic that restrict the communication between peers using additional iptables rules but allows access to any peer that is part of the cloud advertised VPN servers
  • cloud advertised VPN servers have routes to AWS networks which allows them to have back and forward communication with AWS resources
  • not cloud advertised VPN server doesn’t have any AWS network routes, it is not needed
  • public IPs of the hosts are used in the client file to connect to the VPN server/s

No it is designed to handle a replicated setup where the routing table has a static route to the other host. This is accomplished by the host that has the route routing that traffic over a VXLAN to the correct host. If you are making changes to the iptables to add restrictions that may be causing the issue.

This needs to also be done with a non-NAT configuration as documented in the AWS route advertisement documentation.

Ok, here are my findings:

Here is the current VPN setup and route configuration:

VPN Server1(multi-host: host1, host2 + replication + vxlan) routes:
172.16.64.0/24   - Cloud Advertise Virtual Network
10.110.20.0/24   - AWS route1
10.120.21.0/24   - AWS route2
10.130.22.0/24   - AWS route3
10.110.253.0/24  - AWS route4
10.120.254.0/24  - AWS route5
10.130.255.0/24  - AWS route6
172.20.0.0/27    - Admin Server

VPN Admin Server(multi host: host1, host2 + replication + vxlan) routes:
172.20.0.0/27   - Virtual Network
172.16.64.0/19  - VPN Server1

Observations:

  1. Route Propagation Limitation
  • With replication and VXLAN enabled in a multi-host setup, the AWS CIDRs defined in VPN Server1’s route table are not propagated to connected peers.
  • On the peer side, only routes to the VPN Server1 and Admin Server networks are visible. This limits peer access to AWS networks.
  1. Cross-Host Peer Isolation
  • If a peer on VPN Server1 connects to host1, and a peer on the Admin Server connects to host2, the Admin peer cannot reach the peer from VPN Server1.
  1. Replication Without VXLAN
    Even without VXLAN but with replication enabled, the same issue occurs:
  • A VPN Server1 peer connected to host1
  • An Admin Server peer connected to host2 → Admin peer cannot reach the VPN Server1 peer

iptables extra configuration we set

To restrict peer-to-peer traffic while allowing the admin network to reach all other networks, we’ve applied the following iptables rules:

  • Peer isolation: Peers on VPN Server1 are blocked from reaching each other and the admin network.
  • Admin access: Admin network peers can access all other networks, including VPN Server1.

FORWARD chain (relevant rules):

DROP  tun4 → tun6  from 172.16.64.0/24 to 172.20.0.1        # Block VPN Server1 → Admin Server
DROP  tun4 → tun4  from 172.16.64.0/24 to 172.16.64.1:22/80/443
DROP  tun4 → tun4  from 172.16.64.0/24 to 172.16.64.0/24    # Peer-to-peer block
DROP  tun4 → tun6  from 172.16.64.0/24 to 172.20.0.0/27 ctstate NEW

ACCEPT  tun6 → tun4  from 172.20.0.0/27 to 172.16.64.0/24   # Admin → VPN Server1
ACCEPT  tun4 → tun6  ctstate RELATED,ESTABLISHED

INPUT chain (relevant rules):

DROP  tun4 → *  to 172.20.0.1                              # Block direct admin access
DROP  tun4 → *  to 172.16.64.1:22/80/443                   # Block peer → VPN server itself

ACCEPT  tun6 → *  to admin and other networks              # Admin access allowed

All traffic between hosts is allowed using self ingress rule and ENI source/dest check is disabled for both hosts

Conclusion:

Everything functions as expected only when:

  • Multi-host setup is enabled
  • Both replication and VXLAN are disabled
  • All VPN servers run on the same host

Enabling either replication or VXLAN (or both) causes routing and communication issues between peers and across hosts.

Am I missing something in my multi host + replication + vxlan setup so I can make it work no matter in which host my VPN server is running or peer is connected to ?

I can’t look through and debug all that information. Other than some rules to block access to ports like SSH the iptables should not be modified. Any access control should be handle with security groups outside the instance.

There’s no need to block access between server virtual networks. Pritunl will already create iptables rules to restrict network access to only the specific routes included in that server. The only exception to that is if the Restrict Routing option is disable or 0.0.0.0/0 is included in the routes. Subnets can also be blocked from the Pritunl server configuration by adding routes with the Block Route and Net Gateway option.

The web ports 80 and 443 should never be blocked. With WireGuard connections the Pritunl Client will send HTTPS requests to the web server on the virtual server IP to indicate to the Pritunl server that the connection is still active.

This configuration should be done with the NAT option disabled for the AWS subnets and source/dest checking disabled in the AWS instance settings. Additionally all traffic should be allowed between the Pritunl hosts in AWS. Verify the Local IP shown in the hosts tab is correct and that the hosts can reach other hosts using this IP.

A couple of times I took a look inside the documentation but I didn’t find any out of the box way to restrict the traffic between peers within the same VPN network.

Once you add a route to to different VPN server in the origin VPN server and vice verse any peer can reach the other peers in the relevant VPN server. This is something we would like to avoid and thus we use extra iptables rules to basically grant access from admin VPN server to all other VPN servers but not and the other way around.

We are not using WG currently and we only block gateway ports 22,80,443 for any peer as we don’t want peers to be able to reach the gateway and open the web UI.

Is there any way somehow to organize a call/meeting and discuss further our intentions ?