We’re encountering some stability issues with our Pritunl instance. At random intervals, the server goes unresponsive, disconnecting all users and then goes offline. This happens even when there’s no significant load or activity.
Unresponsiveness & Disconnects: The instance becomes unreachable, disconnecting all active VPN sessions and rendering it unusable for anywhere between 1-5 minutes.
Load Conditions: The issue occurs sporadically, even during periods of low user activity. We’re not pushing high throughput or running heavy tasks when it happens.
CPU usage: Also sporadic, sometimes low usage, then at times it skyrockets when no actions have seemingly taken place.
Network activity: Doesn’t seem to show any strange or overloading activity as of yet.
Memory usage: Stable with no signs of any spikes or that it’s running out that we’ve seen.
Any help on where to focus our troubleshooting or known issues that align with these symptoms?
Due to limitations on screenshot uploads, I’ve uploaded and shared the screenshots through our WorkDrive.
Check the logs in the top right of the web console and /var/log/pritunl.log if there’s an issue with the database it will only be logged to file. Check sudo journalctl -u mongodb -n 500 for errors.
The two issues that we see in the logs that come up repetitively are:
There is a discrepancy between the link-mtu - the server is using 1558 and the client is using 1560. Which should be used for best reliability? How do I change this on both the server and the client?
There is a discrepancy between the server using comp-lzo compression and the client using no compression - how do I disable the server from attempting to use comp-lzo?
We have 62 servers and about 100 users so I am guessing that these little inconsistencies are creating a large amount of load with that many servers and users, would you agree?
The VPNs are only used in routers so that we can remotely access the router of our clients, so there’s rarely much traffic, except for 1 client where we are trying to set up a VPN connection via Pritunl between an Azure virtual machine and their local office so that they can run an RSYNC backup so that they have a local backup of their Azure VM. When we start the RSYNC it crashes the server, although we have also had this crashing of the server previously (prior to this RSYNC setup) and we were never able to figure out the cause of it - we thought it was that the Pritunl VM was underpowered and using a shared CPU, but we since increased the compute and RAM and made it a dedicated CPU which helped but now it’s happening again.
This second screenshot is from when it crashes. You can see a few spikes, but it’s not pinned at 100%. This graph is from Pritunl (not the cloud VM host):
The MTU or comp-lzo warnings won’t cause the server to crash or increase load on the server. If there’s nothing in the output indicating what caused the process to exit run sudo dmesg -THk and look for out of memory errors or other errors. Also run sudo journalctl --lines=5000 and check for errors.