Random issues: timeout awaiting response headers

Hi everybody!

We’re experiencing weird and unpredictable behavior on some of our services:

Cockpit:

  • There seems to be random issues with the Websocket-Connection. Most of the time, the application will fail, however hitting F5 will eventually fix this issue (after multiple tries)
  • it appears, pritunl-zero is swallowing some of the Websocket-messages after the first (micro-)seconds of the connection, making the page unusable and unpredictable.
    Cockpit works fine:
  • inside the local Network
  • outside the local network when running via nginx proxy manager
  • outside the local network when running via cloudflare tunnels/cloudflare zero-trust
    Pritunl-Zero logs do not show anything at all.

We’re running some OctoPrint instances:

  • once again: accessing it locally, via cloudflare and via proxy-manager works perfectly fine
  • the page will sometimes simply work, most of the time not work at all (502), sometimes show the login screen and then not proceed after Login.
  • If one of the issues arise, hitting refresh will not help
  • if the login shows, but the page won’t succeed, Browser dev-tools show a Websocket termination before connection was established
  • if the page doesn’t work at all, we get “http: proxy error: net/http: timeout awaiting response headers\n” in the pritunl-zero logs → The webserver itself will not see the request at all.
  • if the login fails, we get a Proxy Serve error with the following stack:

Screenshot 2024-03-07 17.22.08

Is this a known issue and is there a workaround available?
Note that this is not a long request timeout issue.

Check the Chrome developer tools console and network tab for more errors.

Hi Zach!

As I’ve mentioned, there are no direct errors in the dev tools. Only after examining the logs of the websocjet connection itself it shows, that the connection is missing some messages the underlying protocol expects. As mentioned as well, other Zero Trust solutions do not have this issue.

For the Octoprint issue: There are no logs. The Server (Pritunl-Zero) will timeout and afterwards respond with a 5xx. As mentioned aswell, most of the time the page will not load at all. The request does not even get to the underlying webserver.

It appears to me, that there might be an issue with the synchronisation of your goroutines for WS. However i didn’t yet have time to dig into your codebase.

This may be related to a library update. I have a similar recent issue with the WebSocket code in Pritunl Cloud which uses the same code to proxy instance VNC connections between nodes. Once the issue is fixed in Pritunl Cloud both code bases will be updated. The WebSocket proxy code is in pritunl/pritunl-zero/proxy/ws.go.