Proxying my Homelab (Part 1) - Options, Options, Options

2024-06-13

When I began writing this post, it was supposed to be a quick run-through of how I proxy external traffic into my homelab. During the process though I've been answering questions from other homelabbers on the best way to achieve this for their setup, and have been benchmarking my own implementation to try and optimize it for my particular network. I'll do my best to be brief while laying out all the options and testing that I've been going through for the past year.

In part one, I'll discuss why it might be desirable to allow the public internet to reach parts of a homelab, talk about several options for doing so, and discuss various tradeoffs for each options. Future parts will cover my chosen implementation, along with configuration details and benchmarking.

With that bit of housekeeping out of the way, let's get started.

Let the Internet In? Are You Mad?

Do I really need to let the whole internet walk through my front door? It sounds a bit hyperbolic, but in broad terms that's fundamentally what we're discussing today.

This isn't the case for everyone, but many homelabbers like myself run a variety of selfhosted services either for themselves, their family and friends, or the internet at large. Whether its to reduce dependence on cloud providers for services critical to modern life (file storage, email, web hosting, etc) or hosting something more fun (like a game server) these services require one critical component: users. The user pool may contain humans and/or robots (read: other computers), but these users will all need to access the service from somewhere.

That from somewhere portion is critical. While most selfhosted services only need to be accessible to those on a particular network, some services may need to be accessible from the internet at large. There is an inherent security risk in exposing a service to the internet, but that trade-off can be worth it.

Here's an example of a couple services I run, along with the type of access they require:

  • Publicly Accessible: Nextcloud
    • I host a Nextcloud instance that allows me to both store files and share them with others. To facilitate sharing, I must be able to send links to those who aren't on my VPN or local network.
  • Accessible over VPN: Home Assistant
    • Home Assistant logs environment data from several temperature/humidity probes for historical purposes. I may want to check the temperature of my lab's closet remotely, but the world does not.
  • Local Network Access Only: dhcpd
    • I run a dhcpd server at each of my sites, and this service should only ever be accessed by devices on the networks it controls the scopes for. It offers no features to a client on another network.

We're going to tackle the Publicly Accessible use-case in this series.

Comparison of Options

Now that we've established why something running in a homelab might need to be accessible from outside the home network, let's review some options for getting traffic into the network.

  • Don't!
  • Run a VPN, and connect to the service securely wherever I am.
  • Run a reverse proxy (like nginx or HAProxy) inside my network & port-forward on my router
  • Use Cloudflare Tunnels to proxy traffic into my network via the Cloudflare network
  • Run the service on a cloud provider, duh!
  • Run a reverse proxy on a cloud provider, and tunnel back to my network from there

I'll break down each of these in turn, but I personally chose to run a reverse proxy on a cloud provider's VPS and tunnel the traffic back to my network.

Don't!

This one's pretty self-explanatory. Selfhosted services don't have to be exposed to the internet! In fact most services probably shouldn't be exposed to the intenet at all. There are serious security risks involved in letting public, untrusted traffic hit your services. If you only ever need access from home, then feel free to keep it that way.

Host a VPN

If you (or a small group of trusted users) need access to your lab from anywhere, a VPN should be your first stop. While many of the public, third party "VPN" providers like Mullvad or NordVPN offer some kind of remote access solution, I'm referring to the classical definition of a Virtual Private Network that your "remote" roving clients (like a laptop or phone) and your servers at home are connected to. VPNs create a secure overlay network across the internet where clients can talk with each other as if they were directly connected.

There are many options for this, but I won't go in-depth here. I'll list a couple options

  • Create a Wireguard network
  • Host an OpenVPN server
  • Use a third party mesh network manager like Tailscale
  • Create some fancy network topology using Nebula (cool project, mostly geared towards businesses)

Use Cloudflare Tunnels

If you really need to expose a service to the public internet, a popular option among homelabbers is Cloudflare Tunnel. Cloudflare Tunnel a reverse-proxy and tunneling product from CLoudflare for exposing services to the internet via their network.

This has some inherent benefits, as Cloudflare has PoPs (point of presence) in datacenters all over the world, offers a seemingly robust CDN (content delivery network) with DDoS protection, and tunnels are easy to configure. Plus, any attack against you is now an attack against Cloudflare, and they have a vested interest in mitigating that for you.

However, I have a few reasons against using them:

  1. If you're not an existing Cloudflare customer (like me), there's no automatic tie-in with their other services like DNS
  2. To use Cloudflare's rule engine and WAF (web application firewall), you must use their TLS certificate between them and your clients. That means they can see all the unencrypted traffic bound for your tunnel. See this reddit comment on r/homelab for a nice explanation.
    • Yes, you can use Cloudflare Tunnels without their WAF/proxy and host your own proxy at home with your own certificates. At that point, CF is just sending you the raw TCP connection, and you don't benefit from their cloud-side attack mitigations.
  3. Up until 2024/05/16 it was actually against the Cloudflare ToS to:

[...] use the Services for serving video or a disproportionate percentage of pictures, audio files, or other non-HTML content [...], unless purchased separately as part of a Paid Service or expressly allowed under our Supplemental Terms for a specific Service.

The ToS restriction was the main reason I didn't go with Cloudflare tunnels. I'm planning to deploy some video and photo based services that need public access in the future, so if I was to breach the ToS I'd be opening myself up to unexpected downtime/shut-offs of my tunneling system. However, if the removal of this ToS stipulation doesn't bother you, and you're comfortable letting Cloudflare inspect your traffic for their WAF and proxy services, it may be a good option for you.

Run the Service on a Cloud Provider/VPS

Instead of running the service at home, running it on a cloud provider and having no tie-back to home is a totally valid option for some projects. It removes the potential security risk of opening up a home network to the internet, and the service wont go offline in the case of a local blackout or internet outage. However, it can be cost-prohibitive, especially if you need more CPU power or large bulk storage (for something like a media server), so it may or may not be viable depending on the specific use-case.

Run a Reverse Proxy at Home and Port Forward

Running a reverse proxy like HAProxy or Nginx is a common and good practice in the selfhosted world, as it provides a single convenient place for HTTPS certificate storage, TLS termination and access logging/blocking of potentially malicious clients. Even if you're not running a reverse proxy for the express purpose of handling external traffic, its a good idea. In this context though, to allow external traffic to reach an internal service requires creating a port-forward on your router to send traffic from a specified port (like 80/443 for http/s traffic) to your reverse proxy for evaluation and routing. This can be a good option if you have a static IP assigned by your ISP or when using a dynamic DNS service to continually point your DNS records to your service.

However, the act of having common ports open on the internet will be noticed by crawlers from companies like Google or Microsoft and internet search engines like shodan.io. If you don't want your home's public IP associated directly with the fact that you're running web services there, that may be an issue. Another common issue is ISPs may unilaterally block ports (like 25 for SMTP) for residential connections, so that may prevent this from being a viable option in the first place.

Run a Reverse Proxy on a VPS, and Tunnel Back

Just like running a single reverse proxy at home, this involves running a reverse proxy like Nginx or HAProxy, but this time on a cloud VPS. Traffic is then tunneled via any number of tunneling protocols such as OpenVPN, Wireguard or ssh. The reverse proxy could connect to a second reverse proxy on the other side of the tunnel, or directly to the services that need to be exposed.

This partially mitigates the danger of having ports wide-open on a home router, depending on the chosen configuration. This option also has the most complexity, as it requires securing both the VPS and the proxied services/network manually.

Several options for transport layers/implementation include:

  • boringproxy - a Cloudflare Tunnel-like combination of nginx and ssh, with automatic certificate provisioning
  • Reverse proxy with a point-to-point wireguard tunnel
  • Reverse proxy with an ssh tunnel
  • Reverse proxy connected to OpenVPN

What's Next?

There are many options for bringing external traffic into a homelab or private network environment, and I've barely scratched the surface here. It's important to evaluate your particular situation and select a method that makes sense for you.

In the next post, I'll talk about the implementation that I've been running for the last year, and then dive into optimizations, changes, and upgrades.