Architecture: Critical Sequences

Firezone is a distributed system with many moving parts, but some parts are especially critical to the integrity of the entire system:

Authentication: User authentication, usually with an identity provider.
Policy evaluation: Deciding whether to allow or deny a connection request.
DNS resolution: Resolving DNS-based Resources.
High availability: How Firezone achieves high availability through load balancing and automatic failover.

These will be explained in more detail below.

Authentication

Firezone authenticates users using two primary methods:

Email (OTP): Users receive a one-time password (OTP) via email.
OpenID Connect: Users authenticate with an identity provider that supports OpenID Connect (OIDC).

The authentication process for each is similar. Both methods begin the authentication process at your Firezone account's sign in page: https://app.firezone.dev/<your-account>.

However, the OIDC flow redirects the user to the identity provider for authentication before the final redirect back to Firezone.

Here's how the authentication flow works:

Firezone authentication sequence diagram

User clicks Sign in from the Client.
The Client generates random 32-byte state and nonce values. These are used to prevent certain kinds of forgery and injection attacks.
A browser window opens to your account's sign in page, https://app.firezone.dev/<your-account> containing the nonce and state parameters.
The user chooses which authentication method to use. If OIDC, the user is redirected out to the identity provider.
After successfully authenticating, the user is redirected back to the admin portal.
The admin portal mints a Firezone token created from the nonce parameter and other information.
The admin portal issues a final redirect to firezone-fd0020211111://handle_client_sign_in_callback with the token and state parameters from the initial request.
The Client receives this callback URL and validates the state parameter matches what it originally sent. This prevents other applications from injecting tokens into the Client's callback handler.
The Client saves this token in a platform-specific secure storage mechanism, for example Keychain on macOS and iOS.
The Client now has a valid token and uses it to authenticate with the control plane API.
The authentication process is complete.

Policy evaluation

Policy evaluation is the process the Policy Engine uses to decide whether to allow or deny a connection request from a Client to a Resource.

If the request is allowed, connection setup information is sent to the Client and the appropriate Gateway. If the request is denied, it's logged and then dropped. This ensures that Clients are only connected to Gateways that are serving Resources the User is allowed to access.

Connections in Firezone are always default-deny. Policies must be created to allow access.

Here's how the process works:

Firezone policy evaluation sequence diagram

The User attempts to access a Resource, e.g. 10.10.10.10.
The Client sees the request and opens a connection request to the Policy Engine.
The Policy Engine evaluates the request against the configured Policies in your account based on factors such as the Groups the user is a part of, which Resource is being accessed, and so forth. If a match is found, the connection is allowed. If no match is found, the connection is dropped.
If the connection is allowed, the Policy Engine sends the Client the WireGuard keys and NAT traversal information for the Gateway that will serve the Resource.
The Policy Engine sends similar details to the Gateway.
The Client and Gateway establish a WireGuard tunnel, and the Gateway sets up a forwarding rule to the Resource.
Connection setup is complete. The User can now access the Resource.

Since the Client only receives WireGuard keys and NAT traversal information when a connection is allowed, it's not possible for a Client to exchange packets with the Gateway until explicitly allowed by the Policy Engine.

This means Gateways remain invisible to the outside world, helping to protect against classes of attacks that perimeter-based models may be susceptible to, such as DDoS attacks.

DNS resolution

Secure DNS resolution is a critical function in most organizations.

Firezone employs a unique, granular approach to split DNS to ensure traffic intended only for DNS-based Resources is routed through Firezone, leaving other traffic untouched -- even when resolved IP addresses overlap.

To achieve this, Firezone embeds a tiny, in-memory DNS resolver in each Client that intercepts all DNS queries on the system.

When the resolver sees a query that doesn't match a known Resource, it operates in pass-through mode, forwarding the the query to the system's default resolvers or configured upstream resolvers in your account.

If the query matches a Resource, however, the following happens:

The resolver generates a special, internal IP from the range 100.96.0.0/11 or fd00:2021:1111:8000::/107 and stores an internal mapping of this IP to the DNS name originally queried. The IP is returned to the application that made the query.
When the Client sees traffic for the IP, it triggers connection setup. The Client sends a request to the Policy Engine for evaluation. If the request is allowed, the Policy Engine finds an appropriate Gateway to route the traffic.
If the Policy Engine approves the Client's request, it sends the Client the WireGuard keys and NAT traversal information for the Gateway that will serve the Resource.
Once the connection is established (and every time an application re-queries a DNS name), the Client sends a message using Firezone's p2p control protocol to the Gateway, triggering a DNS query using the Gateway's system resolver for the DNS name. The result of this DNS query is stored on the Gateway in a lookup table and cached for 30 seconds.
Traffic from the application flows, and the Gateway translates the internal IP back to the actual IP address of the Resource, forwarding the traffic accordingly.

This is why you'll see DNS-based Resources resolve to IPs such as 100.96.0.1 while the Client is signed in:

> nslookup github.com

Server: 100.100.111.1
Address: 100.100.111.1#53

Non-authoritative answer:

Name: github.com
Address: 100.96.0.1

Notice in the above process that at no point does the Client's system resolver see the actual IP address of the Resource. This ensures that your DNS data remains private and secure.

For a deeper dive into how (and why) DNS works this way in Firezone, see the How DNS works in Firezone article.

Why Firezone uses a mapped address for DNS Resources

This is a common source of confusion among new Firezone users, so it's helpful to explain why Firezone uses mapped IPs for DNS Resources instead of simply using the actual resolved IP.

Consider the case where two DNS Resources resolve to the same IP address, such as when Name-based virtual hosting is used to host two web applications on the same server:

gitlab.company.com resolves to IP 172.16.0.5
jenkins.company.com also resolves to IP 172.16.0.5

Remember that routing happens at the IP level. We can't independently route packets for the same IP to two different places. If Firezone used the Resource's actual IP address to route packets, the User would be able to access jenkins.company.com if they were granted access only to gitlab.company.com.

Using mapped IPs allows Firezone to securely route DNS Resources no matter how many other services share the same IP address.

High availability

Firezone was designed from the ground up to support high availability requirements. This is achieved through a combination of load balancing and automatic failover, described below.

Load balancing

When a Client wants to connect to a Resource, Firezone randomly selects a healthy Gateway in the Site to handle the request. The Client maintains the connection to that Gateway until either the Client disconnects or the Gateway becomes unhealthy.

This effectively shards Client connections across all Gateways in a Site, achieving higher overall throughput than otherwise possible with a single Gateway.

Automatic failover

Two or more Gateways deployed within a Site provide automatic failover in the event of a Gateway failure.

Here's how it works:

When the admin portal detects a particular Gateway is unhealthy, it will stop using it for new connection requests to Resources in the Site.
Existing Clients will remain connected to the Gateway until they themselves detect it to be unhealthy.
Clients identify unhealthy gateways using keepalive timers. If the timer expires, the Client will disconnect from the unhealthy Gateway and request a new, healthy one from the portal.
The Client keepalive timer expires after 10 seconds. This is the maximum time it takes for existing Client connections to be rerouted to a healthy Gateway in the event of a Gateway failure.

By using two independent health checks in the portal and the Client, Firezone ensures that temporary network issues between the Client and portal do not interrupt existing connections to healthy Gateways.

Need additional help?

See all support options or try asking on one of our community-powered support channels:

Discussion forums: Ask questions, report bugs, and suggest features.
Discord server: Join discussions, meet other users, and chat with the Firezone team
Email us: We read every message.

Or try searching the docs: