Load balancers and reverse proxies

Load balancers and reverse proxies are both common technologies in modern, web-based applications and architectures, and can be deployed independently or in a combined configuration. In this section, an overview of these technologies is provided, along with comments on relevant use cases in ArcGIS systems and the relevant capabilities that these technologies provide.

  • A load balancer is generally a hardware or software component that distributes client requests or workloads across multiple computing resources, such as physical servers, virtual servers, or clusters. Load balancing helps to balance system utilization, reduce risk, simplify service delivery and growth, and improve backend server security.
  • A reverse proxy is a software or hardware component that accepts HTTPS requests, such as those from an external client, and routes those requests to the correct internal resources, URLs or services. Reverse proxies are often used to route external requests that arrive at one URL to a different internal URL or system which would not otherwise be accessible to that external client. Reverse proxies are also frequently used to pass traffic from known frontend network ports, such as HTTPS requests to port 443, to other private ports on a backend system. Reverse proxies ensure that there is a layer of security, obfuscation, and potentially request filtering logic between the client making the request and the backend server that is responding to the request.

Implementation patterns

Many ArcGIS systems implement load balancing in combination with reverse proxy usage. Common implementation patterns include:

  • Load balancing the client requests sent to user-facing components of ArcGIS Enterprise, such as the ArcGIS Enterprise portal and federated ArcGIS Server sites, including ArcGIS Image Server, Workflow Manager Server, or others
  • Load balancing user requests for static application content such as web application HTML, JavaScript, and CSS resources across multiple web servers
  • Load balancing requests that are received over HTTPS and sent to port 443 to a backend service that is listening for HTTPS requests on port 6443, such as ArcGIS Server.

Esri provides an installable ArcGIS software component, the ArcGIS Web Adaptor, to provide load balancing and reverse proxy functionality in ArcGIS Enterprise deployments. The Web Adaptor is not the only option for architecting ArcGIS Enterprise systems using a load balancer and reverse proxy, but it is a common pattern and provides a solution that is supported by Esri Technical Support. An external hardware or software load balancer may also be used, depending on preference and technical requirements. The type of workload and frequency of demand will help to inform which type of load balancer may be appropriate for your needs. See Multiple-machine deployment with third party load balancer for guidance related to your specific ArcGIS version and operating system.

To achieve a high availability configuration, the Web Adaptor can be configured with one machine in a highly available site, and includes contains logic to detect the other machines in the site, monitor their status and forward requests to them when they are healthy. This logic is a significant implementation advantage of including the ArcGIS Web Adaptor in a system design, as it can handle the changes to an ArcGIS Enterprise deployment more easily than a manually-configured health check or load balancer setting.

External implementations

While the Web Adaptor can provide load balancing and reverse proxy functionality in many scenarios, many enterprise systems built with ArcGIS also implement additional, external software or hardware components to achieve the functional goals of these two technologies. Examples include:

  • Web server software such as Apache Tomcat, Apache httpd or Nginx, can be used to reverse proxy requests from clients to ArcGIS endpoints.
  • Cloud-native load balancers such as the Azure Application Gateway, Amazon Elastic Load Balancer or Application Load Balancer, or Google Cloud Load Balancer can be used.
  • Other dedicated load balancing and proxy systems, such as F5 BIG-IP, Citric NetScaler, or Barracuda, which are often hardware appliances that provide load balancing and reverse proxy services for many different applications, while also providing support for authentication, intrusion detection, web application firewall-like functionality and other use cases.

As organizations have moved from traditional, on-premises architectures, where a demilitarized zone or DMZ was most often used to provide internet-based clients with access to services and endpoints, to cloud architectures where cloud-native services provide this functionality in most cases, the considerations and architecture practices around reverse proxies and load balancing have also changed.

Advantages

These technologies provide advantages that include enabling scalability, high availability, and security.

Scalability

The ArcGIS system can scale to support both small and large deployments. To accommodate growing deployment sizes, ArcGIS can leverage many load balancing techniques and technologies. Load balancing algorithms, used to dispatch client requests, can vary from simple round robin approaches to more complex algorithms that consider factors such as current connection counts, host utilization, or real-world response times.

A load balancer contributes to performance by distributing load to multiple machines and appropriate resources. In the high availability example, the ArcGIS Web Adaptor load balancers distribute GIS server requests between two ArcGIS Enterprise hosting servers. If your system has many users conducting similar workflows that require significant amounts of processing, such as an editing workflow that includes long, processor-intense transactions, a load balancer can ensure these requests are balanced across available hardware, while workflows that include highly variable requirements for processing may benefit from workload separation instead of load balancing.

load-balancing-1.png

High availability

Load balancing also enables high availability configurations of ArcGIS by distributing requests across a site or cluster of machines that share the same role. For example, when high availability is configured in a multi-machine ArcGIS Enterprise deployment, the load balancer (ArcGIS Web Adaptor) alternates sending requests to the primary and secondary ArcGIS hosting servers as shown.

The reverse proxy in this case provides the single point of entry – a single IP address that obfuscates the system from the end user. The reverse proxy directs traffic to the ArcGIS Web Adaptors following its algorithm, which is typically round robin, but could be more robust. Like ArcGIS Web Adaptor, many load balancers monitor server health or availability and remove unhealthy or unavailable machines from the distribution list, providing resiliency to your system.

The method and efficiency of this capability varies by load balancer. Some hardware load balancers have rich algorithms that enable them to adjust on the fly based on load or response time. Others work on a simple round robin list, where the list may be updated periodically to reflect only healthy machines.

load-balancing-2.png

Increased security

A reverse proxy will generally expose a single IP address to the internet or intranet for a particular system and distribute requests to the right resource. This greatly reduces security risks because the internal topology of the network and systems is hidden, and the number of breach points is reduced in case of attack. This method also simplifies service delivery and consumption by providing a single access point to the system. Most organizations choose to set up a proxy server, so the site is not exposed directly to clients.

A reverse proxy server can be configured to communicate either directly with ArcGIS Server or through the ArcGIS Web Adaptor by adding the corresponding URLs to the proxy directives.

For more information, see Configure a reverse proxy with ArcGIS Server.

Considerations and Recommendations

General recommendations for implementing load balancing and reverse proxies include:

  • Implement load balancing to distribute client workload traffic across multiple computing resources to enable scalability and high availability with ArcGIS.
  • In most ArcGIS Enterprise deployments, the ArcGIS Web Adaptor is a recommended and often required architecture component, which provides both load balancer and reverse proxy capabilities in a single, easy to deploy application. The ArcGIS Web Adaptor also provides a straightforward method for implementing web-tier authentication for ArcGIS Enterprise.
  • If you have advanced load balancing requirements, consider implementing an external load balancer that provides the capabilities you need, in addition to ArcGIS Web Adaptor.
  • When deploying a public-facing system, use a capable reverse proxy to provide additional security to your ArcGIS deployment.
Note:

For many other applications or systems, terminating TLS at the load balancer is a common application architecture. For ArcGIS systems, TLS termination is generally not recommended, and in the case of Portal for ArcGIS, is not supported. Maintaining end to end TLS connections is a stronger security posture and ensures that user requests and data are not visible to an attacker within the network.

Load balancing recommendations

When implementing any load balancing technology with an ArcGIS system, consider the following strategies:

  • ArcGIS software components include various “health check” URLs that can be used for the health check requests that most load balancers send to identify whether to keep a server in the “target pool” of that load balancer. healthCheck requests should only respond with a positive response if the system is properly running and able to receive requests.
  • Most load balancers can operate in several different modes, including a round-robin mode, which sends requests randomly to servers in the pool of available backends. Other, more “intelligent” load balancing approaches may take into account request type or size, server load based on compute or network traffic, or other factors.
  • Understanding how user requests translate to server load can be a complex and subjective topic. Consider the benefits of more intelligent load balancing approaches carefully when departing from the normal round robin approach.
  • Session stickiness is a common load balancer setting that uses either source IP address or a client cookie to ensure that requests from the same user (as perceived by the load balancer) are sent to the same server. This functionality can be convenient in scenarios where multiple, siloed ArcGIS Server sites or functionality are placed behind the same load balancer and ensuring that user requests are returned to the same server can have functional and performance benefits.

Reverse proxy recommendations

Reverse proxies are often implemented as part of a load balancer, as many load balancers send requests to a different backend port or service than the frontend request originates on. When implementing a reverse proxy (along with or separate from a load balancer), consider some of the following design recommendations:

  • Use consistent URLs. ArcGIS systems are highly self-referential, with content like applications, maps or services that reference parts of the system using URLs to items, services, or other components. These URLs are saved into the configurations, so using consistent URLs contributes to system stability and the portability of content. Changing the external URL of a system is a disruptive process that is not supported in the current software.
  • Some reverse proxies may default to using multiple URL parts to reference to the ArcGIS system, such as:

    • https://centralhost.domain.com/systems/gis/rest/services

Note that using multi-part URLs like this is not supported with ArcGIS Enterprise components, the “web context” or the web adaptor name must be the first URL component following the hostname in the URL.

  • ArcGIS works best with transparent reverse proxy configurations, where requests are sent to the backend without inspection or a layer of authentication or validation of requests. Other reverse proxy methods that intercept, adjust, or affect traffic may introduce functional and performance issues with ArcGIS systems and should be carefully considered before implementation.
Top