Securing public data

When an application, map, or web service is made available through a public (internet-facing and anonymously accessible) website or application, usually to public users, a common question is how this access can be properly secured. Organizations want to provide easy access to users, such as the general public, by not requiring authentication or an established account. Understandably, they want to be sure that data is not misused, improperly accessed, or retained by these end users.

In most web-based workflows, a client (often a web browser) requests information from a web server, generally as a map image, a set of spatial features and attributes, or a raster or vector map tile. These requests result in a response that is sent back to the client, which moves this partial or full representation of the data to the user’s client device, where it is used to render an indicator or image or provide the desired map experience. Considerations for data access are provided along common service types.

Map image layers

When accessing map image layers, also known as dynamic map services, the response image contains only pixels, which is a representation of the data with a specific cartographic configuration. It is generated on a server and returned to the client as an image. Map services can be configured to only allow this type of dynamic map image. This is done by disabling the Query capability of the map service. This will prevent any individual data records from being shared to the client. This configuration can reduce functionality, however, and should only be considered when simple map display is sufficient.

Feature layers

Feature Layers can be created from map or feature services. They differ from map image layers as they require the feature data, including its geometry and attributes, to be returned from the server to the client and are rendered in the client browser. Any publicly accessible feature layer is continually sending chunks of data to the client, which may be a request for a certain geographic extent or set of features from the server at a time. When an application makes use of a feature layer, the data is downloaded by the client in order to render it properly. Since feature layers rely on this transfer of data to the client, requirements such as “I want to ensure users don’t download data” are functionally incompatible with an app or workflow that is reliant on feature layers.

Image services

Image services allow users to dynamically request a specific extent to generate an image. The original image resolution can return an actual raster representation of data up to the maximum size of image that the service allows. These images are also downloaded to the client, such as a web, desktop, or mobile client, and are in effect sharing that subset of data directly to the client. Image services also support a download capability, that is not enabled by default, which can allow download of the original image or a converted format. This capability should generally be disabled unless specifically required for functionality in an app.

In the above service types, only in the map image layer scenario can data truly be restricted to user access. As the means of transmitting the data is through a rendered image, no attributes or specific geometries are available.

Methods to secure public access

When data and services are made public, there are still several methods by which some level of restriction can be applied to the service, to reduce the number of users or public exposure.

Referrer-based limiting

Referrer-based limiting is one common pattern for restricting data usage. This requires either a reverse proxy or other type of proxy which can inspect individual HTTP requests to the service and reject those that do not provide a Referrer HTTP request header with a value that matches an allow list. By convention, all browsers generate and send this request header from applications that request resources from another domain, so it is a reliable method to control browser-based traffic.

This can be an effective method of discouraging some misuse, but does not secure the data completely, as a referrer header can easily be spoofed by a motivated party or added to a scripted or programmatic request to allow the request to pass through the proxy. ArcGIS Location Platform API keys can be scoped to only support specific domains in the referrer header and similarly limited to being used in an application on that domain or by a motivated party who can spoof that header value.

Source IP range

Restricting requests based on a source IP range is another option to secure access. It can allow a resource to be requested anonymously, but only when the originating IP for the request is within an expected range like a local area network or NAT-based external IP for an organization. These restrictions can be applied at a web server, load balancer, or reverse-proxy level. IP-based filters can also be useful for filtering access from certain countries or geographies, or for allowing access from a particular system with a fixed external IP address.

API key

Using an API key is another common method of restricting re-use of a public layer. This involves adding an API key to each request, which is validated by an API Management or reverse proxy layer and rejects requests that do not include a valid API key. While this can prevent simple misuse, it is relatively easy for a motivated user to identify the API key being used and re-use that to make requests in their own application or service. API keys are sent from the browser or client application to the server, so they are always client-visible and can be potentially extracted and reused for other purposes. The best method for mitigating this risk is to enforce allowed referrers on a per-key basis and monitor keys for unexpected or malicious use.

Summary

In summary, when the desired audience of an application or service is the broader public, it is best to consider all data accessible in that application to be fully public. Most of the restrictions suggested above can be worked around, and a motivated party can generally find a way to extract a copy of the data for reuse. Carefully managing access, disclaimers, and making data available through public data sites, are all methods for mitigating this concern about sharing data publicly, and through these practices some concerns about improper use may be partially mitigated.