Imagery data management system (Kubernetes)
The imagery data management system pattern is typically deployed to Kubernetes using the ArcGIS Enterprise on Kubernetes software.
ArcGIS Enterprise on Kubernetes uses microservices and containerization to provide a cloud native architecture, running either on your organization’s Kubernetes platform or in your cloud provider’s Kubernetes service. It uses containers to split GIS processes into microservices, each of which performs a discrete, focused function. Each microservice runs in a container that packages everything necessary to run an application. One or more containers are housed in a pod that includes storage resources, a network identity, and a set of rules for how the container is to be run. The Kubernetes cluster orchestrates and manages the ArcGIS Enterprise on Kubernetes containers.
ArcGIS Enterprise on Kubernetes is best suited for organizations that have invested in Kubernetes to orchestrate and manage their containerized applications.
Related resources:
Base Architecture
The following is a typical base architecture for an imagery data management system deployed on Kubernetes.
This diagram should not be taken as is and used as the design for your system. There are many important factors and design choices that should be considered when designing your system. Review the using system patterns topic for more information. Additionally, the diagram depicted below delivers only the base capabilities of the system; additional system components may be required when delivering extended capabilities.
Key components of this architecture include:
- Imagery data (of many types) stored in object storage, file servers and other sources. The definition of imagery data in this pattern is broad, including traditional raster imagery collected from aerial, drone and satellite platforms, Lidar and other 3D data types, oriented imagery, and multidimensional and scientific datasets. Object storage is especially relevant to this system as it is frequently used as a cost-effective storage system for very large imagery assets, which can perform well when optimized for this storage type.
- A foundational deployment of ArcGIS Enterprise on Kubernetes containers to the Kubernetes cluster. This includes four categories of pods that represent various system functions. For more information see the ArcGIS Enterprise on Kubernetes documentation.
- A load balancer is required to direct traffic across each worker node. For more information see the ArcGIS Enterprise on Kubernetes system network requirements.
- The object store provides ArcGIS-managed storage for uploaded and saved content, hosted tile and image layer caches, and geoprocessing output. As of ArcGIS Enterprise 11.2 the object store can be configured to use cloud-native storage from several supported service providers.
- ArcGIS Online, Esri’s SaaS infrastructure, typically provides basemaps (such as imagery basemaps), reference data (such as places), as well as other location services (including geocoding and search) for this system. Alternatively, it is possible for the organization to host and manage their own location services instead of using Esri’s SaaS system. See the location services system pattern for more information.
- There are several applications commonly used in this pattern for a variety of purposes. These include ArcGIS Pro (desktop), ArcGIS Experience Builder (web), and ArcGIS Field Maps (native). The ArcGIS Portal Website, Map Viewer and Scene Viewer are all used for interacting with, configuring, and analyzing imagery data. Python Notebooks, that are run in a user-hosted environment, ArcGIS Notebook Server, or ArcGIS Pro, are used to interact with, visualize and analyze imagery data, and ArcGIS Reality provides a comprehensive set of tools for processing reality capture data. Learn more about the applications used in an imagery data management system.
Key interactions in this architecture include:
- Client applications communicate with imagery data services as well as location services over HTTPS, typically via stateless REST APIs. This pattern makes heavy use of tiled and dynamic image services, scene services, and various analytical services.
- In this deployment pattern, users generally manage imagery assets on file shares or in cloud object storage in various formats. Image services are published directly from this storage with no copying of data or egress from the storage system. Mosaic datasets and other data models are frequently created in either file geodatabases or enterprise geodatabases. The image hosting services or pods within ArcGIS Enterprise on Kubernetes can host image services published from mosaic datasets along with hosted image services published from user uploads to the portal or raster analytics jobs. Raster analysis jobs (ArcGIS Enterprise 11.2 and later) are used to analyze large sets of imagery and save and publish results back to the system as image services.
Additional information on interactions between ArcGIS Enterprise components can be found in the ArcGIS Enterprise on Kubernetes product documentation.
Capabilities
The capabilities of the imagery data management system on Kubernetes are described below. See the capability overview and comparison of capability support across deployment patterns for more information.
Capabilities used in an imagery data management system, but typically provided by other systems, such as basemaps, geocoding, and other location services provided by a location services system are not listed below. Learn more about related system patterns.
Base capabilities
Base capabilities represent the most common capabilities delivered by imagery data management systems and that are enabled by the base architecture presented above.
- Imagery visualization and analysis allow users to interact with imagery data as a basemap in an application, through dynamic image overlays, by navigating through collections of historic imagery, or collect observations based on a recent drone flight. Enhance imagery through dynamic adjustments, stretching, and changing band combinations. Imagery rendering is optimized to show the requested area of interest and re-apply rendering rules on each pan and zoom. Use geoprocessing tools, algorithms, and functions to analyze imagery data, to assess land use, monitor activity and change, measure damage, and assess environmental factors.
- Data modeling and structuring create standardized approaches to add large sets of data into common data models such as mosaic and LAS datasets, raster products and sensor models, oriented imagery catalogs, or other industry-specific or use case-specific models such as trajectory data. Create catalog datasets and interact with catalog layers of assets in local or networked storage or add items and services from an ArcGIS Online or ArcGIS Enterprise organization. These models help to organize, provide metadata about, and enable the usage of these detailed datasets.
- Imagery data publishing allows users of all types to create and host collections of imagery and other remotely sensed data sources. Publish imagery collections and products as dynamic or tiled services at local or global scale, which can be visualized and interacted with using web, mobile, and desktop applications.
- On-the-fly raster analysis relies on using raster functions and combining a set of those functions into raster function templates to quickly combine bands, compare imagery, and analyze values through collections of images to create a dynamic output image. Raster functions are applied at request-time, are only applied to the requested pixel area, and represent an efficient way to dynamically render imagery without reprocessing an entire dataset.
- Elevation analysis provides capabilities to generate contours, run hydrological models, view and delineate watersheds, and view terrain, slope and aspect renderings of detailed datasets. Complete volumetric analyses by cutting and filling or comparing 3D surfaces and datasets. Combine elevation from different sources, at different resolutions, and prepare a seamless elevation service that can be used for direct display or as the basis for a 3D rendering of a city or regional area. Esri also provides ready-to-use elevation services for visualization and analysis requests.
- Image extraction capabilities allow dynamic and programmatic export and download of source and mosaicked imagery data for use in other applications or as image chips in deep learning workflows. Extraction can provide access directly to the source pixels or create a new, resampled image at a specific resolution for a requested extent. ArcGIS also supports extraction of areas of the World Imagery basemap for use offline in disconnected data access and editing workflows.
- Deep learning and AI are embedded throughout ArcGIS and imagery data management systems. Users can train and run inferencing on deep learning models using imagery assets and local compute resources or scaled across large systems including cloud resources and services. ArcGIS Living Atlas also contains a gallery of pre-trained models that are available for direct use or can be adapted to an organization’s specific workflows, data or geography.
- Multidimensional data can be explored using standard scientific formats such as NetCDF, GRIB, HDF, and Zarr. These data display variables such as change over time or measurements at different atmospheric altitudes or depths. ArcGIS includes dedicated user interfaces in ArcGIS Pro and the ArcGIS Map Viewer to quickly display time slices, build and display complex multivariate symbols, or identify available variables, build a new calculation of your own, or predict variables outside the time extent of the dataset.
- Work in image space and perform image mensuration tasks and visualize imagery as it was captured from the sensor, along with traditional ortho views and stereo viewing. Image space analysis can also be used to collect features, view details without resampling, and prevent distortion.
- Use stereo viewing capabilities to visualize imagery in 2.5D, conduct image mensuration tasks, manually digitize and extract features with high precision and 3D object potential. Used frequently in photogrammetry workflows, stereo editing is primarily available in ArcGIS Pro.
Extended capabilities
Extended capabilities are typically added to meet specific needs or support industry specific data models and solutions, and may require additional software components or architectural considerations.
- Distributed raster analytics jobs can be authored to run raster function calculations across massive imagery holdings in a distributed computing model. These operations may also include inferencing using trained deep learning models, or creating new output data products based on a predefined renderers or calculation. Data from raster analytic jobs is persisted through the image hosting capability of ArcGIS Enterprise.
- Work with oriented imagery of various types, including oblique, bubble or spherical imagery, 360-degree panoramas, street-side, and inspection imagery. These datasets are not traditional nadir images but can have significant value to organizations through workflows like security investigation, asset inspection or data collection. Oriented imagery capabilities in ArcGIS include a structured data model, a dedicated viewer application and support for serving and working with oriented imagery in a variety of applications.
- Support drone operations from fleet management to specific mission planning and on the ground data processing, using an array of web, desktop and mobile apps and tools. ArcGIS Dashboards can be used to monitor collection progress, identify operational issues, and manage reporting to data processing and quality teams.
- Reality mapping incorporates extensive ortho mapping capabilities for high-fidelity product generation. Use drone and other aerial imagery to create full-resolution digital surface models, True Orthos, oriented imagery catalogs, 2D surface meshes, dense 3D point clouds, and photo-realistic 3D meshes. Reality mapping capabilities are available in web, desktop and server-based processing patterns.
- Manage, visualize, and analyze Lidar datasets including a variety of data formats, to understand surface conditions, identify different levels of intensity, layers of return points, extract features, classify point clouds, work with photo-realistic colorization, and create derivative products. Manage large sets of Lidar files as one continuous layer using a LAS dataset.
- Work with synthetic aperture radar by accessing collections of imagery from SAR sensors and platforms. ArcGIS includes SAR-specific raster types, raster functions and visualization approaches that support this unique and powerful data type.
- Work with Spatio-temporal Asset Catalogs (STAC) to connect to existing catalogs of imagery and search, filter, and parse records to identify the proper data for a project. Use the STAC connection and search experience in ArcGIS Pro, the
arcpy
Python module and the ArcGIS API for Python to query public and private STAC catalogs and directly access assets through cloud data connections.
Considerations
The considerations below apply the pillars of the ArcGIS Well-Architected Framework to the imagery data management system pattern on Kubernetes. The information presented here is not meant to be exhaustive, but rather highlights key considerations for designing and/or implementing this specific combination of system and deployment pattern. For more information on the pillars of the ArcGIS Well-Architected Framework, see architecture practices.
Reliability
Reliability ensures your system provides the level of service required by the business, as well as your customers and stakeholders. For more information, see the reliability pillar overview.
- SLAs requiring high levels of availability are common.
- Architecture profiles are predefined deployment profiles that correlate to varying levels of redundancy across pods and provide flexibility across several known variables such as requirements for hardware, redundancy, and organizational use.
- Consider the Enhanced availability architecture profile when increased and expanded redundancy across critical pods is required.
- System-level backup and restore is also supported.
Security
Security protects your systems and information. For more information, see the security pillar overview.
- Authentication and authorization are required for managing imagery. It is also common for imagery outputs to be secured, requiring authentication and authorization for access.
- User access and data collaboration are governed by role-based access controls and modern authorization and authentication models, including OAuth, SAML, and multifactor authentication.
- Privileges are carefully managed to ensure that only properly trained and provisioned users are granted image hosting capabilities (the ability to create services). These privileges can be enabled or disabled by assigning a custom role to users or changing their role to another existing role.
- Imagery data in cloud storage can be accessed using cloud-native security concepts like AWS IAM and Azure Managed Identities.
Learn more about ArcGIS Enterprise security best practices and implementation guidance.
Performance and scalability aim to optimize the overall experience users have with the system, as well as ensure the system scales to meet evolving workload demands. For more information, see the performance and scalability pillar overview.
- Scaling in this pattern is not handled automatically and should be carefully planned.
- Scalability is an important design consideration, as imagery data management systems are typically used heavily within an organization. Additionally, usage may increase quickly and unexpectedly as the overall adoption of GIS grows across an organization. ArcGIS Enterprise on Kubernetes deployments can be scaled horizontally by adjusting the number of pods as well as vertically by adjusting the memory and CPU. ArcGIS Enterprise on Kubernetes also provides robust, flexible scaling options for services. Learn more about service scaling.
- Scaling of image hosting is usually not required as tiled imagery requests are light weight in nature. Scaling of raster analytic sites is dependent on available compute resources and the type and frequency of analysis that is needed.
- The source definition and structure of the imagery needs to be as carefully considered in this system pattern, as the image services connect directly to source imagery. An inefficient file type or storage pattern can have impacts on service performance.
Automation
Automation aims to reduce effort spent on manual deployment and operational tasks, leading to increased operational efficiency as well as reduction in human introduced system anomalies. For more information, see the automation pillar overview.
- Imagery data management often involves automation, typically using Python. This is most commonly done using the ArcGIS API for Python, which can work with imagery services and data, and can be used to automate updates, changes to imagery, or other workflows.
- System administration automation is handled in large part by Kubernetes.
- ArcGIS Enterprise on Kubernetes includes support for Helm-based deployment and configuration.
Integration
Integration connects this system with other systems for delivering enterprise services and amplifying organizational productivity. For more information, see the integration pillar overview.
- ArcGIS Enterprise-based imagery data management systems often integrate with other imagery management systems, including content providers who share image services that can be registered in ArcGIS Enterprise.
- The image service outputs from imagery data management systems are also commonly integrated into other systems across an organization’s enterprise, and therefore may also support business operations that are unknown or unavailable to systems administrators.
Observability
Observability provides visibility into the system, enabling operations staff and other technical roles to keep the system running in a healthy, steady state. For more information see the observability pillar overview.
- ArcGIS Enterprise on Kubernetes can be observed in a variety of ways including system logs and health monitoring through ArcGIS Enterprise Manager. Monitoring of system availability, performance, and usage is most critical to this system pattern. In addition to monitoring the ArcGIS Enterprise software, it is important to monitor all supporting components and infrastructure such the Kubernetes environment, databases and other data stores, as well as compute, network, security, and other infrastructure. Learn more about monitoring system health and reliability.
- The delivery of imagery services to the whole organization (and possibly beyond) may lead to usage patterns and growth not anticipated by the system designers or operators. Monitoring helps people make decisions about when to scale and evolve to meet demand while continuing to operate properly (and in accordance with SLAs).
- Additional observation of user logins and account changes may be possible through the configured identity provider when using SAML and/or OpenID Connect logins.
Other
Additional considerations for designing and implementing imagery data management systems include:
- Successful operation requires strong understanding of GIS, IT, and database concepts as well as technology. This includes knowledge and skills specific to the selected storage system(s), imagery data types and formats, as well as Kubernetes.
- For organizations that have the resources and staff to deploy and maintain enterprise software on Kubernetes, the ArcGIS Enterprise on Kubernetes deployment option separates IT administration and maintenance from GIS administration.
- Data governance and alignment with IT policies and roles, such as data steward and database administrator, should strongly be considered when implementing this system pattern.
Related resources: