To successfully build and operate enterprise systems, an organization must have an enterprise IT monitoring and response framework that applies to those systems. Proactive monitoring is as important as reactive problem-solving, and effective telemetry capture key to point in time awareness as well as identifying system performance trends. ArcGIS-specific and system telemetry should be integrated into existing enterprise IT monitoring patterns wherever possible, to ensure that IT staff can have holistic system awareness.
Health, reliability and performance monitoring strategies for working with ArcGIS systems can vary depending on whether your system is deployed with a SaaS foundation, with ArcGIS Enterprise on Windows or Linux, using Kubernetes, ArcGIS Location Platform or a hybrid of these options. Some recommendations and options for monitoring in each scenario are described below.
The level of monitoring and telemetry that is available for the ArcGIS software components of a system first depends on the deployment architecture:
Regardless of the deployment type, it’s crucial to define clear monitoring objectives, set up relevant performance metrics, and establish meaningful alert thresholds to support effective and proactive management and optimization of your ArcGIS Enterprise environment.
Alongside monitoring the ArcGIS components, it is important to also be aware of, and consistently monitor the hardware signals that are available for the system, which can vary based on the deployment pattern, system hosting configuration, and operating system.
All operating systems provide a method for monitoring key system indicators, such as CPU utilization (overall and by-process), memory usage (physical and virtual), disk utilization (storage available, disk I/O), or network usage (bandwidth and total transfer). These statistics can all be used to establish a regular baseline for a system, monitor for deviations from the baseline, and then be used to correlate any system issues or outages to measured changes to the hardware utilization of a system. Some organizations may already have an enterprise monitoring solution or software component that collects information on these metrics, and where that is broadly used, it is generally a best practice to continue using that approach as there will be an economy of scale with user training, alerting and experience.
ArcGIS Monitor is an enterprise-grade monitoring solution that works with ArcGIS Enterprise by providing information about system health, usage, and performance. It captures key metrics and attributes to quantify system health over time, offering real-time and historical performance metrics for various components, including ArcGIS Server, Portal for ArcGIS, and ArcGIS Data Store. ArcGIS Monitor can help to identify performance bottlenecks and issues, allowing for proactive management and optimizing your on-premises deployment, as well as notify system administrators when a component stops responding or reports an unexpected response to a standard query. This process can assist with rapid response to system issues but also with root cause analysis to identify the conditions that contributed to an outage or failure once it has been resolved. ArcGIS Monitor is the only monitoring solution that effectively combines ArcGIS metrics such as instance usage with hardware and performance monitoring metrics such as memory pressure or network saturation, providing a comprehensive view of ArcGIS-specific performance or stability issues.
ArcGIS systems deployed on-premises to a virtual machine host or private network, using ArcGIS Enterprise for Windows, Linux or Kubernetes, have additional monitoring considerations and options, including:
Systems deployed in AWS can make use of Amazon-specific monitoring tools, which are robust and designed to assist with the monitoring of critical systems build on AWS. These tools include:
Systems deployed in Azure can take advantage of existing monitoring tools or approaches provided by Microsoft directly to Azure users and customers, including:
GCP-specific monitoring tools and options include:
In all cloud environments, it is essential to configure monitoring and alerting based on your specific needs and the resources in use. Cloud-native monitoring services typically offer integration with alerting mechanisms to notify you of performance issues in real-time.