Monitor system health and reliability

Any organization must have an enterprise IT monitoring and response framework in order to successfully build and operate enterprise systems. Proactive monitoring of systems is as important as reactive problem-solving, and effective capture of telemetry provides awareness of the system at any given time and identifies trending system behaviors. ArcGIS and system telemetry should be integrated with existing enterprise IT monitoring and viewed as a standard IT system. This integration is also essential to ensure that IT staff can have holistic system awareness.

Health, reliability and performance monitoring strategies for working with ArcGIS systems can vary depending on whether your system is deployed with a SaaS foundation, using ArcGIS Enterprise on Windows or Linux, using Kubernetes, ArcGIS Platform or a hybrid of these options. Some recommendations and options for monitoring in each scenario are described below.

Monitor ArcGIS deployments

The level of monitoring and telemetry that is available for a system first depends on the deployment architecture:

  • ArcGIS Online, as a SaaS offering, does not support observation of its underlying infrastructure and software internals. It does; however, offer ways to observe system utilization and health. Organizations can monitor the health and availability of ArcGIS Online services and key components at the ArcGIS Online Health Dashboard, as well as ArcGIS Living Atlas of the World live feed status. You may view and report on usage status of an ArcGIS Online subscription, including organization’s credit usage, member status and activity, content usage, apps, and groups. Also consider monitoring usage of specific items, including maps, layers, and other content. Learn more about best practices for organization maintenance in ArcGIS Online. ArcGIS Hub also provides a Dashboard for each Initiative or Site, which can be used to see activity and usage of the site and content, along with a Hub-specific status page.
  • ArcGIS Enterprise on Windows/Linux can be observed in a variety of ways including server logs and server statistics. In addition to monitoring the ArcGIS Enterprise software, it is important to monitor all supporting components and infrastructure such as the Windows or Linux operating system, databases and other data stores, as well as compute, network, security, and other infrastructure.
  • ArcGIS Enterprise on Kubernetes can be observed in a variety of ways including system logs and health monitoring through ArcGIS Enterprise Manager. In addition to monitoring the ArcGIS Enterprise software, it is important to monitor all supporting components and infrastructure such the Kubernetes environment, databases and other data stores, as well as compute, network, security, and other infrastructure.
  • ArcGIS Platform, as a PaaS offering, does not support observation of its underlying infrastructure and software internals. It does; however, offer ways to observe system utilization and health. This includes monitoring usage of apps as well as layers through the ArcGIS.com website, which can be accessed through the ArcGIS Developer dashboard.

Regardless of the deployment type, it’s crucial to define clear monitoring objectives, set up relevant performance metrics, and establish alert thresholds to ensure proactive management and optimization of your ArcGIS Enterprise environment.

ArcGIS Monitor

ArcGIS Monitor is a comprehensive tool that is provided by Esri and is specifically designed to monitor and optimize ArcGIS Enterprise performance. It offers real-time and historical performance metrics for various components, including ArcGIS Server, Portal for ArcGIS, and ArcGIS Data Store. ArcGIS Monitor can help to identify performance bottlenecks and issues, allowing for proactive management and optimizing your on-premises deployment, as well as notify system administrators when a component stops responding or reports an unexpected response to a standard query. This process can assist with rapid response to system issues but also with root cause analysis to identify the conditions that contributed to an outage or failure once it has been resolved.

On-premises

ArcGIS systems deployed on-premises, using ArcGIS Enterprise for Windows, Linux or Kubernetes, have additional monitoring considerations and options, including:

  • Third-party monitoring tools – You can use third-party monitoring tools like Nagios, Zabbix, Prometheus, and Grafana to monitor the performance of hardware, servers, and network resources in your on-premises infrastructure.
  • Database monitoring tools – If you’re using relational databases with your ArcGIS Enterprise deployment, database-specific monitoring tools such as Oracle Enterprise Manager, and Microsoft SQL Server Management Studio can help you monitor and optimize the database performance.

Amazon Web Services

Systems deployed in AWS can make use of Amazon-specific monitoring tools, which are robust and designed to assist with the monitoring of critical systems build on AWS. These tools include:

  • CloudWatch – AWS provides Amazon CloudWatch, a native monitoring and observability service. You can use CloudWatch to monitor the performance of EC2 instances, RDS databases, and other AWS resources that are part of your ArcGIS Enterprise deployment in the AWS cloud.
  • Third-Party Tools – You can also integrate third-party monitoring solutions like New Relic, Datadog, or AppDynamics with your AWS-hosted ArcGIS Enterprise to gain comprehensive performance insights.

Microsoft Azure

Systems deployed in Azure can take advantage of existing monitoring tools or approaches provided by Microsoft directly to Azure users and customers, including:

  • Azure Monitor – Microsoft Azure offers Azure Monitor, a native monitoring and diagnostics service. Azure Monitor allows you to collect and analyze performance data from various Azure resources, including virtual machines, databases, and Azure Kubernetes Service (AKS) clusters if used in your ArcGIS Enterprise deployment.
  • Third-Party Integrations – Like with AWS, you can integrate third-party monitoring solutions like Dynatrace or SolarWinds with your Azure-hosted ArcGIS Enterprise for enhanced performance monitoring.

Google Cloud Platform (GCP)

GCP-specific monitoring tools and options include:

  • Cloud Monitoring (formerly Stackdriver) – Google Cloud provides Cloud Monitoring, which offers monitoring and observability capabilities for GCP resources. You can monitor virtual machines, databases, and other components used in your ArcGIS Enterprise deployment on GCP.
  • Third-Party Solutions – GCP allows integration with third-party monitoring tools like Prometheus and Grafana.

In all cloud environments, it’s essential to configure monitoring and alerting based on your specific needs and the resources you’re using. Cloud-native monitoring services typically offer integration with alerting mechanisms to notify you of performance issues in real-time.

Top