IT infrastructure monitoring

Note:

Build confidence that your ArcGIS systems will be available and performant through proactive IT infrastructure monitoring (ITIM).

Modern enterprise systems are expected to be performant and reliable by all levels of use. Any disruption in service could result in significant financial and operational loss. To reduce the possibility of service disruption, even the most well-architected IT infrastructure requires continuous monitoring to detect unexpected service demands and component failures. This practice enables IT professionals to minimize negative impacts on  business operations through automatically detecting and reporting potential issues with the performance, availability, and resource utilization of your enterprise GIS system components.

Recommendations

  1. Monitor all components related to your ArcGIS Enterprise systems
  2. Establish monitoring practices in all environments (development, test, and production)
  3. Analyze historical metrics to discover utilization trends and causes for spikes and failures
  4. Consider the benefits of ArcGIS Monitor which provides specialized monitoring capabilities to detect, diagnosis, report, and provide remediation recommendations

Monitor all components

ArcGIS Enterprise comprises four components that are configured to work together:

  • Portal for ArcGIS
  • ArcGIS Server
  • ArcGIS Data Store
  • ArcGIS Web Adaptor

These components can be hosted by multiple servers, on-premises or by a cloud provider and are dependent upon several computing resources, such as servers, DBMS, network, file systems, and so on. The unavailability or performance degradation of any of these resources can disrupt business operations.

With the above in mind, establishing proactive measures to automatically detect overuse or unavailability of each resource can provide early notification of potential problems, enabling IT Professionals to respond to issues quickly and provide guidance on how to remediate the issue.

Establishing automating monitoring and remediation steps for each component is required to build confidence that business operations will not be available. Establishing and maintaining a complete system architecture diagram is critical to determine all dependent components and computing resources.

This guidance aligns with the need to provide full redundancy in highly available (HA) configurations, enabling your system to function well in the event of the failure of any dependent resources or ArcGIS Enterprise components. While HA configurations prevent business processes from being immediately disrupted, the failure does leave a vulnerable component and quick restoration of the failed component is needed to maintain your redundant environment. Establishing monitoring of both primary and redundant components is important to maintaining the confidence throughout your organization that the system can maintain your organization’s Service License Agreement (SLA).

Monitor all environments

Establishing monitoring practices for all environments (production, staging, test, and development) instills confidence that the system will be available for all users. Full monitoring of the production environment establishes confidence that business operations will remain available and performant and typically needs little justification. Monitoring test and staging environments provide IT professionals two advantages:

  1. Access to solid metrics of the computing resource consumption of new capabilities or upgraded system to assist in capacity planning
  2. An environment to practice detecting and reporting resource usage and availability anomalies such as a server not being available or a web service not responding.

Monitoring the development environment not only has the benefit of providing a reliable system for your development and GIS analysts, but also can provide valuable insight as to the system resource utilization of experimental processes such as long running geoprocessing tools.

Analyze historical metrics

Modern monitoring tools not only detect potential issues in real-time but also collect metrics for the various components. Configuring the analysis of these metrics over time can provide valuable insight into the system’s activity, usage, and performance, leading to the discovery of utilization trends and causes for spikes and failures. Such analysis can highlight which enterprise GIS component has the most activity, which web service are most or least active, or what services are more active during specific time periods. This insight can then be used to justify and design changes to system resources and to anticipate spikes in usage.

Consider Esri provided monitoring tools

Each of these components deliver essential services with different system dependencies. For example, GIS based applications often consume web services hosted by ArcGIS Servers to deliver the business information and capabilities. Diagnosing performance and outage issues of these services can be difficult due to the number of dependencies. Esri has developed specialized monitoring product called ArcGIS Monitor designed to detect, diagnosis, and report issues as well as provide remediation guidance specific to the ArcGIS system. ArcGIS Monitor complements ArcGIS Enterprise by providing tools to monitor the health of your enterprise GIS implementations.

Conclusion

Implementing effective IT infrastructure monitoring is essential for maintaining optimal performance and reliability of your ArcGIS deployment. By following these best practices, you can proactively identify issues, optimize resource allocation, and ensure a seamless experience for ArcGIS users.

Top