Environment isolation

Many organizations build IT systems using a multi-environment approach, whether they refer to these systems as development, staging, pre-production, QA, acceptance, or production, these terms are used to designate different environments which have different characteristics and are used for different purposes. There is no standard definition of environments, or how they are used, other than a general spectrum that extends from “lower” development environments to a final “production” environment. In every case, the definition and constraints on these environments are fully defined by the organization and should be defined to match other business processes or standards and support the business requirements of the organization. While there is no single best practice in this area (as different systems have different requirements) the following section provides some guidance around how to approach environment isolation topics in the architecture process.

Users of ArcGIS systems expect the system to be available when they need to do their work. However, significant changes to system configurations can cause downtime if these changes are not safely developed and tested in environments separate from production. Isolating computing environments is an approach to maintaining system reliability and availability by creating separate systems for production, testing, and development activities. While not all changes (such as app configuration) need to be tested in each environment, significant updates and new functionality can benefit from a structured approach to this topic.

In some cases, users’ expectations may be documented in a Service Level Agreement (SLA), or it might just be an expectation of when the system needs to be available. Consider your users’ expectations and business needs when deciding on the level of environment isolation and governance required to manage system changes.

Recommendations

To best maintain system reliability and availability for your users, these best practices apply:

If it is feasible for the organization, implement isolated production, staging, and development environments.
Test system changes such as patches, upgrades, or OS setting changes in a staging environment before making changes to the production environment.
Use a development environment to develop and test out new capabilities without impacting users in other environments.
Follow standard governance practices when promoting content and capabilities across environments.
Have a plan for how content promotion is managed: who reviews content, who approves it, and how is it moved?

Environment purposes

For the purposes of these best practice recommendations, the following three environment definitions are broadly used by Esri and our customers.

The Production environment is the live system that supports end users. Uptime requirements are usually defined by an SLA and met through effective change management and governance. Software, application, configuration, or network changes should generally not be made to a production system without being tested in a staging environment. Staff engaging in the production environment are those typically viewing apps to make decisions, collecting, and editing information in or out of the office, and coordinating to get mission-critical work done. This environment often has increased monitoring, may have dedicated hardware to ensure good performance, and will have the most IT support effort attached to its maintenance.
A Staging environment is generally designed to be a representative mirror of the production environment that lets you vet system changes before deploying those changes to production. The term “representative” is used because maintaining an entire environment identical to production is costly both in resources and staff support. While a staging environment typically reflects production in terms of the tiers, servers, components and roles, you may or may not require the same individual server capacity as production, so a scaled-down deployment may be considered and cost savings may be possible. You can perform user acceptance testing, performance testing, load testing, and training in a staging environment to avoid risk to your production system. If needed, you can even implement multiple staging environments for different testing and training activities. Staff participating in a staging environment are those who are typically running load tests, performance tests, integration tests, QA/QC, or even possibly training.
A Development system is a workspace where developers and analysts can create and manage content or make changes without impacting a large audience. This dedicated environment is typically used for creating new capabilities such as applications, services, data models, or geoprocessing models, for unit testing, and for designing and constructing business workflows. The size and complexity of the environment will depend on the level of risk generated by changes, the number of creators, and the potential impact of system outages and downtime. Generally, only staff who are actively working with content in the development environment have access to this system, or it is not generally accessible within the organization.

environment-isolation-1

Depending on the organization’s tolerance for risk and IT policies, there may be a need to further separate certain kinds of activities outside of the production, staging, and development constructs. If needed, you can implement multiple environments for different testing and training activities, such as:

QA/QC
Integration or acceptance testing
Performance testing
Load testing
Training

Benefits and costs of environment isolation

Environment isolation insulates your production environment from known risks and changes that can negatively impact your business, like upgrades, new software, or unexpected changes, helping you better maintain their functionality, stability, and performance. Unintentional system changes can cause operational systems to fail to deliver the capabilities and performance that users expect. Implementing these isolated computing environments helps you deliver a stable, extensible, and high-performing system.

Environment isolation also has costs, both in IT resources (keeping multiple systems running), software licensing, and human capital, as an increased number of environments needs a larger support network and more staff involved in change control and deployment. Generally, larger, more business-critical systems deploy more complex environment isolation approaches, but even smaller organizations may choose to deploy a version of this approach to help to isolate changes and protect their systems. It is important to investigate the costs of these choices and message them to stakeholders so that an informed decision is made rather than a default choice to have multiple environments just because “we have always done it that way.”

Implementing environment isolation

Governance plays a critical role in successfully implementing environment isolation. It is the method by which risks are mitigated, resources are optimized, and business benefits are delivered. Governance should define what policies, procedures, and techniques teams will leverage to maintain these environments and promote changes across them.

There is no one-size-fits-all set of considerations or standard path forward to managing the breadth of your software, applications, services, and data across environments. However, there are some resources to help deploy environments consistently, such as Chef Cookbooks, Enterprise Cloud Builder, ArcGIS Enterprise Builder, and database replication tools and asset packages. See ArcGIS Enterprise deployment tools for details. Additionally, it is recommended to avoid manual configuration to reduce likelihood of human error whenever possible. Consider using PowerShell DSC for ArcGIS, the ArcGIS REST API, and the ArcGIS API for Python to automate some of these tasks. Keep in mind that the creation of these scripts is an activity that is appropriate for a development environment.

Every choice made in development inherently leads to something someone needs to know or know how to do in staging and production. Employ good deployment practices by ensuring proper knowledge and/or skill transfer to production staff so they can operate as intended.

Working with multiple ArcGIS Enterprise deployments

Some organizations use multiple ArcGIS Enterprise environments to separate these different tiers. It can be challenging to move and manage content across environments consistently and successfully. However, there are tools you can use to help automate these tasks. For example, there are operations available with the ArcGIS REST API that make it easier to move layers, maps, and apps as they move across environments, called Export Group Content and Import Group Content.

For example, consider a scenario where you have developed a customized Experience Builder application which references a web map and a set of feature layers within a group in your development environment, and it is now ready to be moved to a staging system for a structured review. To do so using these export/import group migration operations, you would conduct the following steps:

Export the group contents from development into a package.
Add the package as an item into your staging environment.
Import the contents of the package into a group within the staging environment.
Deploy your custom app to your staging hosting environment, and point the configuration to the staging environment URLs.

environment-isolation-3

At this point, the items in the package can be discovered, shared, edited, and used in the staging environment, as determined by the staging group’s settings. The same workflow can be used to promote the items to production when they’re ready. This workflow can also be scripted using the GroupMigrationManager module in the ArcGIS API for Python.

Blue-green deployments

Deploying some types of changes, such as a system upgrade or a significant configuration change, can be disruptive. Some organizations use a strategy called blue-green deployment to seamlessly deploy new changes for users. A blue-green deployment is a deployment strategy in which you create two separate, but identical environments. One environment (blue) is running the current application version and one environment (green) is running the new application version or set of configurations. Traffic is directed to either environment using standard mechanisms such as routers, load balancers, reverse proxies, or web servers.

Blue and green take turns to play the role of production. Only one of the environments is live at any given time. For example, when it’s time to upgrade ArcGIS Enterprise, the upgrade would first be performed on the green system. Once the testing team is satisfied that everything is fully operational and ready for production use, the only thing that changes is the direction of traffic flowing from the proxy or load balancer - to green instead of blue, with no perceptible change for production end-users. At this point, new content and capabilities would be developed in blue, until sufficient testing has been successful to warrant switching traffic again.

Keeping two sets of environments up all the time can get expensive. Fortunately, the cloud makes blue-green deployments more feasible. Every major cloud platform provider has tools that allow us to bring up and tear down infrastructure on-demand. For example, can start and stop servers with infrastructure as code and automate the uptime and configuration of the system down to specific details.

Final recommendations

Implementing these isolated computing environments helps you deliver a stable, extensible, and high-performing system. By leveraging these environments to support effective change management, you can shield your system from unexpected failure and avoid disruptions to business operations. At a minimum, most organizations should have at least two computing environments: production and staging, as they may not participate in any custom development activities that would make use of a development environment, or primarily use low and no-code applications. However, you may have more depending on your organization’s appetite for risk.

Consider how you will implement and govern environment isolation (and the activities isolated within each environment) as early as possible. While there is no one-size-fits-all approach to these choices, there are many tools and common practices you can refer to for guidance.

An additional resource that was recently published covers the concept of content promotion between environments, with an example of a scripted approach to this topic: Esri Community Post: ArcGIS Enterprise Content Promotion.