Test results on database resource impacts

In our first set of tests, we looked at the impact of database resources on the system’s ability to handle load, even with ample GIS server resources. This is to show how impactful the database is across the entire system. In other words, if your users are experiencing long wait times, the issue may not necessarily solely reside within the ArcGIS Server tier.

Test methods and results

We conducted two tests to compare how database resources impact the system performance, even with enough server resources:

  • first using a small database virtual machine instance with 8 vCPU
  • then with a larger database virtual machine instance with 16 vCPU

We held all other aspects of the system constant, with a 1:1 ArcSOC configuration. In other words, one ArcSOC configured per vCPU on the ArcGIS Server instance, where the number of ArcSOCs and vCPU are equal. We captured and monitored performance metrics like ArcSOC use and availability, service wait times, system resource utilization, and error rates to evaluate each configuration. Tests were performed at 8 times (8x) the design load of the original system test study, and the ArcGIS Enterprise server resources were cut in half (compared to our previous test studies) to ensure there was enough load to impact the system.

Because ArcGIS is a multi-tier system, tests were conducted across client, service, and data storage tiers, as well as the underlying infrastructure itself. In this test study, JMeter was used to simulate the user workflows and measure system performance under different loads.

Load test 1 : small database instance – 8 vCPU

This run was performed with 8 vCPU available on the PostgreSQL instance to observe system impact on a scaled down database server. The ArcGIS GIS Server (UN Server) was also provisioned with 8 vCPUs. There are six instance resource charts below showing resource utilization across the system and one chart showing concurrent requests. In each of the resource charts, the orange lines represent % CPU utilization, the gold lines represent % disk usage, and the purple line is % memory utilization.

In the diagram below, you can see that the PostgreSQl DB has the CPU running at 100% and the hosting server CPU has frequent spikes to 100%. This is from the high number of view requests associated with the workflows at the 8x design load.

The bottom chart shows concurrent requests, as measured from JMeter logs. Notice the red line, which represents concurrent view requests, trends up to 538 as the test runs. This indicates that requests are not closing. In later charts, you will see this line moving steadily up and down, indicating that the system is responding and requests are closing quickly enough to handle the load.

System resource utilization with a small database instance size

This configuration did not support the load because the database server was under resourced, as seen by the amount of orange (CPU utilization) in the PostgreSQL database chart, the spikes in the hosting server CPU, and the number of concurrent requests.

To further reinforce this claim, the chart below represents ArcSOC utilization on the hosting server as captured by the Soccer (ArcSOC Monitoring) utility, which is running on a remote machine. The red line shows busy ArcSOCs at 100% (8), likely attributed to the overloaded database. ArcSOCs are held (busy) while they wait for the database to respond. In fact, the ArcSOCs were so busy that Soccer could not track their state, as illustrated by the incomplete red line and the sudden drops in maximum (green line).

ArcSOC utilization with a small database instance size

Load test 2 : large database instance – 16 vCPU

In the second test, we doubled the PostgreSQL database instance to 16 vCPU to observe possible differences from the first test. The ArcGIS GIS Servers remained provisioned with 8 vCPUs each. As in the previous diagram, percent utilization for CPU is orange, disk is gold, and memory is purple. Notice that, minus a few spikes, all servers are generally running below 60% CPU, disk, and memory.

The concurrent request chart shows concurrent view requests averaging 36, with a few spikes. The open requests are not trending up as in the previous chart, indicating this system is handling the load.

System resource utilization with a large database instance size

The ArcSOC chart below shows that ArcSOCs on the hosting server are busy, but the overall system response is good. Even though 99% (p99) of the usage is 8 socs or less, the average is 4.81. Later we’ll look at user experience to see if the system enables people to work efficiently.

ArcSOC utilization with a small database instance size

User experience

In addition to overall system utilization and performance, the increased resources available on the database instance significantly improved the end user’s ability to complete their work efficiently. This test study evaluated end user efficiency by observing workflow execution times - how long it takes a user to complete workflow’s steps, as well as workflow step execution times - how long it took to complete a key step within a workflow.

Note:

The user experience is the ultimate measurement in these test studies. We have seen throughout testing that even when the system seems to be performing within normal parameters, aspects like network latency, GPU implementation, map instance misconfiguration, etc. can negatively impact end users. Focus on the end users to improve your return on investment.

Workflow execution times

In the chart above you can see how increased system resource saturation in the small database test run results in longer workflow execution times, which result in an overall worse end-user experience. The overall execution time for all workflows increased significantly with an improperly resourced database as compared to a properly resourced one, even when ArcGIS Server was well-resourced. In particular, the View Assets workflow saw a dramatic decrease in workflow execution time with a properly sized enterprise geodatabase.

In addition to workflow execution time, we can look more deeply at how database resources impact the duration of specific editing workflow steps. However, it does show the vast difference in duration to open a project and locate an asset in the Update Asset workflow. Similarly, Electric Tracing showed a significant increase across all workflow steps. This pattern continued for all workflows captured.

Key workflow step execution times

Top