Real-time data streaming and analytics system (SaaS)
A system following the real-time data streaming and analytics system pattern is available as a software as a service (SaaS) based deployment using ArcGIS Velocity and ArcGIS Online.
ArcGIS Online is a cloud-based GIS managed and delivered as SaaS by Esri. ArcGIS Online provides capabilities that span the data, services/logic, and presentation tiers, working together to provide a complete system. Built on world-class cloud architecture and managed by IT and geographic information system (GIS) experts, ArcGIS Online offers reliable and comprehensive web-based GIS capabilities.
ArcGIS Velocity is a cloud-native add-on capability for ArcGIS Online. It enables users to ingest data from the Internet of Things (IoT) platforms, message brokers, or third-party APIs. It also helps users process, visualize, and analyze real-time data feeds; store those feeds as big data; and perform fast queries and analysis.
Related resources:
Base architecture
The following is a typical base architecture for a real-time data streaming and analytics system deployed as SaaS.
This diagram should not be taken as is and used as the design for your system. There are many important factors and design choices that should be considered when designing your system. Review the using system patterns topic for more information. Additionally, the diagram depicted below delivers only the base capabilities of the system; additional system components may be required when delivering extended capabilities.
Key components of this architecture include:
- ArcGIS Online, including standard portal components such as users, groups, and items, as well as location services such as basemaps and geocoding services. The location services powering the real-time data streaming and analytics system may also come in part or in full from another location services system.
- ArcGIS Velocity provides real-time and big data capabilities as an add-on for ArcGIS Online. It includes spatial analysis tools for real-time as well as big data analytics. Output from ArcGIS Velocity can be output as ArcGIS stream services and feature services, as well as sent to other systems as messages (for example email, SMS, and Kafka). Output from ArcGIS Velocity can also be stored in other systems (for example Amazon S3 and Azure Blob Storage) and stored in the ArcGIS Velocity managed data store where it can be used for big data analytics. Learn more about ArcGIS Velocity.
- ArcGIS Velocity includes a comprehensive website for managing the real-time data streaming and analytics system, as well as designing and performing both real-time and big data analytics. The website is available at velocity.arcgis.com. ArcGIS Velocity exposes tools and APIs, and is typically consumed by a wide range of applications and systems. Learn more about the applications used in a real-time data streaming and analytics system.
Key interactions in this architecture include:
- Client applications communicate with data services as well as location services over HTTPS, typically via stateless REST APIs.
- ArcGIS Velocity ingests data from real-time, streaming sources through feeds. ArcGIS Velocity works with wide variety of ArcGIS, web and messaging, cloud, as well as data provider sources that include both polling and streaming feeds. Learn more about feeds in ArcGIS Velocity.
- ArcGIS Velocity ingests data for big data analytics through data sources. ArcGIS Velocity supports ArcGIS, web and messaging, and cloud-based data sources. It also provides a variety of standard geographies sources to aid in the filtering and enrichment of data. These standard geographies are provided in two groupings, World and United States. Learn more about data sources in ArcGIS Velocity.
Additional information on using and administering the foundational software used in real-time data streaming and analytics systems can be found in the ArcGIS Online and ArcGIS Velocity product documentation.
Capabilities
The capabilities of the real-time data streaming and analytics system on SaaS are described below. See the capability overview the comparison of capability support across deployment patterns for more information.
Capabilities used in a real-time data streaming and analytics system, but typically provided by other systems, such as basemaps, geocoding, and other location services provided by a location services system are not listed below. Learn more about related system patterns.
Base capabilities
Base capabilities represent the most common capabilities delivered by self-service mapping, analysis, and sharing systems and that are enabled by the base architecture presented above.
- Feed ingest connects the system to external sources of real-time, observational data such as Internet of Things (IoT), message brokers, and third-party APIs. These external sources are referred to as feeds and can be configured as input to the real-time streaming and analytics system. ArcGIS Velocity supports both polling and streaming feeds, including ArcGIS, cloud, web, messaging, and data provider feeds. Learn more about feeds in ArcGIS Velocity.
- Data ingest enables data to be loaded into the system for batch analysis and processing. This system pattern does not support performing big data analytics directly at data source locations external to ArcGIS. ArcGIS Velocity supports batch analysis and processing on data stored in the system. Data can be ingested into ArcGIS Velocity from standard geographies, ArcGIS, cloud, web, and messaging data sources. Learn more about data sources and data formats in ArcGIS Velocity.
- Spatial joins and relationships enable rows from two feeds or datasets to be combined based on a spatial relationship. A variety of spatial relationships, including intersect, erase, union, identity, and symmetrical difference may be applied. Tools that perform spatial joins and relationships include, but are not limited to, join features, merge, and overlay layers. Note that some tools only support real-time analysis where others only support big data analysis. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Pattern analysis identifies spatial and temporal patterns in data. Tools that perform pattern analysis include, but are not limited to, find hot spots, find point clusters, and generalized linear regression. Pattern analysis is typically performed on big data, not real-time feeds. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Proximity analysis looks at the proximity of spatial data to other spatial data. Tools that perform proximity analysis include, but are not limited to, create buffers and calculate distance. Note that some tools only support real-time analysis where others only support big data analysis. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Summarization analysis aggregates or summarizes data into higher order data structures. Tools that perform summarization analysis include, but are not limited to, aggregate points, calculate density, and summarize within. Summarization analysis is typically performed on big data, not real-time feeds. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Track analysis works with time-enabled points correlated to moving objects. Tools that perform track analysis include, but are not limited to, reconstruct tracks and snap to network. Note that some tools only support real-time analysis where others only support big data analysis. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Geofence analysis is a form of real-time spatial analysis in which features (often track points) are assessed using areas of interest (often polygon areas). Most commonly, point-based observations are analyzed to determine if they have entered or exited a virtual perimeter. ArcGIS Velocity supports geofencing and dynamic geofencing in several of the real-time and big data analytic tools. Learn more about geofence analysis in ArcGIS Velocity.
- Data management supports operating on geometries and other fields in real-time feeds and big data. Tools that perform data management include, but are not limited to, calculate field and map fields. Note that some tools only support real-time analysis where others only support big data analysis. Learn more about real-time analysis and big data analysis in ArcGIS Velocity.
- Mapping and visualization of analysis results is a powerful step to provide context and to help uncover patterns, trends, and relationships in data. Visualizing and mapping is analogous to charting and plotting with nonspatial data; it enables analysts to verify their analysis, iterate, and create shareable and engaging results. Learn more about visualizing data in ArcGIS Velocity.
- Data publishing and hosting provides for secure storage, management, and access of data as a service for data ingested into the system or persisted from real-time feeds. Data hosted in the system is typically published for consumption as feature layers.
- Feed publishing and hosting provides for new feeds to be published to and hosted by the system. Feeds hosted by the system are typically published as stream layers.
- Send and store messages is an output of real-time streaming and analytics that sends or stores processed feed data (messages) to external systems including message brokers, object stores, and other messaging systems like email and SMS. Supported output types for real-time analytics include ArcGIS feature and stream layers, Azure Event and IoT Hubs, as well as various web and messaging outputs such as text messages and Kafka. Supported output types for big data analytics include ArcGIS feature and stream layers, AWS and Azure based object stores, Azure Event Hub, as well as various web and messaging outputs such as text messages and Kafka. Learn more about the fundamentals of analytic outputs and working with outputs in ArcGIS Velocity.
Extended capabilities
Extended capabilities are typically added to meet specific needs or support industry specific data models and solutions and may require additional software components or architectural considerations.
- Sharing of analysis results is supported by ArcGIS but is considered outside of the scope of the real-time data streaming and analytics system. See related system patterns for more information.
Considerations
The considerations below apply the pillars of the ArcGIS Well-Architected Framework to the real-time data streaming and analytics system pattern on SaaS. The information presented here is not meant to be exhaustive, but rather highlights key considerations for designing and implementing this specific combination of system and deployment pattern. Learn more about the architecture pillars of the ArcGIS Well-Architected Framework.
Reliability
Reliability ensures your system provides the level of service required by the business, as well as your customers and stakeholders. For more information, see the reliability pillar overview.
Security
Security protects your systems and information. For more information, see the security pillar overview.
- Authentication and authorization are required for designing and running analytics as well as managing the real-time data streaming and analytics system. It is also common for outputs like ArcGIS feature and stream layers to be secured, requiring authentication and authorization for access.
- User access and data collaboration are governed by role-based access controls and modern authorization and authentication models including OAuth, SAML, and multifactor authentication.
- Systems are subject to vulnerability assessments including system, web application, and database scans.
Learn more about ArcGIS Online security best practices and implementation guidance.
Performance and scalability aim to optimize the overall experience users have with the system as well as ensure the system scales to meet evolving workload demands. For more information, see the performance and scalability pillar overview.
- In a real-time, IoT system, spikes in the quantity, complexity, or velocity of data occur. Additionally, analytics can be configured that process millions or even billions of records with processing pipelines of varying complexity. To address this, feeds, real-time analytics, and big data analytics in ArcGIS Velocity can dynamically allocate additional resources to maintain collection and real-time analysis velocity as well as achieve fast processing for big data analytics. Autoscaling scales resources up or down according to load. Learn more about autoscaling in ArcGIS Velocity.
Learn more about ArcGIS Velocity best practices for organization administrators such as planning for capacity and managing data retention times.
Automation
Automation aims to reduce effort spent on manual deployment and operational tasks, leading to increased operational efficiency as well as reduction in human introduced system anomalies. For more information, see the automation pillar overview.
- Design and setup of real-time analysis is typically performed interactively, though outputs from real-time analytics are often used in automated workflows. Big data analysis is often iterative, requiring human review and intervention between analysis executions; however, big data analytics can also be automated using scripting.
Integration
Integration connects this system with other systems for delivering enterprise services and amplifying organizational productivity. For more information, see the integration pillar overview.
- Integration with other systems can take the form of real-time feed and big data ingest into the real-time data streaming and analytics system. The outputs from real-time data streaming and analytics systems are also commonly integrated into other systems across an organization’s enterprise, and therefore may also support business operations that are unknown or unavailable to systems administrators.
Observability
Observability provides visibility into the system, enabling operations staff and other technical roles to keep the system running in a healthy, steady state. For more information see the observability pillar overview.
- Real-time data, which is typically moving at high velocity, comes with some unique observability considerations. This is especially true when the velocity and/or stability of incoming feeds is inconsistent.
- The delivery of real-time services to the whole organization (and possibly beyond) may lead to usage patterns and growth not anticipated by the system designers or operators. Monitoring helps people make decisions about when to scale and evolve to meet demand while continuing to operate properly (and in accordance with SLAs).
-
ArcGIS Online and ArcGIS Velocity, as SaaS offerings, do not support observation of their underlying infrastructure and software internals. They do; however, offer ways to observe system utilization and health.
- Monitor feeds, analytics, and output layers in ArcGIS Velocity.
- Monitor compute and storage utilization in ArcGIS Velocity.
- Monitor the health and availability of ArcGIS Online services and key components at the ArcGIS Online Health Dashboard, as well as ArcGIS Living Atlas of the World live feed status.
- View and report on usage status of an ArcGIS Online subscription, including organization’s credit usage, member status and activity, content usage, apps, and groups. Also consider monitoring usage of specific items, including maps, layers, and other content published through the location services system. Learn more about best practices for organization maintenance in ArcGIS Online.
- Additional observation of user logins and account changes may be possible through the configured identity provider when using SAML and/or OpenID Connect logins.
Other
Additional considerations for designing and implementing a mobile operations and offline data management system as SaaS include:
- Successful operation requires strong understanding of GIS and IT concepts as well as technology. The organization should also understand the implication of SaaS, from a data access, security, and management perspective.
- Data governance and alignment with IT policies and roles should strongly be considered when implementing this system pattern.
- Learn more about ArcGIS Velocity best practices for organization administrators such as planning for capacity and managing data retention times.
Related resources: