News & Resources

Patterns And Practices

Migrating Geospatial Applications to Cloud

Authors: Jay Iyer, with Alpha Omega Cloud Centers of Excellence

MIGRATING GEOSPATIAL APPLICATIONS TO CLOUD

July 8, 2024

1. Introduction

NOAA’s mission is to be the Nation’s authoritative environmental intelligence agency by accurately predicting the climate and environmental events that can potentially exact a toll on the economy, human life, and ecosystems. Our support within NOAA has contributed to its mission through delivery of new capabilities and expanding current services in a highly cost-effective manner.

The NOAA Environmental Visualization Lab (VizLab) creates satellite data visualizations for a variety of applications and customers – Science on a Sphere, Data in the Classroom, NESDIS website and other commercial customers. Alpha Omega has been entrusted with leading the migration of the Geospatial visualization infrastructure, currently spanning across four data centers, into the AWS cloud. This migration emphasizes the adoption of modern, cloud-native technologies with a strong focus on reliability, scalability, and security. Key drivers include implementing effective costing controls while managing a substantial quantum of data (approximately 60 TB and increasing), frequent data ingestion (almost every minute), demanding processing and dissemination timelines, and a diverse end-user customer profile.

This article summarizes lessons learned, best practices and proven architectural considerations we have utilized in the migration of ArcGIS-based GIS services to the AWS Cloud. The challenges posed by the scale, real-time data requirements, and user diversity underscore the complexity of such endeavor making it among the unique migrations our Cloud Center of Excellence (COE) has supported till date.

2.  ArcGIS Migration Strategy Components

The Cloud offers a rich set of technologies and tools that may be leveraged to enhance ArcGIS implementations. Reliability, Redundancy, High availability, Cost Efficiency, Elasticity, and the ability to pivot to new technologies such as serverless services are some of the benefits of moving to a Cloud based ArcGIS implementation. As technology continues to evolve, the value proposition offered by Cloud technologies is only going to increase in the future. 

As with any technology, adoption of Cloud based implementation of ArcGIS presents unique challenges such as integration with on-premises systems that need to be identified and addressed early in the migration cycle. Key factors to consider are data migration strategies, data storage options and costs, smooth cutover from on-premises hosted ArcGIS sites to ones that are hosted in the Cloud.

Following are the common and repeatable strategy components  for the migration of Geospatial applications to the Cloud derived from NOAA and other federal experience combined with our Cloud COE knowledgebase:

  1. Evaluate Cloud Readiness: Assess the current ArcGIS application’s architecture, dependencies, and performance requirements to determine its suitability for migration to AWS. Consider factors like scalability, latency, and data residency requirements.
  2. Cloud Architecture Design: Design a cloud-native architecture that leverages AWS services such as EC2, S3, RDS, and ECS/EKS for compute, storage, databases, and container orchestration respectively. Utilize managed services like Amazon Aurora for databases to reduce operational overhead.
  3. Data Migration Strategy: Develop a comprehensive strategy for migrating ArcGIS data (spatial datasets, maps, layers) to AWS. Consider tools like AWS DataSync, AWS Snowball, or direct database replication methods depending on data volume and transfer speeds.
  4. Performance Optimization: Optimize ArcGIS application performance on AWS by leveraging Auto Scaling groups, Elastic Load Balancing (ELB), and caching mechanisms (e.g., Amazon ElastiCache) to handle varying workloads and improve response times.
  5. Security and Compliance: Implement AWS security best practices including IAM roles, VPC configurations, encryption (at rest and in transit), and compliance with relevant standards (e.g., FISMA, HIPAA) if applicable to ensure data protection and regulatory compliance.
  6. Automated Deployments: Utilize AWS CloudFormation or AWS CDK (Cloud Development Kit) for infrastructure as code (IaC) to automate the provisioning and deployment of ArcGIS application components. This ensures consistency and repeatability across environments.
  7. Monitoring and Alerting: Set up comprehensive monitoring using AWS CloudWatch to track application performance, resource utilization, and operational metrics. Configure alerts to proactively respond to issues and maintain high availability.
  8. Cost Optimization: Implement cost management strategies such as reserved instances, spot instances, and leveraging AWS Cost Explorer to monitor and optimize expenses associated with compute, storage, and data transfer.
  9. Disaster Recovery and Backup: Design and implement a robust disaster recovery (DR) plan leveraging AWS services like AWS Backup, S3 cross-region replication, and multi-AZ deployments for high availability and data durability.
  10. Training and Documentation: Provide training to the team on AWS services and best practices specific to ArcGIS applications in the cloud. Maintain detailed documentation of the migration process, architecture diagrams, and operational procedures for future reference and knowledge sharing.

In addition to these components, there are very specific stakeholder and data access considerations that are unique to each migration strategy. The section 4.4 provides examples of such considerations. 

3. AWS Architecture and Considerations

Alpha Omega recommended a hybrid approach to NOAA where:

  • The cloud conducive resources and architectural components are migrated with minor configuration updates
  • rest of resources, services, workflows that require refactoring are updated and configured cloud-first design practices to reduce operational burden

Alpha Omega’s Cloud architects, working with our Cloud COE Team and NOAA stakeholders crafted a Well Architected Framework compliant design that was reviewed and vetted by ArcGIS solution SMEs at AWS. 

The main considerations for the architecture were:

  1. Redundancy in the application stack. This represents a complete overhaul of the current application stack that does not have redundancy built in.
  2. Consolidation of similar resources. The current application stack grew organically and hence has a lot of resources that were commissioned ad-hoc and hence not efficient in terms of O&M. Our design alleviates this concern.
  3. Autoscaling of critical resources. To ensure that the application stack is right-sized, and the customer pays for only what is being used.   For instance, the architecture enables the system to scale up and scale down depending on the demand. 
  4. Leverage AWS managed services such as Lambda layers, EventBridge, Elastic Beanstalk, Transfer Family and other services to the extent that was possible; this also reduces the compute footprint that needs to be managed externally.  
  5. Ensure secure connectivity so that the client data and communications never leave the AWS network. 
  6. Leverage CloudWatch and CloudTrail services to develop a solution for real-time auditing and monitoring of their cloud resources. 
  7. AWS Elastic File System (EFS) offers a NFS compatible file system that was created and mounted on all AWS EC2 compute instances for sharing configuration files, graphics content for ArcGIS server and the NOAAView (PHP) application. This allows for the Cloud solution to have a similar directory structure as the on-premises systems thereby easing the cloud migration burden for this particular use case.
  8. Amazon Aurora Serverless was used since it offers an on-demand, autoscaling configuration for Amazon Aurora. It automatically starts up, shuts down, and scales capacity up or down based on application’s needs obviating the need for managing database instances. It supports PostGIS extending the capabilities of PostgreSQL
  9. An AWS ALB with a public facing IP address serves as the front for traffic meant for the application stack that is deployed across 2 Availability Zones.
Notional ArcGIS Architecture in AWS
Figure 1: Notional ArcGIS Architecture in AWS

This AWS architecture provides the most common cloud, data, and workflow components applicable to most Geospatial implementations with ArcGIS at its core. Its repeatability and idempotent automation across multiple environment and use cases is ensured through the multiple iterations and evaluations performed by Alpha Omega Cloud COE.

3.1 Best Practices – Geospatial Data Migration to AWS

  1. Create an extensive catalog of services that are currently supported.
  2. Create a playbook for the steps required to publish new services.
    Take advantage of the web interface provided by ArcGIS server manager and ArcGIS portal rather than directly logging into the AWS console. This helps in contextualizing and abstracting the data migration/storage process accurately.
  3. As part of moving workloads and services to the AWS Cloud, data from local sources maybe replicated by leveraging geodata services.
  4. Data may be stored on EBS volumes, S3 buckets, RDS databases or EC2 instances depending on the scenarios/use cases.
  5. Transferring data from on-premises to AWS is limited by the available bandwidth on offer. It also requires some careful planning between the technical staff and IT/security professionals within the organization.
  6. Some options that are available for data transfer are:
  7. Copy data when publishing a service
  8. Create an enterprise geodatabase on AWS and register it as the managed database for a stand-alone or federated ArcGIS Server site. Data is copied to the managed database when services are published
  9. Copy and paste using a Remote Desktop Connection
  10. Use S3 client utilities to move data to S3 buckets and from there it could be moved to EBS volumes. 
  11. Use AWS Snowball to transfer large volumes of data into S3. This approach is quite cost effective but requires significant planning and is likely to take more time to complete.
  12. A combination of various techniques such as using AWS snowball along with some near real time synchronization of data between on-premises and AWS might be the most logical solution for moving large volumes of data to AWS
  13. Factors that affect data transfer times include network bandwidth, proximity the AWS region, time of data
  14. Techniques such as zipping GIS datasets before transfer can significantly reduce the actual bytes that need to be transferred – although it incurs the overhead of zipping at the source and unzipping at the destination

Another important consideration while moving data to AWS is the integrity of data paths itself. Map documents reference dozens of data layers located in differing paths and thus need to be moved appropriately to preserve the service integrity in AWS. Use of relative paths are strongly recommended to make this an easier process than using fixed, hard-coded paths.

4. The Migration Approach

Our recommended migration approach emphasized on the thorough discovery and data integrity during migration to develop highly tailored process of migration for our customers at NOAA.

4.1 Discovery Phase

Our Cloud team embarked on an extensive discovery process to comprehensively understand the operational components of the current Geospatial and visualization platform. This involved studying data ingestion patterns, complex image processing and transformation workflows, integration with ArcGIS, and methods for dissemination. Our approach included conducting in-depth interviews with developers, system administrators, and stakeholders across NESDIS, ensuring a thorough grasp of all relevant aspects.

Our migration goals were clearly defined: transition to the AWS Cloud with a heightened emphasis on automation, availability, and security. These include implementing automated deployments, enhancing system availability and performance, ensuring data integrity and security, optimizing resources and workflows for improved performance and cost-efficiency, and accommodating future scalability requirements through scalable cloud solutions.

Presently Vizlab as a platform disseminates about 5TB a month to various customers. There are many ways data is disseminated on the platform:

  • Data Exploration Tool provides over 100 environmental variables from a vast array of NOAA satellites, climate models and other observation devices. 
  • Environmental variables are available are available for download via FTP at full resolution 
  • Geostationary satellite datasets and Polar satellite datasets are available via dedicated web pages.
  • GIS datasets are used in social media outreach for NESDIS, Educational tools (Science on a Sphere, Data in the Classroom), NESDIS website etc.

4.2 Environment Specific Migration Steps

Following steps are specific to the NOAA environment and are heavily influenced by the environment size, visualization SLAs, and workflows for all environment and migration scenarios:

  1. Create an extensive service catalog of all the services 
  2. Inventory all existing software packages/services that are being used to prepare data for ingestion in ArcGIS. A clear plan to migrate/implement the same in the AWS Cloud was implemented to ensure successful migration.
  3. Apply IaC automation and provisioning to all the requisite AWS services that are listed in the implementation solution 
  4. Develop a policy and guardrails to govern the unbounded storage growth in the Cloud due to the high-resolution satellite imagery 
  5. Apply Auto-scaling to ArcGIS Enterprise servers  
  6. Configure the geodatabase that leveraged Amazon Aurora serverless for PostgreSQL.  The geodatabase is a fundamental component for any GIS installation and can be installed on a relational database system. 
  7. Rebuild the enterprise geodatabase in the Cloud; the enterprise geodatabase comprises of ‘mosaics’ which provide ArcGIS with the location on disk for each of the files being served
  8. Publish the entire service catalog that was migrated
  9. Plan for and support side-by-side environment (i.e. the Cloud version and the on-premises versions) to comprehensively test and verify all the functionality of the Cloud version and ensure accuracy
  10. Once the verification is completed and a nominal period passes, DNS cutover was completed, and customers pointed to the newly commissioned Cloud version of the ArcGIS installation

4.3 Additional Customer Specific Considerations

Quite often there may be cases where specialized software is being used in various parts of a given application workflow. These applications may not be readily available in the AWS Cloud platform and may be critical to the success of the migration itself. In such cases, steps need to be undertaken to build or install these applications in the Cloud environment. 

To highlight this, the team encountered the case of Geospatial Data Abstraction Library (GDAL) which is a free and open-source translator library for raster and geospatial data formats. One of the challenges transitioning from on-premises to AWS is that there was not a readily packaged GDAL available for the Amazon Linux 2023, which is the client approved operating systems for virtual machines in the AWS Cloud. To tackle this challenge, the team built and packaged a specialized version of GDAL that was required by Vizlab to generate the requisite MRF formatted files with LERC compression for ingestion into ArcGIS. 


As part of this process, GDAL was packaged for easier installation and distribution as new hosts are provisioned in AWS. The version of GDAL was built was demonstrated to create MRF files with LERC compression faster and smaller files. Our in-depth understanding of how image processing works in ArcGIS, what optimizations are required for the migration to the AWS Cloud and what precursory steps were essential to ensure success. A key consideration for this build was to understand the specific file transformation requirements from the Vizlab development team and customizing a build to suit their specific needs. 

MRF or Meta Raster Format with LERC compression is a cloud optimized file format that is recommended by ArcGIS. Geographic Data Abstraction Library (GDAL) is a translator library for vector and raster geospatial data formats. 

5. Summary

GIS Migration to cloud is a mammoth undertaking and requires extensive understanding of environment, data flows, stakeholder needs, and visualization requirements. Our Cloud COE supports such complex undertakings and delivers robust cloud architecture using Cloud-First principles, zero trust practices and long-term maintainability. In addition to our support on VizLab our broad cloud work in NOAA has facilitated development of NESDIS Common Cloud Framework (NCCF) that provides a Cloud agnostic common enterprise architecture to empower environmental science and weather business functions through secure infrastructure instrumentation. Within NOAA/NESDIS/NCEI our team has enabled scalability and resilience of data archive infrastructure to ensure secure sharing and seamless data availability for environment research.  

Alpha Omega’s Cloud Center of Excellence (CoE) was incubated with Cloud Engineering patterns, practices, and techniques gathered from NOAA experience. We have been an integral partner in NOAA’s journey to cloud modernization since 2018 with strong support from public cloud providers, especially AWS.