AWS Public Sector Blog

A deeper look into the 2022 ASDI Global Hackathon’s first place winner

In 2022, Amazon Web Services (AWS) launched the Amazon Sustainability Data Initiative (ASDI) Global Hackathon, part of a new collaboration with the International Research Centre in Artificial Intelligence, under the auspices of UNESCO. Participants were asked to use their creativity, intelligence, and technical skills to build sustainability solutions using data from ASDI on any AWS Cloud services to build solutions that support one or more of the Sustainable Development Goals (SDGs) from the United Nations (UN).

The 2022 ASDI Global Hackathon offered $100K in prizes and was a call to action to the community to explore how open datasets available in the AWS Cloud can contribute to solutions that support the SDGs. The 11 hackathon judges identified six projects that maximize value and can be scalable, simple to use, and are broadly accessible; show a robust implementation using ASDI data; and demonstrate creativity and strong alignment with the SDGs.

We connected with Jeff McWhirter, the first place winner, to learn more about his Repository for Archiving and MAnaging Diverse Data (RAMADDA).

ASDI: Jeff, tell us about your background and project.

Jeff McWhirter (JW): Through my company Geode Systems, I developed RAMADDA, an open-source data and content platform. This work originated in the Earth sciences and brings together ideas from scientific data repositories, content management systems, wikis, and visualization tools. While focused on science data, RAMADDA can serve as a place for all your digital assets, like content, documents, and data. My goal with RAMADDA is to provide a simple-to-use tool that allows a user to effectively create, organize, and share information and data.

ASDI: What problem did you set out to solve during the hackathon?

JW: ASDI provides complementary access to numerous science data sets via Amazon Simple Storage Service (Amazon S3). However, navigating these large data spaces, understanding the data and its many formats, and making effective use of this data can be a daunting task.

For example, the NEXRAD on AWS collection provides both real-time and archival data for the Next Generation Weather Radar (NEXRAD) network. However, making use of this data requires navigating through multiple levels of year, month, day, and radar station Amazon S3 bucket nodes. The user ends up with a long list of files, in a format that is difficult to understand and utilize.

This access pattern repeats itself across many of the ASDI Amazon S3 data collections. There’s an unwieldy hierarchy yielding difficult-to-use data file formats.

ASDI: How did you create a scalable solution via RAMADDA?

JW: As my submission to the ASDI Global Hackathon, I developed an Amazon S3 RAMADDA Plugin that allows end-users to import an Amazon S3 bucket store into their RAMADDA. This allows for customization of how the data is presented to the user and provides a range of data access mechanisms and services to facilitate the discovery and use of these valuable data sets. This is all executed in a no-code fashion.

In order to showcase the capabilities of RAMADDA, I incorporated 15 ASDI data collections. This collection emphasized a diverse array of earth and social science datasets, encompassing everything from worldwide fisheries to weather and climate data, as well as flood risk, poverty, and diversity data. These sample integrations are designed to illustrate the potential applications of RAMADDA for ASDI datasets, but other groups and individuals can install and run their own version of RAMADDA and integrate any number of ASDI or Amazon S3 data collections.

Here are a few examples of RAMADDA-S3 in action:

NEXRAD RadarFigure 1. A visual representation of NEXRAD radar stations through the RAMADDA plugin.

Figure 1. A visual representation of NEXRAD radar stations through the RAMADDA plugin.

For the NEXRAD Radar data collection, RAMADDA can visually illustrate how dates are shown in the navigation tree; how radar stations are geocoded to enable their display in maps. Plus, RAMADDA integrates with data access services like Really Simple Syndication (RSS) feeds, Thematic Real-time Environmental Distributed Data Services (THREDDS) catalogs, and Open-source Project for a Network Data Access Protocol (OPeNDAP) to support programmatic access to the data.

National Oceanic and Atmospheric Administration (NOAA) global surface summary of day

Figure 2. An example of the visualization capabilities of RAMADDA, like interactive charts, here pictured using the NOAA global surface summary of the day dataset.Figure 2. An example of the visualization capabilities of RAMADDA, like interactive charts, here pictured using the NOAA global surface summary of the day dataset.

The NOAA Global Surface Summary of the Day dataset is a large collection of global meteorological data. When integrated with RAMADDA, researchers can use a rich interface for exploring the data and make use of the built-in visualization capabilities to create interactive chart and maps.

First Street Foundation flood risk data

Figure 3. An example of a county’s flood risk data depicted on a map with RAMADDA.Figure 3. An example of a county’s flood risk data depicted on a map with RAMADDA.

The First Street Foundation flood risk dataset features flood risk data available at the census tract, zip code, county, congressional district and state levels. Used with RAMADDA’s front-end mapping framework, users can see the data visually depicted on a map.

Africa Soil Information Service (AfSIS) soil chemistry

Figure 4. A map interface generated from RAMADDA with the AfSIS Soil Chemistry dataset, which plots locations of the data sources on a geographical map.Figure 4. A map interface generated from RAMADDA with the AfSIS Soil Chemistry dataset, which plots locations of the data sources on a geographical map.

The African Soil Information Service (AfSIS) Soil Chemistry dataset contains soil infrared spectral data and paired soil property reference measurements for georeferenced soil samples that were collected through the Africa Soil Information Service (AfSIS) project. Using RAMADDA,  a map interface is presented that allows the user to search the listing and drill down to the selected data sets.

ASDI: How did AWS help in building the solution?

JW: I made use of the AWS SDK for Java to support querying and accessing the data within an Amazon S3 bucket store. This well-designed toolkit helped me rapidly integrate these capabilities within RAMADDA.

With the support of the ASDI Global Hackathon, I plan on continuing these efforts around data management and providing access to important data sets provided by the ASDI.

ASDI: Who are you aiming to help with this project?

JW: The intended audience for this work is two-fold. First, for end-user communities, this work makes it simpler to explore, understand, and access the rich trove of science data provided by the ASDI initiative. Secondly, this work provides enabling infrastructure for the science community to develop these types of interfaces for their respective data collections. Anyone can simply set up a RAMADDA repository of their own and build out these enriched interfaces into other data sets.

On the technical front, I intend to implement a two-way integration with Amazon S3, to both read data from Amazon S3 and write data to Amazon S3. RAMADDA has rich support for creating, uploading, and ingesting data. This helps provide a seamless interface for end-users to upload their data into Amazon S3 via the RAMADDA interface.

ASDI: How can people learn more about RAMADDA?

JW: There are currently 15 ASDI datasets available on the RAMADDA site, and I am working with groups at University Corporation for Atmospheric Research (UCAR) and NOAA to integrate these services into their respective RAMADDA repositories. I hope that organizations that are working in the area of sustainable data can become aware of how tools like RAMADDA can enable them to provide much richer interfaces into their data.

If you want to explore RAMADDA’s Amazon S3 integration, it is simple to run RAMADDA on your own servers or even on your laptop. RAMADDA is built with Java, so the only requirement is having JDK 8 or higher installed. Get started with these step by step instructions.

Visit the ASDI main page to learn more. Explore other datasets hosted in the ASDI Data Catalog. If you are interested in hosting your data on AWS, consider exploring the AWS Open Data Sponsorship Program.

Read more about ASDI on the AWS Public Sector Blog:

Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.

Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.

Dr. Jeff McWhirter

Dr. Jeff McWhirter

Dr. Jeff McWhirter received his PhD in computer science in 1995 from the University of Colorado and has been building software for 40 years. His primary focus is on software frameworks that support data rich interactive visual environments. Since 2001, he has been actively engaged in the Earth science community working with a number of academic and research organizations.

Angela Wu

Angela Wu

Angela Wu is the content manager on the Amazon Web Services (AWS) worldwide public sector grants team. She loves telling stories about how technology connects communities and inspires social change.

Ilan Gleiser

Ilan Gleiser

Ilan Gleiser is a principal machine learning specialist on the Amazon Web Services (AWS) Global Impact Computing team focusing on circular economy, responsible artificial intelligence (AI), and environmental, social, and governance (ESG). He is an expert advisor of digital technologies for circular economy with United Nations. Prior to AWS, he led AI enterprise aolutions at Wells Fargo. Ilan’s background is in Quant Finance. He spent 10 years as Head of Morgan Stanley’s Algorithmic Trading Division in San Francisco.