AWS Public Sector Blog

Analyzing vehicle fleet location data from a data lake with AWS

In a previous blog post, we discussed how to visualize data lake address datasets on a map with Amazon Athena and Amazon Location Service geocoding. This is a companion post that shows how to apply the same concepts in reverse, using reverse geocoding instead of geocoding to visualize vehicle fleet location history data.

At Amazon Web Services (AWS), many public sector customers operate fleets of vehicles (e.g. emergency response, public transportation) that generate location data, which is ultimately stored in a data lake. These customers frequently ask how they can quickly visualize this data and extract insights that can help them optimize how they operate their vehicle fleets. Amazon Location Service provides reverse geocoding functionality, which can convert geographic coordinates (latitude and longitude points) to addresses (street, city, postal code). Common questions are:

  1. What is the simplest way to use Amazon Location’s reverse geocoding functionality in an ad hoc fashion on location data in a data lake without having to reverse geocode the entire dataset?
  2. Once the location data is reverse geocoded, what is the simplest way to quickly visualize the results to get insights from the data?

In this post, learn how to use Amazon Athena and Amazon Location to perform ad hoc reverse geocoding on a notional dataset of vehicle location history, and visualize the results on an Amazon QuickSight map.

Solution overview: Visualizing vehicle location history data with Amazon Athena and Amazon Location Service reverse geocoding

Figure 1. Architecture diagram showing the solution described in this blog post. The main components are an Amazon S3 bucket, Amazon Athena, AWS Glue, AWS Lambda, Amazon Location Service, and Amazon QuickSight.

Figure 1. Architecture diagram showing the solution described in this blog post. The main components are an Amazon S3 bucket, Amazon Athena, AWS Glue, AWS Lambda, Amazon Location, and QuickSight.

Figure 1 depicts the solution architecture and resources deployed by the AWS CloudFormation template, described in more detail in the Walkthrough section. In the solution workflow:

  1. The CloudFormation template deploys an AWS-hosted Amazon Simple Storage Service (Amazon S3) bucket, which includes a .csv file of the example dataset. In this use case, the dataset contains coordinate data for four different vehicles in the Kansas City Metro area.
  2. The CloudFormation template deploys an AWS Glue database and table, which defines the data format of the .csv file in Amazon S3 to allow it to be queried fromAthena.
  3. Athena launches an SQL query against the data in Amazon S3. An Athena User Defined Function (UDF) is used to launch an AWS Lambda function, which in turn calls Amazon Location to reverse geocode coordinate data into address data (City, Zip Code, County, State, and Country). The result of the SQL query is written back to Amazon S3 and an associated AWS Glue table is created by Athena.
  4. The new geocoded address dataset is loaded into QuickSight and visualized for end users.

Prerequisites

To complete this walkthrough, you will need access to an AWS account with sufficient AWS Identity and Access Management (IAM) permissions to create these AWS resources, and access to the AWS Management Console.

You will also need to have QuickSight enabled within your AWS account. If you do not already have QuickSight enabled, sign up for QuickSight here.

Walkthrough

In this walkthrough, you will create AWS resources using AWS CloudFormation and additional AWS resources using the AWS Management Console.

Request a service quota increase for Amazon Location Service

To enable Athena to call the Amazon Location geocoding API many times in rapid succession as needed for this walkthrough, you need to request a service quota increase on the Amazon Location SearchPlaceIndexForPosition API endpoint. To do this, navigate to the request page in the AWS Management Console. Make sure you are in the desired AWS Region before making the request.

Select the Request quota increase button (Figure 2), which opens a request screen (Figure 3). From this screen, set the Change quota value to 100 and select the Request button. Your request should be granted automatically after a short period of time. Refresh the page to see the status of your service quota increase request.

Figure 2. Service quota request page for the rate of SearchPlaceIndexForPosition API requests.

Figure 2. Service quota request page for the rate of SearchPlaceIndexForPosition API requests.

Figure 3. How to set the quota value for the Rate of SearchPlaceIndexForPosition service quota request.

Figure 3. How to set the quota value for the Rate of SearchPlaceIndexForPosition service quota request.

Deploy the Amazon Athena User Defined Function

Athena UDFs can be used to call AWS Lambda functions when performing SQL queries from Athena. From an AWS Lambda function, you can call Amazon Location to perform geocoding operations. In the aws-samples GitHub repository, there is a sample project that uses the AWS Serverless Application Repository to deploy this AWS Lambda function, along with the required Amazon Location place index that enables you to perform geocoding operations.

To deploy this in your own AWS account, first sign in to your AWS Management Console. Make sure you are in your desired AWS Region, then select this link to open an AWS Management Console page where you can edit application settings (Figure 4).

Figure 4. Application settings that can be adjusted before deploying the Athena UDF. The figure highlights the required settings change to ReservedConcurrentExecutions.

Figure 4. Application settings that can be adjusted before deploying the Athena UDF. The figure highlights the required settings change to ReservedConcurrentExecutions.

As highlighted in Figure 4, change the value of ReservedConcurrentExecutions to “4.” Check the box that acknowledges the creation of IAM roles, and select the Deploy button. Setting ReservedConcurrentExecutions to 4 prevents too many concurrent Lambda function executions from exceeding the rate limit that you set up earlier for the Amazon Location SearchPlaceIndexForPosition API.

Deploy the solution CloudFormation template

An AWS CloudFormation template is provided as part of this blog post solution that deploys a number of AWS resources, including:

  • An Amazon S3 bucket, which is populated with the example dataset of vehicle history data in the Kansas City metro area.
  • An AWS Glue database and table that have the example dataset’s structure predefined, which in turn allows you to start querying it from Athena with no additional configuration.
  • An example Athena SQL query that you will use to perform reverse geocoding operations.

To deploy this template into your AWS account, select this link which opens the AWS Management Console and initiates the solution’s CloudFormation template deployment process. Check the I acknowledge that AWS CloudFormation might create IAM resources box (Figure 5); then select the Create stack button.

Figure 5. Shows how to deploy the CloudFormation template to your AWS account that creates the required resources for this solution.

Figure 5. Shows how to deploy the CloudFormation template to your AWS account that creates the required resources for this solution.

Reverse geocode coordinates using Amazon Athena

With the solution deployed, you can now perform some reverse geocoding. Navigate to the Athena section of the AWS Management Console and open the Query Editor. Change your workgroup to LocationServiceExampleWorkgroup (Figure 6), and choose the Acknowledge button in the resulting screen (Figure 7).

Figure 6. Shows how to change your Amazon Athena workgroup in the AWS Management Console.

Figure 6. Shows how to change your Athena workgroup in the AWS Management Console.

Figure 7. Shows how to acknowledge the changes to your Amazon Athena workgroup.

Figure 7. Shows how to acknowledge the changes to your Athena workgroup.

In the Query Editor page, navigate to the Saved queries tab and choose the saved query named VehicleHistoryExample (Figure 8).

Figure 8. Shows how to open an Amazon Athena saved query to perform the subsequent reverse geocoding operation.

Figure 8. Shows how to open an Athena saved query to perform the subsequent reverse geocoding operation.

Select the Run button (Figure 9) to launch the saved SQL query. If you look at the contents of this SQL query, you will see that it passes address values into a function called “search_place_index_for_position,” which in turn uses AWS Lambda to call Amazon Location to reverse geocode these addresses.

Figure 9. Shows how to launch the Amazon Athena SQL query that performs the geocoding operation.

Figure 9. Shows how to launch the Athena SQL query that performs the geocoding operation.

After a few minutes, the query will complete. You should see a new AWS Glue table called “vehicle_locations_geocoded” that new values from our reverse geocoding operation (Figure 10).

Figure 10. Shows the results of the Amazon Athena query, which is a new AWS Glue table that includes new values from the reverse geocoding operation.

Figure 10. Shows the results of the Athena query, which is a new AWS Glue table that includes new values from the reverse geocoding operation.

Integrate the data with Amazon QuickSight

Now that you have performed the required reverse geocoding and have produced a new AWS Glue table, you can visualize the results in QuickSight.

Start by logging into QuickSight through the AWS Management Console. Once you are logged in, navigate to the “QuickSight access to AWS services” settings page. Under Allow access and autodiscovery for these resources, make sure that both Amazon S3 and Athena are checked (Figure 11). Then choose the Select S3 buckets link and check the vehicle-locations-* Amazon S3 bucket that was created earlier, along with the Write permission for Athena Workgroup checkbox to the right. Select Finish to exit this window, then select Save.

Figure 11. Shows the Amazon QuickSight permissions page that allows you to grant permissions for Amazon QuickSight to access data in Amazon S3 via Amazon Athena.

Figure 11. Shows the QuickSight permissions page that allows you to grant permissions for QuickSight to access data in Amazon S3 via Athena.

Next you will add a dataset in QuickSight so that we can import the data via Athena.

Navigate to the QuickSight Datasets section and select the New dataset button. As seen in Figure 12, select the Athena option, then enter “vehicle_location_history” in the Data source name field before selecting the Create data source button.

Figure 12. Shows the Amazon Athena add data source workflow.

Figure 12. Shows the Athena add data source workflow.

On the next screen, Choose your table (Figure 13), select the vehicle_locations database and the vehicle_locations_geocoded table before choosing the Select button.

Figure 13. Shows the Amazon Athena table selection workflow.

Figure 13. Shows the Athena table selection workflow.

On the next screen, Finish dataset creation (Figure 14), leave the defaults and select the Visualize button.

Figure 14. Shows the final step in setting up the Amazon QuickSight dataset creation process.

Figure 14. Shows the final step in setting up the QuickSight dataset creation process.

After a short time, the data will be imported into QuickSight SPICE, which is an in-memory engine that increases performance. Now let’s create some visuals using QuickSight.

Create visualizations in Amazon QuickSight

The first visualization you will create uses the Filled Map visual type. This visualization gives you insight about where the vehicles spend the most time by counting the times a vehicle was recorded within each postal code. As seen in Figure 15, select the Visual type of Filled Map, then drag and drop postal_code into the Location field well.

Figure 15. Shows an Amazon QuickSight map visualization showing how many times vehicles were in different postal codes.

Figure 15. Shows a QuickSight map visualization showing how many times vehicles were in different postal codes.

For the next visualization, we visualize the distribution of vehicle locations by city name using a pie chart. As seen in Figure 16, select the Visual type of Pie chart, then drag and drop Municipality into the Group/Color field well. Now you can see which cities were most frequently visited by the vehicles.

Figure 16. Shows an Amazon QuickSight pie chart that shows which cities the vehicles visit most frequently.

Figure 16. Shows a QuickSight pie chart that shows which cities the vehicles visit most frequently.

To check for any differences between the cities that the vehicles visit, you can next create a pie chart for each vehicle. To do this, keep the same visualization as the previous step, and drag and drop deviceid into the Small multiples field well (Figure 17). Note that they are fairly similar, but drilling down into each vehicle produces different counts for the different cities.

Figure 17. Shows an Amazon QuickSight pie chart for each vehicle showing which cities each vehicle visits most frequently.

Figure 17. Shows a QuickSight pie chart for each vehicle showing which cities each vehicle visits most frequently.

Finally, you can visualize the actual coordinates of the vehicle locations on a map. Choose the Points on map visual type (Figure 18), then drag and drop the lat and lon fields into the Geospatial field well. To differentiate by vehicle, drag and drop deviceid into the Color field well.

Figure 18. Shows an Amazon QuickSight map visualization with all the vehicle locations color coded by vehicle.

Figure 18. Shows a QuickSight map visualization with all the vehicle locations color coded by vehicle.

Clean up

If you would like to remove all the resources created in this walkthrough, perform the following steps:

1. Delete the QuickSight analysis and dataset from the QuickSight user interface.

2. If you were not already using QuickSight and want to end your trial subscription, follow these instructions.

3. From the Athena section of the AWS Management Console, delete the “vehicle_locations_geocoded” table that was created by our SQL query as seen in Figure 19.

4. Delete all objects from the Amazon S3 bucket “vehicle-locations-*” that was created by CloudFormation using the instructions in the QuickSight User Guide.

5. Delete the CloudFormation stacks called “serverlessrepo-AmazonLocationUDFs” and “location-service-athena-example.”

Figure 19. Shows how to delete the “vehicle_locations_geocoded” table from the Athena user interface.

Figure 19. Shows how to delete the “vehicle_locations_geocoded” table from the Athena user interface.

Troubleshooting

Note that QuickSight can show up to 10,000 points on a map at once. If you experiment with this solution with your own data and have more than 10,000 points to show, QuickSight will render the first 10,000 points.

If you are attempting to do this walkthrough on a new AWS account, your service quota request for Amazon Location may not be automatically accepted, and you may have to work with AWS support to get the service quota adjusted.

Similarly, if you are using a new AWS account, you may have an AWS Lambda “Concurrent Executions” service quota limit that is set to 10. If this is the case, then deploying the Athena UDF solution will fail. If this happens, you may request a service quota increase for AWS Lambda Concurrent Executions with a value of at least 20.

Conclusion

In this blog, we explored visualization options for location history data from a notional fleet of vehicles to provide an interactive, intuitive interface to better understand this dataset using serverless services on AWS. This solution can be applied to various use cases, including vehicle tracking, IoT device asset tracking, and more that can help organizations better manage and measure their fleet networks.

Contact us if you have questions about this solution or are interested in learning more about AWS in the Public Sector.

If you have addresses stored in a data lake and need to geocode them, read the related blog post, “Visualize data lake address datasets on a map with Amazon Athena and Amazon Location Service geocoding.”

If you have addresses stored in a relational database instead of a data lake, read about a similar solution in the blog post, “Access Amazon Location Service from Amazon Aurora.”


Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.

Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.