Selling Partner API BlogVideos
SP-API DocsDeveloper ConsoleSupport
SP-API DocsDeveloper ConsoleSupport

Streamlining Analytics with Amazon Selling Partner API Data Ingestion into a Data Lake

Using SP-API with AWS serverless and managed services to easily process all your data

by Hina V., Sr Solutions Architect, Selling Partner Developer Services, Manikanta G., Data & ML Engineer, AWS WWCO ProServe,Daman O., Solution Architect, AWS SA – CPG Solutions, Jani S., Principal Solutions Architect, AWS BDSI, Norman K., CPG BDM, AWS BDSI, and Anuj R., Global Leader Solutions Architecture, Selling Partner Developer Services| May 24, 2023

Update October 2, 2023 - This post has been updated to remove outdated AWS IAM ARN and/or AWS Signature Version 4 information.

As an Amazon Selling Partner, you are likely dealing with a vast amount of data related to your products, sales, and traffic. This data can be overwhelming and difficult to manage, especially if you don't have the right tools and processes. However, using Amazon Selling Partner APIs (SP-API) with Amazon Web Services (AWS) serverless and managed services, you can easily ingest and process all of your data, providing valuable insights that can help improve sales and grow your business.

This blog provides AWS architecture guidance giving you the ability to build a modern data lake — a useful tool for handling large amounts of data. We'll explore high-level design aspects of the architecture guidance to provide a comprehensive understanding of how it works.


Authentication and authorization

To establish secure data access, it is necessary to implement authentication and authorization for SP-API. The following diagram provides an overview of this workflow within SP-API.

Image depicting SP-API authentication and authorization reference architecture

Authentication and authorization workflow in SP-API

The authorization process for SP-API relies on Login with Amazon (LWA), which is Amazon's implementation of OAuth 2.0. During this process, your Selling Partner application interacts with both Amazon's pages and your website to gain authorization. The web browser passes parameters back and forth between your website and Amazon for each selling partner action.

To implement OAuth 2.0 authorization, you must set up your website to:

  1. Accept and handle the parameters passed by Amazon, and
  2. Redirect the web browser while passing parameters to Amazon. Refer to Website authorization workflow for more information.

Once authorized, you obtain a Login with Amazon (LWA) refresh token. This token should be re-authorized every 365 days. It is essential to rotate your LWA credentials for all applications every 180 days.

In this AWS Architecture guidance, AWS Secrets Manager stores your application's LWA credentials and the LWA refresh token. These tokens are exchanged for an LWA access token to include in API calls.

SP-API integration

To connect with Selling Partner APIs, you first call the Login with Amazon (LWA) endpoint with your application's LWA credentials and the LWA refresh token you obtained. In return, you receive an LWA access token, which is passed in the request header of your Selling Partner API HTTP request and expires after one hour.

Serverless reports application reference architecture

In this AWS architecture, you use AWS Step Functions to create a serverless application that interacts with the Amazon Selling Partner Reports API. The Reports API has different regional endpoints, marketplace IDs, and report-type configurations that are stored using AWS Systems Manager Parameter Store. To simplify the process, have an AWS Lambda function dedicated to creating reports. This function makes calls to the appropriate report type configuration, marketplace IDs, and other necessary HTTP headers to complete the API call.The following diagram depicts the architecture behind this workflow.

Image depicting serverless Reports application reference architecture

Serverless Reports application reference architecture

In accordance with Reports API best practices, use a Lambda function to subscribe to the SP-API REPORT_PROCESSING_FINISHED notification via the Notifications API (Note: this notification is not yet available for vendors). This allows the architecture to be event-driven and receive notifications when the report processing is complete.

Another Lambda function is then used to retrieve the report from the SP-API endpoint, process the data, and store the report in an Amazon Simple Storage Service (Amazon S3) bucket. To ensure data protection, you should use AWS Key Management Service (AWS KMS) to encrypt data.

Use an API client to handle rate limiting. The Selling Partner APIs return the x-amz-rate-limit in the API response header, which can be used to get the most updated rate limits for the operations you are calling. By using an API client to handle rate limiting, you can avoid exceeding these limits and ensure that your application functions smoothly.

Serverless Catalog and Listings Items reference architecture

Next, use AWS Step Functions to streamline the development and management of a distributed application that integrates with the Amazon Selling Partner APIs. This allows for the creation and execution of workflows that manage the various components of the application. The following diagram depicts the architecture behind this workflow.

Image depicting serverless Catalog and Listings Items reference architecture

Serverless Catalog and Listings Items reference architecture

One Lambda function can search Amazon's catalog items and associated information, using identifiers such as ASIN/SKU/EAN or by keywords like brand-specific search terms. The results can be stored in an Amazon S3 bucket for further processing.

There is also a Lambda function that makes calls to Amazon's Listings API. This enables you to search for a selling partner catalog that is uniquely identified by a selling partner-provided SKU, representing the product facts and sales terms for an item sold on or fulfilled by Amazon.

Furthermore, by subscribing to and monitoring the LISTINGS_ITEM_STATUS_CHANGE, LISTINGS_ITEM_ISSUES_CHANGE, and other notifications available through the Selling Partner APIs, you can build event-driven workflows that react in real-time to changes and stay informed. This can provide valuable insights and opportunities for optimization in your selling strategies on Amazon. If necessary, you can also integrate other Selling Partner APIs into the application to enhance its functionality and performance, such as the Orders, Merchant Fulfillment, or Fulfillment by Amazon APIs.

Data ingestion, processing, and storage

Data collected from the Selling Partner Catalog and Listings APIs, retail analytics, and brand analytics provides valuable insights to selling partners regarding their products, customers, and overall performance, such as:

  • Sales performance: Analyzing sales data can help selling partners identify trends and patterns in customer behavior, allowing them to optimize their sales strategy.
  • Product performance: Analyzing product data can help selling partners identify which products are selling well and which ones are not, allowing them to optimize their product offerings.
  • Customer behavior: Analyzing customer data can help selling partners understand their customers’ behavior, preferences, and needs, allowing them to personalize their marketing and sales strategies.
  • Competitive analysis: Analyzing data from competitors can help selling partners gain insights into their competitors’ strengths and weaknesses, allowing them to optimize their own strategies.

AWS modern data lake architecture is a powerful solution that can help selling partners effectively manage and analyze large datasets from multiple sources. With AWS data lake, Amazon selling partners can easily store and process their data in a centralized location, making it easier to analyze and gain insights from the data. Additionally, it provides you with scalable data integration, with data security features such as encryption and access control to ensure data is protected.

Data storage and insights reference architecture

As your data is ingested and processed, it is stored in Amazon S3, which serves as the storage layer for your modern data architecture. From there, you can use a range of AWS services, such as AWS Lake Formation, AWS Glue, and Amazon Athena, to catalog, transform, enrich, and analyze your data, enabling you to derive valuable insights that can help you make informed business decisions. The following diagram depicts this architecture.

Image depicting data storage and insights reference architecture

Data storage and insights reference architecture


This blog provides you with a powerful and flexible way to ingest and process data, enabling you to gain deeper insights into your business and drive growth and profitability. By leveraging Selling Partner APIs and AWS serverless and managed services, you can scale data operations as business grows, while also keeping costs under control.


Have feedback on this post?

If you have questions or feedback on this post, we'd like to hear from you! Please vote and leave a comment using the tools at the bottom of this page.

Subscribe to updates via RSS feed.