Streamlining Analytics with Amazon Selling Partner API Data Ingestion into a Data Lake
Using SP-API with AWS serverless and managed services to easily process all your data
by Hina V. and Anuj R., Selling Partner Developer Services; Manikanta G., Daman O., Jani S., and Norman K., Amazon Web Services | May 24, 2023
As an Amazon Selling Partner, you are likely dealing with a vast amount of data related to your products, sales, and traffic. This data can be overwhelming and difficult to manage, especially if you don't have the right tools and processes. However, using Amazon Selling Partner APIs (SP-API) with Amazon Web Services (AWS) serverless and managed services, you can easily ingest and process all of your data, providing valuable insights that can help improve sales and grow your business.
This blog provides AWS architecture guidance giving you the ability to build a modern data lake — a useful tool for handling large amounts of data. We'll explore high-level design aspects of the architecture guidance to provide a comprehensive understanding of how it works.
Prerequisites:
- You are a registered developer for SP-API.
- You have created IAM policies and entities for SP-API.
- You have registered a Selling Partner API application.
Authentication and authorization
SP-API Authentication and Authorization reference architecture
To establish secure data access, it is necessary to implement authentication and authorization for SP-API. The following diagram provides an overview of this workflow within SP-API.
Figure 1: Authentication and authorization workflow in SP-API
The authorization process for SP-API relies on Login with Amazon (LWA), which is Amazon's implementation of OAuth 2.0. During this process, your Selling Partner application interacts with both Amazon's pages and your website to gain authorization. The web browser passes parameters back and forth between your website and Amazon for each selling partner action.
To implement OAuth 2.0 authorization, you must set up your website to:
- Accept and handle the parameters passed by Amazon, and
- Redirect the web browser while passing parameters to Amazon. Refer to Website authorization workflow for more information.
Once authorized, you obtain a Login with Amazon (LWA) refresh token. This token should be re-authorized every 365 days. It is essential to rotate your LWA credentials for all applications every 180 days.
In this AWS Architecture guidance, AWS Secrets Manager stores your application's LWA credentials and the LWA refresh token. These tokens are exchanged for an LWA access token to include in API calls.
AWS Sigv4 is another critical aspect of making an SP-API call, as it is used to sign the API request, providing a secure way to interact with SP-API. You sign requests using your IAM STS AWS access keys, which consist of an access key ID, a secret access key, and a session token. Amazon recommends using the AWS Security Token Service (AWS STS) to request temporary AWS access keys to sign your requests.
SP-API integration
To connect with Selling Partner APIs, you first call the Login with Amazon (LWA) endpoint with your application's LWA credentials and the LWA refresh token you obtained. In return, you receive an LWA access token, which is passed in the request header of your Selling Partner API HTTP request and expires after one hour. When you send HTTP requests to the Selling Partner API, you also sign the requests using AWS STS to request temporary AWS access keys to sign your requests.
Serverless reports application reference architecture
In this AWS architecture, you use AWS Step Functions to create a serverless application that interacts with the Amazon Selling Partner API for Reports. The Reports API has different regional endpoints, marketplace IDs, and report-type configurations that are stored using AWS Systems Manager Parameter Store. To simplify the process, have an AWS Lambda function dedicated to creating reports. This function makes calls to the appropriate report type configuration, marketplace IDs, and other necessary HTTP headers to complete the API call. The following diagram depicts the architecture behind this workflow.
Figure 2: Serverless Reports application reference architecture
In accordance with Reports API best practices, use a Lambda function to subscribe to the SP-API REPORT_PROCESSING_FINISHED
notification via the Notifications API (Note: this notification is not yet available for vendors). This allows the architecture to be event-driven and receive notifications when the report processing is complete.
Another Lambda function is then used to retrieve the report from the SP-API endpoint, process the data, and store the report in an Amazon Simple Storage Service (Amazon S3) bucket. To ensure data protection, you should use AWS Key Management Service (AWS KMS) to encrypt data.
Use an API client to handle rate limiting. The Selling Partner APIs return the x-amz-rate-limit
in the API response header, which can be used to get the most updated rate limits for the operations you are calling. By using an API client to handle rate limiting, you can avoid exceeding these limits and ensure that your application functions smoothly.
Serverless Catalog and Listings Items reference architecture
Next, use AWS Step Functions to streamline the development and management of a distributed application that integrates with the Amazon Selling Partner APIs. This allows for the creation and execution of workflows that manage the various components of the application. The following diagram depicts the architecture behind this workflow.
Figure 3: Serverless Catalog and Listings Items reference architecture
One Lambda function can search Amazon's catalog items and associated information using identifiers such as ASIN/SKU/EAN or by keywords like brand-specific search terms. The results can be stored in an Amazon S3 bucket for further processing.
There is also a Lambda function that makes calls to Amazon's Listings API. This enables you to search for a selling partner catalog that is uniquely identified by a selling partner-provided SKU, representing the product facts and sales terms for an item sold on or fulfilled by Amazon.
Furthermore, by subscribing to and monitoring the LISTINGS_ITEM_STATUS_CHANGE
, LISTINGS_ITEM_ISSUES_CHANGE
, and other notifications available through the Selling Partner APIs, you can build event-driven workflows that react in real-time to changes and stay informed. This can provide valuable insights and opportunities for optimization in your selling strategies on Amazon. If necessary, you can also integrate other Selling Partner APIs into the application to enhance its functionality and performance, such as the Orders, Merchant Fulfillment, or Fulfillment by Amazon APIs.
Data ingestion, processing, and storage
Data collected from the Selling Partner Catalog and Listings APIs, retail analytics, and brand analytics provides valuable insights to selling partners regarding their products, customers, and overall performance, such as:
- Sales performance: Analyzing sales data can help selling partners identify trends and patterns in customer behavior, allowing them to optimize their sales strategy.
- Product performance: Analyzing product data can help selling partners identify which products are selling well and which ones are not, allowing them to optimize their product offerings.
- Customer behavior: Analyzing customer data can help selling partners understand their customers’ behavior, preferences, and needs, allowing them to personalize their marketing and sales strategies.
- Competitive analysis: Analyzing data from competitors can help selling partners gain insights into their competitors’ strengths and weaknesses, allowing them to optimize their own strategies.
AWS modern data lake architecture is a powerful solution that can help selling partners effectively manage and analyze large datasets from multiple sources. With AWS data lake, Amazon selling partners can easily store and process their data in a centralized location, making it easier to analyze and gain insights from the data. Additionally, it provides you with scalable data integration, with data security features such as encryption and access control to ensure data is protected.
Data storage and insights reference architecture
As your data is ingested and processed, it is stored in Amazon S3, which serves as the storage layer for your modern data architecture. From there, you can use a range of AWS services, such as AWS Lake Formation, AWS Glue, and Amazon Athena, to catalog, transform, enrich, and analyze your data, enabling you to derive valuable insights that can help you make informed business decisions. The following diagram depicts this architecture.
Figure 4: Data storage and insights reference architecture
Conclusion
This blog provides you with a powerful and flexible way to ingest and process data, enabling you to gain deeper insights into your business and drive growth and profitability. By leveraging Selling Partner APIs and AWS serverless and managed services, you can scale data operations as business grows, while also keeping costs under control.
Have feedback on this post?
If you have questions or feedback on this post, we'd like to hear from you! Please vote and leave a comment using the tools at the bottom of this page.
Subscribe to updates via RSS feed.
Updated about 2 months ago