Optimize Rate Limits for Application Workloads

Manage API throttling and optimize SP-API usage within your application.

When you design your Selling Partner API (SP-API) application, you must consider per-API resource rate limits. SP-API maintains a per-API resource quota for each selling partner to maintain availability and prevent overloading individual APIs.

If you exceed these rate limits, SP-API returns a 429 Too Many Requests error and throttles the call. Excessive API throttling can result in job failure, delays, and operational inefficiencies that ultimately cost your organization time and money. If you receive these error responses, you can resubmit the failed requests in a way that complies with rate limits.

This guide outlines the following strategies to help you effectively manage API throttling and optimize the performance and reliability of your SP-API applications:

Check and adhere to rate limits
Avoid spiky traffic
Use SP-API pre-built SDKs
Implement retry and back-off techniques
Reduce the number of API requests

For comprehensive guidance on best practices across various aspects of SP-API integration, refer to the SP-API Well-Architected Guidance playlist on the Amazon SP-API Developer University channel.

Check and adhere to rate limits

Review the following guidance on how to check and adhere to rate limits.

Check rate limits

Review the usage plan for each SP-API operation in the documentation. To learn how to find the usage plan, refer to How to find your usage plan.

Compare the documented limits against the rate limit headers of the API responses. The response header is available for HTTP status codes 20x, 400, and 404. To avoid throttling, design your application to stay within these limits.

To learn more about usage plans and how the SP-API rate limiting algorithm works, refer to Usage Plans and Rate Limits.

Understand application rate limits

Some APIs, like the Listings API, have multiple rate limits:

A rate limit that depends on the selling partner account.
A rate limit that depends on both the selling partner account and on the application that calls the Selling Partner API on behalf of the selling partner account.

Requests are rate limited by whichever threshold you reach first. Rate limits for the selling partner account are specific to the account, whereas application rate limits serve as a universal threshold across all selling partners to ensure balanced API usage. Each API includes a dedicated documentation page with specific rate limits for every operation. For an example, refer to Listings API Rate Limits.

To understand these thresholds and implement appropriate handling mechanisms, review the rate limits documentation for each API that you use.

Set up an error monitoring and alerting system

To adhere to API rate limits, it’s crucial to set up an effective system to monitor and alert when errors occur. This process typically involves the following steps:

Log API responses: Capture and store the complete API response data, including status codes, headers, and error messages, to enable analysis and categorization of errors.
Categorize errors: Organize the logged errors into relevant buckets based on HTTP status codes. For example, you can categorize 400-level client errors into the following buckets: 400 invalid input, 403 authentication issues, 404 resource not found, 429 rate limit breaches and so on.
Create an error dashboard: Visualize the error rates for each API operation and error type on a centralized dashboard to quickly identify problematic areas.
Set alerting thresholds: Define appropriate thresholds for each error type and set up alerts to proactively notify you when error rates exceed those thresholds.

If you use AWS services, you can implement this best practice by using Amazon CloudWatch:

CloudWatch logs: Capture and store the detailed API response data.
CloudWatch metrics filters: Create custom metrics to count the different error types based on status codes.
CloudWatch alarms: Monitor the error metrics and trigger notifications (for example, Amazon Simple Notification Service) when thresholds are breached.

Avoid spiky traffic

Distribute API requests uniformly across time to avoid concentrated bursts of calls to specific operations followed by periods of minimal activity. These uneven spikes cause additional 429 errors, which you can avoid by spreading out the traffic over time.

You can implement a rate limiter to manage a high volume of traffic, and allow N requests per second based on per-API resource limits. The rate limiter ensures a consistent calling pattern over time, to mitigate traffic peaks and promote uniform API usage. Use the per-API rate limit as the guideline for each API in the rate limiter.

For a step-by-step code example that uses the Selling Partner API Authentication/Authorization Library to implement a rate limiter, refer to the following sample code.

📖

Implement a rate limiter

Open Recipe

Use SP-API pre-built SDKs

SP-API prebuilt SDKs include a built-in rate limiter to help manage your API requests effectively. When you exceed a rate limit, the operation throws a rate limit exceeded error. Implement appropriate exception handling for these cases. For implementation details, refer to the prebuilt SDKs in GitHub.

Implement retry and back-off techniques

Proactively implement the following techniques to avoid impact on your workloads and increase the reliability of your application:

Retry: Implement automatic retry logic. You can configure the retry settings by adding a delay based on the rate limit and by queuing between your requests.
Exponential back-off: Use an exponential back-off algorithm for better flow control, with progressively longer waits between retries for consecutive error responses. Exponential back-off can lead to very long back-off times, because exponential functions grow quickly. Implement a maximum delay interval and a maximum number of retries, which you can adjust based on the operation and other local factors.
Jitter: Retries can be ineffective if all clients retry at the same time. To avoid this problem, use jitter, which is a random amount of time before making or retrying a request to help prevent large bursts by spreading out the arrival rate. Most exponential back-off algorithms use jitter to prevent successive collisions. For more information, refer to Exponential Backoff and Jitter.

Reduce the number of API requests

To maintain efficient SP-API integrations, you can optimize your call patterns and reduce API requests. For detailed strategies on effective API use and call management, refer to Optimize Calls to the Selling Partner API.

Other best practices

Keep in mind the following other best practices:

Optimize your code to eliminate unnecessary API calls.
Cache frequently used data to reduce the need for repeated API requests. You can cache data on your servers using Object-level storage like Amazon S3. You can also save relatively static information in a database or serialize it in a file.
Stagger SP-API requests in a queue and do other processing tasks while waiting for the next queued job to run.

Updated 12 days ago