Strategies to optimize rate limits for your application workloads

How to use the defined rate limits for each SP-API operation in an optimized manner and strategies for managing API throttling in your applications workloads

by Hina V., Solutions Architect, Selling Partner Developer Services | March 16, 2022

When architecting your application for Selling Partner API, you need to keep the per API resource rate limits in mind, particularly the types of operations and the frequency with which they are called. When the allotted rate limit for an API operation exceeds, you’ll receive an error response and the call will be throttled. Excessive API throttling can result in job failure, delays, and operational inefficiencies that ultimately cost your organization time and money. Selling Partner API maintains a per API resource quota for each Selling Partners to help with the availability of Selling Partner APIs, and prevent accidental burdening of the APIs.

When request submissions exceed the steady-state request rate and burst limits, SP-API fails the limit-exceeding requests and returns 429 Too Many Requests error responses to the client. If you see these error responses, you can resubmit the failed requests in a way that is rate limiting while also complying with rate limits.

The following sections describe how to use the defined rate limits for each SP-API operation in an optimized manner and provides strategies for managing API throttling in your applications workloads.

Implement a rate limiter code

A rate limiter is used to manage a high volume of traffic allowing N requests per second based on per API resource limits. Implementing the rate limiter allows the calling pattern to be consistent (smooth) over time instead of having peaks through the traffic. You can use the per API rate limit as the guideline for each API in the rate limiter.

In the Selling Partner API Authentication/Authorization Library, the Interface allows you to set and get rateLimit configurations that are used with RateLimiter. RateLimiter is used on client side to restrict the rate at which requests are made. RateLimiter Configuration takes Permit, rate which requests are made, and TimeOut.

The following code snippet provides an example configuration for a rate limiter.

import com.google.common.util.concurrent.RateLimiter;
import com.amazon.SellingPartnerAPIAA.RateLimitConfiguration;
.
.
.
/**
* Sets the RateLimiter
* A rate limiter is used to manage a high volume of traffic allowing 
* N requests per second
* @return Api client
**/
public ApiClient setRateLimiter(RateLimitConfiguration rateLimitConfiguration) {
    if (rateLimitConfiguration != null) {
        rateLimiter = RateLimiter.create(rateLimitConfiguration.getRateLimitPermit());
//Add rateLimiter to httpclient interceptor for execute
            RateLimitInterceptor rateLimiterInterceptor = 
            new RateLimitInterceptor(rateLimiter, rateLimitConfiguration);
            httpClient.interceptors().add(rateLimiterInterceptor);
        }
     return this;
}    
.
.
.
class RateLimitInterceptor implements Interceptor {
    RateLimiter rateLimiter;
    RateLimitConfiguration rateLimitConfiguration;

    public RateLimitInterceptor(RateLimiter rateLimiter, 
    RateLimitConfiguration rateLimitConfiguration) {
        this.rateLimiter = rateLimiter;
        this.rateLimitConfiguration = rateLimitConfiguration;
    }

    @Override
    public Response intercept(Chain chain) throws IOException {
        if (rateLimitConfiguration.getTimeOut() == Long.MAX_VALUE) {
            rateLimiter.acquire();
        } else {
            try {
                if (!rateLimiter.tryAcquire(rateLimitConfiguration.getTimeOut(), 
                TimeUnit.MILLISECONDS)) {
                    throw new ApiException(
                    "Throttled as per the ratelimiter on client");
                }
            } catch (ApiException e) {
                e.printStackTrace();
            }
        }
        return chain.proceed(chain.request());
    }
}  

Workload-based mechanisms to manage throttles

The following additional mechanisms can help you manage throttles depending on how your workload is architected.

Event-based workload

Monitor with Notifications API: With event-based tasks, you can monitor notifications using the Notifications API and perform actions based on certain conditions.

The Selling Partner API for Notifications lets you subscribe to notifications that are relevant to a selling partner's business. You can create a destination to receive notifications, subscribe to notifications, delete notification subscriptions, and more. Instead of polling for information, your application can receive information directly from Amazon when an event triggers a notification to which you are subscribed. This approach creates a holistic workflow that coordinates and makes the most of the resources you already have in place.

As a best practice to avoid polling, you should use notifications when developing applications for the Orders API and Reports API. The following notifications subscriptions can help provide insight into changes:

There are many more notifications available for your application to leverage. For more information, see the Notifications API v1 Use Case Guide.

Implement retry techniques: If a Selling Partner API method rate limit is exceeded, you will often receive a RequestLimitExceeded response and the API call will be throttled. To avoid impact to your workloads, you should proactively implement retry techniques. The following techniques increase the reliability of your application and can reduce operational costs for your organization.

  • Retry: Implement automatic retry logic. You can configure the retry settings by adding a small delay and queuing between your requests.
  • Exponential backoff: In addition to simple retries, implement an exponential backoff algorithm for better flow control. The idea behind exponential backoff is to use progressively longer waits between retries for consecutive error responses. Exponential backoff can lead to very long backoff times, because exponential functions grow quickly. You should implement a maximum delay interval and a maximum number of retries. The maximum delay interval and maximum number of retries are not necessarily fixed values. They should be set based on the operation being performed and other local factors, including network latency.
  • Jitter: Retries can be ineffective if all clients retry at the same time. To avoid this problem, we employ jitter, a random amount of time before making or retrying a request to help prevent large bursts by spreading out the arrival rate. Most exponential backoff algorithms use jitter to prevent successive collisions. For more information, see Exponential Backoff and Jitter.

User input-based workload

Monitor and manage the user input based workload. Based on the seller level throughput, you can decide how to distribute the throughput with input from users. We recommend having some indicator of when the throughput is at 90% capacity, so that you are able to add backoff time for the throughput to become available for the api resource. Generally, you should be able to allocate resources to meet your customers’ demands as long as they’re not generating programmatic traffic.

Use batch operations

Use bulk get endpoints such as reports/2020-09-04/reports?, which lets you get data in bulk in a single API request. SP-API offers the following options for aggregated data:

Use bulk upload resources when applicable. Feeds feeds/2021-06-30/feeds? API allows you to carry bulk tasks in a single API request. SP-API offers the following options for bulk uploads:

  • Listings Feed to create an Amazon listing, create a new record (ASIN), or update inventory levels.
  • Orders Feed to acknowledge, cancel, or issue a refund for an Amazon Order.
  • Fulfillment by Amazon Feeds allow you to submit, cancel Amazon FBA order or create, cancel bulk shipment plans.

Use generic paths to get the Restricted Data Token (RDT). When implementing Tokens API for secure resources, use generic paths to get the Restricted Data Token. In the following example, you get a single RDT for the mentioned data elements for all orders for a given seller.

POST https://sellingpartnerapi-na.amazon.com/tokens/2021-03-01/restrictedDataToken 
    { 
        "restrictedResources": 
            [ 
                { 
                    "method": "GET", 
                    "path": "/orders/v0/orders", 
                    "dataElements": ["buyerInfo", "shippingAddress"]
                } 
            ] 
    }

General best practices to manage throttling

Monitor API activity against your rate limit. You can implement an API dashboard to monitor your API activity against your rate limit in the last 24 hours. You can also use the following response headers to confirm the application's current rate limit and monitor the number of requests remaining in the current minute. In the following example, the current request count per second is 5.

X-AMZ-Rate-Limit: 5

Finally, if you are using AWS services, you can monitor API activity with Amazon CloudWatch metrics.

Catch errors caused by rate limiting. For each request, you can check to see if you've bumped into the rate limit. If you get a response code of 429, "Too Many Requests", you've hit the rate limit. It's best practice to include code in your script that catches 429 responses. If your script ignores the "Too Many Requests" error and keeps trying to make requests, you might start getting null errors. At that point, the error information won't be useful in diagnosing the problem.

Reduce the number of API requests. Make sure you make only the requests that you need. Try these tips for reducing the number of requests:

  • Use the throughput as you grow and scale with your usage.
  • Optimize your code to eliminate any unnecessary API calls. For example, are some requests getting data items that aren't used in your application?
  • Cache frequently used data. You can cache data on your servers using Object level storage like Amazon S3. You can also save relatively static information in a database or serialize it in a file.
  • Stagger SP-API requests in a queue and do other processing tasks while waiting for the next queued job to run.
  • Configure your code to stop making additional API requests when throttled until enough time has passed to retry.

Considerations for application design

Consider these tips for managing throttling and rate limits when you design your app:

  • Regulate the rate of your requests for smoother distribution when using multiple SP-APIs.
  • Remember, some Selling Partner APIs may have functional limits on the frequency that data can be requested or written for a given seller, so we don't suggest using all of your throughput on one seller at a time.

Conclusion

This post described how to use the defined rate limits for each SP-API operation in an optimized manner and provided strategies for managing API throttling in your applications workloads.

👍

Have feedback on this post?

If you have questions or feedback on this post, we'd like to hear from you! Please vote and leave a comment using the tools at the bottom of this page.

Subscribe to updates via RSS feed.


Did this page help you?