Usage Plans and Rate Limits

Learn about usage plans and how rate limits are used in SP-API.

API reliability depends on sizing your capacity and resources to meet the changing needs of your application over time. For this reason, the Selling Partner API (SP-API) defines usage plans to control the number of requests that you can make to an operation within a certain time period.

Usage plans depend on several factors. These factors can include the selling partner account, the application that calls the SP-API on behalf of the selling partner account, the marketplace, and so on.

This document describes the rate-limiting algorithm of the SP-API, the factors that determine usage plans, how to find your usage plan, and frequently asked questions about working within your rate limits.

For guidance on how to optimize your application workloads to leverage the rate limits for each SP-API operation efficiently, refer to Optimize rate limits for application workloads.

To learn more about the terms that are used on this page, refer to Terminology.

Rate-limiting algorithm

The SP-API uses the token bucket algorithm to rate-limit requests to the API. The algorithm is based on the analogy of a bucket that contains tokens, where you can exchange each token for a request to the API.

In this analogy, the SP-API automatically adds tokens to your bucket at a set rate per second until the maximum limit of the bucket is reached. The maximum limit is also called the burst rate.

When you call the SP-API, the SP-API retrieves your usage plan (rate limit and burst limit) for that operation based on the access token that identifies the caller identity in the request header. Each request that you make to the SP-API subtracts a token from the bucket. Throttling occurs when you make a request for which no token is available because your bucket is empty. A throttled request results in an error response.

The following illustration shows how rate limiting works based on the token bucket algorithm and the bucket allocation criteria. The example uses an operation that has a rate limit of one and a burst limit of two.

The token bucket algorithm.

The bucket is initially full and holds two tokens (burst limit). At 01:00:100, an application calls an operation of the SP-API on behalf of a selling partner who operates in the EU marketplace. The request is successful. The bucket now holds a single token because the request depleted one token from the bucket.
After 100 milliseconds, at 01:00:200, the same application calls the same operation on behalf of the same selling partner in the same region. The request is successful and uses the last token in the bucket. The bucket is now empty.
After another 100 milliseconds, at 01:00:300, the same application calls the same operation on behalf of the same selling partner in the same region. This time, due to the depleted limits (empty bucket), the request is throttled.
One second from the beginning of this example, at 01:01:000, the SP-API puts a token in the bucket, because the rate limit is one token per second. The application can now call the operation successfully without being throttled.
After one more second, at 01:02:00, the SP-API adds another token to the bucket.
At 01:03:00, the SP-API doesn't add another token to the bucket because the bucket has reached its maximum limit (burst limit).

The factors that determine rate and burst limits mean that in some cases there is a separate token bucket. In this example, two tokens are available at 01:00:200 in the following cases:

The same application triggers the same operation but on behalf of a different selling partner.
The same application calls the same operation on behalf of the same selling partner but using the credentials from a different regional account.
A different application triggers the same operation on behalf of the same selling partner.

The following illustration shows these alternate scenarios.

Alternate scenarios.

Factors that determine usage plans

Rate and burst limits for each SP-API operation depend on multiple factors, and multiple usage plans can apply to an individual operation. Requests are rate limited by whichever threshold you reach first.

The SP-API relies on the following factors to assign usage plans:

API operation: Each operation of the SP-API has its own default rate and burst limits. You can find these limits, or a link to documentation that lists the limits, in the API reference documentation for each operation.
Selling partner (account) and application pair: Most operations' rate limits are per the selling partner and application pair. The application calls the SP-API on behalf of the selling partner. Grantless operations are exceptions to this rule. Applications can call grantless operations without authorization from a selling partner. For grantless operations, usage plans are defined based on the other criteria in this section.
Regions and marketplaces: Usage plans are implicitly tied to the marketplace groups in which a selling partner operates and has authorized an application to call the SP-API on their behalf. The implicit nature of this rate-limiting factor comes down to the different marketplace-specific selling partner accounts. Different marketplace-specific selling partner accounts have different seller IDs.

Usage plan types

Each SP-API operation is associated with a usage plan, which includes a rate limit and a burst limit. SP-API has two types of usage plans: standard and dynamic.

Standard usage plans: Most SP-APIs are governed by standard usage plans. With a standard usage plan, rate limits have static values for all callers based on the expected calls patterns for each operation. You can find the usage plan, or a link to documentation that lists the usage plan, in the API reference documentation for each operation.
Dynamic usage plans: Some SP-APIs and operations have dynamic usage plans. A dynamic usage plan is one that is automatically adjusted to each selling partner based on the current and historical business needs for that business. Because the purpose of dynamic usage plans is to right-size those limits over time, the rates can change. A variety of selling partner business metrics influence rate adjustments. These metrics are business metrics only and do not include a historical number of API requests. Rates do not dynamically increase because an application makes API requests more frequently.

How to find your usage plan

You can find your usage plan (rate and burst limits) in the following ways:

SP-API rate limit pages: Each API section has a dedicated page that lists the limits for each operation. For example, refer to A+ Content API Rate Limits.
Response header: When you call an SP-API operation, the x-amzn-RateLimit-Limit response header, if available, specifies the operation's rate limits per account-application pair. However, you must not depend on this header being present, for the following reasons:
- In some cases, despite a best-effort attempt, the SP-API can't retrieve the rate limits. In this case, the SP-API doesn't fail an otherwise valid call to the operation. Instead, the SP-API returns the response without the x-amzn-RateLimit-Limit header.
- The x-amzn-RateLimit-Limit response header is for HTTP status codes 20x, 400 and 404. Unauthorized, or unauthenticated request, don't include this header.
- The x-amzn-RateLimit-Limit header doesn't include other usage plan rate limits.

Frequently asked questions

The rate limits for one operation are too low for my use case. Can the limit be increased?

We aim for right-sized limits, with the goal that efficient call patterns should, ideally, never be throttled. If your application is consistently throttled, it might mean that you can further optimize your call patterns. For more information, refer to What should I do if my application is consistently throttled. If you find that default rates aren't sufficient, refer to Strategies to optimize rate limits for your application workloads.

How should my application handle a 429 response?

A 429 is a retry-able status code. You can try again, but repeated throttled requests require a back-off strategy. Use the x-amzn-RateLimit-Limit response header, when available, to determine if the rate limits differ from your expectations. For details about when this header is available, refer to How to find your usage plan.

How can I test my application with respect to its usage plans?

You can test 429 error handling by using the SP-API sandbox. However, you can't test rate limits with the sandbox because while operations in production can have various rates, all sandbox operations share the same rate. You can find your assigned usage rates in the x-amzn-RateLimit-Limit response header for each operation, when available. For details about when this header is available, refer to How to find your usage plan.

Can my application completely avoid getting throttled?

No. Any number of factors outside your control can result in a few transient 429s. Your application code should account for this possibility.

What should I do if my application is consistently throttled?

If your application is consistently throttled, you might need to further optimize your call patterns. For example:

Call less frequently and keep within your rate limits.
Rely on push notifications instead of polling mechanisms.
Use batch APIs where available or try to do more with fewer calls. For example, with the Feeds and Reports APIs, you can send or retrieve more information per call. Generally, examine your call patterns against the operations in an API to determine if you can get the same work done in fewer calls.

Will my application be throttled more often as I obtain more authorizations?

No. Usage plans are specific to the application and selling partner pair, so that your throughput grows naturally with your clients.

Will rate limits change?

We can raise rate limits at any time. If we ever lower the rate limits posted in the API reference documentation, we communicate the change in advance to give you time to update and test your applications before the change goes live.

Rate limits for dynamic usage plans auto-adjust higher or lower depending on business context.

Will my rate limits increase if my application is constantly throttled?

Rates are based on a selling partner's business metrics. If your application is consistently throttled, your call patterns are likely not aligned with the rate limits assigned to that selling partner. Refer to What should I do if my application is consistently throttled? and The rate limits for one operation are too low for my use case. Can the limit be increased?.

What is the overall goal of dynamic usage plans?

Historically, we have observed that homogeneous usage plans are over-sized for some situations and, worse, under-sized for others. The goal of dynamic usage plans is to leverage the known business context of a given call to put the right limits in place for any situation.

What factors influence dynamic usage plans?

In general, rate limits are shaped by the type, size, and behavior of the selling partner business.

How often will the limits associated with a given usage plan change?

We aim to prevent frequent, disruptive changes to limits. Generally, limits are changed as soon as we detect meaningful changes in the business metrics in a Selling Partner account.

How should I code my application to respect dynamic limits?

The following suggestions can help your application handle dynamic rate limits:

Read the x-amzn-RateLimit-Limit header, when available. For details about when this header is available, refer to How to find your usage plan.
Do not hardcode timers.
Code naturally against events rather than running on a loop. If you use events, you don't need a timer. For example, update prices in response to price notifications rather than every certain number of seconds.

Updated 3 months ago