Rate limits in Amazon Business APIs

Amazon Business defines usage plans to control the number of requests an application can make to a given Amazon Business API operation within a certain time period. These usage plans help ensure consistent API access and prevent the APIs from being overwhelmed.

Terminology

Note the following key terminology to understand Amazon Business API usage plans:

Rate limit: The maximum number of requests that you can make to a specific API operation per second, also called transactions per second (TPS). To avoid throttling, stay below this limit. Each rate limit is separate per API operation.
Burst limit: The maximum number of requests an operation can handle concurrently before it returns a 429 error. This error means that you exceeded the rate limit for this operation.
Token bucket algorithm: The algorithm that Amazon Business uses to rate-limit requests, based on an analogy of exchanging tokens for API requests. For more information, see Rate limiting algorithm.
Throttling: The case in which Amazon Business temporarily rejects your requests when you exceed your rate limit. A throttled request results in a 429 HTTP error response. You can retry throttled requests later.
Application: The developer application that calls the Amazon Business API on behalf of a consented business customer. In Solution Provider Portal (SPP), you can find the application ID after the application name.
Party: The pair between 1) an application and 2) a consented customer account. Rate limits are applied at the party level. For example, if your application simultaneously receives requests from two consented customers to call an API operation that has 1 TPS, both calls will succeed because the rate limit is allocated per party and not per application.

Rate limiting algorithm

Amazon Business uses the token bucket algorithm to rate-limit requests to the API. The algorithm is based on the analogy of a bucket that contains tokens, where you can exchange each token for a request to the API.

In this analogy, the Amazon Business API automatically adds tokens to your bucket at a set rate per second until the maximum limit of the bucket is reached. The maximum limit is also called the burst rate.

When you call an Amazon Business API operation, the API retrieves your usage plan (rate limit and burst limit) for that operation based on the access token that identifies the caller identity in the request header. Each request that you make to the API subtracts a token from the bucket. Throttling occurs when you make a request for which no token is available because your bucket is empty. A throttled request results in an error response. You will be able to call the API again when another token is added to your bucket.

How to find your usage plan

You can find the applicable usage plan for an Amazon Business API operation in two places:

Amazon Business API documentation: Each API reference and model document includes a “Usage plan” section that defines the default rate and burst limits for this operation.
Response header: When you call an Amazon Business API operation, the x-amzn-RateLimit-Limit response header specifies the operation's rate limit for this account-application pair.
- In some cases, despite a best-effort attempt, the API can't retrieve the rate limits. In this case, the API doesn't fail an otherwise valid call to the operation. Instead, the API returns the response without the x-amzn-RateLimit-Limit header.
- The x-amzn-RateLimit-Limit response header is for HTTP status codes 20x, 400 and 404. Unauthorized, or unauthenticated, requests don't include this header.

Best practices

To effectively call Amazon Business APIs within their allotted rate limits, note the following best practices:

You can make API calls on behalf of multiple customers in parallel rather than sequentially, since the limit is per consented customer and not per application.
If an API call fails with 429 error, implement a retry mechanism with an exponential backoff and a slight jitter. This ensures that not all calls are sent at the same time. For information about this workflow, see Retry with backoff pattern.