API rate limits


Attentive has a rate limit for the number of requests that you can send to each endpoint. This is to ensure that all our partners can reliably use our APIs and to maintain our platform’s stability.

Your access token allows you to make up to 150 requests per second for most endpoints. Some endpoints have lower rate limits:

  • Identity: 10 requests per second
  • Subscribe: 10 requests per second
  • Unsubscribe: 5 requests per second

These limits are the same for both private and public applications.

If you exceed the rate limit, you’ll receive a 429 response. To avoid getting a 429 response, consider implementing a retry mechanism with an exponential backoff schedule. Building randomness into your backoff schedule can help prevent your API request from being retried concurrently with other requests.

Track usage

To assist in monitoring your API usage, your requests (including any that have exceeded the rate limit) will contain the following response headers with information on your current usage:

  • x-ratelimit-remaining: the number of requests that you can send in the next second without exceeding the current limit
  • x-ratelimit-limit: the total number of requests that you can send per second (i.e., the current rate limit)

Testing API latency

API latency is the round-trip time for a request from the client to the server and back, including the server’s processing time. Attentive’s Product and Engineering team developed our API infrastructure using industry best practices to ensure high network deliverability and fast API response times. To test Attentive’s API latency, we recommend using Postman or another API testing tool.

We suggest testing API calls to a dummy phone number. Although using a dummy number generates an error code, it still allows you to test API latency.

We recommend the following testing scenarios:

  • Baseline latency: This scenario establishes a baseline for API latency under normal operating conditions. By sending a single request at a time to the API endpoint and measuring the time it takes for the response to return, you can determine the typical latency experienced by users.
  • Concurrent requests: This scenario simulates real-world conditions where multiple users or systems access the API simultaneously. By sending multiple requests concurrently and measuring their latency, you can evaluate how well the API performs under load and whether latency increases with higher concurrency.
  • Variable payload size: This scenario simulates requests of varying payload sizes to evaluate how the amount of data transmitted affects the API's performance. By sending requests with small, medium, and large payloads, you can evaluate if there's any correlation between payload size and API latency.
  • Network latency simulation: This scenario introduces artificial network latency, allowing you to evaluate how the API performs under less-than-ideal network conditions. By delaying the transmission of packets between the client and server, you can measure the effect of network latency on API latency and identify any potential performance issues.
  • Error handling: This scenario tests the API's error handling capabilities by intentionally triggering error responses. This can include authentication failures, sending requests with invalid parameters, or sending requests to non-existent endpoints. By measuring the latency for error responses, you can ensure that the API handles errors without a significant impact to overall latency.