How to navigate the maze of security API data limits

June 7, 2023
Santiago Castineira
Shane Morton

In the wide world of security data lakes, no two data sources are quite alike. Integrating a new tool into a security data lake requires working within the constraints of that tool’s API. Each API comes with its own set of rules and limits for data requests, and navigating these limits can be frustrating. Any team that sets out to build their own security data lake will quickly become familiar with the cumbersome process of building, testing, scaling up, breaking, rebuilding, and retesting each connector.

In a perfect world, there would be perfect documentation of each of these limits for every API, explaining exactly how each works. Unfortunately, API documentation rarely tracks in a straightforward way to a security data lake use case, and it can be hard to understand how those limitations will get in your way. At Monad, we’ve identified and built connectors capable of scaling through, over, and around a wide variety of API restrictions, and have found that most fall into categories that can be conquered with some specific workarounds.

Rate Limits

Rate limits on APIs are designed to prevent any one user from consuming too much server capacity and impeding the experience of other API users. They slow data-hungry callers so everyone else calling the API can get the data they need. If you make a request to an API and receive a 429 Too Many Requests error status code, you’ll know you’ve hit a rate limit with that API. The error code will likely tell you what the rate limit for the API is as a header within the error code.

It is standard practice for the 429 error code to contain a header that specifies the backoff period that your client should retry after failing. Unfortunately, this header typically lists a static value that doesn't represent how long you really need to back off for (often because 429s are based on server load, and that cannot be predicted by the server-side API code either).

A common way to handle this is to start by backing off for the time suggested by the API, and then if another 429 is experienced on the next request, switch to backing off exponentially.

You can also consider building an error-handling mechanism into your connector in advance, to ensure that if you eventually encounter a 429 status code, you collect the necessary data to update your API to accommodate it.

Concurrency Limits

Concurrency limits refer to the maximum number of simultaneous or concurrent requests an API can handle from a client. Like rate limits, concurrency limits are put in place to protect the API server from being overwhelmed by too many requests.

Concurrency limits are typically outlined specifically in the API’s documentation. If not, they can be tricky to identify, as running up against them will likely result in a 429 error similar to rate limit, and it will take more careful examination of the error data to identify that it was concurrent requests that triggered the limit, rather than requests over time.

The most common solution for a concurrency limit is building a throttled job queue. Throttle the maximum number of requests your connector makes to the API to the exact concurrency limit (it may be a single request, or it may be an arbitrary limit, like five). Then stack requests so that the completion of one triggers the next, while keeping each batch under the concurrency limit. If you know the cooldown time of the API’s rate limit as well, add a wait-and-retry mechanism that waits until the next cooldown upon a failed request, as a backup in case you hit either rate or concurrency limit.

Request Signature Over Time Limits

A limit by Request Signature Over Time is less common than a limit by rate or concurrency, but it can be harder to identify and navigate. The "signature" of a request refers to the characteristics of the request itself, like the endpoint it’s requesting from, the parameters it’s sending, or the headers it’s asking for. A limitation on the request signature over time means that the API provider doesn't want to serve the exact same request too frequently.

API documentation rarely mentions a Request Signature Over Time limitation. You may discover this limit when you start seeing error responses to requests that were previously successful. The error message might say something like "Duplicate request" or "This request has been made recently", or 403 (Forbidden). You may even receive the 429 (Too Many Requests) error code despite knowing that you’re well within rate and concurrency limits. Always make sure to read the error codes closely as they will likely include valuable detail that can help you identify the path forward.

Navigating a Request Signature Over Time limitation can be difficult. You might need to vary the structure of the request to get around it; rotate the requesting IP address, have random intervals between requests, or experiment with modifying non-functional parts of the request like referrer and user-agent headers. If your request structure cannot vary, you will need to identify the time interval between when duplicate requests can be made, and sequence requests to that interval.

Export Size Limits

An export size limit is a limit to the maximum amount of data (often expressed as a number of a certain parameter, like 100k items, entries, or records) that an API allows to be exported in a single request or operation. These limits are designed to prevent the transmission of overly large data sets in a single request, which can strain the server or degrade performance.

Export size limits are typically noted in the API documentation, but in case it isn’t, you’ll find it when the API responds with an error message indicating that your request was too large. The specific status code might vary, but it's often a 413 (Request Entity Too Large) or a 400 (Bad Request) with a header indicating that your request exceeds the allowable limit.

There are a few ways to get around an export size limit. You can try pagination; instead of trying to export all the data you need in a single request, you can break down your request into several smaller ones. You request a portion of the data (a "page"), then make additional requests for the next pages until you've retrieved all the data. If the API allows for it, you can filter the columns in the request, pulling only that which you need. And some APIs allow you to export the data in different formats, so you can also try pulling it in a more compact format, like CSV rather than JSON.

Specific Sequence of Actions Limits

Some APIs require requesters to perform certain actions in a specific order to successfully extract data. This limit is typically implemented to ensure a logical flow of data, to enhance security, or maintain the consistency and integrity of the underlying application’s operation. Identifying these sequences typically requires a careful reading of the API's documentation. It may specify a certain order of operations, like "First, create a report; then, wait for report generation to complete before retrieving the report data." If you don't follow the specified sequence, the API will return an error message.

The error code could be a 400 (Bad Request), or it may be a more specific error code, depending on the API. The response message might indicate something like "Report not ready" or "You must create a report before retrieving data."

Specific Action Sequence limits can be a pain to build around, and the sequential steps required to make each request mean data can take some time to be fulfilled, but it’s worth the effort to get the information you need. To navigate a specific action sequence limitation, you'll need to design your connector to precisely follow the sequence; make a request, wait for the response, make the next request, and so on. For processes that take a while to complete (like generating a report), the API may provide an endpoint to check the status of the operation. You may be able to poll this endpoint and build a mechanism into the connector to only proceed to the next operation when the status indicates that it's okay to do so. Some APIs also support asynchronous operations, where you can start a long-running operation (like generating a report), then do other things and come back to check the operation's result later. This can help you work around some sequence limitations while optimizing your data extraction process.

When integrating with APIs that have Specific Action Sequence limitations, it can also be worth implementing robust error handling into your connector. If you receive an error indicating that you've performed an action out of sequence, have your application read and perform the required preceding action.

Software Development Kit (SDK) limitations

An API provider may strongly recommend (or even require) the use of their own provided SDK to gather security data; the idea behind this requirement is to streamline the integration process and provide a standard, optimized way to interact with the product. However, these libraries might not always support all use cases you have in mind or the scale you need, or may only be provided for languages not in your technology stack. When you encounter issues because a specific use-case isn't supported by the SDK, there might not be an explicit error message. Instead, you might find that you simply can't find a way to do what you need using the methods provided by the SDK.

You may be able to extend the SDK with your own code to support your specific use-case. This could involve creating a subclass, adding a function, or even modifying the SDK source code. You might need to use the SDK for most of your product interactions but revert to direct HTTP API calls for any use-cases not covered by the SDK.

The work of building robust security data pipelines is not for the faint of heart. Nor is it for the impatient. If all of the above methods fail, you may need to work directly with the vendor to get the functionality you need. This is typically a last resort; SDK/API functionality is rarely a priority for security tool providers, so there’s no guarantee you can get what you need if their API is too difficult to work with. And managing close technical relationships with your security vendors is rarely a desirable workflow when your goal is efficiency. Even if you do scoot under the rate, concurrency and export size limits, follow the specific sequence of actions, vary your request signatures, and push the library to its limits, all you’ve got is raw data; you still need to transform it into a coherent format to start working with it.

But let’s say you don’t want to wait.

Let’s say you want to get key data from tools like Wiz, Snyk, Tenable, or Crowdstrike transformed into an ops-ready format and delivered to your data warehouse, and you want it in hours, rather than months.

If that’s the case, Monad is for you. We’ve already built these connectors, and we push them to the limit with hundreds of millions of events every day. We also manage and maintain relationships with the providers of the security tooling you’re already using, so that when their APIs don’t work as described or simply don’t export the data you need, we work with them to get it done.

Reach out to hello@monad.com, and let’s get started.