The "happy path" is the way things should work in an ideal world, where nothing goes wrong. It would be like me thinking "I can just ride my bike to work in New York City without two taxi drivers, a truck, a family unit of tourists, cops on horses, a dog walker, and a raccoon all trying to kill me."
When API client developers start out they are full of optimism, and spend most of their time focused on the happy path as they focus on completing all the functionality they’ve been asked to do.
Over time API client developers step on enough rakes of their own making that they become hardened, almost paranoid, and it’s not unreasonable to feel this way.
There are so many things that can go wrong whenever you "go over the wire" that it’s best to assume an API interaction is not going to work, then be pleasantly surprised when it does. Your code should reflect this mindset.
Let’s look at a simple bit of code thats likely to be in a codebase near you already.
link:code/ch06-connection-problems/01-happy-path-only.js[role=include]
This is assuming that the server is up, that the request is going to get a response, in a reasonable time-frame (no definition of what that is), and when it does it’s actually talking to the server you thought it would be, and it’s definitely returning JSON!
Assuming server exists, responds in a decent time, and the data is in fact JSON, there’s a few more issues: we’ve presumed the pokemon was definitely found (not typos or missing data), and confidently tied our code to an exact JSON structure with no checks to see if the data exist, and of the types we expect.
Coding API interactions this way may work for long enough to close out your tickets and go home, but you’ll end up with a lot more tickets piling up when various things go wrong.
For a client-server interaction to work, clearly both the client and the server will need to have an Internet connection at some point, but it might not always be there.
Even in situations where there is a strong degree of confidence, things could cut out momentarily, or for a while. How will your application cope if it does?
Mobile users in the city might hop on a subway, or go into an old building with a Superman-proof lead roof.
Mobile users in the countryside might be out in the woods trying to use your app to get directions home, or maybe they’re just trying to send selfies to make their friends jealous.
Laptop users might be in the middle of an important process when their wireless network decides to switch from the coffee shop wifi to the bar next doors Wifi, essentially cutting that connection in half and trashing the API calls. That might take a second, or it might be longer if the new network requires a login screen.
Maybe the API has "rate limiting" and decided your client is too chatty.
The API implemented a redirect and the HTTP client doesn’t know to follow it.
Their HTTP errors are coming out in HTML because they forgot to catch and return JSON errors.
HTTP status code 500 is popping up from the server or some other network component like a cache server, even though their documentation said they would not ever do a 500 error.
The API deployment wasn’t set up with zero-downtime deployments, and a code push has caused some downtime.
The API is relying on an upstream API which is deploying, or generally having a wobble, and thats caused some temporary instability.
AWS is down again.
ISP (Internet Service Provider) blocked a domain, or a keyword in the request/response.
Squirrels attacked the data center?
"A frying squirrel took out half of our Santa Clara data center two years back," Christian said, noting squirrels' propensity to interact with electrical equipment, with unfortunate results. If you enter “squirrel outage” in either Google News or Google web search, you’ll find a lengthy record of both recent and historic incidents of squirrels causing local power outages.
Surviving Electric Squirrels and UPS Failures, 2012, Data Center Knowledge
Ships trash an undersea cables by dropping anchor right on The Internet?!
All of these problems are going to cause the happy path to get messy.
Let’s harden our code one step at a time.
link:code/ch06-connection-problems/02-catch-fetch-errors.js[role=include]
This little change solves some of these problems. If there is any sort of connection failure from something like a connection refused (no internet, server is down, etc), or a dropped connection (failed part way through), or certificate errors, this should all be caught with the first exception.
Whatever happens, it will log something to the user console, and return early. You could imagine this code doing something clever to update the user interface, but for now we’re keeping it simple.
This is a step in the right direction, but once we’ve eventually got a response there is a lot of other things that can go wrong. What if the response is randomly HTML instead of JSON? Or it’s weirdly invalid JSON?
link:code/ch06-connection-problems/03-catch-json-errors.js[role=include]
Great! Now when the API randomly squirts some unexpected HTML error at you, the function will just return an empty array, and there is an error logged that the developers can go digging into.
Another step in the right direction, but this still assumes we actually get a response in a reasonable timeframe.
What if you’ve been waiting for thirty seconds?
What if you’ve been waiting for two minutes?
We will deep dive into timeouts later on in the book, but a really helpful quick bit of defensive coding you can do, is to make sure your application isn’t spending two minutes doing absolutely nothing for a request that normally takes less than half a second.
link:code/ch06-connection-problems/04-timeout.js[role=include]
Most of the time developing against an API that works just fine means you cannot test these complicated unhappy paths.
To simulate the sort of nonsense you are coding to defend against, take a look at Toxiproxy by Shopify.
Another common situation to run into is rate limiting, which is basically the API telling your API client to calm down a bit, and slow down how many requests are being made. The most basic rate limiting strategy is often "clients can only send X requests per second."
Many APIs implement rate limiting to ensure relative stability when unexpected things happen. If for some reason one client causes a spike in traffic, the API has to continue running smoothly for other users instead of crashing. A misbehaving (or malicious script) could be hogging resources, or the API systems could be struggling and they need to cut down the rate limit for "lower priority" traffic. Sometimes it is just because the company providing the API has grown beyond their wildest dreams, and want to charge money for increasing the rate limit for high capacity users.
Often the rate limit will be associated to an API key or access token, so it can be tied to a specific account. Our friends over at Nordic APIs very nicely explain some other rate limiting strategies:
Server rate limits are a good choice as well. By setting rates on specific servers, developers can make sure that common use servers, such as those used to log in, can handle a lot more requests than specialized or seldom used servers, such as data conversion devices.
Finally, the API developer can implement regional data limits, which limit calls by region. This is especially useful when implementing behavior-based limiting; for instance, a developer would expect the number of requests during midnight in North America to be lower than the baseline daytime rate, and any behavior contrary to this without a good reason would suggest questionable activity. By limiting the region for a period of time, this can be prevented.
https://nordicapis.com/stemming-the-flood-how-to-rate-limit-an-api/
All fair reasons, but for the client it can be a little pesky.
There are a lot of ways to go about throttling your API calls, and it very much depends on where the calls are being made from. One of the hardest things to limit are API calls to a third party being made directly to the client. For example, if your iOS/web/etc clients are making Google Map API calls directly from the application, there is very little you can do to throttle that. You’re just gonna have to pay for the appropriate usage tier for how many users you have.
Other setups can be a little easier. If the rate limited API is being spoken to via some sort of backend process, and you control how many of those processes there are, you can limit often that function is called in the backend code.
For example, if you are hitting an API that allows only 20 requests per second, you could have 1 process that allows 20 requests per second to pass through. If this process is handling things synchronously that might not quite work out, and you might need to have something like 4 processes handling 5 requests per second each, but you get the idea.
If this process was being implemented in NodeJS, you could use Bottleneck.
const Bottleneck = require("bottleneck");
// Never more than 5 requests running at a time.
// Wait at least 1000ms between each request.
const limiter = new Bottleneck({
maxConcurrent: 5,
minTime: 1000
});
const fetchPokemon = id => {
return pokedex.getPokemon(id);
};
limiter.schedule(fetchPokemon, id).then(result => {
/* ... */
})
Ruby users who are already using tools like Sidekiq can add plugins like Sidekiq::Throttled, or pay for Sidekiq Enterprise, to get rate limiting functionality. Worth every penny in my books.
Every language will have some sort of throttling, job queue limiting, etc. tooling, but you will need to go a step further. Doing your best to avoid hitting rate limits is a good start, but nothing is perfect, and the API might lower its limits for some reason.
The appropriate HTTP status code for rate limiting has been argued over about as much as "tabs" versus "spaces", but there is a clear winner now; RFC 6585 defines it as HTTP 429.
Some APIs like Twitter’s old API existed for a few years before this standard, and they chose "420 - Enhance Your Calm". Twitter has dropped 420 and got on board with the standard 429. Unfortunately some APIs replicated that and have not yet switched over to using the standard, so you might see either a 429 or this slow copycat.
Google also got a little "creative" with their status code utilization. For a long time were using 403 for their rate limiting, but I don’t know if they are still doing that. Bitbucket are still using 403 in their Server REST API.
Actions are usually "forbidden" if they involve breaching the licensed user limit of the server, or degrading the authenticated user’s permission level. See the individual resource documentation for more details.
https://docs.atlassian.com/bitbucket-server/rest/5.12.3/bitbucket-rest.html
GitHub v3 API has a 403 rate limit too:
HTTP/1.1 403 Forbidden
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1377013266
{
"message": "API rate limit exceeded for xxx.xxx.xxx.xxx. (But here's the good news: Authenticated requests get a higher rate limit. Check out the documentation for more details.)",
"documentation_url": "https://developer.github.com/v3/#rate-limiting"
}
Getting a 429 (or a 420) is a clear indication that a rate limit has been hit, and a 403 combined with an error code, or maybe some HTTP headers can also be a thing to check for.
Either way, when you’re sure it’s a rate limit error, you can move onto the next step: figuring out how long to wait before trying again.
There are three main ways a server might communicate retry logic to you.
The Retry-After header is a handy standard way to communicate "this didn’t work now, but it might work if you retry in <the future>".
Retry-After: <http-date>
Retry-After: <delay-seconds>
The logic for how it works is defined in RFC 6584 (the same RFC that introduced HTTP 429) but basically it might look a bit like this:
HTTP/1.1 429 Too Many Requests
Retry-After: 60
Content-Type: application/json
{
"error": {
"message": "API rate limit exceeded for xxx.xxx.xxx.xxx.",
"link": "https://developer.example.com/#rate-limiting"
}
}
You might also see a Retry-After
showing you an HTTP date:
Retry-After: Sat, 15 April 2023 07:28:00 GMT
Same idea, it’s just saying "please don’t come back before this time".
By checking for these errors, you can catch and retry (or re-queue) requests that have failed. If that is not an option try sleeping for a bit to calm workers down.
Warning
|
Make sure your sleep does not block your background processes from processing other jobs. This can happen in languages where sleep sleeps the whole process, and that process is running multiple types job on the same thread. Don’t back up your whole system with an overzealous sleep!_ |
Some HTTP clients like Faraday are aware of Retry-After and use it to power their build in retry logic, but other HTTP clients might need some training.
Some APIs like GitHub v3 use proprietary headers, all beginning with
X-RateLimit-
. These are not at all standard (you can tell by the
X-
), and could be very different from whatever API you are working
with.
Successful requests with Github here will show how many requests are remaining, so maybe keep an eye on those and try to avoid making requests if the remaining amount on the last response was 0.
$ curl -i https://api.github.com/users/octocat
HTTP/1.1 200 OK
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 56
X-RateLimit-Reset: 1372700873
You can use a shared key (maybe in Redis or similar) to track that, and
have it expire on the reset provided in
UTC time in X-RateLimit-Reset
.
The benefit of the proprietary headers is that you get a lot more information to work with, letting you know you’re approaching a limit so you can pre-emptively back off, instead of waiting to stand on that rake then having to respond after being hit round the head.
There’s an IETF RFC draft called RateLimit header fields for HTTP that aims to give you the best of both, and maybe you’ll fun into something that resembles this in the distant future of 2024 or 2025.
RateLimit-Limit: 100 RateLimit-Remaining: 50 RateLimit-Reset: 50
This says there is a limit of 100 requests in the quota, the client has 50 remaining, and it will reset in 50 seconds. Handy!