Skip to content

Commit

Permalink
Updated examples
Browse files Browse the repository at this point in the history
  • Loading branch information
darrelmiller committed Sep 29, 2024
1 parent 7a1c8d1 commit 82e1a15
Showing 1 changed file with 43 additions and 89 deletions.
132 changes: 43 additions & 89 deletions draft-ietf-httpapi-ratelimit-headers.md
Original file line number Diff line number Diff line change
Expand Up @@ -360,7 +360,7 @@ If a response contains both the RateLimit and Retry-After fields, the Retry-Afte

This specification does not mandate a specific throttling behavior and implementers can adopt their preferred policies, including:

- slowing down or preemptively back-off their request rate when
- slowing down or pre-emptively back-off their request rate when
approaching quota limits;
- consuming all the quota according to the exposed limits and then wait.

Expand Down Expand Up @@ -562,11 +562,13 @@ may reply with different throttling headers.

# Examples

## Unparameterized responses
## Responses without defining policies

Some servers may not expose the policy limits in the RateLimit-Policy header field. Clients can still use the RateLimit header field to throttle their requests.

### Throttling information in responses

The client exhausted its service-limit for the next 50 seconds.
The client exhausted its quota for the next 50 seconds.
The limit and time-window is communicated out-of-band.

Request:
Expand All @@ -592,7 +594,6 @@ the response status code,
a subsequent request is not required to fail.
The example below shows that the server decided to serve the request
even if remaining keyword value is 0.
even if remaining keyword value is 0.
Another server, or the same server under other load conditions, could have decided to throttle the request instead.

Request:
Expand All @@ -613,11 +614,9 @@ RateLimit: default;r=0;t=48
{"still": "successful"}
~~~

### Use in conjunction with custom fields {#use-with-custom-fields}
### Multiple policies in response

The server uses two custom fields,
namely `acme-RateLimit-DayLimit` and `acme-RateLimit-HourLimit`
to expose the following policy:
The server uses two different policies to limit the client's requests:

- 5000 daily quota units;
- 1000 hourly quota units.
Expand All @@ -630,8 +629,9 @@ the closest limit to reach is the daily one.
The server then exposes the RateLimit header fields to
inform the client that:

- it has only 100 quota units left;
- the window will reset in 10 hours.
- it has only 100 quota units left in the daily quota and the window will reset in 10 hours;

The server MAY choose to omit returning the hourly policy as it uses the same quota units as the daily policy and the daily policy is the one that is closest to being exhausted.

Request:

Expand All @@ -646,10 +646,7 @@ Response:
~~~ http-message
HTTP/1.1 200 Ok
Content-Type: application/json
acme-RateLimit-DayLimit: 5000
acme-RateLimit-HourLimit: 1000
RateLimit: dayLimit;r=100;t=36000
RateLimit: hourLimit;r=25;t=700

{"hello": "world"}
~~~
Expand All @@ -663,7 +660,6 @@ in case of saturation, thus increasing availability.
The server adopted a basic policy of 100 quota units per minute,
and in case of resource exhaustion adapts the returned values
reducing both limit and remaining keyword values.
reducing both limit and remaining keyword values.

After 2 seconds the client consumed 40 quota units

Expand All @@ -687,8 +683,7 @@ RateLimit: basic;r=60;t=58
~~~

At the subsequent request - due to resource exhaustion -
the server advertises only `remaining=20`.
the server advertises only `remaining=20`.
the server advertises only `r=20`.

Request:

Expand Down Expand Up @@ -743,7 +738,7 @@ RateLimit: default;r=0;t=5
}
~~~

## Parameterized responses
## Responses with defined policies

### Throttling window specified via parameter

Expand Down Expand Up @@ -812,8 +807,7 @@ RateLimit: dynamic;r=9;t=50

Continuing the previous example, let's say the client waits 10 seconds and
performs a new request which, due to resource exhaustion, the server rejects
and pushes back, advertising `remaining=0` for the next 20 seconds.
and pushes back, advertising `remaining=0` for the next 20 seconds.
and pushes back, advertising `r=0` for the next 20 seconds.

The server advertises a smaller window with a lower limit to slow
down the client for the rest of its original window after the 20 seconds elapse.
Expand Down Expand Up @@ -881,7 +875,7 @@ query again the server even if it is likely to have the request rejected.

### Missing Remaining information

The server does not expose remaining keyword values
The server does not expose remaining values
(for example, because the underlying counters are not available).
Instead, it resets the limit counter every second.

Expand All @@ -901,7 +895,7 @@ Response:
~~~ http-message
HTTP/1.1 200 Ok
Content-Type: application/json
RateLimit-Policy: quota;l=100
RateLimit-Policy: quota;l=100;w=1
RateLimit: quota;t=1

{"first": "request"}
Expand Down Expand Up @@ -969,7 +963,7 @@ RateLimit: day;r=100;t=36000

1. Why defining standard fields for throttling?

To simplify enforcement of throttling policies.
To simplify enforcement of throttling policies and enable clients to constraint their requests to avoid being throttled.

2. Can I use RateLimit header fields in throttled responses (eg with status code 429)?

Expand All @@ -979,27 +973,13 @@ RateLimit: day;r=100;t=36000

No. {{?RFC6585}} defines the `429` status code and we use it just as an example of a throttled request,
that could instead use even `403` or whatever status code.
The goal of this specification is to standardize the name and semantic of three RateLimit header fields
widely used on the internet. Stricter relations with status codes or error response payloads
would impose behaviors to all the existing implementations making the adoption more complex.

4. Why is the partition key necessary?

4. Why don't pass the throttling scope as a parameter?

The word "scope" can have different meanings:
for example it can be an URL, or an authorization scope.
Since authorization is out of the scope of this document (see {{goals}}),
and that we rely only on {{HTTP}}, in {{goals}} we defined "scope" in terms of
URL.

Since clients are not required to process quota policies (see {{receiving-fields}}),
we could add a new "RateLimit-Scope" field to this spec.
See this discussion on a [similar thread](https://github.com/httpwg/http-core/pull/317#issuecomment-585868767)

Specific ecosystems can still bake their own prefixed parameters,
such as `acme-auth-scope` or `acme-url-scope` and ensure that clients process them.
This behavior cannot be relied upon when communicating between different ecosystems.

We are open to suggestions: comment on [this issue](https://github.com/ioggstream/draft-polli-ratelimit-headers/issues/70)
Without a partition key, a server can only effectively only have one scope (aka partition), which is impractical for most services, or it needs to communicate the scopes out-of-band.
This prevents the development of generic connector code that can be used to prevent requests from being throttled.
Many APIs rely on API keys, user identity or client identity to allocate quota.
As soon as a single client processes requests for more than one partition, the client needs to know the corresponding partition key to properly track requests against allocated quota.

5. Why using delay-seconds instead of a UNIX Timestamp?
Why not using subsecond precision?
Expand All @@ -1022,14 +1002,11 @@ RateLimit: day;r=100;t=36000
on the [httpwg ml](https://lists.w3.org/Archives/Public/ietf-http-wg/2019JulSep/0202.html)
- almost all rate-limit headers implementations do not use it.

6.

7. Shouldn't I limit concurrency instead of request rate?
6. Shouldn't I limit concurrency instead of request rate?

You can use this specification to limit concurrency
at the HTTP level (see {#use-for-limiting-concurrency})
and help clients to
shape their requests avoiding being throttled out.
and help clients to shape their requests avoiding being throttled out.

A problematic way to limit concurrency is connection dropping,
especially when connections are multiplexed (e.g. HTTP/2)
Expand All @@ -1042,12 +1019,12 @@ RateLimit: day;r=100;t=36000
Saturation conditions can be either dynamic or static: all this is out of
the scope for the current document.

8. Do a positive value of remaining keyword imply any service guarantee for my
7. Do a positive value of remaining paramter imply any service guarantee for my
future requests to be served?

No. FAQ integrated in {{ratelimit-remaining-parameter}}.

9. Is the quota-policy definition {{quota-policy}} too complex?
8. Is the quota-policy definition {{quota-policy}} too complex?

You can always return the simplest form

Expand All @@ -1065,43 +1042,15 @@ RateLimit: sliding;r=50;t=44

the value "sliding" identifies the policy being reported.

11. Can we use shorter names? Why don't put everything in one field?

The most common syntax we found on the web is `X-RateLimit-*` and
when starting this I-D [we opted for it](https://github.com/ioggstream/draft-polli-ratelimit-headers/issues/34#issuecomment-519366481)

The basic form of those fields is easily parseable, even by
implementers processing responses using technologies like
dynamic interpreter with limited syntax.

Using a single field complicates parsing and takes
a significantly different approach from the existing
ones: this can limit adoption.

12. Why don't mention connections?

Beware of the term "connection":
 - it is just *one* possible saturation cause. Once you go that path
 you will expose other infrastructural details (bandwidth, CPU, .. see {{sec-information-disclosure}})
 and complicate client compliance;
 - it is an infrastructural detail defined in terms of server and network
 rather than the consumed service.
This specification protects the services first,
and then the infrastructures through client cooperation (see {{sec-throttling-does-not-prevent}}).
 RateLimit header fields enable sending *on the same connection* different limit values
 on each response, depending on the policy scope (e.g. per-user, per-custom-key, ..)
13. Can intermediaries alter RateLimit header fields?
9. Can intermediaries alter RateLimit header fields?

Generally, they should not because it might result in unserviced requests.
There are reasonable use cases for intermediaries mangling RateLimit header fields though,
e.g. when they enforce stricter quota-policies,
or when they are an active component of the service.
In those case we will consider them as part of the originating infrastructure.

14. Why the `w` parameter is just informative?
10. Why the `w` parameter is just informative?
Could it be used by a client to determine the request rate?

A non-informative `w` parameter might be fine in an environment
Expand All @@ -1112,7 +1061,7 @@ RateLimit: sliding;r=50;t=44
for defining the throttling
behavior.

15. Can I use RateLimit fields in trailers?
11. Can I use RateLimit fields in trailers?
Servers usually establish whether the request is in-quota before creating a response, so the RateLimit field values should be already available in that moment.
Supporting trailers has the only advantage that allows to provide more up-to-date information to the client in case of slow responses.
However, this complicates client implementations with respect to combining fields from headers and accounting for intermediaries that drop trailers.
Expand Down Expand Up @@ -1153,19 +1102,16 @@ A sliding window policy for example, may result in having a remaining keyword va
e.g.

~~~
RateLimit-Limit: 12
RateLimit-Policy: 12;w=1
RateLimit-Remaining: 6 ; using 50% of throughput, that is 6 units/s
RateLimit-Reset: 1
RateLimit-Policy: sliding;l=12;w=1
RateLimit: sliding;l=12;r=6;t=1 ; using 50% of throughput, that is 6 units/s

~~~

If this is the case, the optimal solution is to achieve

~~~
RateLimit-Limit: 12
RateLimit-Policy: 12;w=1
RateLimit-Remaining: 1 ; using 100% of throughput, that is 12 units/s
RateLimit-Reset: 1
RateLimit-Policy: sliding;l=12;w=1
RateLimit: sliding;l=12;r=1;t=1 ; using 100% of throughput, that is 12 units/s
~~~

At this point you should stop increasing your request rate.
Expand All @@ -1191,10 +1137,18 @@ and Julian Reschke.
# Changes
{:numbered="false" removeinrfc="true"}

## Since draft-ietf-httpapi-ratelimit-headers-07
{:numbered="false" removeinrfc="true"}

* Refactored both fields to lists of Items that identify policy and use parameters
* Added quota unit parameter
* Added partition key parameter


## Since draft-ietf-httpapi-ratelimit-headers-03
{:numbered="false" removeinrfc="true"}

* Split policy informatio in RateLimit-Policy #81
* Split policy informatiom in RateLimit-Policy #81


## Since draft-ietf-httpapi-ratelimit-headers-02
Expand Down

0 comments on commit 82e1a15

Please sign in to comment.