In this article you'll explore some rate limiting algorithms using Python and Redis, starting from a naive approach.

Rate limiting is a mechanism that many developers may have to deal with at some point in their life. It’s useful for a variety of purposes like sharing access to limited resources or limit the number of requests made to an API endpoint and respond with a 429 status code.

Here we’ll explore some rate limiting algorithms using Python and Redis, starting from a naive approach and culminate at an advanced one called Generic Cell Rate Algorithm (GCRA).

In the following article we'll use redis-py to communicate with Redis (`pip install redis`). I suggest you to clone my repository from GitHub to make some experiments with rate limiting.

## #Time Bucketed

A first approach for rate limiting a number of requests within a `period` is to use a time-bucketed algorithm, where a counter (initially set to `limit`) and expiry time (`period`) are stored for each rate-key (something unique like username or IP address) and decremented on subsequent requests.

Using Python and Redis we can implement time-bucket logic in this way:

• check if the rate-key `key` exists
• if `key` doesn't exist, initialize it to the `limit` value (Redis SETNX) and an expiration time `period` (Redis EXPIRE)
• decrement this value on each subsequent requests (Redis DECRBY)
• the request is limited only when `key` value drops below zero
• after the given `period` the key will automatically be deleted
``````from datetime import timedelta
from redis import Redis

def request_is_limited(r: Redis, key: str, limit: int, period: timedelta):
if r.setnx(key, limit):
r.expire(key, int(period.total_seconds()))
bucket_val = r.get(key)
if bucket_val and int(bucket_val) > 0:
r.decrby(key, 1)
return False
return True``````

You can see this code in action simulating requests with a limit of `20 requests / 30 seconds` (to keep things clear I wrapped the function in a module as you can see in my repository).

``````import redis
from datetime import timedelta
from ratelimit import time_bucketed

r = redis.Redis(host='localhost', port=6379, db=0)
requests = 25

for i in range(requests):
print ('🛑 Request is limited')
else:
print ('✅ Request is allowed')``````

Only the first 20 requests won't be limited, you have to wait 30 seconds before being allowed to make a new request again.

The downside to this approach is it actually enforces a limit to the number of requests a user can do within a period rather than a rate. This means that the entire quota can be exhausted immediately, resetting only after the period has expired.

## #Leaky bucket

There's an algorithm that can take care of rate limiting and produces a very smooth rate limiting effect: the leaky-bucket approach. The algorithm works similarly to an actual leaky bucket: you can pour water (requests) in the bucket up to a maximum capacity while some amount of water is escaping at a constant rate (usually implemented using a background process). If the amount of water entering the bucket is greater than the amount leaving through the leak, the bucket starts to fill and when the bucket is full new requests are limited.

We can skip over this for a more elegant approach that doesn't require a separate process to simulate a leak: the Generic Cell Rate Algorithm.

## #Generic Cell Rate Algorithm

Generic Cell Rate Algorithm (GCRA) comes from a communications technology called `Asynchronous Transfer Mode (ATM)` and was used in ATM network’s schedulers to delay or drop cells, small fixed size packets of data, that came in over their rate limit.

GCRA works by tracking remaining limit through a time called the Theoretical Arrival Time (TAT) on each request

``````tat = current_time + period
``````

and limit the next request if the arrival time is less than the stored TAT. This works fine if `rate = 1 request / period` where each request is separated by `period`, anyway in the real world we usually have `rate = limit / period`. For example with `rate = 10 requests / 60 seconds` a user would be allowed to make 10 requests in the first 6 seconds, with `rate = 1 request / 6 seconds `the user would have to wait 6 seconds between requests.

To be able to bunch requests in a short period of time and support `limit` requests within `period` with `limit > 1`, each request should be separated by `period / limit` and the next Theoretical Arrival Time (`new_tat`) calculated in a different way than before. Indicating with `t` the time of the request arrival

• `new_tat = tat + period / limit` if requests bunch (`t <= tat`)
• `new_tat = t + period / limit` if requests don’t bunch (`t > tat`)

hence

``new_tat = max(tat, t) + period / limit.``

The request is limited if `new_tat` exceeds the current time plus the period, `new_tat > t + period`. With `new_tat = tat + period / limit` we have

``tat + period / limit > t + period``

Hence, we should limit requests only when

``````tat — t > period — period / limit
``````
``````       period — period / limit
<----------------------->
--|----------------------------|--->
t                           TAT``````

In this case TAT should not be updated, because limited requests don’t have a theoretical arrival time.

Now it's time to unveil the final code!

``````from datetime import timedelta
from redis import Redis

def request_is_limited(r: Redis, key: str, limit: int, period: timedelta):
period_in_seconds = int(period.total_seconds())
t = r.time()
separation = round(period_in_seconds / limit)
r.setnx(key, 0)
tat = max(int(r.get(key)), t)
if tat - t <= period_in_seconds - separation:
new_tat = max(tat, t) + separation
r.set(key, new_tat)
return False
return True``````

In this implementation we use Redis TIME because GCRA is dependent on time and it’s crucial to make sure that the current time is consistent between multiple deployments (clock drift between machines could lead to false positives).

In the following script you can see GCRA in action with `rate = 10 requests / 60 seconds`

``````import redis
from datetime import timedelta
from ratelimit import gcra

r = redis.Redis(host='localhost', port=6379, db=0)
requests = 10

for i in range(requests):
print ('🛑 Request is limited')
else:
print ('✅ Request is allowed')``````

The first 10 requests are allowed by GCRA, but you need to wait at least 6 seconds to make another request. Try to run the script after some time to see how it works and try to change `limit` and `period` (e.g. `limit = 1` and `period=timedelta(seconds=6)`  😉).

To keep GCRA implementation clear, I avoided to add a lock between `get` and `set` calls, but it's needed in case of multiple requests with the same key at the same time. Using `Lua-based lock` we can simply add a lock as a context manager.

``````def request_is_limited(r: Redis, key: str, limit: int, period: timedelta):
period_in_seconds = int(period.total_seconds())
t = r.time()
separation = round(period_in_seconds / limit)
r.setnx(key, 0)
try:
with r.lock('lock:' + key, blocking_timeout=5) as lock:
tat = max(int(r.get(key)), t)
if tat - t <= period_in_seconds - separation:
new_tat = max(tat, t) + separation
r.set(key, new_tat)
return False
return True
except LockError:
return True``````

Check the complete code on GitHub. 