How to set up rate limiting in .NET

Aaah rate limiting. One of those essential features everyone forgets to set up until it’s too late. And by too late, I mean they’re getting DDoSed by one of their clients because someone accidentally wrote an endless loop.

“Never trust your clients” - wiser words were never spoken.

In this article, let’s take a quick look at the ways you can implement rate limiting in your web API and the different algorithms provided in the default rate-limiting middleware.

Rate limiting in .NET

One of the reasons I love .NET is the abundance of first-party solutions for common problems. This is no different for rate limiting. Microsoft has been so kind to provide us with rate-limiting middleware you can use by referencing Microsoft.AspNetCore.RateLimiting.

The way this middleware works is that you set up one or more rate-limiting policies, and then assign those to the endpoints of your choosing. Pretty straight-forward.

Let’s move on to some example code and the first of the rate-limiting algorithms.

1. Fixed window rate limiting

Of all the rate-limiting algorithms, this is probably the simplest. All it does is create a fixed window in time, to which you can apply some rules.

Here’s a bit of example code, with some added comments to make things easier to understand:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add the rate limiter middleware to your services
// We are using the fixed window rate limiter here and are naming our policy "FixedRateLimiter"
builder.Services.AddRateLimiter(_ => _
    .AddFixedWindowLimiter(policyName: "FixedRateLimiter", options =>
    {
        // We're creating a new fixed window every 12 seconds
        options.Window = TimeSpan.FromSeconds(12);
        // During those 12 seconds, only 4 requests are allowed
        options.PermitLimit = 4;
        // We start by processing the oldest first
        options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        // If we already have 2 requests in our queue, and a third arrives, it will be rejected :(
        options.QueueLimit = 2;
    }));

var app = builder.Build();

// Don't forget to use the rate limiter middleware!
app.UseRateLimiter();

// A simple GET endpoint
// To add rate limiting, we call the RequireRateLimiting method and pass the policy name
app.MapGet("/", () => Results.Ok($"Hello World!"))
    .RequireRateLimiting("FixedRateLimiter");

app.Run();

Pretty straightforward right? And that’s the biggest advantage to fixed window rate-limiting: it’s simple and quick to set up.

However, a burst of requests just at the edge of our fixed window might use all requests for our current and next time slot, leading to starvation for other requests.

2. Sliding window rate limiting

Although seemingly more complex, sliding window rate limiting isn’t all that hard to understand.

In this case, we have a window of a fixed duration, divided into x number of segments. Every time a segment passes, the request you made during that segment gets re-added to the total amount of requests you can make.

Let’s look at an example to make things a bit clearer:

We want to set up a sliding window rate limiter with the following parameters:

Sliding window of 12 seconds, with a limit of 100 requests
Divided into 3 segments (4 seconds for each segment)

Here’s an example of what that would look like:

Time	Available Req	Used Req	Recycled Req	Carry over Req
0	100	20	0	80
4	80	40	0	40
8	40	20	0	20
12	20	0	20	40
16	40	80	40	0
20	0	10	20	10

Looking at the above we can see that:

Until 12 seconds on, we’re not getting any recycled requests added to our available requests. 12 seconds being our sliding window.
Once we move past 12 seconds, we recuperate the 20 requests we spent in the 0 to 4-second segment and add it back to our total.
Once we move past 16 seconds, we recuperate the 50 requests we spent in the 4-8 second segment.

Not all that complex right? And neither is the code to set this all up:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add the rate limiter middleware to your services
// We are using the sliding window rate limiter here and are naming our policy "SlidingRateLimiter"
builder.Services.AddRateLimiter(_ => _
    .AddSlidingWindowLimiter(policyName: "SlidingRateLimiter", options =>
    {
        // We want a limit of 100 requests during our sliding window
        options.PermitLimit = 100;
        // Our window should be 12 seconds
        options.Window = TimeSpan.FromSeconds(12);
        // We want 3 segments of 3 seconds each
        options.SegmentsPerWindow = 4;
        // We use the first-in-first-out approach to our queue
        options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        // We'll set our queue limit to 10
        options.QueueLimit = 10;
    }));

var app = builder.Build();

// Don't forget to use the rate limiter middleware!
app.UseRateLimiter();

// A simple GET endpoint
// To add rate limiting, we call the RequireRateLimiting method and pass the policy name
app.MapGet("/", () => Results.Ok($"Hello World!"))
    .RequireRateLimiting("SlidingRateLimiter");

app.Run();

3. Token bucket limiter

The more straightforward sibling of the sliding window, the token bucket adds a fixed number of request tokens for each replenishment period. It doesn’t care when you make a request, every time a period passes, it adds the configured amount of tokens until you reach the max tokens per period.

Let’s look at another example with the following configuration:

A tokenlimit of 100
A token replenishment period of 4 seconds, with 20 tokens added per period

Here’s what that would look like:

Time	Available Req	Used Req	Added Req	Carry over Req
0	100	20	0	80
4	80	10	20	90
8	90	0	10	100
12	100	80	20	40
16	40	60	20	0
20	0	10	20	10

Pretty straightforward forward right? Every 4 seconds 20 tokens get added to the bucket unless it would go over the max amount of tokens.

Here’s how that would look like in code:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add the rate limiter middleware to your services
// We are using the token bucket  rate limiter here and are naming our policy "TokenBucketRateLimiter"
builder.Services.AddRateLimiter(_ => _
    .AddTokenBucketLimiter(policyName: "TokenBucketRateLimiter", options =>
    {
        // The max amount of tokens possible
        options.TokenLimit = "100";
        // Our tokens get topped up every 4 seconds
        options.ReplenishmentPeriod = TimeSpan.FromSeconds(4);
        // We add 20 tokens to the bucket every period
        options.TokensPerPeriod = 20;
        // You can configure the replenishment to go automatic, but you can also do this manually by calling TryReplenish on the limiter
        options.AutoReplenishment = myOptions.AutoReplenishment;
        // We use the first-in-first-out approach to our queue
        options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        // We'll set our queue limit to 10
        options.QueueLimit = 10;
    }));


var app = builder.Build();

// Don't forget to use the rate limiter middleware!
app.UseRateLimiter();

// A simple GET endpoint
// To add rate limiting, we call the RequireRateLimiting method and pass the policy name
app.MapGet("/", () => Results.Ok($"Hello World!"))
    .RequireRateLimiting("TokenBucketRateLimiter");

app.Run();

4. Concurrency limiter

Finally, the fourth algorithm we can use is the concurrency limiter. All it does is limit the amount of requests you can do at the same time. So it does not limit the total amount of requests throughout a period of time, unlike the previous rate limiters.

Let’s look at a code example:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add the rate limiter middleware to your services
// We are using the concurrency rate limiter here and are naming our policy "ConcurrencyRateLimiter"
builder.Services.AddRateLimiter(_ => _
    .AddConcurrencyLimiter(policyName: concurrencyPolicy, options =>
    {
        // A max of 10 requests at the same time are allowed
        options.PermitLimit = 10;
        options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        options.QueueLimit = 4;
    }));


var app = builder.Build();

// Don't forget to use the rate limiter middleware!
app.UseRateLimiter();

// A simple GET endpoint
// To add rate limiting, we call the RequireRateLimiting method and pass the policy name
app.MapGet("/", () => Results.Ok($"Hello World!"))
    .RequireRateLimiting("ConcurrencyRateLimiter");

app.Run();

Concurrency limiters are often combined with another one of the limiters we discussed earlier.

Final thoughts

Rate limiting is an important part of building scalable systems. We went through all four of the rate-limiting algorithms built into the rate-limiting middleware. As you saw it’s all very easy to set up with minimal effort.

What we looked at today is how to make your application responsible for rate limiting. This works great in simple situations where you, for example, have a single API running.

However, if you’re working with more complex architectures and multiple services that interact with each other, it’s often a better idea to look at API management tools such as Azure API Management or YARP. These allow you to manage and configure rate limiting for multiple destinations in one place.