How to Configure Rate Limiting in Nginx on Ubuntu

Your login endpoint gets hammered by a bot submitting hundreds of credential-stuffing attempts per minute. Your public API, open without authentication, gets scraped by a single client pulling thousands of pages in seconds. Your web server does not crash immediately, but CPU climbs, response times degrade, and legitimate users start seeing slow responses or errors.

Nginx has two built-in modules for this: ngx_http_limit_req_module controls the request rate, how many requests per second a client can make. ngx_http_limit_conn_module controls concurrent connections, how many simultaneous open connections a client can hold. Together they let you enforce fair use on any endpoint without touching your application code.

In this tutorial you will configure both modules on Ubuntu, apply rate limits to specific paths like /api and /login, understand the burst and nodelay options that separate a usable rate limit from one that breaks your own users, and test that everything works.

How Nginx Rate Limiting Works

Before writing any configuration, it helps to understand the mechanism.

limit_req_zone

limit_req_zone implements the leaky bucket algorithm. Imagine each client has a bucket with a fixed leak rate. Requests fill the bucket; the leak drains it at a steady rate. When the bucket overflows, Nginx rejects the excess requests with a 503 (or whichever status code you configure).

You define the zone globally (in http {}) and apply it to a location (in server {}):

limit_req_zone $binary_remote_addr zone=myzone:10m rate=10r/s;

Breaking this down:

$binary_remote_addr is the key used to track each client. The binary form of the remote IP is 4 bytes for IPv4 and 16 bytes for IPv6, much smaller than the string form and therefore more memory-efficient.
zone=myzone:10m is a shared memory zone named myzone with 10 MB of storage. 10 MB holds state for roughly 160,000 IPv4 addresses simultaneously.
rate=10r/s allowing 10 requests per second per IP. You can also write it as 60r/m (60 per minute) for lower-traffic paths.

limit_conn_zone

limit_conn_zone tracks how many connections from a given IP are currently open. This complements the rate limit by capping slow HTTP clients that hold connections open without sending requests fast enough to trigger the rate limit.

limit_conn_zone $binary_remote_addr zone=connzone:10m;

Prerequisites

Ubuntu 20.04, 22.04, or 24.04
Nginx installed and running. If you need it: sudo apt install nginx
A non-root user with sudo privileges
Basic familiarity with Nginx server blocks and the command line

Step 1: Define the Rate Limit Zones

Zones must be declared inside the http {} block, not inside server {} or location {}. The standard place on Ubuntu is /etc/nginx/nginx.conf.

Open the file:

sudo nano /etc/nginx/nginx.conf

Find the http {} block. Add the zone definitions inside it, just before the include lines for your virtual hosts:

http {
    # --- rate limiting zones ---
    limit_req_zone $binary_remote_addr zone=general:10m rate=20r/s;
    limit_req_zone $binary_remote_addr zone=login:10m    rate=5r/m;
    limit_req_zone $binary_remote_addr zone=api:10m      rate=30r/s;

    limit_conn_zone $binary_remote_addr zone=connlimit:10m;
    # ---------------------------

    include /etc/nginx/mime.types;
    include /etc/nginx/conf.d/*.conf;
    include /etc/nginx/sites-enabled/*;
}

Three zones for different use cases:

general, a broad zone for regular web traffic. 20 requests per second is generous enough that a human browsing your site never gets throttled.
login, a strict zone for authentication endpoints. 5 requests per minute means a real human has plenty of room, but a bot hammering the form gets blocked almost immediately.
api, a zone for API endpoints. Higher than general because legitimate API clients (mobile apps, SPAs) can legitimately issue bursts of requests.

Save and close the file.

Step 2: Apply Rate Limits in Your Server Block

Open the server block for your site. On a typical Ubuntu Nginx setup, this lives in /etc/nginx/sites-available/:

sudo nano /etc/nginx/sites-available/example.com

Apply the general zone to all traffic, and add tighter limits for specific paths:

server {
    listen 443 ssl;
    server_name example.com;

    # Apply a general rate limit to all requests
    limit_req zone=general burst=40 nodelay;
    limit_conn connlimit 20;

    # Strict limit on the login endpoint
    location /login {
        limit_req zone=login burst=3 nodelay;
        limit_req_status 429;

        proxy_pass http://127.0.0.1:3000;
    }

    # API rate limit
    location /api/ {
        limit_req zone=api burst=60 nodelay;
        limit_req_status 429;

        proxy_pass http://127.0.0.1:3000;
    }

    # Static assets
    location /static/ {
        root /var/www/example.com;
    }
}

Let’s go through the key directives:

limit_req zone=general burst=40 nodelay;

burst defines a queue size. When a client exceeds the steady-state rate, up to burst extra requests are queued rather than rejected. Without nodelay, queued requests are delayed to smooth them out over time. With nodelay, queued requests are served immediately but the queue still provides an upper bound before requests are dropped. nodelay is almost always what you want, delaying requests by design frustrates users and does not actually protect your backend any better.

limit_conn connlimit 20;

No single IP can hold more than 20 simultaneous connections. This blocks slow-loris style attacks where a client opens many connections and sends requests extremely slowly, tying up Nginx worker connections.

limit_req_status 429;

By default, Nginx returns a 503 Service Unavailable when a rate limit is exceeded. 429 Too Many Requests is the correct HTTP status for rate limiting and is more informative for API clients. Set it on any path where you want the correct status code.

Step 3: Test and Reload the Configuration

Before reloading, always test:

sudo nginx -t

Expected output:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

If the test passes, reload Nginx:

sudo systemctl reload nginx

Step 4: Verify the Rate Limit Is Enforced

Use curl in a quick loop to trigger the rate limit on the login endpoint (which is set to 5 requests per minute):

for i in $(seq 1 10); do
  curl -s -o /dev/null -w "%{http_code}\n" https://example.com/login
done

The first few requests return 200. Once the burst queue fills, subsequent requests return 429:

For a more realistic load test, install ab (Apache Bench) and fire 100 requests with a concurrency of 20:

sudo apt install apache2-utils -y
ab -n 100 -c 20 https://example.com/api/users

Look at the “Non-2xx responses” line in the output, those are the rate-limited requests.

Step 5: Return a Custom Error Page for 429

Instead of the default Nginx error page, return a clean JSON response for API clients:

server {
    # ...

    error_page 429 /rate_limited.json;

    location = /rate_limited.json {
        internal;
        default_type application/json;
        return 429 '{"error": "Too many requests. Please slow down."}';
    }
}

For web pages, you can return an HTML file instead:

error_page 429 /429.html;

location = /429.html {
    internal;
    root /var/www/example.com;
}

Step 6: Whitelist Your Own IPs

Rate limiting your own monitoring tools, internal services, or trusted IPs is annoying. Use the geo module to build a bypass variable:

http {
    geo $limit {
        default         1;
        127.0.0.1       0;
        10.0.0.0/8      0;
        203.0.113.10    0;
    }

    map $limit $limit_key {
        0 "";
        1 $binary_remote_addr;
    }

    limit_req_zone $limit_key zone=general:10m rate=20r/s;
}

The geo block assigns $limit = 0 to trusted IPs and $limit = 1 to everyone else. The map block converts that to an empty string for trusted IPs and the actual IP for everyone else. When the zone key is an empty string, Nginx skips rate limiting for that client entirely. All other zones (login, api) would need the same treatment if you want a blanket whitelist.

Common Mistakes and Troubleshooting

Rate limit is triggering on your own browser.

You configured too low a rate without enough burst headroom. A modern web page loads dozens of resources (scripts, styles, fonts, images) in parallel from a single IP. If your general zone rate is too tight, the browser hits the limit while loading assets. Either set the general limit high (20+ r/s with burst of 40+), or exclude static asset paths from rate limiting as shown in Step 2.

limit_req_status 429 has no effect.

This directive requires Nginx version 1.3.15 or later. Verify your version:

nginx -v

On Ubuntu 22.04 and 24.04 the default apt package is recent enough. If you are on an older system, add the official Nginx repository:

sudo add-apt-repository ppa:nginx/stable
sudo apt update && sudo apt install nginx

Requests are delayed, not rejected.

You used burst without nodelay. Queued requests are delayed to fit the configured rate. If you want to queue them and serve them quickly rather than spread them out, add nodelay. If you want excess requests rejected immediately with no queue, remove burst entirely.

nginx -t fails with “could not build server_names_hash”.

This is unrelated to rate limiting, it means server_names_hash_bucket_size is too small for your hostnames. Add this inside http {} in nginx.conf:

server_names_hash_bucket_size 64;

Logs show rate limit hits but the client does not receive 429.

Check whether a higher-level location block has its own limit_req_status that overrides the inner one. limit_req_status is inherited from the parent context but can be overridden per location. If the server {} block sets limit_req_status 503 and a location {} sets 429, the location wins. Make sure the status code is set in the right context.

Best Practices

Set limit_req_log_level warn (or error) to control how rate limit hits appear in your logs. By default they log at the error level, which can flood your error log on a busy server:

http {
    limit_req_log_level warn;
}

Add a Retry-After header so well-behaved API clients know when to retry. Nginx does not add it automatically, but you can add it in the error response:

location = /rate_limited.json {
    internal;
    add_header Retry-After 60 always;
    default_type application/json;
    return 429 '{"error": "Rate limit exceeded. Retry after 60 seconds."}';
}

Combine request rate and connection limits. limit_req alone does not stop a client from holding a large number of idle connections. limit_conn plugs that gap. Use both on any public-facing endpoint.

Use a separate zone for each endpoint class. Do not share a single zone across /login, /api, and /static. A bot hammering /login should not burn rate limit headroom for an API client hitting /api. Separate zones keep the math clean and the behavior predictable.

Layer rate limiting with Fail2ban. Rate limiting throttles bad actors in real time at the Nginx level. Fail2ban bans persistent offenders at the firewall level before they can even reach Nginx. The two complement each other perfectly: rate limiting absorbs the initial burst, and Fail2ban kicks in when a client keeps probing repeatedly over time. The nginx-botsearch and nginx-http-auth jails covered in How to Protect Your Server from Brute-Force Attacks with Fail2ban on Ubuntu pair well with the /login rate limit zone.

Monitor rate limit hits. Add $limit_req_status to your Nginx log format to track which requests were passed, delayed, or rejected:

log_format main '$remote_addr - $remote_user [$time_local] '
                '"$request" $status $body_bytes_sent '
                '"$http_referer" "$http_user_agent" '
                'rt=$request_time lrs=$limit_req_status';

access_log /var/log/nginx/access.log main;

$limit_req_status logs PASSED, DELAYED, or REJECTED for every request. Searching for REJECTED in your access log is the fastest way to see how often the rate limit fires in production.

Conclusion

You have configured Nginx rate limiting on Ubuntu using limit_req_zone for request rate and limit_conn_zone for concurrent connections. You applied strict limits to sensitive endpoints like /login, looser limits to API endpoints, and skipped static assets. You also saw how to return a proper 429 status, whitelist trusted IPs, and combine rate limiting with Fail2ban for layered protection.

Rate limiting at the Nginx level is one of the most cost-effective protections you can add to a public-facing server. It runs before your application code even executes, so rejected requests consume almost no backend resources. The configuration is a few lines of Nginx config, and it works for any application regardless of the language or framework running behind the proxy.

From here, good next steps are:

Set up Nginx as a reverse proxy or load balancer to put rate limiting in front of multiple backend instances, see Configure Nginx as a Layer 7 Load Balancer
Combine with SSL to ensure rate-limited endpoints are also encrypted, see How to Secure Nginx with Let’s Encrypt SSL Using Certbot on Ubuntu
Add Prometheus metrics to track request rates and error rates over time, see Set Up Prometheus and Grafana on Ubuntu