Rate Limiting

The IBM Application Gateway (IAG) allows incoming requests to be limited based on a user-defined set of criteria and policy.

Rate limiting achieves the following protections:

  • Brute force attacks on sensitive information such as passwords or PINs;
  • Denial of Service attacks on a server or the Web Reverse Proxy.

Rate limiting is performed by taking the incoming request and identifying the parts of the request that makes it unique to a client. Information such as the IP address that made the connection, a session cookie, other header information or the URL and HTTP method that were used can all be included to identify a client. When duplicate requests come in they fill up a counter; which when full causes the IAG to return an error to the client or drop the incoming connection.

Types of users

Rate limiting can be deployed with two types of end users in consideration.

  • The malicious end user who tries to cause damage to the service or steal a credential with brute force.
    • There are two approaches to handle a malicious user:
      • The first approach is to handle them in the most efficient manner by closing the connection without performing any further request processing;
      • The second approach is to mislead the malicious user by returning false information, making it unclear to the user, if the operation was successful or even processed.
  • A valid user who needs to be stopped from making too many requests in order to stop service degradation or overloading a back end server:
    • A page that provides more information is more suitable for a end user who is intended to use the service.

Client Identification

A critical aspect of the rate limiting capability is being able to uniquely identify the client which originated the request. The IAG container that performs the rate limiting might not be the perimeter device in a connection from a client, and thus the incoming IP might not correspond to the IP address from the end user. In this instance, the device that is at the network perimeter should include the client address information in a header such as X-Forwarded-For, which can then be used by the IAG to identify the source of the request.

Rate Limit Policy

Rate limiting is configured by authoring a policy. This policy contains the following items:

  • Requests that the policy applies to. The request is identified by the HTTP method and path;
  • A rate limiting rule, which defines how the rate limiting policy should be applied. The rate limiting rule consists of:
    • The information in the request that identifies the client. This includes; cookies, headers, query string parameters, the client IP address and credential attributes, if available;
    • The threshold for how many identical requests a client can make before they need to be rate limited and how long a client remains rate limited;
    • How to respond when a client is rate limited.

A rate limiting policy is defined in the IAG configuration YAML file.

Rate Limit processing

Rate limiting can occur in two places during the IAG processing flow:

  • If the rate limiting policy does not include credential attributes as a value to use in the request the rate limiting processing occurs very early when processing a request:
    • This is before the authentication and authorization processing;
  • If the rate limiting policy includes credential attributes the rate limiting processing will take place after the user's session has been verified:
    • This is after HTTP transformations and authentication, but prior to authorization checks;

Please note that if rate limiting is applied early in the processing and the URL is re-written, the rate limiting check is not performed again.

All matching rate limiting polices are applied to each request, such that if a request satisfies multiple policies they are all applied until one results in a reaction. This allows rate limiting policies to be layered. An example of a layered rate limiting policy, which can be used to rate limit a user account on a device, requires two policies, which are described below:

  • The first policy limits a user's ability to request resources from a protected application based on client IP and HTTP method;
  • Then you can use a second policy to rate limit the use of a session cookie, with a reaction which will invoke the logout page.

This way, when a user abuses the session, they are logged out and must log in again. With the first policy in place, authentication is also a limited operation.

Configure IAG to use a rate limiting policy

A rate limiting policy is specified in the IAG configuration YAML file under /policies/rate_limiting. These policies can be applied to gateway resources.

For example, to limit GET requests to 3 requests per second, from each client IP address, for all resources under the path /my_app :

resource_servers:
  - path: "/my_app"
    connection_type: "tcp"
    servers:
      - host: "10.10.10.200"
        port: 1337
    transparent_path: false

policies:
  rate_limiting: 
    - name: "limited_by_ip"
      methods: 
        - "GET"
      paths: 
        - "/my_app*"
      rule: |
        ip: true
        capacity: 3
        interval: 1
        reaction: TEMPLATE

See Rate Limiting Example for an example IAG configuration YAML file with rate limiting policy.

Matching Criteria

The two parts of the matching criteria are HTTP methods and the request path. If a request matches the criteria the rate limiting policy is applied.

  • The request path is specified with the path key value. The path pattern supports wild cards. Only one request path can be specified;
  • The HTTP method is a list of methods to match on, or the value of * to indicate that all HTTP methods will match. Multiple methods can be specified.
  • All matching on path and method is case insensitive.

Each matching criteria will correspond to a single rate limiting bucket. In other words, the access rate for each matching criteria is treated separately.

An example which will limit attempts to POST to an application path is:

policies:
  rate_limiting: 
    - name: "limited_by_ip"
      methods: 
        - "POST"
        - "*"
      paths: 
        - "/my_app*"

If you want to match on every request the following can be specified:

policies:
  rate_limiting: 
    - name: "limited_by_ip"
      methods: 
        - "*"
      paths: 
        - "*"

To match all POST and GET requests regardless of URL:

policies:
  rate_limiting: 
    - name: "limited_by_ip"
      methods: 
        - "GET"
        - "POST"
      paths: 
        - "*"

An example, which will apply to all requests to a given application path:

policies:
  rate_limiting: 
    - name: "limited_by_ip"
      methods: 
        - "*"
      paths: 
        - "/my_app"

Configure tracing for rate limiting

Trace records for rate limiting can be sent to a file within the IAG container. The trace component is pdweb.http.ratelimit.

The trace level governs how much detail is logged. A level of 9 provides the most detailed output.

logging:
  tracing:  
    - file_name: /var/tmp/http-ratelimit.log
      component: pdweb.http.ratelimit
      level: 9

See Enable Tracing for more information on configuring tracing.

Rate Limiting Rules

Overview

The rate limiting rules consists of three key portions.

Matching Attributes

The following attributes can be used to match an incoming request:

  • cookies
  • headers
  • query string parameters
  • client IP
  • credential attributes

Wild card characters can be used to match an attribute. However, when a match occurs the actual value rather than the configured pattern is used to identify the request. All matching is case insensitive.

If the request does not contain the specified header, cookie, query string parameter, or credential attribute, the request is not rate limited.

An example of limiting incoming requests based on user credentials would be:

      rule: |
        header: 'Authorization: "Bearer *"'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

An example, which limits the number of session created, involves rate limiting on the PD-S-SESSION-ID cookie:

      rule: |
        cookie: 'PD-S-SESSION-ID: "*"'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

An example of rate limiting based on a query string (for example: /my_app?resource=123) would be:

      rule: |
        query: 'resource: 123'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

An example of limiting access to any resource would be:

      rule: |
        query: 'resource: "*"'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

An example which uses a credential attribute to limit access of users based on username would be:

      rule: |
        credential: 'AZN_CRED_PRINCIPAL_NAME: "*"'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

A boolean flag is also available to control whether the client IP address is also used to match the request. For example:

      rule: |
        credential: 'AZN_CRED_PRINCIPAL_NAME: "*"'
        ip: false
        capacity: 3
        interval: 60
        reaction: TEMPLATE

Limiting Requests

Once a request has been matched the identifying attribute (i.e. URL, Method, IP, cookies, headers, query parameters, etc) are then built into a consistent lookup key.

A counter is kept for this key. This counter has two properties, the maximum value and how often the count is reset back to zero. These values are controlled by two configuration properties:

  • A capacity which indicates the number of requests that are allowed until rate limiting occurs;
  • An interval which is the number of seconds that must pass until the capacity is reset.

For example, to allow 100 requests per minute, set the following:

      rule: |
        ip: true
        capacity: 100
        interval: 60
        reaction: TEMPLATE

Reaction Method

The final part of the configuration is the reaction method. There are three ways to react:

  • CLOSE
    • Close the connection without sending any information back to the client.
  • TEMPLATE
    • Send the rate limiting template, with the status code 429 'Too Many Requests', back to the client. The default page returned to the client is the management page entitled: ratelimit.html. See Defining Custom Responses for more information on customizing IAG pages if a customized page is required.
  • URL
    • A URL can be specified. This URL rewrites the request URL to the defined value:
      • This can be used to route a rate limited request to a dummy resource which appears to produce the same functionality as the actual resource. The intent of this is to mislead a malicious client into thinking they are performing numerous operations while actually not having any negative impact on the system.
      • This can also be used to log a user out. For example, you can direct them to /pkmslogout to terminate their session.

Please note that providing a reaction is optional. If no reaction is provided the TEMPLATE reaction will be used.

For example, to close a rate limited connection:

      rule: |
        ip: true
        capacity: 100
        interval: 60
        reaction: CLOSE

To return the template response for a rate limited connection:

      rule: |
        ip: true
        capacity: 100
        interval: 60
        reaction: TEMPLATE

To re-write the URL for a rate limited connection:

      rule: |
        ip: true
        capacity: 100
        interval: 60
        reaction: "/dummy-login"

Rate Limiting Cache

When performing rate limiting some information about the number of requests made for a client and policy must be stored. This information is stored in a cache which has a size limit. When this limit is exceeded the oldest entry is ejected. This effectively resets the rate limiting counters for this client.

It is important to ensure that you set the configuration entry, cache_size, to a suitably high number so that a malicious client cannot saturate the cache. This number needs to be higher than the number of requests being rate limited across a refresh interval. If this value is not set it defaults to 16384. Refer to the following YAML reference page for details on setting this configuration entry: /server/rate_limiting.

Rate limiting with a load balancer

When there is a load balancer or other network device, one which terminates the connection before forwarding the request onto the IAG, the IP address flag is not useful. The IP address is effectively static as it will always be set to the address of the network terminating device. In order to rate limit on an IP address in this situation, have the network terminating device include the client IP address as a header, and include this in the rate limiting configuration. For example:

      rule: |
        header: 'X-Forwarded-For: "*"'
        ip: true
        capacity: 3
        interval: 60
        reaction: TEMPLATE

Request log entries for rate limited requests

When rate limiting occurs the username which is found in the corresponding request log entry will always be set to unauthenticated. This is because the rate limiting occurs before the authenticated user is identified. With no username value being present in the request log entry it is often useful to include the Client IP address or the X-Forwarded-For header in the request log to help correlate and identify rate limited entries.

When a connection is closed due to rate limiting, the status code which is found in the corresponding request log entry will be set to -1. This signifies that the request was dropped without a status.

Sharing rate limiting data across instances

By default, IAG uses a local instance specific in-memory cache to track and store rate limiting data.

IAG can also be configured to track and store rate limiting data in a Redis database. When rate limiting data is stored in a Redis database, multiple IAG instances can share this same data and rate limit clients across multiple instances.

Configuring rate limiting to use Redis

To configure IAG to distribute rate limiting data using Redis:

  1. Configure a Redis collection in /services/redis
services:
  redis:
    key_prefix: iag-
    collections:
      - name: test-collection-ratelimiting
        servers:
          - host: redis-b.ibm.com
            name: redis-b
            password: passw0rd
            username: redis-iag-user
            port: 6380
            ssl:
              trust_certificates: "@redis-b-ca.pem"

For more details refer to the services/redis YAML reference.

  1. Specify the Redis specific rate limiting configuration in /server/rate_limiting/redis
server:
  rate_limiting:
    redis:
      collection_name: test-collection-ratelimiting
      sync_window: 10

For more details refer to the server/rate_limiting YAML reference.

No changes are required to existing rate limiting policy files.

Synchronisation Model

When configured to store rate limiting data in Redis, IAG uses the following model to synchronise data.

  • When a client is observed by an instance for the first time, IAG will check to see if any rate limiting data for that client is available in Redis. This data will be retrieved and stored in the local cache for the duration of the synchronisation window /server/rate_limiting/redis/sync_window.
  • On subsequent requests if the synchronisation window has not elapsed, IAG will continue to use the locally cached rate limiting data for that client.
  • If a subsequent request is observed after the synchronisation window has elapsed, IAG will reconcile its rate limiting data (the number of hits observed by this instance) with the record in Redis. At this point it will also replace the locally cached rate limiting data with the current version from Redis.

Note that in this model there is a small window of time during which a client can exhaust a single instance and continue to make requests to other instances.

Redis data storage

A new record is created for each client as identified by each policy. These records are created with an expiration corresponding to the rate limiting policy's defined interval. The native Redis expiration capability will automatically remove expired records.

Each record consists of a single key which is used to store an integer value. The integer value is the number of observed hits.

The format of the key is:

<key-prefix>-<policy hash><client hash>

FieldDescriptionExample
<key-prefix>A prefix which is applied to all keys in this database. This is the value of /services/redis/key_prefix.isva
<policy-hash>A SHA256 hash of the policy YAML content which is being applied to this client.d9ae...d5fd
<client-hash>A SHA256 hash of the criteria which uniquely identifies this client according to the rate limiting policy being applied.504f...3ba8

Note that the rate limiting policy content itself or details about clients are not stored in Redis, only a SHA256 hash of this information.

Required Redis commands

The following Redis commands are used by IAG for storing and maintaining rate limiting data. When using Redis authorization and Redis ACLs, ensure the following commands are permitted for the user which IAG is configured to use:

  • EXISTS
  • INCRBY
  • TTL
  • SET