Caching AWS Lambda with CloudFront

The first post of this series showed how to deploy an AWS Lambda function and make it accessible via an API Gateway. The second post showed how to stream the response. Now we will add some caching to CloudFront.

Why? Because the Lambda I am writing should always return the same results for the same input. So why call it every time, and spend money on it? We could also store the results in S3, but then we have to manage the cache ourselves. Delete old entries, make sure we don’t use too much storage there. CloudFront takes care of that for us.

CloudFront and HTTPS

CloudFront is a Content Delivery Network (CDN) by AWS. It is running on AWS’s edge locations, which is closer to the users than our Lambda function. We could even user Lambda@Edge to run code on the edge locations, but not in scope of this post.

In order to set up CloudFront with Terraform, we need to define a number of resources. We start with the distribution itself. This is already a rather verbose object because of all the required fields:

resource "aws_cloudfront_distribution" "main" {
  enabled = true

  # We can have one or multiple origins (i.e. backends) for the CloudFront distribution.
  origin {
    # Each one needs a unique ID
    origin_id   = "test_lambda_origin"
    # We want to use our Lambda function as origin, so we have to derive the domain name from its function_url
    domain_name = trimsuffix(trimprefix(aws_lambda_function_url.main_lambda_url.function_url, "https://"), "/")

    # The custom_origin_config is required unless the origin is an S3 bucket.
    custom_origin_config {
      http_port                = 80
      https_port               = 443
      origin_protocol_policy   = "https-only"
      origin_ssl_protocols = [
        "TLSv1.2",
      ]
    }
  }

  # A viewer_certificate block is required, and we use a default certificate for now.
  viewer_certificate {
    cloudfront_default_certificate = true
  }

  # Here, you can restrict geo-locations, but we don't want to do this.
  restrictions {
    geo_restriction {
      restriction_type = "none"
      locations = []
    }
  }

  # This block defines the actual routing and caching. By default we route everything to the lambda.
  # We may change this in the future when we get more origins.
  default_cache_behavior {
    # Allow or limit HTTP methods to be forwarded
    allowed_methods = ["DELETE", "GET", "HEAD", "OPTIONS", "PATCH", "POST", "PUT"]
    # Which requests should be cached?
    cached_methods = ["GET", "HEAD"]
    # The target origin as stated about. We should extract this to a variable.
    target_origin_id       = "test_lambda_origin"
    # The cache policy contains more detailed settings for the cache, like ttl, passed headers, etc.
    cache_policy_id        = aws_cloudfront_cache_policy.test_lambda_cache_policy.id
    # Redirect all HTTP requests to HTTPS
    viewer_protocol_policy = "redirect-to-https"
  }
}

This is not enough though. We need another resource for the cache policy. It at least requires a name and the definition of which parts of the request should be forwarded to the origin. None in our case:

resource "aws_cloudfront_cache_policy" "test_lambda_cache_policy" {
  name = "test_lambda_cache_policy"

  parameters_in_cache_key_and_forwarded_to_origin {
    cookies_config {
      cookie_behavior = "none"
    }
    query_strings_config {
      query_string_behavior = "none"
    }
    headers_config {
      header_behavior = "none"
    }
  }
}

Finally, we need to know on which domain the CloudFront distribution is available, so we add an output:

output "cloudfront_domain_name" {
  value = "https://${aws_cloudfront_distribution.main.domain_name}"
}

Missing custom_origin_config

One mistake that took me quite some time, and which ended in consulting my co-workers was to omit the custom_origin_config. You don’t need to provide one if you use an S3 bucket as origin. But for other URLs, you need it.

Controlling the cache

There are two ways to control the caching-behavior of the CloudFront distribution:

Return Cache-Control headers in the response of the Lambda.
Assign a default_ttl to the default_cache_behavior.

Personally, I prefer the first option, because the Lambda knows best how long the response will be valid for. In order to demonstrate this, I will change the Lambda function code a bit:

export const handler = streamifyResponse(
  async (event, responseStream, context) => {
    responseStream = awslambda.HttpResponseStream.from(responseStream, {
      statusCode: 200,
      headers: {
        "Content-Type": "text/plain",
      },
    });
    responseStream.write(new Date().toString());
    responseStream.end();
  },
);

Debugging

A short word about debugging, because when I wrote this post and wanted to try setting the default_ttls, I suddenly encountered Internal Server Error responses from CloudFront. So what did I do to debug this? I don’t know if there is a way to remote-debug the Lambda, but what you can do is

Try the CloudFront-URL.
Try the Lambda-URL directly.
Look into the CloudWatch logs of the Lambda
Test the Lambda in the AWS console.
If you commit often, roll back to the last working version and iteratively deploy new versions until you encounter the bug.

In my case, I tried to do responseStream.end(new Date().toString()) directly, instead of response.write(...); response.end(). This worked find the test-runner, and it should work with Node.js streams. But it didn’t with the function URL.

With the new Lambda function we should get and impression of how long things are cached. The example repository contains a script multi_curl.sh which fetches the URL 5 times with a delay of 2 seconds and prints the result. Fetching directly from the function URL, has this output.

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ bin/fetch_ten.sh https://t6joecx3wv7jdr74jmsuc3aptu0dhvzn.lambda-url.eu-west-1.on.aws/
1: Wed Dec 11 2024 14:18:34 GMT+0000 (Coordinated Universal Time)
2: Wed Dec 11 2024 14:18:36 GMT+0000 (Coordinated Universal Time)
3: Wed Dec 11 2024 14:18:38 GMT+0000 (Coordinated Universal Time)
4: Wed Dec 11 2024 14:18:40 GMT+0000 (Coordinated Universal Time)
5: Wed Dec 11 2024 14:18:42 GMT+0000 (Coordinated Universal Time)

Setting default_ttl in CloudFront

When we use the CloudFront URL, we get the same result. After some experiments, which you can read at the end of this article, I found out that the right is to add a default_ttl to the test_lambda_cache_policy:

resource "aws_cloudfront_cache_policy" "test_lambda_cache_policy" {
  name = "test_lambda_cache_policy"

  default_ttl = 6
  // ...
}

The script now returns:

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ bin/multi_curl.sh https://d1ev8nsmj7uz99.cloudfront.net
1: Wed Dec 11 2024 14:42:09 GMT+0000 (Coordinated Universal Time)
2: Wed Dec 11 2024 14:42:09 GMT+0000 (Coordinated Universal Time)
3: Wed Dec 11 2024 14:42:09 GMT+0000 (Coordinated Universal Time)
4: Wed Dec 11 2024 14:42:09 GMT+0000 (Coordinated Universal Time)
5: Wed Dec 11 2024 14:42:17 GMT+0000 (Coordinated Universal Time)

Not exactly six seconds, but good enough for me.

Returning `Cache-Control` headers from the Lambda

No lets remove the default-ttl again and return a Cache-Control header in the Lambda instead:

export const handler = streamifyResponse(
  async (event, responseStream, context) => {
    responseStream = awslambda.HttpResponseStream.from(responseStream, {
      statusCode: 200,
      headers: {
        "Content-Type": "text/plain",
        "Cache-Control": "max-age=6",
      },
    });
    responseStream.write(new Date().toString());
    responseStream.end();
  },
);

Caching works and a new value is returned after 6 seconds.

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ bin/multi_curl.sh https://d1ev8nsmj7uz99.cloudfront.net
1: Wed Dec 11 2024 14:42:27 GMT+0000 (Coordinated Universal Time)
2: Wed Dec 11 2024 14:42:27 GMT+0000 (Coordinated Universal Time)
3: Wed Dec 11 2024 14:42:27 GMT+0000 (Coordinated Universal Time)
4: Wed Dec 11 2024 14:42:33 GMT+0000 (Coordinated Universal Time)
5: Wed Dec 11 2024 14:42:33 GMT+0000 (Coordinated Universal Time)

Passed parameters and headers

Currently, our Lambda does not process and input data. If we want to do that, we need to specify those parameters in the cache policy. Those are also the parameters that are used to compute the cache key: Only if all forwarded are the same for two requests, then the second request is taken from the cache.

For this experiment, let’s return a JSON object of all query parameters and headers in the response:

export const handler = streamifyResponse(
  async (event, responseStream, context) => {
    responseStream = awslambda.HttpResponseStream.from(responseStream, {
      statusCode: 200,
      headers: {
        "Content-Type": "text/plain",
        "Cache-Control": "max-age=6",
      },
    });
    responseStream.write(
      `${new Date().toTimeString()} ${event.queryStringParameters?.myParam} ${event.headers["x-my-header"]}\n`,
    );
    responseStream.end();
  },
);

Case-sensitivity

Header names are converted to lower-case by the Lambda function, this is why I use event.headers["x-my-header"] instead of event.headers["X-My-Header"].

Now, let’s see if the values are returned properly. First the Lambda function directly:

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ curl  -H "X-My-Header: foo" "https://t6joecx3wv7jdr74jmsuc3aptu0dhvzn.lambda-url.eu-west-1.on.aws/?myParam=bar"
15:52:13 GMT+0000 (Coordinated Universal Time) bar foo

Now via cloudfront:

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ curl  -H "X-My-Header: foo" "https://d1ev8nsmj7uz99.cloudfront.net/11?myParam=bar"
15:56:06 GMT+0000 (Coordinated Universal Time) undefined undefined

We can see that the parameter myParam and the header x-my-header are not passed to the Lambda function. We have to add them to the cache policy:

resource "aws_cloudfront_cache_policy" "test_lambda_cache_policy" {
  name = "test_lambda_cache_policy"

  parameters_in_cache_key_and_forwarded_to_origin {
    cookies_config {
      cookie_behavior = "none"
    }
    query_strings_config {
      query_string_behavior = "whitelist"
      query_strings { items = ["myParam"] }
    }
    headers_config {
      header_behavior = "whitelist"
      headers { items = ["x-my-header"] }
    }
  }
}

Deploy and try again. Let’s call the function multiple times. Two times with the same query parameter, and two times with different ones:

➜  terraform-lambda-example git:(0030-cloudfront-lambda) ✗ for i in q1 q1 q2 q3 ; do curl "https://d1ev8nsmj7uz99.cloudfront.net/14?myParam=$i" ; sleep 1 ; done
16:03:56 GMT+0000 (Coordinated Universal Time) q1 undefined
16:03:56 GMT+0000 (Coordinated Universal Time) q1 undefined
16:03:58 GMT+0000 (Coordinated Universal Time) q2 undefined
16:03:59 GMT+0000 (Coordinated Universal Time) q3 undefined

From the timestamps, we can see that the second request came from the cache, while the other ones are freshly computed. We could also test this for different headers, and I did. But I’ll leave that to you as an exercise.

Conclusion

We have seen how to create a CloudFront distribution for a Lambda function with Terraform. We have tried different strategies to control the caching behavior, and we have seen how to pass parameters and headers to the Lambda function.

There are many more options available in the CloudFront configuration, and there is plenty of documentation. I hope I could give you a good starting point and show you some of the puddles I stepped into.

If you want to read on, there is still a small part about failed experiments following below. And there is much more that I would like to write about. Like using custom domains and custom certificates. Using multiple Terraform providers, adding a homepage to the site, and about the image generation algorithm itself.

So stay tuned and have nice holidays.

The setup for this post can be found in the branch 0030-cloudfront-lambda of the example project

Bonus: My experiments with `default_ttl`

When I wrote those lines, things didn’t work for me at first.

I configured default_ttl = 6 in the default_cache_behavior. No caching applied.

I noticed that the script was still returning a header Cache Control: no-cache in the response, which overrides the default TTL. After removing that, still no change.

I added min_ttl =0 and max_ttl = 300000 the default_cache_behavior. Now something is cached, but for way longer than 6 seconds. Then I saw the documentation mention, that the ttl-fields in the default_cache_behavior are a way to specify the ttl without having a cache-policy. So I removed the fields from the behavior and added them to the policy instead, like described above.

Still the cached value was returned, but for other URLs it was cached about 6 seconds. After a CloudFront invalidation, it worked also for the initial URL.