Gotchas with Amazon S3 and Amazon CloudFront Static Site Hosting

Learn continually - there's always "one more thing" to learn! Steve Jobs

As I went on the journey to transform this blog from standard HTML/CSS based, Amazon S3 hosted to Hugo based, Amazon S3 hosted with Amazon CloudFront, I learnt some important lessons. Primarily around hosting a static site with CloudFront and S3.

  1. Amazon S3 as Origin

    When setting up origin for CloudFront you can choose to setup S3 as origin in two ways:

    1. Using REST API endpoints:

      <bucket-name>.s3.amazonaws.com

      This type of origin does not require public access to be enabled on the bucket. You should use Origin Access Identity or OAI to restrict access to the objects.

    2. Using website endpoint/custom origin:

      <bucket-name>.s3-website-<region>.amazonaws.com or <bucket-name>.s3-website.<region>.amazonaws.com

      This type of origin needs the backend to be a publicly available endpoint, which means the S3 bucket should be configured for static website hosting and objects publicly available. You cannot use Origin Access Identity or OAI to restrict access to the objects with this configuration. You would be looking at securing the interaction using headers in the request and conditions in the bucket policy.

      We will discuss this configuration later in this blog in detail.

  2. Compression and Caching Policy

    When using CloudFront to serve compressed objects, ensure the below:

    1. EnableAcceptEncodingGzip and EnableAcceptEncodingBrotli is set to true.
    2. Gzip and Brotli settings are both enabled in the cache policy.
    3. TTL is greater than zero otherwise caching is disabled and CloudFront will not compress objects.
  3. S3 Endpoints and DNS Resolution

    It is important to note that it takes time for DNS records to be created and propagated for newly created S3 buckets. If your CloudFront distribution is configured to use the global S3 domain name, you can get a Access Denied 403 Error where the URL in the address bar redirects to the regional S3 endpoint even when your request is made using the CloudFront distribution endpoint. This may not be as pronounced an issue for for buckets created in US East (N. Virginia) region, since US East is default and used for global resources such as CloudFront.

    You could use the regional S3 domain in the distribution settings and escape this issue.

  4. Default Root Object vs Multiple Sub-directories

    This is, in my opinion, the most important part of this article. We will discuss two types of static websites here and what configuration of CloudFront and S3 is needed and why.

    1. Static Website with a single root object

      In this use case you would have a single index.html that serves all content. You can use S3 with CloudFront OAI. You do not need to enable static website hosting for the S3 bucket.

      You do need to configure the Default Root Object in the distribution settings.

    2. Static Website with multiple subdirectories and root objects per subdirectories

      If you have static website such as this blog, where each post has its own root object, then you will receive 403 Access Denied error. Since CloudFront does not append the index.html at the end of the request URL within subdirectories.

      1. The first option is to define behaviours in the distribution for each of the paths that your website would have and update the behaviour. Now this approach is definitely not scalable since you would need to keep a track of the behaviours and paths within your static site. If you are using a generator then it becomes even harder.

      2. As per this AWS knowledge center article, you could use Lambda@Edge with your distribution.

        The default root object feature for CloudFront supports only the root of the origin that your distribution points to.

        CloudFront doesn’t return default root objects in subdirectories. For more information, see Specifying a Default Root Object.

        If your CloudFront distribution must return the default root object from a subfolder or subdirectory,you can integrate Lambda@Edge with your distribution. For an example configuration, see Implementing Default Directory Indexes in Amazon S3-backed Amazon CloudFront Origins Using Lambda@Edge.

        You would, however, incur costs based on the Lambda usage.

      3. Another approach is to host your website using S3 in static website hosting configuration. You can then configure your S3 website as a custom origin. You can secure the contents using the Add Header approach described in the next step.

  5. Backend Protection

    1. Object Ownership

      It is of the utmost importance to ensure that the AWS account that owns the bucket must also own the objects in the bucket. This applies to access granted by the bucket policy. and not to access granted by the object’s access control list (ACL).

    2. Encrypted Objects

      OAI only supports SSE-S3 which means you will not be able to serve objects encrypted using SSE-KMS through CloudFront by default. For this purpose, you would need to use Lambda@Edge to perform the encryption and decryption of the objects.

    3. Bucket Policy

      I’ve discussed two types of origins that CloudFront can be configured with when using with S3. In each case, the bucket policy would be different to restrict access to the S3 objects.

      1. Using S3 REST API and OAI:

        Configuration in this case is very simple. All you need to do is block public access to the bucket and enable access for the CloudFront OAI

        {
            "Version": "2012-10-17",
            "Id": "PolicyForCloudFrontPrivateContent",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "<OAI Amazon Resource Name (ARN)>"
                    },
                    "Action": "s3:GetObject",
                    "Resource": "arn:aws:s3:::<bucket-name>/*"
                }
            ]
        }
        
      2. Using website endpoint/custom origin:

        Configuration in this case is a based on using conditions and headers. You would need to configure custom headers that would be added by CloudFront as it requests objects from the S3 website endpoint.

        In your CloudFront distribution, under Origin Custom Headers set the Header Name Referer and any secret under Value. Why Referer as the Header Name? Simply because, CloudFront removes this header from any incoming requests unless in the allowed list, which means the value can’t be forwarded from an end user’s request.

        Now add a condition in the bucket policy to restrict access to the objects.

        {
            "Version": "2012-10-17",
            "Id": "PolicyForCloudFrontPrivateContent",
            "Statement": [
                {
                    "Effect": "Allow",
                    "Principal": {
                        "AWS": "*"
                    },
                    "Action": "s3:GetObject",
                    "Resource": "arn:aws:s3:::<bucket-name>/*",
                    "Condition": {
                        "StringLike": {
                            "aws:Referer": "<*SecretHeaderValue*>"
                        }
                    }
                }
            ]
        }