Leave GoDaddy and Go Completely Serverless Using Just Cloudflare and S3 Buckets: Part Two

This is the continuation of Part One. The goal again is:

Goal: Completely leave GoDaddy, move email services to Cloudflare, run WordPress offline and serve static HTML pages from Amazon S3, only pay a fraction of the ever-rising GoDaddy hosting fees, and finally move off GoDaddy’s underpowered, EOL’d shared server.
Overview

In Part Two, I show you how to handle the Htaccess file, use Cloudflare Workers to serve dynamic content, create cron jobs, protect your S3 buckets, cache hard, handle comments, and set up a WAF and firewall, and more advanced topics.

Results

Here is my serverless WordPress website now:

Ericdraken.com serverless website performance
Ericdraken.com serverless website performance
Checklist: If you would like to skip to the end, here is a checklist of steps I take when I move a site from GoDaddy.

Table of Contents

Part Two


Part One


Serverless: Serve a WordPress Website without a Server

Congratulations. At this point we have a working website, and do not even need Workers or Lambdas: HTML and images load fast, virtual subdomains work, the 404-page functions, and the whole site responds via HTTPS.

Think About It: What do we really need from serverless?

Some people throw a lot of services at serverless, overcomplicating serverless. But, why? Before you read an article that talks you into using Route53, API Gateway, Lambdas, CloudFront, CloudWatch, RDS, and Terraform to hold it all together, stop and ask yourself what you need, not what is en vogue on a résumé.

My vision of serverless looks closer to this:

Limit serverless

Where to start? Notice in the production/ folder there is a remnant .htaccess file with such contents as:

There are more directives, like different cache timings for different assets, a rewrite rule to redirect www to the root domain, rules to block bad bots, and so forth. These neat rules are Apache rules, but do we even need all of them any longer?

My Identified Serverless Needs

Before following anyone’s “Set up Serverless in 10 Minutes” article, I need to brainstorm what I need in Serverless to get away from GoDaddy without becoming a full-time SRE1.

Do I need a Gateway? Am I just trying to be cool by throwing every service at the problem? Am I fooling myself into thinking that if I have every AWS service touching my Serverless site that it will make me marketable?

The best solution is one that works and keeps working.

Here is what I’ve identified as my needs:

  • Serve a custom 404 page.
  • Honour the accept-encoding: gzip, deflate, br request.2
  • Have great SEO performance.
  • Redirect moved pages over time.
  • Serve vanity affiliate links.
  • Serve tracking links in emails.
  • Run cron jobs with custom PHP.
  • Collect information with a 1×1 pixel in PHP.
  • Mitigate bot scraping.
  • Publish and forget the site.
  • Keep backups.

Some Htaccess Thought Experiments:

Now that I have my Serverless needs written down, we can brainstorm what to do with the .htaccess directives:

  • If we merely delete the .htaccess directives file, then:
    • No rewrites at all.
    • No 401, 403, 301, 302, 500, … etc. response codes, ever.
    • Only 200 and 404 response codes are supported directly with S3.
    • Deleted pages in the future are 404, not 410 (Gone).
    • It is impossible to 301-redirect pages, only duplicate content.
    • Cloudflare only allows three page rules, but one is already HTTP –> HTTPS.
    • We will have big SEO problems in the future.
  • If we run a thin Apache server, then:
    • We will need a host or an EC2 in AWS.
    • It will be slower than simply loading an index.html from S3.
    • It will make me sad because if I have to use a server, it should be Nginx.
  • If we run an AWS Gateway with an incoming Lambda, then:
    • Problem solved, but…
    • We have the overhead of a cold-start Lambda on each GET request (poor SEO).
    • Every image and asset is either piped through the Lambda or gets a 302-redirect (poor SEO).
    • I’d be deeply tied to AWS and the fragility of glue between its services.
    • I'd have to write a mini server in Python or Node and hardcode and maintain rules, then…
      • I'll cry myself into a coma.
  • If we serve static files from S3 and use the 50 redirection rules allowed, then:
    • 404 errors are trivial.
    • It's easier to maintain a JSON file of URL redirects.
    • There are still no status code headers (just 302).
    • S3 only accepts HTTP requests and drops HTTPS requests.
    • S3 does not use gzip to return objects.
    • Metadata (Last-Modified, ETag) can be set, but with standalone PUT requests.
    • Easy redirects can be configured in JSON, and 410-gone rules can be in a Lambda, but…
      • I'd have to maintain JSON rules in S3, and a coded Lambda: two places, deeper coma.
  • If we use Cloudflare Pages, then:
    • We're tied into another vendor's configuration.
    • You'd notice right away that we're not a Vue, Angular, React, Jekyll, Hugo, etc. app.
    • Those gigabytes of images and assets still need to go somewhere: not GitHub!
  • If we use Cloudflare Workers, then:
    • We're still vendor-locked.
    • The coding languages are only: vanilla JavaScript, TypeScript, and Rust.
    • It's still coding a thin server, just in a simpler lambda-like Worker.
    • We get WAF, caching, redirects, TLS (HTTPS), logging, and much more for free.
Fun Fact: I worked with a company that tried to offload thousands of WordPress images to S3 from WordPress and keep them in sync (delete, resize, etc.) and it was an unmitigated disaster. I have learned that everything sits in S3, or nothing sits in S3. And, do not shard3 your website across GitHub repos to get past the 1 GiB repo limit.

Top ↩


The Plan for Apache’s Htaccess

So, what am I going to do with that core Apache file?

  • Turn off any Hide My WordPress obfuscation plugins. With static HTML, who cares if anyone sees /wp-uploads/ anymore? A bot can hammer the non-existent /wp-admin/ all day long. Then, most of the rewrite rules elegantly vanish. These can all be removed:
  • Turn off local compression. Yes, you read that right. I have an abandoned plugin called BWP Minify that caches content in gzip-compressed files. There are too many places compression can bite us in the rear, plus we won’t have a compression header in S3 assets. By turning off compression for offline assets, we are sure HTML, CSS, and JS are stored and scraped correctly for offline static storage.
  • Remove all the wonderful attack mitigation directives. For a static presentation website hosted in S3, I say go for automated XSS injection, Bot Farmers. These can all be removed:

  • Set edge cache expiration and web visitor cache expiration details in Cloudflare under caching instead of .htaccess. These can all be removed:
  • Remove all the WordPress rules in .htaccess.

  • Remove security rules in .htaccess. These can all be removed:

  • Remove all the 410-gone redirects because “currently Google treats 410s (Gone) the same as 404s (Not found), so it’s immaterial to [Google] whether you return one or the other. (ref)”

  • What is left is dynamic redirects and important headers concerning affiliate links. Plus, I will sometimes give a custom URL in an email link (e.g. ericdraken.com/linkedin4/ –> ericdraken.com/) and can detect if the recipient has clicked on it. For this, there has to be some dynamic processing. The remainder of the .htaccess essentially looks like this:

  • For dynamic URL redirects, I will use S3 redirects and explore vanilla JS with Cloudflare Workers. See the next section.

  • For static redirects, S3 redirects work, as well as using regular HTML redirects like so:

Summary

For most static WordPress sites, you can just delete the .htaccess file. For typical URL redirecting needs, there are easy solutions like those above. We no longer need security rules, so those can go. If you have a lot of dynamic affiliate links or cloaked URLs, that is when you need S3 redirect rules, Worker JavaScript redirects, or paid Cloudflare redirects. For static redirects, you can effectively use HTML redirects.

Top ↩


Step 15. Set up a Cloudflare Worker Dev Environment

Say I need something dynamic triggered by a web page. Cloudflare has a Lambda alternative called Workers that runs JavaScript or Rust and has no cold start (there are severe limitations, however).

Cloudflare workers

This is a simpler solution than breaking out IAMS, Roles, Route53, cold-start Lambdas, monitoring, metrics, maybe Terraform, yadda-yadda because Cloudflare takes care of all of that in your account.

FYI, the Worker needs to be uploaded from the console, but the good news is that it can be kept under version control offline. An example of a vanilla JS worker is this:

Install the Workers CLI

Cloudflare suggests I install Node.

828 MB just to install Node? Are you kidding?
828 MB just to install Node? Are you kidding?

Let’s use a Node Docker container with Wrangler (Cloudflare’s CLI tool) instead. I’ve made a Dockerfile and updated the docker-compose.yaml and .env (I’ll evolve these later on):

Why is this so cool? Now I can run docker-compose run --rm wrangler2 and communicate with Cloudflare Workers from the VM. Also, I can test my Workers offline in dev mode. Awesome. First, I’ll need to create a Workers token in the Cloudflare dashboard.

API Token: As of this writing, a beta of Wrangler2 is available, and to use the API token, you must pass CLOUDFLARE_API_TOKEN as an environment variable to the Docker container.

Cloudflare api token

I’ve created a sample Worker and am running it locally:

Gotchas: Wrangler2 CLI is wonderful, but there are gotchas. Run wranger dev --ip :: to bind to any IP, not 0.0.0.0 nor 127.0.0.1.
Signs of life from a Worker
Signs of life from a Worker
Wrangler dev is helpful for debugging
Wrangler dev is helpful for debugging

Now we can have fun exploring Workers.

With rich request information and a map function, we can recreate those .htaccess rules.

Top ↩


Step 16. Serve Dynamic Content, Formerly PHP Scripts

I have a tracking URL that looks like this:

https://innisfailapartments.com/pixel/
  ?loc=/contact/form
  &ua=Mozilla/4.0 (MSIE 6.0; Windows NT 5.1)
  &rnd=6242006519216

I made a tracking pixel many years ago and I’m not sure it even does anything, anymore. But, it is supposed to execute a PHP script:

Ok, so it sends visitor logs to Twitter (for some reason). Ah, Slack wasn’t mainstream back then, plus I was overseas at the time and I casually wanted to gauge visitor interest in that site.

Additionally, the active WordPress theme renders HTML based on the user-agent header (e.g. iPhone, Edge, Chrome, etc.). For instance:

How would I run PHP in a serverless setup?

Some ideas:

  • Rewrite the PHP into TypeScript to run in a Worker.
    • Easiest to implement and test locally. Cloudflare has a transpiler.
  • Host a baby PHP-FastCGI server on Vultr, DigitalOcean, Linode, etc. just to execute PHP.
    • Too much overhead, but maybe needed in rare cases.
  • Run the PHP in AWS Lambda, which runs PHP.
    • A PHP code rewrite will still be needed, and maybe more.
    • We'll then have both a Lambda and a Worker.
  • Use AWS or Cloudflare metrics and decommission such a script.4
    • Easiest to implement: just use Cloudflare metrics and/or GTM metrics.
  • Drop theme elements that inspect headers and switch to Responsive CSS.
    • Wise to implement, anyway.

Fortunately, I can drop the tracking pixel and scrape detection for this simple website. I found it easy to rewrite some PHP scripts into JavaScript.

Saving Files: You may be thrown off when you need persistent storage which was formerly just a server filesystem. Take a look at Durable Objects (paid option) and key-value store – a free, slow, eventually-consistent storage of up to 25 MB.

Top ↩


Step 17. Publish a Cloudflare Worker

Worker Quota: We are limited to 100,000 Worker invocations per day across all sites. If you front-load the website with a Worker, and a typical page with images, scripts, styles, icons, etc. requires, say, 25 requests, then you can serve 4,000 views a day. If you have ten sites, then you can only serve 400 views a day per site. If a bot comes and tries to hack your site, that may be hundreds of brute-force requests. Now, if GoogleBot comes along and indexes your site(s), you are so unbelievably pwnd. Do not front-load your website with Workers.
Worker Duration: We are limited to 10ms of CPU time per request on the free plan. If you proxy a large file from S3, you will go over that limit. Do not front-load your website with Workers.

When the 100,000 invocation limit is exhausted, an error will be returned to the web browser. The good news is that you can make multiple routes for a Worker. For example,

https://innisfailapartments.com/pixel/*

can be a route. Let’s let S3 serve 404s and real pages, and the Worker be invoked deliberately. Here is a production Worker that doesn’t do much except to exist:

Given the following wrangler.toml:

Publishing is as simple as:

Additional Workers Topics

Please explore these concepts if you are excited to use Workers for your dynamic needs.

  • Durable Objects (paid feature)
  • Key-Value Storage (free, 25MB limit, eventual consistency)
  • HTMLRewriter (because DOMParser is unavailable)

Top ↩


Step 18. Prevent Bots from Hammering your Workers

Someone with an ax to grind may come along, write a multi-threaded URL blaster, and hammer your Worker until you reach your limit for the day (in under 2 minutes5). Say they really don’t like you, so they set a cron job to run every day at 12:01. You’re hooped. Why do I even come up with these scenarios? I have experience with AdWords click fraud.

Mitigation: WAF

You could enable WAF Rate Limiting, but there is a cost of about a nickel for every 10,000 rolling good requests – this can add up quickly.

Cloudflare WAF rate limiting at 5c for every 10,000 good requests
Cloudflare WAF rate limiting at 5c for every 10,000 good requests

Mitigation: Short Cache

You could add a Page Rule to cache responses from Workers for 30s and ignore query strings for caching. Combine that with a Transform Rule to strip off any no-cache headers from the client – this works, but is advanced and has drawbacks. It works because a bad actor can spin up 16 threads to perform IO requests to your Worker, but only one will do anything. Its drawback is the Worker may cache, say, GeoIP info if you are not careful.

Mitigation: Polymorphic Worker Endpoints

If you are a Champion of Cheap, or just like the mental exercise, you can do something wicked for free: Make a Worker create a (cache) item in S3 (not Cloudflare), and make an S3 Policy to redirect to the Worker if said cache object does not exist. After a few or several seconds, a subsequent Worker request can delete that object and replace it with a fresh version for caching again. S3 does all the heavy lifting, and you can cache for as little as 5 seconds. Feel like being cheeky? Make a Worker polymorphically change its endpoint via the Cloudflare API (also update S3) to keep the b@stards from hitting the Worker endpoint directly.

Or do nothing and be kind to everyone. Or just pay Cloudflare. Whichever.

Top ↩


Step 19. Cron Jobs without a Server

How to schedule cron jobs within this simple serverless design? Cloudflare is awesome and it comes to the rescue again:

cloudflare cron triggers

It is straightforward to create a cron job that periodically calls a Worker. That Worker can do anything from rotating logs in S3 to fetching the latest Cloudflare CIDR blocks to running a batch job on visitor IPs to perform GeoIP lookups to then make neat graphs. The sky is the limit. To leave GoDaddy, I’m not in need of a cron job here.

Top ↩


Step 20. Move Comments and Schedules to a Third-Party

If you allow comments in your WordPress site: Stop.

There are too many prevalent XSS vulnerabilities whereby a bad actor leaves a malicious message like:

I love your site! This is so helpful. A+ <img src=”data-uri:base64 1×1-pixel” onload=”alert(‘Add backdoor user.’)”></a>

As an admin, you check the comments which are “held for moderation”, but as soon as you do, it’s too late. While logged in as an admin, you’ve just triggered some JavaScript:

You’d never know. The hacker or bot can modify your WordPress site with the same rights as an admin. Trust me.

When you move to S3, you will no longer have comments and interactive WordPress schedules. I suggest you use something like Disqus and JaneApp to keep that interactivity. For my needs, I prefer no comments at all.

Top ↩


Advanced Serverless Options

Really, I just need to get away from GoDaddy. Here are some impressive features that Cloudflare offers… for free. Feel free to let your mind go wild with the possibilities.

  • Serve compressed HTML and assets.
  • Enable HTTP/3 with QUIC for speed.
  • Use TLS Stapling for a speed-up.
  • Early Hints of more URLs for a given HTML page.
  • Add GTM and scripts via Cloudflare.
  • Create firewall rules.
  • Set up a scrape shield.
  • Reject bots from visiting.
  • Set up DDoS mitigation (save money on S3 and Workers).
  • Rewrite some HTML sections with Workers on the fly.
S3 Compression: Did you know that the objects stored in S3 are served uncompressed? If a 100 kB file of spaces is uploaded to S3, that 100 kB is served to the visitor. Some people suggest compressing objects offline and uploading them to S3. Clever, but, hey, wait, Cloudflare caches responses from S3. Let’s see how Cloudflare helps us with compression:

Cloudflare compresses objects from S3 in their cache
Cloudflare compresses objects from S3 in their cache

Cloudflare rocks. Brotli compression rocks too.

Top ↩


Advanced Concepts and Gotchas

SEO Juice

Do you want Google to give SEO juice to example.com.s3-...-amazonaws.com or example.com? Add this to your website on each page:

Protect S3

Want to prevent users from visiting example.com.s3-...-amazonaws.com?

Try a combination of these handy header scripts on every page (in a header.php). Combine them or mix and match. I use them all.

Rewrite HTML

Want to edit server-side HTML on the fly with Cloudflare’s HTMLRewriter? You can do some pretty cool server-side rewrites and the client would never know.

However, there is a super gotcha: you get 10ms and scant memory to run a Worker. So, forget using DOMParser or regex if you effectively want to modify HTML in a template. You need to be creative if you want to serve a populated HTML template on the fly without using client-side JavaScript (SEO prefers server-side-rendered content).

Hint: If you can lazy-load an image, you can solve this easily and use Cloudflare’s fast, linear-scanning HTMLRewriter. Have fun with this; it would make an interesting interview question.

Serve Third-Party Fonts ‘Locally’

Let’s serve a cached copy of Google Fonts instead of going to their CDN. Why? Each domain that has assets needs to perform a DNS lookup as well as a TLS handshake which slows down the loading of those assets, potentially blocking the site from painting. From DevTools, it’s straightforward to find the CSS and the WOFF files and “host” them locally.

Reverse-engineer the CSS and WOFF files for third-party fonts
Reverse-engineer the CSS and WOFF files for third-party fonts

Top ↩


Checklist: Move Sites from GoDaddy

I operate dozens of sites doing all kinds of things, but I really want to get this site off GoDaddy. Let’s review and automate the steps to break up with GoDaddy.

  1. Create or reuse the ByeDaddy VM.
  2. Convenience symlink: ln -s ~/godaddy/homedir/public_html/example_com ~/example.
  3. Confirm a ‘staging’ folder of the site exists.
  4. Create a docker/ folder; copy the docker-compose.yaml, .env, Dockerfile, and php.ini.
  5. Update the .env file with the wp-config.php values.
  6. Create the logs/ folder; add debug.log and apache_errors.log.
  7. Copy the aws/ folder with the policy and cors JSON files.
  8. Update the JSON files with the new site domain name.
  9. Run aws configure if this is the first time.
  10. Create the two S3 buckets: example.com and static.example.com.

  11. Change the buckets to static websites.

  12. Set the public policy of the buckets.

  13. Allow CORS for the buckets.

  14. Stop any other local sites: docker-compose down or ctrl+c.
  15. Block commercial plugins from phoning home:

  16. Start the WordPress staging site: docker-compose up; confirm SQL import succeeded.
  17. Edit /etc/hosts; add 127.0.0.1 example.com and 127.0.0.1 static.example.com.
  18. Visit https://example.com/wp-admin/; verify site looks great.
  19. (Optional) Keep yourself logged in longer for local WordPress dev work.

  20. Edit .htaccess; remove any .htpasswd references.
  21. Disable any Hide My WP plugins or similar.
  22. Remove any production/* contents (not the folder itself!).
  23. Generate static HTML and assets to a newly-created production/ folder (confirm /var/www/production/).
  24. Repair any static HTML generation problems.

    Repair any static HTML generation problems
    Repair any static HTML generation problems

  25. Address any 404 or 30X issues, and especially any 50X errors.

    Address any HTML error codes on static generation
    Address any HTML error codes on static generation

  26. Generate static HTML and assets, again.
  27. View static production site offline with a custom vhosts.conf.

    Tip: To view the static HTML files, you will still need a vhosts. I recommend modifying the WordPress Docker container to serve the production/ folder. Don’t forget about CORS, even offline. Hint:

  28. Spot check for broken images or broken pages.
  29. (Optional) Replace strings or URLs throughout the site:

  30. (Optional) Create a Worker; test offline; deploy; set the DNS entry if needed.
  31. (Optional) Create a redirect.json for S3 for URL redirects and/or Workers. Include again the index and error docs:

  32. (Optional) Push the bucket policy file to all site buckets:

  33. Sync the production folder to (both of) the S3 buckets:

  34. Enroll in the Cloudflare Email Routing service; enable catch-all.
  35. Add or replace MX and TXT records in DNS settings.
  36. Send a test email to catchme@example.com to verify email forwarding works.
  37. Update Cloudflare DNS CNAME records to point to corresponding S3 buckets:
    • CNAME: example.com –> example.com.s3-website-us-west-2.amazonaws.com
    • CNAME: static –> static.example.com.s3-website-us-west-2.amazonaws.com
  38. Clear Cloudflare cache: Purge Everything. Here is a shell script:

  39. Test URL redirection; test affiliate links, etc.
  40. Check for broken outbound links and fix or remove:

    Check for broken outbound links
    Check for broken outbound links

  41. Backup the running SQL database. Please see my Stackoverflow post for a detailed explanation.

  42. Cancel your GoDaddy hosting account.
  43. Set Cloudflare Edge Cache rules.

    Set Cloudflare Edge Cache rules
    Set Cloudflare Edge Cache rules

Top ↩


The GoDaddy Experience Simulator

Friends, GoDaddy’s cachet from the dot-com boom has long waned since the founder sold his majority of shares in 2011 and it transformed into a profit grinder using old hardware, cobbled-together third-party portals, and a mess of upselling. GoDaddy is a remnant of a by-gone era in a landscape where AWS, Linode, and Vultr (to name a few) do hosting as their bread and butter.

Here is a video of me trying to cancel my hosting account. This is me in real-time struggling to cancel my account. You can see that GoDaddy’s pages are broken, so I had to investigate and reverse the JavaScript in the console to skip the “chat with an agent first” blocker and jump to the satisfying denouement. GoDaddy, just… ByeDaddy.

Did you experience how broken and painful the GoDaddy backend is?

Top ↩


Results and Summary

Hello from S3. I moved ericdraken.com to S3+Cloudflare and you are reading this from Cloudflare cache with S3 as the origin. I expect to pay about 12 cents a month for this site's hosting. Let's see how fast the site loads:

Ericdraken.com serverless website performance
Ericdraken.com serverless website performance

The waterfall and timings look beautiful, too.

Excellent load speeds with S3 + Cloudflare
Excellent load speeds with S3 + Cloudflare

Success

When you realize that a WordPress site with a heavy theme and dozens of plugins takes 5~8s for a page to actually load under Apache and PHP, having under one second load time is nearly miraculous.

Success: I broke up with GoDaddy. I no longer pay $18/mo for one shared vCPU competing with hundreds of tenants on a near-decade-old 6-core, EOL'd Linux server and only getting 512MB of RAM on a tired, maxed-out machine; I am now serverless. My sites have never been faster - amazingly faster.
Success: I do not have to lose my mind with IAMs, ACLs, Roles, Route53, regions, AZs, CloudFront, AMIs, RDS, SES, and the weeks of pain it would take to properly set up hosting in AWS or to set up and test Terraform and fail and repeat. I also do not have to maintain AWS infrastructure and their migrations, deprecations, new versions, patches, and on.
Success: My emails work. I'm not hackable via WordPress. I have an industry-hardened Web Firewall (WAF) and DDoS protection through Cloudflare. I have a global CDN, naturally. I get GeoIP information for free. And, the best part is that I only pay a few cents a month. Truly a success.

ByeDaddy featured


Notes:

  1. SRE = Site Reliability Engineer
  2. I bet you didn’t think about how serverless buckets send back contents uncompressed which may actually slow down your site.
  3. Shard = split your website up between several repos.
  4. 404 errors from this tracking pixel still show up in logs.
  5. A nasty technique is to not wait for ACK responses until the next salvo.