As I’ve previously written, information that is subject to change has no place in a URL. This is because a URL is an agreement to serve a specific piece of content from a predictable location for as long as possible. As Tim Berners-Lee wrote, “cool URIs don’t change.” With this in mind, designing a human-friendly bulletproof URL structure can be challenging.

Tim Berners-Lee recommends that webmasters avoid using the subject (i.e. post title or topic) in the URL. Despite the fact that using the subject often is human readable, the title has the potential to change over time. For instance, we know that keywords within headers and URLs impact search rankings and impact a user’s perception of credibility. For this reason, and whenever content changes, content authors and Search Engine Optimizers may want to experiment with title and keyword changes to improve search rankings.

We want a URL structure that is human readable, provides the flexibility to be changed periodically, but continues to serve the same content even if the URL changes.

Fortunately, we already have an example of a website that achieves just that—Medium. Medium assigns each post an ID which is appended to the post title.

Medium Post ID adbec59c7cf4

As can be seen here, the “title” portion of the URL is inconsequential—only the post ID matters. By appending the ID to the end of the URL, Medium ensures that keywords are closer to the root domain, and are less likely to be truncated within search results or elsewhere. I find this approach particularly elegant, so I attempted to recreate this URL structure within Jekyll.

The first step was to create a static ID that could be used within the URL. Because Jekyll has no persistent database, almost every value within Jekyll is subject to change. My goal here was to select a post attribute that would (almost) never change to use as the static ID. The attribute that I deemed least likely to change was the post date.

The creation date of the document—the date the URI is created—is one thing which will not change. … That is one thing with which it is good to start a URI. If a document is in any way dated, even though it will be of interest for generations, then the date is a good starter.

Cool URIs don’t change, Tim Berners-Lee

The only time I could see the date becoming an issue is if I ever decided to change a post date, or if I ever published two posts with the exact same post date. Both scenarios are unlikely, but if either situation ever does occur it will require special consideration.

Once I had decided to use date as the static ID, I had to decide how I wanted to display that date within the URL. I could have used any combination ofyear-month-day, but the less specific I am with the date, the more likely I am to have a collision (e.g. if two posts are published on the same day). I could have easily added hour-minute-second to the date to be more specific, and that probably would work fine, but I didn’t want the date so blatantly displayed within the URL. After all, these are permalinks and I strive to may my content evergreen. I don’t want users thinking my content is outdated because the URL contains an old date.

It seems like Medium is using an ID in base64 notation, so I decided to encode the date using MD5 and then truncate the hash to a limited number of characters. To achieve this, I wrote a small Jekyll extension called jekyll-hashpermalink which adds a :static_id variable that could be used within the Jekyll permalink.

permalink: '/:categories/:title-:static_id'

Having added this new variable, I was able to append a static ID to my URLs, but this still did not solve the problem. If I changed the post title, the old URL would become invalid. I want a solution where the title can change without invalidating the URL. Effectively, I need the title to be ignored, such that only the static ID is used to serve content.

To achieve this next step, I had to change the way Jekyll creates the file structure when creating posts during compilation by overwriting the Jekyll::Document destination method. Instead of using the full url, which is Jekyll’s default destination, I replace this with just the static_id.

Once the files were writing to the correct location, they could be accessed using the static_id as the URL, but they could no longer be accessed with the full URL (including their title). To resolve this problem, I had to configure my server to serve content based on the static ID suffix rather than the full URL.

RewriteCond %{REQUEST_URI} -([^-]*)$
RewriteRule ^ %1/index.html

Caution: It may be pertinent to ignore specific subdirectories (e.g. the assets directory), or be more specific with the patten match to prevent unintended URLs from being rewritten.

Finally, I could successfully access all of my content using the static ID, and I could change the title at will without needing to manually define a 301 redirect after each change.

Duplicate Content

To prevent duplicate content issues, I ensured that a canonical link was present on the page that points to the URL with the latest title.

<link rel="canonical" href="{{ page.url | replace:'index.html','' | prepend: site.baseurl | prepend: site.url }}">

Always Accurate

I also added a tiny bit of JavaScript to rewrite the URL in browser, using history.replaceState, so that the URL the visitor sees will always match the current correct canonical URL, regardless of the URL they used to access the page.

// Shareable URLs
// Automatically replace URL with canonical URL
// Strip query parameters from the URL
canonical_pathname=document.querySelector('link[rel=canonical]').href.replace(/^.*\/\/[^\/]+/, '');
(function(l){window.history.replaceState({},'',(canonical_pathname||l.pathname)+l.hash)})(location)