Jack's Blog

Removing File Extensions From Pelican Links

I hate having the ending .html's on a website

I spent a decent amount of time on my old website removing the .html extensions from the end of my page. Initially I used an nginx rule to strip the .html from the URL manually. I got this code block from this Stackoverflow answer and it worked decently.

location / {
        error_page 404 /404.html;
        # Remove html from end of page
        if ($request_uri ~ ^/(.*)\.html$) {
            return 302 /$1;
        }
        try_files $uri $uri.html  $uri/ =404;
}

I moved over to Caddy in late 2025 and have a similar block for my main webpage still:

jackmcintoshthomson.com {
        root * /srv
        try_files {path}.html
}

This came from an answer by francislavoie on the Caddy Forums. My current setup allows you still add the .html in manually but I think that's ok for now.

With my new blog

I probably could do some sort of Caddy rule to redirect/strip the file ending out manually. I found it simpler to use a setting found in the Pelican docs to create the links without the file ending. The Pelican docs have some settings located here that allow you to give links with one pattern of URL (e.g. ARTICLE_URL) and save it as another name (e.g. ARTICLE_SAVE_AS). My config looks like this in bothpelicanconf.pyandpublishconf.py`.

# Rewrite links so they don't end in .html
# Pelican creates a folder for each post with an index.html inside of it
ARTICLE_URL = '{slug}/'
ARTICLE_SAVE_AS = '{slug}/index.html' # This is a default setting
PAGE_URL = '{slug}/'
PAGE_SAVE_AS = '{slug}/index.html'
# I don't need to generate authors, so just don't save them
AUTHOR_URL = ''
AUTHOR_SAVE_AS = ''
# Categories and tags don't generate seperate folders, just link to their html
CATEGORY_URL = 'category/{slug}'
CATEGORY_SAVE_AS = 'category/{slug}.html'
TAG_URL = 'tag/{slug}'
TAG_SAVE_AS = 'tag/{slug}.html'

Now pelican will output a folder for a new article in {slug}/index.html (slug is just the unique name of the article), but in my main index will link the page as {slug}/. A browser will look for an index.html by default and Caddy will add that extension into the URL. This gives me much cleaner url's like https://jackmcintoshthomson.com/blog/removing-file-extensions-from-pelican-links/ I don't love the final forward slash, but it's much nicer than having a .html there.

Misc

The categories and tags pages generate as their names directly e.g. python.html for the Python category. I had to change the TAG_URL and CATEGORY_URL to just link to their names in the category/tag folders directly. So instead of looking for category/python/index it's looking for category/python and the .html is tried automatically by Caddy.

I also removed the Authors category entirely as there won't be any other authors on this blog.

Things to do going forward

I want to make the categories and tags pages much better looking including what category and tag that you're looking at.