Beautify Hakyll post URLs: Removing .html extension and timestamp

Standard Hakyll post URLs end in .html and the file name starts with a timestamp. Furthermore, all posts (we’ll call them articles) are saved inside a common directory that we don’t want to show in our URLs because we are using category directories anyway.

On the file system an article path could look like this:

articles/category1/2015-06-30-some-arbitrary-url.md

As said, we want to remove the articles part, the timestamp and the (in the default setting) auto-generated .html extension. This would produce an URL like this:

category1/some-arbitrary-url/

For that we have to write a new route that rewrites the articles’ FilePath. It uses Text.Regex module for matching article file names and will throw an exception if they don’t match the required format.

articleRoute :: Routes
articleRoute = customRoute makeR
    where
        makeR i  = shorten (toFilePath i) </> fileName (toFilePath i) </>
                       "index.html"

        fileName :: FilePath -> FilePath
        fileName p = case (convertArticleFile . takeBaseName) p of
                         Just np -> np
                         Nothing -> error $ "[ERROR] wrong format: " ++ p
        shorten    = joinPath . tail . splitPath . takeDirectory

-- Removes date part from article file name.
convertArticleFile :: String -> Maybe String
convertArticleFile f = fmap last $ matchRegex articleRx f

articleRx :: Regex
articleRx = mkRegex "^([0-9]{4})\\-([0-9]{2})\\-([0-9]{2})\\-(.+)$"

In the matcher inside the Hakyll monad we use a strict pattern that requires all articles to be stored inside category directories.

match "articles/*/*.md" $ do
        route articleRoute
        [...]

For replacing all links in the format /bla/index.html with /bla/ we can use following function written by Yann Esposito:

-- Replace url of the form foo/bar/index.html by foo/bar.
removeIndexHtml :: Item String -> Compiler (Item String)
removeIndexHtml item = return $ fmap (withUrls removeIndexStr) item
    where
        removeIndexStr :: String -> String
        removeIndexStr url = case splitFileName url of
                                (dir, "index.html") | isLocal dir -> dir
                                _                                 -> url
        isLocal :: String -> Bool
        isLocal uri        = not (isInfixOf "://" uri)
First published on December 1, 2015