Hakyll, Contexts and Metadata

Published on the

I recently switched my website to using the Hakyll static site generator. Hakyll’s written in Haskell, and consequently you have to write your configuration files in Haskell. I’m going to discuss an aspect of Hakyll I found pretty difficult at first, the Context monoid. Note, I’m using Hakyll 4. The Context monoid lets you pass data to a template. It is passed to the loadAndApplyTemplate function.

Hakyll gives you a defaultContext already. It does things like put the content in the body variable and give you access to YAML metadata. There is a caveat with the YAML metadata though. It doesn’t work with lists or nested data.

Because Context is a monoid, it can be combined using <>. a <> b will create a new Context where a is queried first for a variable, and if it’s not found then b is queried. It’s left-biased.

Let’s look a bit more at what a Context is:

In other words, you can think of Context as a a function that takes three parameters and gives you a Compiler ContextField. The first variable is the name of the variable, the second is a list of arguments and third is the Item that is being compiled. We’ll get back to what an Item and a Compiler ContextField are in a bit.

When you refer to a variable $a$ in a template, it will pass the string "a" as the first parameter. However, you can also pass arguments, as if you’re calling a function e.g. $a("b", "c")$, in which case the second parameter will be the list ["b", "c"].

Let’s get back to the Item data type and Compiler monad. These are at the heart of Hakyll. Throughout Hakyll, data is operated on values belonging to the Item type. A file starts off as an Item String which contains the raw file contents as a String. The item also contains the item identifier. But it doesn’t contain the item metadata. To access the metadata, you need to be inside the Compiler monad. Why? Well, the getMetadata function takes an item identifier and returns the metadata, but it returns m Metadata, where m is an instance of MonadMetadata. Guess what’s an instance of MonadMetadata, a Compiler.

The Compiler monad is also central. Things such as pandocCompiler, loadAndApplyTemplate all take an Item and return a Compiler String. We want something that produces data that can be used by the template, so we want a Compiler ContextField.

To give a concrete example, here’s a Context, that will simply return b for the variable a:

empty just means I have nothing to offer for this variable name. Let’s do a more interesting example. Let’s say you want to embed YouTube videos easily. YouTube videos have an ID, visible at the end of their URL after /watch?v=. Suppose the ID is dQw4w9WgXcQ, wouldn’t it be cool to type in $youtube("dQw4w9WgXcQ")$ and for the video to be embedded. Here’s an example of how that could be achieved.

But actually you don’t even have to do that, since Hakyll provides a convenience function functionField, but it’s good to know what’s going on behind the scenes:

Note that with functionField, f no longer returns a Compiler ContextField but a Compiler String (which is put into a StringField). Let’s do something slightly more advanced. Let’s take advantage of the fact we’re in a MonadMetadata and actually use the metadata for something. Let’s have a function len, that returns the length of a string in metadata:

Note that in Hakyll Metadata is an alias for Object which is provided in the aeson package (and re-exported in the yaml package):

Another thing to note is the use of fail, this is better than something like error as it integrates with Hakyll’s logging system, and won’t crash your entire program. Fun fact, the reason you can use fail is because Compiler is an instance of MonadError which comes from the mtl package. Additionally, Value and Object use Text instead of String, which is why I needed to use unpack and pack to convert between them.

One thing that’s used to form defaultContext is metadataContext, the issue with this, as I mentioned above, is it doesn’t take into account nested properties. Let’s create our own version that does. Note I’m going to use the split package which provides splitOn so I can go from a.b.c to ["a", "b", "c"]. This context only work, when the YAML is a scalar property (i.e. not an array or object).

So far we have only covered StringField, but there is another type called ListField. This can be used with for-loops in templates, like so:

Let’s look at how a ListField is defined:

One thing you might not be familiar with is the forall a. This is different from doing data ContextField a. If you did data ContextField a, then a ContextField String would be a different type from a ContextField [String], but here regardless of the type of the value contained, it’s a single type.

So a ListField is formed from a list of items that are iterated over, and a Context a. This context is to provide the variables inside the loop. To understand this, consider that the most common use of a ListField is when creating something like an archives page, which lists other pages so people can access them. Therefore, the list of items is the list of pages, which get be obtained using loadAll. The Context provided can then load the relevant data from the page’s metadata, to provide the variables in the body of the loop. To make this easy, there is a convenience function listField:

To use it you might do something like this:

But suppose you want to do something more complex. For instance suppose you have a list of scripts that you want to insert into every page. Suppose you want to be able to control them from your Haskell configuration. You could store them in a list:

You could then use them as a list field like this:

makeItem just creates an item whose contents is what’s passed to it. Note that these items don’t have any metadata, attached to them. We now need to define a scriptCtx that will take an Item String whose itemBody is the script, and provide it under some key.

Now we can do:

That’s all well and good, but what if the list wasn’t stored in our Haskell file, but in metadata. For instance:

In this case we would have to write our own custom context:

This seems like an awful amount of effort to go through to deal with a list of strings in metadata. However it’s possible to generalise it so it works for all lists of strings in your metadata. For my website, I have a context, that I use instead of metadataContext. It has three advantages:

You can check it out here.