Building A Static Site With ndb(6), Shell Script, And Love
code.driusan.net is a domain I created to host the experimental version control system I've been writing to see if I could write something that I like using better than git. I thought it would be interesting to explain my design in an informal, blog-like fashion, which I'll do in future posts, but this first post is a diversion to explain the code of this blog itself.
The initial version of my vcs (currently called "pq", which stood for "patch queue") was written on Plan 9 (or more specifically, 9front) and that is the platform this blog is hosted on. I am working on a port to Rust called "dsm" (Dave's Source Manager) so that I can use it on Linux, and the initial version will probably be renamed to match in the future, but for now consider the two terms interchangeable.
If you want to write a simple blog on Plan 9, what are the options? There's werc but a cgi framework is overkill for a relatively static blog without comments. There are various static site generators written in Go or Python, but it would be nice to have something more Plan 9-native (which might exist, I didn't look too closely. If you know of any, contact me and I'll add them here.)
The obvious choice for a database for a static site generator on Plan 9 would be ndb(6), which can be queried with ndb/query. A database might look something like this:
site=codeblog src=intro.md dst=intro.html
site=codeblog src=other-post.md dst=other-post.html
# etc
We would take the src, run it through some kind of markdown to html converter, and store it in the dst.
But an html page isn't just the content, it has other markup and formatting around the content like page headers and footers and navigation links generated from some kind of template. We might, then, add a template:
site=codeblog src=intro.md dst=intro.html template=codeblog.tpl
site=codeblog src=other-post.md dst=other-post.html template=codeblog.tpl
# etc
We could even theoretically pass the other attributes from the row tuple to the template while we're building the site and allow templates to use arbitrary metadata that we add to the row.
Now, let's say we want an RSS or Atom syndication. We do not have a source (or rather, the source is multiple rows from our database) but we still want it to be defined within our database:
site=codeblog src=???? dst=myfeed.xml template=rssfeed.tpl
site=codeblog src=intro.md dst=intro.html template=codeblog.tpl
site=codeblog src=other-post.md dst=other-post.html template=codeblog.tpl
# etc
It would be useful is to have a way to run an arbitrary command if there is no src and capture its output to save to dst, so let's do that:
site=codeblog dst=myfeed.xml rc=mkrss
site=codeblog src=intro.md dst=intro.html template=codeblog.tpl
site=codeblog src=other-post.md dst=other-post.html template=codeblog.tpl
# etc
Now, what are we going use for our templating language? If we can already run arbitrary commands, why don't we let anything that reads from stdin and writes to stdout be our templating language and we can get rid of the template tag entirely by setting the rc attr to our templating engine?
site=codeblog dst=myfeed.xml rc=mkrss
site=codeblog src=intro.md dst=intro.html rc=codeblog.rc
site=codeblog src=other-post.md dst=other-post.html rc=codeblog.rc
# etc
Given our example, we now want a script that, for the site "codeblog" runs:
mkrss > myfeed.xml
codeblog.rc > intro.html < intro.md
codeblog.rc > other-post.html < other-post.md
Looking at our database format, it's very close to the shell's syntax for setting environment variables. Maybe what we really want to run is:
site=codeblog dst=myfeed.xml mkrss > myfeed.xml
site=codeblog src=intro.md dst=intro.html rc=codeblog.rc codeblog.rc > intro.html < intro.md
site=codeblog src=other-post.md dst=other-post.html rc=codeblog.rc codeblog.rc > other-post.html< other-post.md
Then any rc script used as a template to transform src to dst can also use the rest of the tuple's variables as normal variables and all we need to do is set the environment.
So how do we set the environment variables? eval the tuple? Parse it and write to /env? Wait, do we really need to do anything?
In the Plan 9 world, it's a fairly common idiom to have a command print the commands it expects you to run to standard out. You can then either run the command to see what it would do, or run the command piped to rc to perform the action (for instance, kill works this way). If we follow this idiom, since the syntax of ndb(6) is already so close to the syntax of rc(1), all we need to do is:
- Print the database tuple (maybe with some minor escaping)
- Print the command, including the src/dst redirections
This would also make it easier to debug if things go wrong, or allow us to only copy/paste the last line if we want draft something and see how it looks without affecting the rest of the site.
One last optimization would be to have some metadata (such as the directories that paths are relative to) in a tuple without any dst. Since ndb/query returns the first row by default, we can conventionally make this the first row. With that tuple being used for metadata about the site, we can also add a default rc to avoid redunancy:
site=codeblog srcpath=/lib/something dstpath=/usr/web/something rc=codeblog.rc
site=codeblog dst=myfeed.xml rc=mkrss
site=codeblog src=intro.md dst=intro.html
site=codeblog src=other-post.md dst=other-post.html
# etc
And that is how this static page was generated with a combination of ndb(6), rc shell scripting, and love. The source for the code used to generate this page based on this idea is hosted as a dsm / pq repository here. (If you don't have pq installed, you can click any "Snapshot" link on the commit history to download that commit as a tarball.)