RoryMurphy.com - Old dog, new tricks

As I've been preparing for a shift in career, with my new role, I thought I'd take some time to try out some new tools. Since content management is a topic that I've spent more than my fair share of time thinking about, it seemed only natural to use a blog as the use case for development. Blogging is actually a fairly expansive topic, with a variety of frameworks available, ranging from simple static site generators like Gatsby or Next.js to full-featured content management systems with wysiwyg editors, like WordPress or Drupal. Over the course of this series, I'm hoping to cover several of these options, along with discussing the concepts behind implementing a framework of your own.

A minimum viable blog

For this first iteration, we'll be looking at the bare minimum features necessary. To have something recognizable as a blog, a reasonable list of these features might be:

A landing page consisting of:
- The blog title
- A list of blog posts in reverse chronological order, with titles, author and timestamp (pagination in Phase 2)
- (Optional) A image representing the theme of the blog
- (Optional) A header menu bar allowing easy navigation
Each blog post should minimally present:
- A title
- A subtitle
- The name of the author
- A timestamp when the post was published
- Content, supporting formatted text, links and images
- (Optional) Tags highlighting keywords and allowing the discovery of related posts

While there are numerous additional features available in the more comprehensive blogging platforms, for now we want to keep things simple. Looking at these requirements, the data consists primarily of the unstructred post content alongide it's corresponding structured metadata. Providing a cohesive editing experience for these two types of data is a core challenge of content management. This data is then used in two ways - once to put it in a chronological list and once to display it in full form. While all of this can be relatively easily achieved by statically generating the pages at deployment time, I decided to use a minimal runtime that allows for more features to be added as we progress.

Just Go

For this initial blog implementation, I somewhat arbitrarily chose Go along with the Gin framework, but will be exploring other languages and some of their unique advantages/disadvantages in other posts. I also made some decisions specifically intended to simplify the build. The first is that the editorial workflow would be managed via Git. This removes the need for the application to handle access control, as that can be deferred to the repository. Using this file based constuct for blog posts, one option would be to edit the posts in raw HTML. This, however, feels like a skillset that the average blogger need not have so, instead, I chose the simpler Markdown format. To simplify implementation, I surveyed some of the options for transforming Markdown to HTML and found Goldmark suited my needs nicely. The next simplification was to use an existing layout based on Bootstrap, as I am unfortunately no designer. Start Bootstrap was helpful in this department, and had several professional looking layouts to choose from. At this point my project structure looks like:

- assets (for all my images)
- posts (the actual content)
- css
    - styles.css (from the layout)
- js
    - scripts.js (from the layout)
- templates
    - base.tmpl
    - index.tmpl
    - post.tmpl
- main.go (mostly empty)

The next decision represents one of the core simplifications to this minimal implementation - storage of the metadata and its association to the post content. Since this metadata is necessary to render the landing page, it must be indexed / made available potentially before the post content has ever been accessed. For now, I am storing that metadata in a JSON file, along with a link to its corresponding content. The structure for this data ended up at:

type BlogPost struct {
    Title       string    `json: "title"`
    Subtitle    string    `json: "subtitle"`
    Slug        string    `json: "slug"`
    Author      string    `json: "author"`
    Timestamp   time.Time `json: "timestamp"`
    Tags        []string  `json: "tags"`
    ContentPath string    `json: "contentPath"`
}

With that, we're ready to write two very simple handlers. First, an index handler renders the deserialized BlogPosts as a list. The logic for this is absolutely trivial. However, this logic relies on the posts being indexed on startup and held in memory which, as we will see later, impacts some decisions on the hosting.

r.GET("/", func(c *gin.Context) {
	c.HTML(http.StatusOK, "index", gin.H{
        "posts": posts,
	})
})

The second handler is only slightly more sophisticated, in that it must first get the metadata for the post, indexed by its slug followed by loading the Markdown file for the post content and rendering it. This can all be accomplished with the following few lines.

r.GET("/posts/:slug", func(c *gin.Context) {
	slug := c.Param("slug")
	if !IsValidSlug(slug) {
        c.String(http.StatusBadRequest, "Invalid post id.")
        return
	}
	if entry, ok := postsBySlug[c.Param("slug")]; !ok {
        c.String(http.StatusNotFound, "Blog entry not found")
	} else {
        file, err := ioutil.ReadFile(entry.ContentPath)
        if err != nil {
            c.String(http.StatusNotFound, "Blog entry not found")
            return
        }

        var buf bytes.Buffer
        if err := goldmark.Convert([]byte(file), &buf); err != nil {
            c.String(http.StatusInternalServerError, "Error while reading blog entry")
            return
        }

        content := template.HTML(buf.String())
        c.HTML(http.StatusOK, "post", gin.H{
            "siteRoot": "/",
            "post":     entry,
            "content":  content,
        })
	}
})

And with that, we now have a basic blog ready to publish. For hosting, I've initially chosen Google Cloud's AppEngine, as it provides scalable hosting with minimal setup. However, as mentioned earlier, this does have some implications since the post metadata is currently being held in memory. As the number of posts grows, the overhead of indexing rises. While this would not present an issue on static infrastructure, where the cost would be paid once on start-up, AppEngine performs automatic scaling, meaning that for a low traffic site like this blog, a higher portion of requests will encounter the aforementioned start-up latency. However, there are several solutions to this issue, to be covered in future posts.