A new publishing discipline

Foreword

From a comment on HN in a discussion of md2blog:

It would be nice if as part of the zero config effort, programs (esp. programs that aim to be easy-to-use Web-authoring tools) would focus even more on being as close to zero setup/zero installation as possible, too. To wit:

This program is written in TypeScript, a browser-incompatible dialect of JS that is nonetheless regularly compiled to standard JS, e.g. to transform a given module into a form that can run in the browser—and yet md2blog itself doesn't. You are expected to either download an out-of-band, platform-specific binary or use Deno to run it.

Instead, one should be able to use md2blog and similar programs by using the browser itself to open and run the program, considering it's the universal platform that everyone already has, and it's reasonable to expect that you're going to use it at some point in the pipeline, anyway (e.g. for proofing your work).

Even better quick start instructions would look something like this:

  1. Start with a directory that contains the site sources that you intend to publish.
  2. Save a copy of "md2blog.html" there, too (or anywhere, really).
  3. Open the md2blog program in your browser by double clicking md2blog.html.
  4. Drag and drop the directory with your Markdown sources [...] into the newly opened md2blog tab.

(For good measure, a copy of these quick start instructions would be shown when you open md2blog.html.)

Can it be done? Answer: Yes—this document is proof of it.

This is a general-purpose static site generator. It is adapted from triickl (version 0.11.0) and is made available here in a rudimentary attempt at publishing a triple script as a self-documenting program in a Web-native format.

This document describes the process by which the pages on colbyrussell.com are built.

The contents have been minimally prepared for presentation from the original triickl sources; we lean heavily on the top-down authoring convention emblematic of triple scripts, and out-of-band commentary is scarce. This scarcity is not exactly by design so much as it reflects a scarcity of time to devote energy during the edit cycle before publication.

User guide: Preparing content for publication

To use this tool, your input needs to follow a certain format. By default:

You'll need to design (or find) your own layouts; this tool does not hardcode any kind of simple fallback to be used in the event that you do not provide layouts of your own.

You can place a file _config.yml at the root to override the defaults. As the comments in the Triickl module indicate, the extent of what's supported involves a drastically simplified YAML-like language; this program does not contain a full-fledged YAML parser, which means the configuration language is valid YAML, but not every valid YAML listing is valid input for this tool. Refer to TriicklConfig.makeDefaults for a list of available opportunities to override the defaults.

The workflow for using this tool looks a lot like the one described in the foreword. First, you open this tool, then you click the "Open…" button and use the browser filepicker to point at the input directory.

As a concrete example, consider an input directory that contains a single post at _posts/2021-11-30-trying-anpd.markdown:

---
title: Trying a new publishing discipline
layout: noteworthy
---

From now on, I intend to adhere to the following principle: the tooling I use
to publish my writing and other works to the Web will be treated as
first-class content in itself.  I will not store my site sources in a repo
with a README that makes scant (or no) mention of publishing tools that must
be found and understood out-of-band.  Make no mistake; I'm still going to use
version control on the repo where my site sources live, but I'm not going to
leave either my process undocumented or the acquisition of the necessary tools
as a separate exercise.  The tools themselves will constitute part of the
content available on my site.  If I am able to publish something
(anything—this take this blog post as an example), then it is because there is
another piece of content somewhere within the collection it belongs to which
specifies in exact detail the process by which preparation for publication
occurs.
    

Based on the YAML frontmatter, this tool expects that the input directory will also contain a file _layouts/noteworthy.html that will be applied to this post. (If no layout were specified, and assuming you haven't specified a different default layout in _config.yml, then this tool would instead apply _layouts/post.html to the post.) The template code in this layout file will determine the final output—the contents of 2021/11/30/trying-anpd.html. It can make make use of any of the scriptable properties, including the post title specified in YAML.

A layout for a post might look like this:

---
layout: default
---
<div class="header">
  <h1 class="title"><<text site.config.name>></h1>
</div>

<h2 class="title"><a href="<<text $in.url>>"><<text $in.title>></a></h2>
<p class="meta">
  by <a href="/">Galileo Madrigal</a>.
  <<text dateFormat($in.date, " yyyy mmmm d")>>.
</p>

<article class="noteworthy post">
  <<html $in.content>>
</article>
    

Note from this example that layouts can themselves rely on other layouts.

Note also the use of the scriptable $in binding, which in template code refers to the content being templated (e.g. the post object in this example), and there is a scriptable site object as well.

A lock file is generated to aid in stable post ordering, e.g. in case you choose to create an archive listing for your posts. The issue this solves arises if you write multiple posts on the same day, since chronological ordering would not be guaranteed, given the granularity is only the daily level and not something more precise (e.g. involving the use of timestamps).

Implementation details

The following modules are written in the triple script dialect, presented in the triple script compilation form. Excluded from these "primary" program modules is triickl's shunting block, which is reproduced (and explained in more detail) in Appendix A. In lieu of triickl's standard shunting logic, we include in this adaptation an entry point of our own, which describes (or, to put it another way, determines) how the program's control flow ends up in the static method Triickl.generate.

(Following convention, that entry point is called main. It does not appear as the first piece of code for a few different reasons. Among them is that it is just not essential from the perspective of a reader to understand what goes on in main before getting into the real "heart" of our program—the contents of main are not more important than the high-level, driving factors that compel us to include the machinery of main (or any other lower level procedure) to begin with. This presentation order reflects a preference to work through our program in a logical, top-down order dictated by the global, whole-program concerns that led to the program's creation in the first place. The top-down approach begins here, in the aforementioned heart of our static site generator, with the Triickl class—and its static method generate, in particular.)

Primary program modules

Entry point

As is the case with triple scripts generally, the primary program modules are observant of a principle of delineated power. That is, the program is split between general purpose modules that comprise the program's business logic—and which make no assumptions about any platform bindings available—and then the "system"-level modules that implement an interface targeted by the modules in the business logic layer. Roughly, the modules BrowserSystem and NodeJSSystem (and their dependencies) exist in the "system" layer which can be loosely considered to have special privileges, whereas all other modules lie outside it. The idea with a traditional triple script is that this separation lets the program run in different types of environments, e.g. either in the browser where W3C/WHATWG APIs are available, or from a command-line runtime like NodeJS where browser APIs are not available and the associated runtime-specific APIs for e.g. reading and writing files must be used instead. This leads to a program that is robust and cross-platform.

This document, however, is not a bonafide triple script. Fortunately, the separation discussed makes it very easy to supplement the description of the existing BrowserSystem module so that it can be adapted to work for our purposes. If this document is viewed "live" in a web browser rather than, say, printed out, then we can add some sugar to the experience by affixing a button near the top of the document that allows the reader to run the specification described within this document.

Code for injecting that button and piggy-backing off the behavior of the BrowserSystem module follows.

NB: The preceding program initialization sequence replaces the shunting block found in triickl 0.11.0, from which this tool is adapted. You can refer to the actual text of triickl's shunting block, which is reproduced for completeness in Appendix A.

Appendix A: Shunting

By using a media type as follows, i.e., one that is not recognized by the user agent, the associated block will not be evaluated. This is intentional, as it includes an IIFE as part of the shunting block from triickl 0.11.0, for which the current document is not a supported environment and would otherwise cause problems.

Colophon

Document identifier
https://crussell.ichi.city/pager.app.htm
Other versions of this document
https://www.colbyrussell.com/meta/pages.app.htm
Author
Colby Russell <https://colbyrussell.com>
Published
2021 July 29
Revised

As is evident, this is a structured document. The structure is specified as HTML using tags to denote HTML elements, and a styling language called CSS is used to specify rules that use selectors to match elements. The desired styles can be applied to matched elements by specifying the properties which should take effect for each rule.

The information that follows about the structure and style rules used herein should be sufficient to recreate this document in the event that a printed reference of this document is the only aid available. (Refer to the World Wide Web Consortium's HTML and CSS standards for more detailed information about what constitutes a well-formed document as well as the Document Object Model and its APIs which are relied upon in the program itself.)

Document structure

The preceding code block demonstrates the use of a heading element within the document body—at the beginning of the document and which we can target with the h1 selector—and the beginnings of a paragraph element and the elided content of the rest of the document body. (Close tags are denoted by a slash preceding the name of the element.) Subheadings can be specified further still using h2, etc. Viz:

We also use script and style elements to denote code blocks, which should be clear without an example, as we also include rules to adorn a given block with an explicit representation of its type attribute, including the attribute's value. In the preceding example, the text id="appendix-a-shunting" also denotes an attribute—the id attribute and its value. The id attribute is used within this document for linking directly to a given section, using the a element. For example, the code <a href="#appendix-a-shunting">Appendix A</a> in a compatible viewer should mark up the text "Appendix A" at its place in the document as a jump link to the element whose id matches the value of the link's href with the leading octothorpe (#) omitted. A title attribute can also be added to an element, e.g., <abbr title="Compact Disc">CD</abbr> denotes an abbreviation "CD" and associates it with the text "Compact Disc". Alternating dt and dd elements within an outer dl parent element permit a group of items and their descriptions, as in the case of:

Other elements used in this document include code, dfn, em, and p for denoting inline text comprising a snippet of code, a defined term that is not an abbreviation, inline text that should be emphasized, and a paragraph, respectively.

Finally, to force a symbol to appear in the document without being interpreted by a conformant viewer is known as escaping. This is useful since the less-than and greater-than signs are otherwise meant to denote the boundaries of a tag and its attributes. Escaping them and the ampersand symbol is possible using the sequences &lt;, &gt;, and &amp; (respectively), which are used in every place these characters occur as inline text. There is no need to escape these characters within the code blocks in this document, so long as no sequence can be misinterpreted as the closing tag for the containing block.

Document styling

We have included in this document style rules that specify the presentation for the main text of document itself and the minimal application controls, e.g., the button used for loading a directory of site sources that is displayed if this document is opened in a compatible viewer.

The rules are as follows:

In some readers, such as Firefox, monospace fonts are by default configured to render at ~80% of normal body text. The rationale for this decision isn't clear, but it's still annoying all the same. Whether or not it's an apt characterization of the matter, we have to treat this as a deliberate choice by the user, the same as if the user deliberately altered their reader settings. The only true remedy is for users to change their reading settings.

It's certainly possible for us to use a rule like the one below, but by doing so, we'd blow out the text size on readers like Chrome where this problem doesn't exist. This would also have the effect of overriding the wishes of users who have consciously adjusted their reading settings or otherwise made sure that the settings that user's true preferences. By using such a rule, the benefit—whether real or perceived—would in the best case help readers who haven't made any decisions—or more accurately those who have taken no effort to change defaults. What's most likely is that the breakdown of users is such that more fall into the latter category (those who've taken no concious effort) than the number of users who fall into the former category. It's hard, though, to justify throwing the former group under the bus, given that set membership is defined by the deliberateness of their actions. So while it is probably unrealistic to ask for 9 out of 10 readers to verify their reader settings and make corrections as necessary, the casualties of doing otherwise are hard to justify. And as of this writing in 2021, Firefox users do not make up 90% or even anything close to a majority of the world's readers.

If it were so desired, though, this document could easily apply a rule like this:

(Note that the use of the text/plain+css media type here; similar to our text/plain+vnd.triplescriptsorg.shunting media type used in Appendix A, this prevents the rule from taking effect, while still allowing it to be directly embedded here. To get it to actually apply, of course, this media type would not be used, and this snippet would need to be between style tags and not in a script element.)

Bibliography

https://triplescripts.org/tools/#triickl
The triickl static site generator
https://triplescripts.org/format
Triple scripts' shunting block explained
https://w3.org/TR/html
The HTML standard for the W3C/WHATWG hypertext system
https://w3.org/TR/css
The CSS standard for the W3C/WHATWG hypertext system

Refer also to past related work on self-documenting programs at <https://colbyrussell.com/LP/debut/>.