Late on a Friday night in early February 2024, I came across Rob Pike’s talk on YouTube, “Lexical Scanning in Go”1. I’d been looking for an excuse to write more Go, so I watched the talk in one sitting. Around that time I’d also been thinking about building a personal markdown notetaking app for my own use, so the ideas began to click immediately!
I stayed up into the early hours of the morning, re-watching the talk whilst making notes and trying to understand how I could apply the techniques presented to my markdown parser. A few weekends of effort later and I had a very rough-but-functional implementation that could be used to drive markdown parsing in a notetaking app.
I’ve since made a few improvements, and now go-markdown is capable of tokenising raw markdown text, converting flat token streams into a hierarchical AST and providing JSON parsing for both of these structures - with all of these happening in <5ms for realistic markdown file sizes.
Unfortunately, it turns out that Markdown is quite complex with many edge cases and there are a few bugs in there that I’ve yet to get around to fixing at the time of writing2. Regardless, it has been a massive source of learning and improvement for me, and I’ve found myself thanking the past me for undertaking this!
I’d encourage the reader to take a look at the source code: it’s incredibly simple once you’ve grasped the main concepts.
Or alternatively, give it a try:
go get github.com/jonlinkens/go-markdown