One day, I’m going to do a writeup of the technical restructuring I just did on probably one of my most important projects right now. Today is not that day, because I want to talk about the reasoning and the history of that rewrite instead, on a meta level.

I’m currently building an open source library published to npm to parse and render a file format we’ve designed for Open Color Tools. We’ve built a first prototype using a YAML parser and doing some preprocessing, but the format quickly evolved into something that was essentially incompatible with YAML, so we needed a new solution.

For what reason ever (mostly because I thought it should be fun), we’ve decided to use jison and write a grammar for the format. It was fun in the beginning, but I quickly learned that writing good grammars (and lexers, too) is hard. Not only is it hard (and this is the important part) I also now believe or rather know that you need to know a lot about how parsers work (specifically the parser type you’re working with, since there are a lot of different parser types), to write good lexers and grammars for them.

Me as a non-CS-educated type of developer, I basically ran into a brick wall. And of course, I, again, like so often in my life, felt like a fraud. Here I’m sitting, burning money I’ve earned due to weird dynamics on the job market, on something that simply can’t be that hard to understand.

If you have ever been in a situation like that, you probably know how the story continues. I chipped away at the brick wall, with some success and a lot of frustration but giving up and trying another approach didn’t even enter my mind, because that would have been defeat. At the same time, I spent literally weeks on something that should have probably cost me a day or two. Okay, three with real life testing and getting feedback in.

And feature wise, I simply stopped looking at some of the more difficult problems, because at that point I was always afraid to having to rejigger the whole grammar from top to bottom to accommodate that one feature.

And again, the only reason for me to keep attacking that brick wall, with a little plastic fork, was my own pride. In hindsight, what a stupid way of wasting my own time and our project’s budget.

Then, one day, when looking at another bug a colleague found, it started to dawn on me: This is actually not a hard problem. It is not a complicated file format. It has very simple rules and only a handful of different entities. Maybe throwing a full blown parser at this problem is actually overkill.

I started from scratch. In a different programming language, Ruby, where I feel a lot more at home than this JavaScript thing. It took me a day to write a sufficiently good parser. Another half a day to implement all of the missing features I was dreading on the other side.

Two days ago, I started a branch on the “real” project to port the new parser over. I ran into some issues with language differences and also it was a bit of work to make the new parser play nicely with the object structure I’ve created before (and I didn’t want to completely change the API), but now I have a version that runs all the tests, implements all of the features that were missing before and, tadaa, is about 10-15 times faster.

Plus, the code is reasonably simple and can be fully understood by my team mates, whereas before I was the lonely guardian of the holy grammar, which was, at that point, really only a burden and nothing else.

There are a couple of extra points I need to make here:

I don’t think I’m too dumb to understand bison/yacc style parsers. I just didn’t have the time to properly fully understand them, which is a pretty big task. Doable, but definitely not something to do when you’re trying to quickly build an MVP. I simply completely underestimated the amount of knowledge needed for doing that right. But having spoken about this with people that actually do have a CS degree and did learn about lexers and parsers in their compiler class, I am not alone, even among them. Not many people need and use parsers in their every day job and so than knowledge starts to rust and crumble.

My biggest mistake was to not properly think the problem through in the beginning. The new “naive” parser as I like to call it is stupidly simple. I did have a better understanding of the problem after toying around with jison for a few weeks, so I might have been a bit slower if I would have started with the naive implementation, but even if, it would have taken me 3 or 4 days, that’s a lot better than the probably more like 20 days I spent on the grammar and lexer. So why did I do this mistake? My guess is that using a parser sounded like a clever idea. And parsers always fascinated me, even though I didn’t do a lot of work with them. Be aware of clever ideas, that’s all I can say…

In the beginning, wrestling with the parser felt adventurous. I was learning stuff (And I’m still thankful for that) and I had slow but steady progress. Sometimes I had breakthrough moments where I had the feeling that something finally clicked. Only to be thrown down into the mud the next day trying to fix the next parser bug.

You would think that a seasoned developer like me should be able to see all these things much earlier and correct mistakes quicker. But that is hard. It takes a level of self reflection that I sometimes possess, but sometimes it gets lost in the daily grind.

So what did I take away from all this:

The header image is another photo I shot of “Älvborgsbron” in 2012 (the page header on the homepage is a closeup of the bridge). It is one of my favourite bridges, because it’s a clear sign that the ferry is about to arrive and my vacation is about to start. And also, I find it quite beautiful.