GOBF

Intro

I have always been fascinated by compilers. I think the idea of representing one language’s structure inside of another language is just downright cool. Furthermore, the ability to analyze another language and optimize it really peaks my interest.

Recently, I worked on a project that would allow you to more precisely manage which processor core a set of Goroutines would run on. The idea was that Goroutines that communicate more than they process independently would benefit from sharing that same processor core’s cache. I called them M-Groups. The implementation called for modifying the runtime package and the Go scheduler. While ripping through the Go internals, I stumbled upon quite a bit of the Go compiler code, including the SSA library. As I ripped into it more, I started to form an itch to make my own compiler of some kind.

Fast forward a few months and I found myself staring at some examples of my favorite simple esoteric language, BF. BF is the perfect language to write a compiler / optimizer for because of its extremely minimal syntax. It only has 8 possible syntax items(called commands), <>+-.,[], but those simple syntax items quickly form complex programs (not in the sense of awesome functionality).

Here is Hello World in BF

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

GOBF Overview

Introducing GOBF, an interpreter/optimizer/compiler/code_generator for the BF language. It has two primary modes of operation, interpreting/running (as-is) and converting BF to optimized Go code.

Interpreting

Interpreting and running was the first mode of operation that really just served to solidify my understanding of the BF language. It simply sets up a virtual environment for the BF program and acts on each syntax item as they come. There is no optimization here.

Optimizing / Compiling

Next, I wanted to focus on how I would be able to generate super fast binaries from my BF programs. I considered lots of approaches, but the two most prominent ones were to directly assemble a binary from the BF commands and convert BF to equivalent Go code.

Directly assemble an x86 binary from the the BF commands seems like a natural extension, since BF forces the programmer to act somewhat like a state machine. The downside to this approach is that it would lack portability and would loose out on further architectural optimizations that are found in modern compilers.

I ultimately chose the ladder solution, where I convert the BF commands into equivalent Go code. Although it seems like a bulky implementation strategy, the program’s functionality is quite limited and we gain the builtin compiler inlining.

I have designed GOBF to parse BF programs and generate a internal intermediate representation(IR) that resembles a tree. The next major observation that BF contains lots of repetitive actions, like ++++++++, which increments the local accumulator eight times, only to save the value 8. These repetitive actions can easily be smashed into a single operation, which can yield massive speedups. The GOBF optimizer traverses the IR tree and coalesces these repetitive actions into a more descriptive single action.

When the optimizer finishes, GOBF generates Go code from the IR and optionally compiles it.

Benchmarking

I ran some of the typical BF programs you can find on the internet.

BF Program Run Time
mandelbrot.b 8.162s
hanoi.b 5.401s
helloworld.b 0.003s

Running on an Intel® Xeon® CPU E3-1505M v5 @ 2.80GHz

Future Work

I would like to create a reverse code generator that takes a high-level language, such as Go, and generates BF.

GOBF on Github

Related

Next
Previous