Introduction

Hi ! My name is Mariko & you can find me on twitter as @kosamari.

This is a talk I gave at JSConf Colombia in Medellin, Colombia about building a compiler and lessons I learned by making one. Let me know if you have any questions and comments!

How I learned English

I first started learning English at 7th grade. In Japan, at least when I took English class, "This is a pen" was a omnipresent sentence which every English text book used as 1st sentence to learn.

It's useless as a English sentence, but it was simple to get started. I was really excited to learn new language, so at first, I even took extra curricular to learn English.

As you advance in lessons, things got a lot complicated. You start reading paragraphs instead of a sentence. Questions asked in exams become more about understanding context, and what's happening in the reading, than if you remembered a meaning of a word.

So when I graduated from High school, English was not my strongest subject. In fact, I suck at it so much that I failed on every collage entry exam I took.

English was getting the way of my trying to do things I wanted to do, so I took ESL class at Temple University in Tokyo, and 6 month later, I earned enough TOFEL score to enroll in undergraduate program.

One this that helped me while I was learning English was visualizing how sentences are structured. I'm a visual learner, I tend to perform well when I can draw what is happening.

This is a note I made while I was learning Rust. It helped me a lot. In fact, I took this learning style from when I learned English.

There are certain rules in English. For example, a and the are always followed by a noun, a verb is always followed by subject, word ending with -ly is usually adverb which can not be added to a noun... etc. This is how I used these rules to de-construct English sentence to something I can understand.

Of cause, English is a very divers language, and there is always new grammar and word invented, so these rules are often broken. For example, "I'm lovin' it" is not grammatically correct. The verb "Love" does not have continuous version of that. If you are using "loving" then it has to be adjective.

To be grammatically correct, you would say "I love it." JavaScript was a human it would probably throw an error

How I learned to make new language.

Lat summer I spent a week writing a small programming language, which would compile knitting instructions into a pattern image.

In process of this, I had to make a compiler. This was completely new language I made up, so I needed to transform them so that browser can understand to make those patterns.

At first, with out even knowing what a compiler does, I was scared. It sounded like a lot of computer science.

But once I started making one, I realized it was a lot simpler concepts that we all do and was very close to how I learned English.

Let's try to be a compiler.

Let's see what a compiler does by learning new language right now. The language we are going to look at it called "Design By Numbers" by John Maeda. It is a programming language made at MIT Media Lab in late 90s. It was designed to introduce computer programming concepts in visual way.

Here is very 1st program in the book. Which makes the image on the left.

There are 3 commands in this code, Paper defines color of the paper, Pen defines color the pen, and Line draw a line. In this programming environment, a paper is always 100 x 100, and a Line is defined by x y coordinates of start point, and x y coordinate of end point.

So, which image do you think this code will produce ?

It will produce the one on the bottom. Did you get it right ? congratulations! You just experienced being an compiler !!

How does a compiler work?

Let's look at what just happened in our head as a compiler right now. A compiler goes through a series of steps to transform a code to another form.

Tokenization

First thing we did was to spread each key words.

It's easy to spot what's a word in English because we use spaces to separate them. So it may seem like not even a step, but in language like Japanese it will be an effort to separate words.

While we are separating words, we can also assign primitive types to each keyword, like "word" or "number".

Parsing

Once a blob of text is separated into tokens, we go through them and try to find a relation ship between tokens.

In this case, we group numbers associated with command keyword as one command. By doing this, you start seeing a structure of the code.

Transformation

Once you analyzed syntax by parsing, then you can transform the structure to something suitable for the result.

In this case, we are going to draw a picture, so we are going to transform to step by step instruction for humans to draw.

Code Generation

Lastly, we make a compiled result, an image. At this point, instructions are already written in previous step, so we just follow them to make compiled image.

Ant that's what a compiler does!

Let's make a compiler.

Now that you know what compiler does, lets make one right now.

I created a demo for this Design by Numbers language. Instead of drawing by hand, this compiller produce SVG code so we can see them in browser.

First thing is tokenization. We just split the code with white space, then check if a keyword is a number. If it is, we assign a type "number", if not we assign a type "word".

Next we go through the tokens, and try to extract sentences. For example, when we find a word "Paper" then the next talk should be a number for the color.

If the next token is not a number, then we throw an error.

Extracted data is represented in a object called AST - Abstract Syntax Tree.

Now we know what's happening in this code, but this AST is not really suited for making SVG. There is no "Paper" tag in SVG, you would probably replicate one using rectangle element. So we transform one AST into another AST that is more SVG friendly.

Now we can go a head and generate SVG based on the new SVG friendly AST we just made.

Now we have our little compiler. I made a demo site that shows you results of a each step in this compiler. It's available on the web, links will be at the end of the slides.

But wouldn't compiler require a recursion and traversal and visitor etc ?

Yes, those are all wonderful way to build a compiler, but that doesn't mean you have to take that approach. Yes, if we were to expand this little language into more complex one, it would be a good idea to use those technique, but it is not a requirement.

Things you can make with compiler.

Knowing how to make a compiler is really powerful and fun. Do you use Babel ? Babel is a type of compiler. Have you ever wanted to type "función" in stead of "function" ? You can make compiler for that.

In fact there is a person who made fika script to write JavaScript code in Swedish.

There is a programing language called Emojicode to program in Emojis.

It does not even have to be a text! Piet is a language to write program in color.

There are so many things you can make with compiler, so I recommend you make one, it's fun !

It's great way to learn about software development.

Making compiler is fun, but most importantly, it teach you a lot about software development. Here are few things I learned while making compiler.

It's okay to have unfamiliar things.

Much like our lexical analyzer, you don't need to know everything from the begging. So if you don't really understand a piece of code or technology, it's okay to just say. "There is a thing, I know that much" and pass it on to next step. Don't stress about it, you'll get there eventually.

Don't be a jerk with bad error message.

Parser's role is to follow the rule and check if things are written according to those rules. So many times, error happens. When it does, try to send helpful and welcoming messages. It's easy to say "It doesn't work that way", like "ILLEGAL Token" error or "undefined is not a function" message. In stead, try to tell users what should happen as much as you can.

When someone is stuck with a question, instead of saying "yeah that doesn't work", may be you can start saying "I would google __ and __ keywords." or "I recommend reading this page on documentation." You don't need to do a work for them, but you can certainly help them do the work better.

Elm is a programming language that embrace this method. They put "Maybe you want to try this ?" in their error message.

Context is everything

Finally, just like our transformer transformed one type of AST to another more fitting one for the final result. There is no one perfect way to do things. Everything is context specific.

So don't just do things because it is popular or you have done it before, think about the context first. Things that work for one user may be a disaster for another user.

Also, applicate the work those transformers do, you may know good transformers in your team, those work they put in to bridge the gap may not directly create a program, but it is a damn important work in our community.

Conclusions

So hopefully, I convinced you how it's so awesome to be a compiler. It certainly is not scary to make one.

I wrote a program for our compiler, so if you feel like it, be a compiler and draw them, or you can copy it into the demo site.

Here are links to the projects and samples. If you have any comments/questions, hit me up on twitter @kosamari.

Thanks!

"How to be a Compiler" - Talk given at JSConf Colombia 2016 by Mariko Kosaka