Intro
In this series I am creating a transpiler from Python to Golang called Pogo. In the last post we made the abstract syntax tree, which means we are on our final step on our way of transpiling Python to Go.
The Emitter
To start off, just like the lexer and the parser, we need to make the emitter struct.
type Emitter struct {
}
And that's it! We aren't going to keep any properties on the emitter. Furthermore, the emitter will only be one function!
Direct translation
We are going to make a list of Structure
s that have direct translations from Python to Go.
var directs []int = []int{
// Other
structureCode["IDENTIFIER"],
structureCode["L_PAREN"],
structureCode["R_PAREN"],
structureCode["L_BLOCK"],
structureCode["R_BLOCK"],
structureCode["L_SQUIRLY"],
structureCode["R_SQUIRLY"],
structureCode["SEP"],
structureCode["ASSIGN"],
structureCode["COMMENT_ONE"],
// Keywords
structureCode["K_FOR"],
// Math operands
structureCode["MO_PLUS"],
structureCode["MO_SUB"],
structureCode["MO_MUL"],
structureCode["MO_DIV"],
structureCode["MO_MODULO"],
// Literal
structureCode["L_INT"],
structureCode["L_STRING"],
// Comparison operands
structureCode["CO_EQUALS"],
structureCode["CO_NOT_EQUALS"],
structureCode["CO_GT"],
structureCode["CO_GT_EQUALS"],
structureCode["CO_LT"],
structureCode["CO_LT_EQUALS"],
}
Most single symbols, operators, literals, comments, identifiers, and so forth, stay the same. We will use this when translating.
We will also use a map for Structure
s that don't have direct translations, but have consistent translations.
var translation map[int]string = map[int]string{
// Statements
structureCode["ST_DECLARATION"]: "var",
// Other
structureCode["BLOCK"]: "{",
structureCode["NEWLINE"]: "\n",
structureCode["ANTI_COLON"]: "}",
// Keywords
structureCode["K_IF"]: "\nif",
structureCode["K_ELIF"]: "else if",
structureCode["K_ELSE"]: "else",
structureCode["K_WHILE"]: "\nfor",
// In-built functions
structureCode["IB_PRINT"]: "println",
// Bool operands
structureCode["BO_NOT"]: "!",
structureCode["BO_AND"]: "&&",
structureCode["BO_OR"]: "||",
}
Something important to note is that if
and for
lead with a newline while else
and else if
don't. This is important for formatting in Go. In Go, else
needs to appear next to the }
of the if
, not on a newline, and gofmt
will cry about it if you don't have it there (we will use gofmt
after emitting to make our code look nice). The bool operands also have to get converted, as Python uses the literal words "and" and "or" instead of symbols like everybody else.
Emit!
'''golang
func (e *Emitter) emit(ast Structure) (string, error) {
output := ""
if ast.code == structureCode["ILLEGAL"] {
return output, errors.New("[Emit (emit)] ILLEGAL structure found in final code")
}
// Override for loops
if ast.code == structureCode["ST_FOR"] {
identifier := ast.children[1].text
output += "\n" + ast.children[0].text + " " + identifier + " := " + ast.children[5].text + ";"
output += identifier + " < " + ast.children[7].text + ";"
output += identifier + "++"
temp, err := e.emit(ast.children[10])
if err != nil {
return output, err
}
output += " " + temp
return output, nil
}
if ast.code == structureCode["ST_WHILE"] {
output += "for "
temp, err := e.emit(ast.children[1])
if err != nil {
return output, err
}
output += temp
temp, err = e.emit(ast.children[4])
if err != nil {
return output, err
}
output += temp
return output, nil
}
// Own text
val, exists := translation[ast.code]
if !exists {
found := false
for i := 0; i < len(directs); i++ {
if directs[i] == ast.code {
found = true
output += ast.text
break
}
}
if !found {
// Bools
if ast.code == structureCode["L_BOOL"] {
output += strings.ToLower(ast.text)
}
}
}
output += val
// Children's text
for i := 0; i < len(ast.children); i++ {
temp, err := e.emit(ast.children[i])
if err != nil {
return output, err
}
output += " " + temp
}
return output, nil
}
'''
To explain the code we will start by skipping the overrides on both for
and while
loops.
The first thing we want to look for is whether we have a translation for the Structure
we are looking at. If it does we can simply add it from translation
.
If we don't find it we should start looking whether this Structure
has a direct translation.
If both of those fails we check for the special case of a Boolean, which simply needs to be lowercase, unlike everybody else, Python capitalizes their Booleans.
Children
Every Structure
has children to make a tree, so now we need to emit this Structure
's children. Just like the parser, if we get an error from the children, we bubble the error.
Overrides
For loops and while loops end up looking very different between Python and Go. While loops are simpler, just making sure we replace the word "while" with the word "for," because Go uses for
for both loops, unlike everybody else.
For loops are a bit more complicated, because you have to translate the syntax of for i in range(0, 10)
to for i := 0; i < 10; i++
.
We start by saving the identifier, in this case, i
.
We then have to make the declaration of i := 0;
.
Now the next two parts on the loop are very fixed, which may be annoying, but later we can make it more generalized. The issue with how we have implemented it is that we don't support loops that start at a high number and decrement.
Next is the block of code within the for loop, which we just get the emission from to add to the output.
That's It
We're done! If you've followed along (and have looked at the GitHub to fill in the gaps where there was just too much code) then you should have a working compiler.
The Code is Ugly
There is one more thing I want to fix before I say we have our first completed version, and that is help you format your code correctly. If you have Go installed you should also have gofmt
. Type gofmt --version
to find the version of gofmt
, or to check whether it is installed. gofmt
doesn't support --version
, but you will get an error and if you get an error from gofmt
then you have it installed.
In the error message you probably came across it will tell you about which flags you can use, and we are interested in -s
and -w
.
We should be able to run a command similar to gofmt -w -s test.go
, and this will fix the formatting of our code.
Next
Now before you get excited about adding new features through these blogs I have to explain the plan for this year.
This project was the start of 12 projects to really help me have a good showcase of my understanding of programming, and this project was created to exemplify my (limited) knowledge of how compilers work.
These 12 projects should each take about a month which does mean the end of this project is coming soon, but it isn't here yet so I will be trying my best to add the semantic analyzer and maybe even an optimizer. Another feature that I would love to end up implementing would be classes and functions, even if the solution ends up being crude. This is because so much code contains classes and functions that it would be a major oversight to not have them as part of the compiler.
I thought I would share my future project plans to see if any of it sounds interesting, or if I should change some projects, so here is the list of projects along with what they aim to improve and showcase.
- Python to Golang compiler (Language Creation)
- Wolfenstein-like game (Game Creation) (These projects are focused on learning to create 3D graphics especially)
- 3D racing game (Game Creation)
- Python Interpreter (Language Creation)
- Generic 3D Game Engine (Game Creation)
- 2D AI Shooting Game (AI & Game Creation)
- My Own Compiled Language (Language Creation)
- Server with Websockets (Networking)
- Implement graphics into own language (Game Creation & Language Creation)
- 3D Game on own language (Game Creation & Language Creation)
You may notice that a lot of these projects build off of each other, for instance, I start with 3D graphics in a language I know, then make my own language, and I'm able to put them together and implement 3D graphics into my own language.
Furthermore, you may notice that 2 spots are missing, these 2 spots are dedicated to learning new programming languages.
I do realize that's a lot of different topics and not everybody wants to follow every single one, but hopefully a few of those are interesting to you, if not, do tell me so I can alter them.