Making a Language
This is a series that teaches you how to make a programming language. It was motivated by a gap between compilers as taught and compilers as seen in production. Production compilers are deeply concerned with being interactive, accomodating IDE usage and incremental compilation. In trying to learn how these work, I encounterd a dearth of material. The language we implement here will be modest, but it will focus on what's required for an interactive compiler ready to support an IDE.
Our language will have a healthy reverence for the theory, but won't be partaking. We're happy to rely on the results of theory, but we won't be picking apart how it achieves those results. This series focus will be on the practical side of things. Runnable code can be found accompanying each section.
Our language's implementation can be divided up into passes: Typechecking, Lowering, etc. Each pass has a section below that is subdivided into features for the language. We can see "Types" (our first pass about type inference) is subdivided into three features: Base, Rows, Items.
Each feature layers new capabilities atop the previous feature. Base is the initial feature, and Rows layers support for row types on top of the Base language (and Item layers on Rows etc). Pass/feature combos connect together. Lowering/Base depends on Types/Base, and at the end we can combine all our passes' Base features to create a compiler for our Base language.
You can read each feature set for a pass, or you can read each pass for a feature. Implementation code is available in the accompanying repo
Types
Types introduces the Abstract Syntax Tree (AST) for our language and starts us off by inferring types for our AST.
Base
Rows
Items
Lowering
Lowering turns our typed AST into our Intermediate Representation (IR). This shifts us from the frontend of the compiler to the middle/backend of the compiler.
Base
Rows
Items
Simplify
Simplification optimizes our IR in a type-preserving way. It produces an IR of the same type with better runtime performance. A key ingredient in accomplishing this is inlining.
Base
Monomorph
Monomorphization is responsible for removing polymorphism from our language. This is an important step towards code emission, as we don't know how to emit code to execute polymorphism.
Base
ClosureConvert
Closure conversion removes functions from our IR and replaces them with a lower level construct: the closure. Closures make, the previously implicit, variable capature of functions explicit exposing them in a form more amenable to compilation.
Base
Emit
Code emission, or code generation, turns our closure converted items into executable code by targeting WebAssembly (Wasm). This involves flattening our tree-structured IR into a list of Wasm operations and converting our closures into Wasm structs.
Base
Parsing
Parsing checks a source file contains valid syntax, and constructs a Concrete Syntax Tree representing the syntax of our language.