Hacking on Swift Compiler (Part 1)
Hacking on Swift Compiler (Part 1)

I have been thinking to hack on the industrial-strength compiler for some time. The last time I worked on a compiler was my third year in the college, where every student has to take a required compiler course and wrote a C compiler with lex and yacc. Ever since my graduation I merely work on my own DSL but not a thorough compiler. With the advance of LLVM and as what Anders Hejlsberg has described in this Channel 9 video, the architecture of the compiler is no longer like the old school of a big pipeline, where the compilation would cost you minutes for a mid size program. The structure of compiler basically would try to make the compiler frontend as re-usable as possible to the IDE code completion part and type checking partially on code snippets.
Since around two months ago I started to work on iOS end again. Which got me a good opportunity to work on Swift compiler which was architected by compiler expert Chris Lattner. My attempt of hacking on compiler is to make Swift to compile to golang. My opinion on the golang is that it has a outstanding runtime however it is so mediocre in the language design. Swift though yet is also not so advanced but at least striking some balance regarding the ergonomic and the speed (without resorting to Rust’s linear logic). Also, it got generics! I put some thought on this and I think it should be feasible since golang as the target language having garbage collection, most of the semantics could be one-to-one translated to the language. And I just need to add some keyword to make it look like libmill: https://github.com/sustrik/libmill , which provides C lib support for coroutines. And dropping weak, unowned semantic since golang doesn’t need that.
Swift was designed to have it’s own intermediate langauge, to provide higher level structure for multi-pass optimization that llvm’s IR cannot provide.
Initially I think I should codegen from SIR but actually the SIR is too low level for golang. The typechecked swift AST should be enough for that. And swiftc already providing that. By typing the following in the console
swiftc -dump-ast hello.swift
You can get the AST dump in the format like s-expression
(source_file
(func_decl "main()" type='() -> ()' access=internal
(parameter_list)
(brace_stmt
(pattern_binding_decl
(pattern_typed type='Int'
(pattern_named type='Int' 'a')
(type_ident
(component id='Int' bind=Swift.(file).Int)))
(call_expr implicit type='Int' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] nothrow
(constructor_ref_call_expr implicit type='(_builtinIntegerLiteral: Int2048) -> Int' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] nothrow
(declref_expr implicit type='Int.Type -> (_builtinIntegerLiteral: Int2048) -> Int' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] decl=Swift.(file).Int.init(_builtinIntegerLiteral:) specialized=no)
(type_expr implicit type='Int.Type' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] typerepr='Int'))
(tuple_expr implicit type='(_builtinIntegerLiteral: Int2048)' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] names=_builtinIntegerLiteral
(integer_literal_expr type='Int2048' location=hello.swift:2:16 range=[hello.swift:2:16 - line:2:16] value=1))))
(var_decl "a" type='Int' access=private let storage_kind=stored)
)))
If you would like to get the type alone, you can use swift-ide-test for that
swift-ide-test -print-types -source-filename=hello.swift
You would get
FuncDecl '''main''' () -> ()
VarDecl '''a''' Int
BinaryExpr:2 '''1 + 1''' Int
DeclRefExpr:2 '''+''' (Int, Int) -> Int
TupleExpr:2 '''1 + 1''' (Int, Int)
CallExpr:2 '''1''' Int
ConstructorRefCallExpr:2 '''1''' (_builtinIntegerLiteral: Int2048) -> Int
TypeExpr:2 '''1''' Int.Type
DeclRefExpr:2 '''1''' Int.Type -> (_builtinIntegerLiteral: Int2048) -> Int
TupleExpr:2 '''1''' (_builtinIntegerLiteral: Int2048)
IntegerLiteralExpr:2 '''1''' Int2048
CallExpr:2 '''1''' Int
ConstructorRefCallExpr:2 '''1''' (_builtinIntegerLiteral: Int2048) -> Int
TypeExpr:2 '''1''' Int.Type
DeclRefExpr:2 '''1''' Int.Type -> (_builtinIntegerLiteral: Int2048) -> Int
TupleExpr:2 '''1''' (_builtinIntegerLiteral: Int2048)
IntegerLiteralExpr:2 '''1''' Int2048
VarDecl '''b''' <<error type>>
BinaryExpr:3 '''a + 2.0''' <<error type>>
OverloadedDeclRefExpr:3 '''+''' <<error type>>
TupleExpr:3 '''a + 2.0''' <<error type>>
DeclRefExpr:3 '''a''' Int
FloatLiteralExpr:3 '''2.0''' <<error type>>
Which should be enough of information for all of the codegen. I just need to write a parser for the AST dump and walk the tree to generate the corresponding golang code. This would be left for the future series to detail my attempt.
In the meantime of digging into the code base, I was also interested into how swift implemented it’s type system and type checker. I presented my learning on Hindley Milner Type system at Haskell.sg before, so I have a basic understanding of how the HM-like system works. However, for industrial strength compiler especially with the bridging to ObjC part, I am really interested into how to architect your type system and make it work.
The swift repository on github is pretty undocumented, I can only find two docs drafting the whole picture but without too much detail. You have to peak into the code to look for detail.
https://github.com/apple/swift/blob/master/docs/TypeChecker.rst
After digging into the code base I found that you can actually turn on the debug mode in Swift REPL when you compile your swift binary with DEBUG flag on. You just simply use your DEBUG version swift to run into REPL mode and type the following meta command
(swift) :constraints debug on
Then it would spit all of the type checking process step by step
(swift) let c = 1
---Initial constraints for the given expression---
(integer_literal_expr type='$T0' location=<REPL Input>:1:9 range=[<REPL Input>:1:9 - line:1:9] value=1)
Score: 0 0 0 0 0 0 0 0 0 0 0 0
Type Variables:
#0 = $T0
#1 = $T1
Active Constraints:
Inactive Constraints:
$T0 conforms to IntegerLiteralConvertible [[locator@0x7ffcc5819e00 [IntegerLiteral@<REPL Input>:1:9]]];
$T0 conv $T1 [[locator@0x7ffcc5819e00 [IntegerLiteral@<REPL Input>:1:9]]];
Active bindings: $T0 := Int
(trying $T0 := Int
Active bindings: $T1 := Int
(trying $T1 := Int
(found solution 0 0 0 0 0 0 0 0 0 0 0 0)
)
)
---Solution---
Fixed score: 0 0 0 0 0 0 0 0 0 0 0 0
Type variables:
$T1 as Int
$T0 as Int
Overload choices:
It even come with the Constraint Graph with the calculated connected components for debugging the solver. Pretty neat.
---Constraint graph---
$T0:
$T1:
$T2:
Constraints:
$T2 conforms to FloatLiteralConvertible [[locator@0x7ffcc4967838 [FloatLiteral@<REPL Input>:1:13]]];
$T2 operator arg conv Int [[locator@0x7ffcc4967ca8 [Binary@<REPL Input>:1:11 -> apply argument -> comparing call argument #1 to parameter #1]]];
$T3:
Constraints:
$T3 conv $T4 [[locator@0x7ffcc4967b80 [Binary@<REPL Input>:1:11]]];
Adjacencies: $T4
$T4:
Constraints:
$T3 conv $T4 [[locator@0x7ffcc4967b80 [Binary@<REPL Input>:1:11]]];
Adjacencies: $T3
The above is the progess so far. I also found that Swift compiler is actually a good source of C++ code base for reading. It adopts pretty modernized C++ syntax, just it is a little pity that the compiler is not reaching to bootstrapping where the compiler is written in its own language.
Flight with Elixir
Flight with Elixir

I kept hearing people saying good comments about Elixir and its most popular web framework: Phoenix, from last year. People pitched the language with the Ruby syntax however much more performant because it is built on BEAM, leveraging its robustness and speedy performance. And due to some personal business I had to travel and took about 5 hours flight and therefore I decided to get my flight time to have a brief overlook of the language. I am using the “Getting Started” documentation from the elixir-lang.org. The first impression is unlike other mainstreet langauge, elixir got a few things right.
- Reference semantic
The example is extracted from the example of Joe Armstrong’s A week of Elixir. The creator of the language apparently understand that the mechanics that the value itself should be immutable and we are just binding a value to a name, if we would like to “modify” it then we should just rebind the name to a different value, but not make the semantic having the side-effect instead.
iex(9)> s = HashSet.new
#HashSet<[]>
iex(10)> s = HashSet.put(s, :element)
#HashSet<[:element]>
- Closure done right
It is common when main street language trying to catch-up with the wagon of functional programming, they did it wrong. One famous example is python’s lambda, which obviously didn’t honor the semantic of closure where the value should be captured and freezed.
a = 1
>>> f = lambda x: x + a
>>> f(5)
6
>>> a = 3
>>> f(5)
8
And Elixir did it right.
iex(11)> a = 5
5
iex(12)> f = fn(x) -> x + a end
#Function<6.50752066/1 in :erl_eval.expr/5>
iex(13)> f.(1)
6
iex(14)> a = 10
10
iex(15)> f.(1)
6
- Macro
I didn’t check the manual to verify if everything in Elixir is an expression or not, but I played with a few of control flow structure and it seems like so.
iex(1)> if nil do
…(1)> “haha”
…(1)> else
…(1)> “world”
…(1)> end
“world”
And the good thing is that I didn’t aware that Elixir actually comes with Macro. The above if/else control flow is actulally just a macro and rewrite it to the following structure.
iex(18)> if false, do: :this, else: :that
:that
- Sigil
Not sure if Sigil is also achieved with Macro, but it is great that with a PCRE like Regex to make the daily scripting handy.
iex(16)> “HELLO” =~ ~r/hello/
false
- Exit is catchable
It should be inherited from BEAM’s feature, most of the languages has exit to abort the program, but with BEAM’s own lightweight process. you can actually use exit as a signal to notify the parent process to do something about it.
iex(17)> try do
…(17)> exit “lalala”
…(17)> catch
…(17)> :exit, _ -> “boom”
…(17)> end
“boom”
Although I still prefer to program in a well-designed staic type language, overall I like Elixir. It looks like Ruby on a distant, but actually more about good inheritance from Erlang under the hood. If its community could catch up the size of Ruby, then it is definitely better than the world full of Ruby on Rails shop.