compiler

Unnamed Compiled Systems Language Project
git clone http://frotz.net/git/compiler.git
Log | Files | Refs

why-write-a-compiler.md (4163B)


      1 # Why Write a Compiler?
      2 
      3 I think my primary motivation is simply that I had never done so before.
      4 
      5 Across 30-some years programming, I've written a bunch of assemblers, several
      6 interpreted languages, a number of bytecode runtimes, a linker/preprocessor for
      7 java class files and bytecode into a smaller/lighter bytecode (for the hiptop
      8 family devices), various disassembler, and a pile of domain specific
      9 configuration languages, but never a full top-to-bottom compiled programming
     10 language that starts with source code and ends up with executable machine
     11 instructions of some form or another.
     12 
     13 It feels like one of those things that everybody should do at least once.
     14 
     15 ### Open Source FPGA Toolchains
     16 
     17 Advances in the state of the art of open source FPGA toolchains have made doing
     18 "real" digital design stuff without awful vendor tools possible, which
     19 re-kindled my interest in this space.  It's pretty trivial to build a little
     20 CPU on verilog.  It's not hard to crank out a little assembler for it.  But if
     21 you're doing anything serious, writing a lot of software in assembly is just
     22 not that fun or efficient, at least not for me.  I'd like to have a little
     23 systems compiler amongst my tools that I could easily target various little
     24 CPUs with.
     25 
     26 ### The Big Compilers are Way Too Big
     27 
     28 GCC and LLVM/clang are open source.  And very powerful.  And frickin' enormous.
     29 
     30 So while I could invest in learning how to make use of, say, LLVM to build my
     31 own frontends and share backends, or write backends for either, both of them
     32 are quite enormous, complex source bases, and I'm not super excited about
     33 dealing with giant piles of other peoples' software.
     34 
     35 ### Self-Hosting and Embedded Platforms
     36 
     37 While I'm writing the initial version in C, once I have sufficient feature
     38 coverage I hope to mechanically translate the C version and migrate to building
     39 the compiler with itself.  A Self-Hosting Compiler is another one of those
     40 "it'd be fun to do that at least once" sort of things.
     41 
     42 I'd also like to be able to use the compiler not just to build for but run on
     43 small embedded devices, retro computing systems, or experimental small
     44 platforms.  On the scale of single-digit megabytes of memory or less at the low
     45 end.  That definitely rules out GCC and clang.
     46 
     47 ### At Most, I Want "Just Enough" Optimization
     48 
     49 Modern optimizing compilers are amazing things, but they increasingly seem to
     50 be getting to clever for their own good (or at least mine).  For systems
     51 programming, especially, I really don't want the compiler to silently drop code
     52 or massively rearrange it.  I want to be able to rely on the compiler mostly
     53 doing what I tell it and not getting all inventive about "undefined behaviour."
     54 
     55 C and C++ have a bunch of undefined behaviour around integer and unsigned math,
     56 for example, which result in a weird gap between what the underlying machine
     57 does if you ask it and what a modern compiler considers "valid", and if "not
     58 valid" happily pretends like it doesn't matter.  In a way I'm looking for a bit
     59 more of the "high level assembly language" that C is often accused of being
     60 while it is increasingly not...
     61 
     62 https://www.yodaiken.com/2021/05/19/undefined-behavior-in-c-is-a-reading-error/
     63 
     64 https://www.yodaiken.com/2021/05/16/c-is-not-a-serious-programming-language/
     65 
     66 ### They Still Haven't Built me a "Better C"
     67 
     68 Go comes close, but its insistence of green threads, userspace scheduling,
     69 garbage collection, largish libraries, large runtime memory footprint, and
     70 really awkward interworking with native C/ELF ABI code are deal breakers.
     71 
     72 Rust is more pragmatic about inteworking with C/C++ native code and native
     73 platform threading (yay), but still suffers from a large standard library that
     74 doesn't seem to layer/subset well, resulting in even small programs being
     75 several megabytes in size.
     76 
     77 I feel like its heart is in the right place with the lifetime tracking stuff,
     78 but it feels rather awkward to code with.  It also suffers from very slow
     79 compilation.
     80 
     81 And so on and so on.
     82 
     83 This project gives me a chance to do my own experimenting, and while I am
     84 skeptical that I shall succeed wildly where all others have failed, at least
     85 I'll have fun trying.