compiler

Unnamed Compiled Systems Language Project
git clone http://frotz.net/git/compiler.git
Log | Files | Refs

commit cb0b30dedee55d74eeae6f6e684b6b9ded30d75f
parent 67fac0cdab5ee935a5507f00587fd7fca50996de
Author: Brian Swetland <swetland@frotz.net>
Date:   Sun,  8 Mar 2020 21:49:06 -0700

readme, docs, rename

Diffstat:
MMakefile | 8++++----
Rdocs/tlc.bnf -> docs/bnf.txt | 0
Adocs/why-write-a-compiler.md | 76++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Areadme.md | 25+++++++++++++++++++++++++
Mruntest.sh | 2+-
Rsrc/tlc.c -> src/compiler.c | 0
6 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/Makefile b/Makefile @@ -1,5 +1,5 @@ -all: bin/tlc bin/fs bin/r5d bin/r5e bin/mkinstab out/test/summary.txt +all: bin/compiler bin/fs bin/r5d bin/r5e bin/mkinstab out/test/summary.txt clean: rm -rf bin out @@ -7,9 +7,9 @@ clean: CFLAGS := -Wall -O2 -g CC := gcc -bin/tlc: src/tlc.c src/risc5dis.c out/risc5ins.h +bin/compiler: src/compiler.c src/risc5dis.c out/risc5ins.h @mkdir -p bin - $(CC) -o $@ $(CFLAGS) src/tlc.c src/risc5dis.c + $(CC) -o $@ $(CFLAGS) src/compiler.c src/risc5dis.c bin/fs: src/fs.c src/fs.h @mkdir -p bin @@ -31,7 +31,7 @@ out/risc5ins.h: src/risc5ins.txt bin/mkinstab @mkdir -p out bin/mkinstab < src/risc5ins.txt > $@ -out/test/%.txt: test/%.src bin/tlc bin/r5d ./runtest.sh +out/test/%.txt: test/%.src bin/compiler bin/r5d ./runtest.sh @mkdir -p out/test @rm -f $@ @./runtest.sh $< $@ diff --git a/docs/tlc.bnf b/docs/bnf.txt diff --git a/docs/why-write-a-compiler.md b/docs/why-write-a-compiler.md @@ -0,0 +1,76 @@ +# Why Write a Compiler? + +I think my primary motivation is simply that I had never done so before. + +Across 30-some years programming, I've written a bunch of assemblers, +several interpreted languages, a number of bytecode runtimes, a linker/preprocessor +for java class files and bytecode into a smaller/lighter bytecode (for +the hiptop family devices), various disassembler, and a pile of domain specific +configuration languages, but never a full top-to-bottom compiled programming language +that starts with source code and ends up with executable machine instructions of +some form or another. + +It feels like one of those things that everybody should do at least once. + +### Open Source FPGA Toolchains + +Advances in the state of the art of open source FPGA toolchains have made doing +"real" digital design stuff without awful vendor tools possible, which re-kindled +my interest in this space. It's pretty trivial to build a little CPU on verilog. +It's not hard to crank out a little assembler for it. But if you're doing anything +serious, writing a lot of software in assembly is just not that fun or efficient, +at least not for me. I'd like to have a little systems compiler amongst my tools +that I could easily target various little CPUs with. + +### The Big Compilers are Way Too Big + +GCC and LLVM/clang are open source. And very powerful. And frickin' enormous. + +So while I could invest in learning how to make use of, say, LLVM to build my own +frontends and share backends, or write backends for either, both of them are quite +enormous, complex source bases, and I'm not super excited about dealing with giant +piles of other peoples' software. + +### Self-Hosting and Embedded Platforms + +While I'm writing the initial version in C, once I have sufficient feature coverage +I hope to mechanically translate the C version and migrate to building the compiler +with itself. A Self-Hosting Compiler is another one of those "it'd be fun to do +that at least once" sort of things. + +I'd also like to be able to use the compiler not just to build for but run on small +embedded devices, retro computing systems, or experimental small platforms. On the +scale of single-digit megabytes of memory or less at the low end. That definitely +rules out GCC and clang. + +### At Most, I Want "Just Enough" Optimization + +Modern optimizing compilers are amazing things, but they increasingly seem to be +getting to clever for their own good (or at least mine). For systems programming, +especially, I really don't want the compiler to silently drop code or massively +rearrange it. I want to be able to rely on the compiler mostly doing what I tell +it and not getting all inventive about "undefined behaviour." + +C and C++ have a bunch of undefined behaviour around integer and unsigned math, +for example, which result in a weird gap between what the underlying machine does +if you ask it and what a modern compiler considers "valid", and if "not valid" +happily pretends like it doesn't matter. In a way I'm looking for a bit more of +the "high level assembly language" that C is often accused of being while it is +increasingly not... + +### They Still Haven't Built me a "Better C" + +Go comes close, but its insistence of green threads, userspace scheduling, garbage +collection, largish libraries, large runtime memory footprint, and really awkward +interworking with native C/ELF ABI code are deal breakers. + +Rust is more pragmatic about inteworking with C/C++ native code and native platform +threading (yay), but still suffers from a large standard library that doesn't seem +to layer/subset well, resulting in even small programs being several megabytes in size. +I feel like its heart is in the right place with the lifetime tracking stuff, but it +feels rather awkward to code with. It also suffers from very slow compilation. + +And so on and so on. + +This project gives me a chance to do my own experimenting, and while I am skeptical +that I shall succeed wildly where all others have failed, at least I'll have fun trying. diff --git a/readme.md b/readme.md @@ -0,0 +1,25 @@ + +# Unnamed Compiled Systems Language Project + +"O. Inspired by Oberon and gO, and it's like C without the sharp edges." - @adamwp + +It doesn't really have a name yet. The project is still very early. The syntax and features are +incomplete and will change. It is way, way too early to do much of anything but watch +me tinker with things, building this incrementally. + +The general plan is a small compiled systems language (in complexity, source size, and binary size) +borrowing syntax from some of my favorite "braces languages", C, Go, and Rust, aiming to be a bit +safer than C, and suitable for small, embedded, self-hosted systems. + +## [Why write a compiler?](docs/why-write-a-compiler.md) + +## Status + +It's currently compiling a subset of the work-in-progress language and generating binaries for +the Project Oberon [Risc 5 Architecture](docs/project-oberon-risc5-architecture.txt). + +I plan to also support RISCV (RV32I) as a target soonish, and if the project keeps moving +will eventually target X86-64 because why not. + +It's presently written in C, but once the core language is suitably featureful and codegen is +reasonably reliable I plan to translate it to itself and become self-hosted. diff --git a/runtest.sh b/runtest.sh @@ -12,7 +12,7 @@ msg="${txt%.txt}.msg" gold="${src%.src}.log" echo "RUNTEST: $src: compiling..." -if bin/tlc -o "$bin" -l "$lst" "$src" 2> "$msg"; then +if bin/compiler -o "$bin" -l "$lst" "$src" 2> "$msg"; then # success! if [[ "$txt" == *"-err"* ]]; then # but this was an error test, so... diff --git a/src/tlc.c b/src/compiler.c