<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Kartik Agaram</title>
    <link>http://akkartik.name</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Quickly make any LÖVE app programmable from within the app</title>
      <link>http://akkartik.name/post/love-repl</link>
      <pubDate>Tue, 22 Aug 2023 11:51:16 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/love-repl</guid>
      <description><![CDATA[
<p>
It's a very common workflow. Type out a <a href='https://love2d.org'>LÖVE</a>
app. Try running it. Get an error, go back to the source code.

<p>
How can we do this <em>from within the LÖVE app</em>? So there's nothing to
install?

<p>
This is a story about a hundred lines of code that do it. I'm probably not the
first to discover the trick, but I hadn't seen it before and it feels a bit
magical.

<p>
<a href='https://forum.malleable.systems/t/adding-malleability-to-any-love-app/90'>Read more</a>
]]></description>
    </item>
    <item>
      <title>Using computers more freely and safely</title>
      <link>http://akkartik.name/post/freewheeling</link>
      <pubDate>Tue, 23 May 2023 09:03:45 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/freewheeling</guid>
      <description><![CDATA[
<p>
<a href='/freewheeling'>A 15-minute manifesto (video and transcript) on
lessons learned trying to build situated software for a year.</a>
]]></description>
    </item>
    <item>
      <title>A year of freewheeling apps</title>
      <link>http://akkartik.name/post/roundup22</link>
      <pubDate>Thu, 29 Dec 2022 17:56:32 PST</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/roundup22</guid>
      <description><![CDATA[
<p>
Over the course of 2022, I've found myself gradually programming in a certain
way that has been working really well. Here, let me show you a few examples,
see if you can spot the pattern:

<p>
<ol>
<li> <a href='http://akkartik.name/lines.html'>A plain-text editor where you can
also draw line drawings.</a>

<p>
<img src='/images/20221229-roundup/gravity.png' style='margin-left:1em' width='100%'
alt='Screenshot showing text interspersed with a line drawing covering the
whole window width (48KB)'
/>

<p>
Minimal dependencies, easy to build, runs anywhere you can install apps
without asking permission, thoroughly tested, designed above all to reward
curiosity about its internals.

<p>
<li> <a href='https://github.com/akkartik/lines-polygon-experiment'>A different
way to draw polygons.</a> Old way:

<p>
<!-- more -->

<p>
<img src='/images/20221229-roundup/20220615-before.gif' style='margin-left:2em' width='100%'
alt='Drawing a square by specifying two points and one side of the resulting
line (200KB)'
/>

<p>
New way:

<p>
<img src='/images/20221229-roundup/20220615-after.gif' style='margin-left:2em' width='100%'
alt='Drawing a square by specifying a centroid and vertex (200KB)'
/>

<p>
<a href='https://github.com/akkartik/lines.love#mirrors-and-forks'>First of a
sprawling family tree of over a dozen forks of the original editor.</a>

<p>
<img src='/images/20221229-roundup/20220903-freewheeling-forks.png' style='margin-left:2em' width='100%'
alt='Family tree of 9 forks, showing relative complexity of each along the x
axis (32KB)'
/>

<p>
(Image drawn using itself, of course.)

<p>
<li> <a href='https://codeberg.org/akkartik/pong.love'>The Pong fork.</a>
Baking the editing environment into other apps for a self-contained curiosity-rewarding/forking
experience.

<p>
<video controls width='90%' style='margin-left:2em; margin-bottom:2em'>
<source src='/images/20221229-roundup/20220903-freewheeling-pong.webm' type='video/webm'
alt='A game of Pong being edited in place, and its logs browsed. The logs are
graphical and animate like a flipbook to show the game when scrolled at speed. (1.2MB)'
>
</video>

<p>
(Notice the extensible graphical logs. It takes very little code to augment
debug-by-print. This work stemmed from the <a href='https://handmade.network/p/283/bifold-text'>Handmade
Network Wheel Reinvention Jam</a>)

<p>
<li> Using the editing environment to debug the editing environment. (More
tools should support a command palette; it's the best of commandline and GUI
worlds.)

<p>
<img src='/images/20221229-roundup/20220918-logging-menu-changes.png' style='margin-left:2em; margin-bottom:2em' width='100%'
alt='Another graphical log, this time for the editing environment itself,
showing the command palette state changing in response to keystrokes (200KB)'
/>

<p>
<li> Pivot: <a href='https://codeberg.org/akkartik/20221018-live.love'>making
changes to programs as they run.</a> They maintain state even after a crash
(the red error on the left).

<p>
<video controls width='90%' style='margin-left:2em; margin-bottom:2em'>
<source src='/images/20221229-roundup/20221017-live-coding.webm' type='video/webm'
alt='Video showing an editor making code changes to an app in a second window.
the app crashes and its window hangs, but the editor is able to get it
unblocked (1MB)'
>
</video>

<p>
<li> <a href='https://codeberg.org/akkartik/luaML.love'>LuaML</a>, a box model
over an infinite 2D surface that you can pan and zoom without restriction.
Built in the live style, of course.

<p>
<video controls width='90%' style='margin-left:2em; margin-bottom:2em'>
<source src='/images/20221229-roundup/20221224-luaML-yule.webm' type='video/webm'
alt='Video showing a hierarchical box model of blocks of text, including two
columns and a table. The surface can be panned and zoomed in and out. (3.4MB)'
>
</video>

<p>
(You can edit each text widget, and scrolling within a widget pans the whole
surface. This took me a couple of tries to boil down to a reasonably elegant
implementation.)

<p>
<li> <a href='https://codeberg.org/akkartik/driver.love'>Pulling LuaML
&ldquo;into the left window,&rdquo; the editing environment.</a>

<p>
<video controls width='90%' style='margin-left:2em; margin-bottom:2em'>
<source src='/images/20221229-roundup/20221228-luaML-driver.webm' type='video/webm'
alt='Video showing the hierarchical box model running in the editor to
visualize multiple functions from an app while retaining all its capabilities
of panning and zooming (850KB)'
>
</video>

<p>
</ol>

<p>
I've found myself calling these <em>freewheeling apps</em> to myself. They're
freewheeling in two ways. First, they're easy to get started with so you can
be off doing your thing. Second, they stay freewheeling over time. They don't
cramp your style with constraints <a href='https://250bpm.com/blog:51'>after
you've gotten suckered into adopting them</a>.

<p>
<div style='margin-left:2em'>
&ldquo;You want me to trust your binaries? I'm just not ready for that
commitment man. Do you have a rock-solid build process that's guaranteed to
work on my machine? I'd like the <em>option</em> to look at what you're up to
when I feel like it.&rdquo;

<p>
&ldquo;It autoupdates twice a week? I'd rather not spend time tracking down
what change broke my habits, thanks.&rdquo;

<p>
&ldquo;1GB install? What if I'm in, you know, the other 95% of the
planet?&rdquo;

<p>
&ldquo;You have a PR submission process? How lovely for you. Hey, how about I
just publish a fork, and you take what you want. (I love getting comments,
though.)&rdquo;

<p>
&ldquo;I don't want to remember a bunch of idiosyncracies about your language
and app. Can you just give me good error messages when I mess up? Just don't
harsh my buzz about portability, compatibility constraints and whatnot.&rdquo;
</div>

<p>
Replace &ldquo;I&rdquo; with yourself, dear reader. These apps should work
anywhere (except mobile platforms), be easy to try out, easy to edit in place,
and easy to subvert if you dislike a design choice. I'm going to continue
improving the hacking experience, and I want to support forks in staying up to
date with my changes. Unfortunately they require a little bit of programming
experience for now (particularly <tt>git</tt>), but it should all seem pretty
familiar regardless of what languages and tools you've used in the past. And
if they don't, feel free to reach out. I welcome questions.

<p>
<em>Coda</em>

<p>
A lot of the bang here comes from the stack I'm using: the <a href='https://www.lua.org'>Lua</a>
programming language and the <a href='https://love2d.org'>LÖVE</a> game engine
for Lua. You don't <em>have</em> to use Lua and LÖVE to be freewheeling, I
think, any parsimonious stack designed to be portable and easy to build will
do. But if you have another candidate that meets those criteria, I'd like to
see it.

<p>
<hr style='width:50%'>
]]></description>
    </item>
    <item>
      <title>Linear reading and the Silfen Paths</title>
      <link>http://akkartik.name/post/silfen-paths</link>
      <pubDate>Tue, 30 Nov 2021 17:56:32 PST</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/silfen-paths</guid>
      <description><![CDATA[
<p>
I spent the pandemic year reading a lot of Peter Hamilton. I wouldn't necessarily
recommend it; they all blur together after a while, and I start to wonder if
they aren't perhaps all the same story&hellip;

<p>
Regardless, the first Peter Hamilton I read, <em>Pandora's Star</em>, still
sticks with me for a motif that didn't come together until right at the end:
the Silfen Paths. In this universe humanity has portals that can span light
years, often conveying train service between star systems, but there are
occasional legends of an older interstellar network by an ancient alien
civilization. Needless to say, our intrepid protagonist manages to get on this
network. And suffers years of privation and amazing adventures (while everyone
else in the novel is moving the story forward) before coming out the other
end. Unlike the portals created by humans, the Silfen paths don't contain
abrupt transitions between two points in space. Things blend together more
gradually. Also unlike portals, the Silfen Paths aren't in the traveller's
control. Instead, to go forth on the paths is to open oneself to the new, the
unexpected. Extreme heat and cold. Danger. The occasional prancing Silfen
who'll happen upon you and help you out, but who doesn't quite seem to get the
idea of &ldquo;home,&rdquo; or that you're trying to get there, before
outpacing you again, inevitably leaving you behind to find your own path
through the maze.

<p>
<!-- more -->

<p>
If I could go back in time and give my younger self a message, it would be to
spend more time reading linearly through guided documentation. After growing
up falling asleep on textbooks, the interactive and random-access nature of
computers was a great learning aid. And yet, somewhere along the way, I
started to rely far too much on the crutch of just <em>Googling</em> for my
immediate problem. Many times I prematurely dismissed painstakingly written
documentation while bemoaning how poorly documented everything was. Everything
wasn't set up just right for me &mdash; because it can't ever be. And by
taking too often the easy portal out, I was being inefficient with the
opportunities for learning that were coming my way. Just because they <em>seemed</em>
inefficient for my task at the time, something seemingly all-important that I
forgot about the next day as I went dancing on my way.

<p>
I consider rereading Pandora's star every once in a while. I'm sure it'll
happen eventually. But I'm equally sure I'll skip all the interminable
interludes about the Silfen Paths, now that I know where they end. Just
reading about someone so out of control is more than my pampered, twentieth-century,
low-attention-span, summer-child self can handle. It almost got me to give up
on the book the first time around. And yet, I cherish my one time walking the
Silfen Paths in my imagination. I did learn something.

<p>
For an eternal-seeming few years fifteen years ago, there was a frequent
debate about the value of comments in a blog. On one side: <em>blogs are meant to
be interactive!</em> On the other: <em>why do I need teenagers tramping
through my homestead? <a href='https://www.joelonsoftware.com/2007/07/20/learning-from-dave-winer'>If
you have something to say, get your own damn blog!</a></em> With the perspective
of hindsight, it's apparent who inherited the earth. Blogs largely are the
preserve of the few, and even the few committed bloggers that remain have to
go out to find readers on the street, the curbside, the microblog, where
everything is a comment. You can't get others to read you without giving them
the opportunity to appropriate you with a pin, a bookmark or a retweet.

<p>
Have we noticed yet that our blogs are now just as disintermediated from our
readers as the mainstream newspapers whose disintermediation we celebrated?
&ldquo;Comments are dead,&rdquo; blogs proclaimed. But comments didn't die,
they became the whole platform.

<p>
It's a grave decision, where you draw boundaries between pages on a website.
Every link is a portal, a beginning, an opportunity for a search engine or
microblogger to start a cowpath. With that in mind, I'm trying something new,
the guided tour for Mu. Ironically, it atomizes my previous docs by linking
repeatedly into anchors in the middle of pages. <a href='https://github.com/akkartik/mu/blob/main/tutorial/index.md'>Proceed
if you dare.</a>
]]></description>
    </item>
    <item>
      <title>Mu's neighborhood</title>
      <link>http://akkartik.name/post/neighborhood</link>
      <pubDate>Sun, 13 Jun 2021 14:54:27 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/neighborhood</guid>
      <description><![CDATA[
<p>
<a href='/about'>My goal</a> for <a href='https://github.com/akkartik/mu'>Mu</a> is
software that is accountable to the people it affects. But it's been difficult
to talk to people about Mu's goals because of the sheer number of projects
that use similar words but lead to very different priorities and actions. Some
of these I like to be associated with, some <span style='color:red'>not so
much</span>.

<p>
<em>if you care about making software accountable</em>

<p>
<!-- more -->

<p>
There are two ways to describe the provenance of software: in terms of the
people who made it or checked it out, and in terms of its internals. The two
ways are often described with similar-sounding words.

<p>
On one hand, <b><a style='color:red' href='https://en.wikipedia.org/wiki/Trust_metric#Transitivity'>trust
chaining</a></b> is a way to transform trust in one person into trust in
another.

<p>
Trust chaining is also used to transform trust in people into trust in
artifacts. Performing arithmetic on an artifact to <b>verify</b> a trust
relationship such as "person X owns this signature".

<p>
On the other hand, <b>proofs</b> convey an argument to a reader and let the
reader assess how objective and ironclad it is without needing to trust the
author.

<p>
Automated proofs can be checked by a computer, but are still amenable to human
readers. (Beyond automated proofs, zero-knowledge proofs are more about
verification. Conveying ownership of artifacts rather than objectivity of
knowledge. That way lie ideas like <span style='color:red'>proof of work</span>.
I want to exclude these from my preferred notion of 'proof'. <a href='/contact'>Let
me know</a> if you can think of a clearer word than 'proof' for my desired
category.)

<p>
<span style='color:red'>Trusted computing</span> uses trust chaining to shackle
a software stack to some distant server running some arbitrary software.
(<a href='https://cs.stanford.edu/people/eroberts/cs201/projects/trusted-computing/what.html'>more
details</a>)

<p>
<a href='https://en.wikipedia.org/wiki/Reproducible_builds'>Reproducible
builds</a> are about getting some arbitrary set of software to generate
deterministic output given identical input. The build recipe acts as a proof
that a binary was generated from it.

<p>
<a href='http://bootstrappable.org'>Bootstrappable builds</a> are about making
<em>all</em> the software needed for a program available for auditing.
Including all the software needed to build it, and all the software needed to
build that, and so on. The build recipe acts as a proof that source code for
the entire supply chain is available for auditing.

<p>
(<span style='color:red'>Bootstrapping</span> is about getting a compiler to
build its own source code. Such a similar term, so <a href='/trusting_trust.pdf'>anthithetical</a>
to bootstrappable builds.)

<p>
<a href='https://github.com/crev-dev/crev'>Collaborative code review</a> is
about getting people to sign off on software packages once source code is
available. There's no proof here, but it becomes possible to verify that some
people inspected sources and found no major issues.

<p>
Problems in this category:
<ul>
<li>people make mistakes; I want to verify not just what people think, but <em>why</em>
<li>stacks currently grow complex faster than attempts to make them auditable
<li>auditability doesn't help answer people's questions about <em>why</em> a
piece of software does something seemingly questionable
</ul>

<p>
Even so:
<ul>
<li>some auditability is a vast improvement over none
<li>auditability is vastly more important for the lowest levels of the stack,
where it's more tractable to obtain
</ul>

<p>
<em>if you care about bringing software closer to the people it affects</em>

<p>
Minimalism is about building simple programs with as little code as possible,
starting from some arbitrary set of software. (<a href='http://arclanguage.org'>example</a>
<a href='https://dwm.suckless.org'>example</a> <a href='https://100r.co/site/nasu.html'>example</a>)

<p>
<a href='https://malleable.systems'>Malleable software</a> or
<a href='https://tcher.tech/publications/PhilipTchernavskij_PhDThesis.pdf'>end-user programming</a>
is about building programs to be as easy to change as they are to use.
Building atop some arbitrary set of software.

<p>
Problems in this category:
<ul>
<li>things (<a href='https://wiki.xxiivv.com/site/uxn.html'>example</a>) are often not
minimal or simple if you get into details (<a href='https://packages.ubuntu.com/hirsute/libsdl2-2.0-0'>example</a>)
<li>to permit open-ended changes you have to take control of more and more of
the stack (<a href='https://medium.com/feenk/one-rendering-tree-918eae49bcff'>example</a>)
<li>minimal software has a constant temptation to grow less minimal over time
(<a href='https://en.wikipedia.org/wiki/Unix'>example</a>)
</ul>

<p>
Even so:
<ul>
<li>they can deliver a great deal of capability even if they're not perfect
</ul>

<p>
<em>editorializing</em>

<p>
I care about software that is <em>accountable</em> to <em>the people it
affects</em>. Both sides matter.

<p>
The word 'trust' provides cover for a large amount of bad behavior.
A <a href='https://en.wikipedia.org/wiki/Trusted_Platform_Module'>trusted platform module</a>
provides trust to large companies about the hardware their "intellectual
property" runs on. It does not provide trust to the individual people whose
computers it inhabits.

<p>
Proofs work better than some cryptographic sign-off. If a trust relationship is
found to have a problem, it must be revoked wholesale. Repeated revocations
reduce confidence. If a proof is found to have a problem, it's usually easy to
patch. There are enough details to determine if things are improving.

<p>
Trust relationships can change. Habits can be hard to change. That contrast
implies that the result of an audit cannot be binary. It is untenable to tell
people to stop using software that grows abusive. When software does something
a person dislikes, they should be able to 1. find the sources, 2. build the
sources, 3. modify the sources at any level of granularity, and 4. feel
confident in the results of their actions.

<p>
It's important that this be an incremental process. Minor tweaks in response to
minor dissatisfactions aren't just first-world problems. Going through all four
steps for something minor creates confidence that one will be prepared when
major changes are needed.

<p>
Size matters. It can take surprisingly little code to lose this property of
<em>incremental accountability</em>.

<p>
The number of zones of ownership matters. You can make incrementally accountable
software by relying on others, but not too many others. Minimizing the
dependency tree may well be more important than minimizing lines of code.

<p>
<a href='https://github.com/akkartik/mu'>Mu</a> is reproducible, auditable and almost
entirely incrementally accountable (property 3. above hasn't been stress-tested
much and so remains a work in progress). I flatter myself that it's difficult
to ask "why" questions in the form of code changes without triggering a failing
test or other error message that answers them.

<p>
However, Mu too has problems:

<p>
<ul>
<li>I have to do without many things at the moment: network support, concurrency,
files, pointing device, performance, etc., etc.
<li>it's unclear if this way leads to any program anybody else would find
useful enough to want to modify
</ul>

<p>
Even so:
<ul>
<li>if it's hard to create incrementally accountable software, perhaps we
shouldn't be relying on software so much
</ul>

<p>
<em>Conclusion</em>

<p>
It's worth thinking about different pieces of software in terms of what you
give up when using them. Some points of comparison:

<p>
<ul>
<li>Most software: Memory safety (either directly or in build dependencies), reproducible builds, auditable builds, incremental accountability.
<li><a href='http://suckless.org'>The suckless school</a>: Memory safety, reproducible builds, auditable builds, incremental accountability.
<li>Rust: <a href='https://users.rust-lang.org/t/testing-out-reproducible-builds/9758'>reproducible builds</a>, auditable builds, incremental accountability.
<li><a href='http://bootstrappable.org/projects/mes.html'>bootstrappable</a>: Memory safety, incremental accountability.
<li><a href='https://wiki.xxiivv.com/site/uxn.html'>Uxn</a>: Memory safety, large screen, lots of RAM, reproducible builds. (Incremental accountability seems less important when programs are so tiny.)
<li><a href='https://github.com/akkartik/mu/tree/main/linux'>Mu on Linux</a>: Graphics, some memory safety and incremental accountability (Linux dependency), portability
<li><a href='https://github.com/akkartik/mu'>Mu</a>: Networking, concurrency, files, performance, portability.
</ul>

<p>
Thank you for reading. Here's a screenshot of a Mu program I made for my kids
today: "chessboard with rainbows".

<p>
<img style='width:100%; margin-bottom:1em' src='/images/20210613-mu-bowboard.png'>

<p>
<span class='btw'>(This post was inspired by <a href='https://www.ribbonfarm.com/you-are-here'>Ribbonfarm's
periodic maps</a>.)

<p>
<em>comments</em>

<p>
<ul><div class="comment">
&nbsp;&nbsp; <li><a name="518e0283c8bf400a06907a84f91e12c3fe0b16aa7aa5eedd615fb1658480101b"></a><a href="https://twitter.com/TriKro">Tristan Kromer #BlackLivesMatter</a>, 2021-07-04: I notice you never use the word transparency. Does the program make clear to the user what it is doing at all times?

<p>
This is a little different from auditable. Eg, I would prefer a webcam hardwired with a light to show when it is powered on vs a webcam where I could inspect it to determine the status, but that requires my curiosity and effort.
&nbsp;&nbsp; <ul><div class="comment">
&nbsp;&nbsp;   <li><a name="90c1c5c6be4b6965011afbc73c51ac777a2d1d4a3934248b923ff08acf08618e"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2021-07-04: It's a good question. I was enamored with 'transparency' a year ago but dropped it because of associations with stuff like <a href='https://www.computer.org/csdl/magazine/cs/2017/01/mcs2017010005/13rRUxC0SLw'>https://www.computer.org/csdl/magazine/cs/2017/01/mcs2017010005/13rRUxC0SLw</a>. People often use 'transparency' to mean 'invisible,' which is the opposite of what I mean. So it's been challenging to carve out an unpolluted term.

<p>
The distinction you make is also important. A device can only have a limited number of lights and dials, and I don't have much to contribute regarding what to use them for. It feels like a plenty hard problem just to provide all the data somewhere so that others can decide what to highlight.
</div></ul>
&nbsp;&nbsp; <li><a name="f2b5293aedc4ab36d62cd0f6ba2557470f2fe0553dba331a5ac39f4d343d9000"></a>Anonymous, 2021-09-04: I liked "habitable" <a href='http://akkartik.name/post/habitability'>http://akkartik.name/post/habitability</a>.
&nbsp;&nbsp; <ul><div class="comment">
&nbsp;&nbsp;   <li><a name="b1d62387ba9aabf8d9a0756109611fe46b793297ddf5382f87ee02a69f6e3080"></a>Anonymous, 2021-09-04: PS: Similar ontological analysis from earlier this year: <a href='https://news.ycombinator.com/item?id=25458080'>https://news.ycombinator.com/item?id=25458080</a>
&nbsp;&nbsp;   <ul><div class="comment">
&nbsp;&nbsp;     <li><a name="d55f33a96a3294f7043fa38ead023b826620ae28f49762b49c3d39138a111c7b"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2021-09-04: Thanks! That's a good point.

<p>
Is that comment by you? I just want to clarify that <a href='http://akkartik.name/post/habitability'>http://akkartik.name/post/habitability</a> is not written by me. It's a long quote from an essay by Richard Gabriel.
&nbsp;&nbsp;     <ul><div class="comment">
&nbsp;&nbsp;       <li><a name="259e5742aa9a6f34df8966fff40eb9daf3e06691f6c3f8bfd419f73821bf391c"></a>Anonymous, 2021-09-13: > Is that comment by you?

<p>
Yes.

<p>
&gt; <a href='http://akkartik.name/post/habitability'>/post/habitability</a> is [...] a long quote from an essay by Richard Gabriel

<p>
Yes, but it's a nice descriptor.

<p>
Or maybe we should be bold enough to synthesize our own word.  "hactile software"?  (*hack* + *tactile* = *hactile*; something that exhibits the quality is said to have *hactility*)  Too corny?
&nbsp;&nbsp;       <ul><div class="comment">
&nbsp;&nbsp;         <li><a name="1815a0af5ab600d6cd2d18616236a1c27224dcee097f3d737c7b49a5070707da"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2021-09-13: :) I love that you brought up tactile. Here's something from Mu's Readme several iterations ago:

<p>
&gt; Trading off notational convenience for tests may seem regressive, but I suspect high-level languages aren't particularly helpful in understanding large codebases. No matter how good a notation is, it can only let you see a tiny fraction of a large program at a time. Logs, on the other hand, can let you zoom out and take in an entire *run* at a glance, making them a superior unit of comprehension. If I'm right, it makes sense to prioritize the right *tactile* interface for working with and getting feedback on large programs before we invest in the *visual* tools for making them concise.

<p>
(From <a href='https://github.com/akkartik/mu/blob/14b33e59a528b996f682110564182524d1503e91/Readme'>November 2014</a> to <a href='https://github.com/akkartik/mu/blob/a538edba2d461f785093f36e48c046cce3eee920/Readme.md'>March 2016</a>)
&nbsp;&nbsp;         <li><a name="f46abe4797b352405eba9ebbc9c32c3929979a89ed6e5f87246617eb901867ae"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2021-09-13: Last week's exchange has had me mulling an update to this post. In particular, I now see Mu's neighborhood as more pronouncedly bilobed, between habitability and auditability. (I think both benefit from increased tactility?)

<p>
In the process, I was also reminded of another fount of neighbors on the habitability side: the literature on tailorable software in the 90's, as described in <a href='https://tcher.tech/publications/PhilipTchernavskij_PhDThesis.pdf'>chapter 2 of Philip Tchernavskij's thesis</a>.
</div></ul>
</div></ul>
</div></ul>
</div></ul>
</div></ul>
]]></description>
    </item>
    <item>
      <title>The Mu computer in 2020</title>
      <link>http://akkartik.name/post/mu-2020</link>
      <pubDate>Wed, 30 Dec 2020 15:41:14 PST</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/mu-2020</guid>
      <description><![CDATA[
<p>
<div style='margin-left:3em'>
<span class='left_quote_char'>&ldquo;</span><em>There are two ways of constructing
software. One way is to make it so simple that there are obviously no deficiencies,
and the other way is to make it so complicated that there are no obvious
deficiencies. The first method requires a willingness to accept limitations,
and to compromise when conflicting objectives cannot be met.&rdquo;</em>
<br>&mdash; <a href='http://worrydream.com/refs/Hoare%20-%20The%20Emperors%20Old%20Clothes.pdf'>C.
A. R. Hoare</a>
</div>

<p>
It seems to me that modern computers trap people in a vicious cycle. Compatibility
guarantees breed complexity over time as the world changes. Complexity is
managed by introducing layers of abstraction. Abstractions introduce new
compatibility guarantees. Over the decades this vicious cycle leads to even
professional programmers understanding only a tiny fraction of the software
infrastructure that runs their computers. As a result, our world is increasingly
captured by software that is unaccountable to people.

<p>
For several years now I've had <a href='/about'>a vision</a> for a computer
that allows anyone to audit its inner workings, where any operation can be
decomposed strictly into a parsimonious combination of simpler operations,
terminating without cyclic dependencies or circular reasoning at some ground
level. Ideally it would do this in a way that rewards curiosity, leading to a
virtuous cycle where an order of magnitude more people grow to understand how
their computer works as they use it.

<p>
Nowhere in this picture are compatibility guarantees, version numbers or
forced upgrades. At any point your computer should be internally consistent
and free of known historical accidents. Even if this means upgrades are more
work and so more infrequent, and that our computers must be slower. Or do
less. That seems like a worthwhile trade for <a href='https://www.youtube.com/watch?v=pW-SOdj4Kkk'>a
more sustainable world</a>.

<p>
At the start of 2020 the state of <a href='https://github.com/akkartik/mu'>the
Mu computer</a> looked like this:

<p>
<!-- more -->

<p>
<img src='/images/20201230-2019.png' style='width:100%; margin-bottom:1em'>

<p>
The blue stack on the left was my current computer, without changes. The stack
on the right was a better computer using some existing pieces (blue) and some
new pieces (red). The arrow between the stacks was like a wormhole between
worlds. I was using the existing world to build the new world, but the idea
was for the new world to be self-sufficient once it was set up.

<p>
At the end of 2020, more details of the picture are becoming clear:

<p>
<img src='/images/20201230-2020.png' style='width:100%; margin-bottom:1em'>

<p>
I now have a self-sufficient computer in the middle stack that can
<a href='http://akkartik.name/akkartik-convivial-20200607.pdf'>rebuild itself</a>
without needing much mainstream software. Since its focus is on building
itself, the currency of the realm is streams of text. An existing Linux kernel
provides clean primitives for operating on streams of text over <tt>stdin</tt>/<tt>stdout</tt>.
Beyond that the Mu computer fends for itself. It even has a shell, though it
doesn't look anything like Unix has taught us to expect:

<p>
<img src='/images/20201229-mu-environment.png' style='width:100%; margin-bottom:1em'>

<p>
It's a <a href='http://worrydream.com/#!/StopDrawingDeadFish'>live-updating</a>
postfix environment entirely in text mode that shows the top-level evolution
of your computation (including side-effects in little fake screens) at a
glance, where you can drill down into any function call when you need more
details. Check out the <a href='https://archive.org/details/@kartik_agaram'>15
two-minute demos</a> I made in 2020.

<p>
I don't represent size and complexity in my pictures above. The mainstream
stack running on our computers contains
<a href='https://caseymuratori.com/blog_0031'>hundreds of millions of lines of code</a>,
and would look like a ball of spaghetti if we zoomed into the connections
between levels. The middle stack requires twelve million lines for the Linux
kernel, but aside from that weighs in at 50k lines of straight-line dependencies,
most of them mapping to individual instructions of machine code, two thirds of
which are comments or automated tests. (It does a lot less, of course. The
goal is to start with a sustainable stack and then preserve sustainability
properties as we thoughtfully add functionality.) Finally, the final and most
nascent stack on the right gets rid of the twelve million lines of Linux, and
can do even less at the moment. All it can do is <a href='/images/20201227-256color.png'>draw pixels on the screen</a>
and process keystrokes.

<p>
<ul>
<li> No wifi, no networking.
<li> No file system yet, just sectors on a local disk.
<li> No multitouch, no touchscreen, no mouse, not even any shift key support yet.
<li> No graphics acceleration, no fonts, no way to print text.
<li> No virtual memory, no GC, not even any memory reclamation yet.
</ul>

<p>
But it's a start. A moderately sized screen, a keyboard, gigabytes of RAM and
a guarantee of memory-safety. Let's see where 2021 leads.

<p>
<span class='btw'>(<a href='https://merveilles.town/@akkartik/105467785655674076'>Initial
revision on Mastodon</a>.)</span>
]]></description>
    </item>
    <item>
      <title>Mu: The first 6 years</title>
      <link>http://akkartik.name/post/convivial-computing</link>
      <pubDate>Sun, 15 Mar 2020 22:23:46 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/convivial-computing</guid>
      <description><![CDATA[
<p>
Over the last few months I've written up in one place the entire argument
for&mdash;and comprehensive description of&mdash;what I've been working on
since 2014. It will be published in the proceedings of the <em>Convivial
Computing Salon</em>. From <a href='https://2020.programming-conference.org/home/salon-2020#Call-for-Submissions'>the
call for submissions</a>:

<p>
<blockquote style='border-left:2px #888 solid; padding-left:0.5em'>
In <a href='/illich.pdf'><em>Tools for Conviviality</em> [1973]</a>, Ivan Illich said, &ldquo;I choose
the term &lsquo;conviviality&rsquo; to designate the opposite of industrial
productivity&hellip; Tools foster conviviality to the extent to which they can
be easily used, by anybody, as often or as seldom as desired, for the accomplishment
of a purpose chosen by the user&hellip; People need new tools to work with
rather than tools that work &lsquo;for&rsquo; them.&rdquo;

<p>
We were promised bicycles for the mind, but we got aircraft carriers instead.
We believe Illich’s critique of the damage to society from technology
escalation offers a fresh perspective from which to discuss the pathologies of
modern software development, and to seek better alternatives.
</blockquote>

<p>
An inspiring theme. My response: <a href='/akkartik-convivial-20200607.pdf'>&ldquo;Bicycles
for the mind have to be see-through.&rdquo;</a> Get it? When I look over at my
bicycle I can see right through its frame. I can take in at a glance how the
mechanism works, how the pedals connect up with the wheels, and how the wheels
connect up with the brakes. And yet, when we try to build bicycles for the
mind, we resort to &ldquo;hiding&rdquo; and &ldquo;abstraction&rdquo;. I think
this analogy has a lot more power than we credit, a lot more wisdom to impart
if we only let it in. I think conviviality requires tools with exposed
mechanisms that reward curiosity.

<p>
I've been trying to falsify this hypothesis for 6 years. There are still large
gaps to investigate, but so far it's holding up. <a href='/akkartik-convivial-20200607.pdf'>Read
on &rarr;</a> [pdf; 25 pages]
]]></description>
    </item>
    <item>
      <title>Mu: Sketching out a minimal system programming language</title>
      <link>http://akkartik.name/post/mu-2019-2</link>
      <pubDate>Tue, 15 Oct 2019 15:18:14 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/mu-2019-2</guid>
      <description><![CDATA[
<p>
<a href='/post/mu-2019-1'>In the previous post</a>, I described what <a href='https://github.com/akkartik/mu#readme'>my
new hobbyist computing stack</a> looks like today, and how the design decisions
seemed to accumulate inevitably from a small set of axiomatic goals. In this
post I describe some (far more speculative) future plans for Mu and try to
articulate how the design process continues to seem pre-ordained.

<p>
(Many of the sections below outline constraints before describing a design
that mostly fits them. This flow is inspired by <a href='https://en.wikipedia.org/wiki/Notes_on_the_Synthesis_of_Form'>
Christopher Alexander's <em>&ldquo;Notes on the synthesis of form&rdquo;</em></a>.)

<p>
<!-- more -->

<p>
To recap, I plan 3 levels of languages for Mu:

<p>
<ul>
<li> Level 1 (SubX): just the processor's instruction set with some syntax
sugar
<li> Level 2 (Mu): a memory-safe language
<li> Level 3 (TBD): an expressive high-level language
</ul>

<p>
So far Mu has just level 1 (described in <a href='/post/mu-2019-1'>the
prequel</a>). After writing 30k LoC in level 1, the thing I miss most is the
safety of a higher-level language. In particular, memory safety. I repeatedly
make the following kinds of errors:

<p>
<ol>
<li> I allocate local variables on the stack but forget to clean them up
before returning.

<p>
<li> I accidentally clobber a register in a function without first saving the
caller's version of it.

<p>
<li> I accidentally clobber memory outside the bounds of an array.

<p>
<li> I accidentally continue to hold on to a heap allocation after I've freed
it.

<p>
</ol>

<p>
The first two are problems that C solves. The latter two are not. Accordingly,
level 2 will have some similarity with C but also major differences. Like C's
original goals, it's intended to be easy to translate to native machine code.
Unlike C, however:

<p>
<ul>
<li> It is intended to <em>remain</em> easy to translate over time. C compilers
have <em>optimization passes</em> that have grown ever more complex and
sophisticated over time in a search for higher performance. Mu is intended to
never have an optimizer.

<p>
<li> It is intended to always be built out of lower level languages. This
imposes some tight constraints on the number of features it can have.

<p>
<li> Performance is not a priority. <a href='https://cr.yp.to/qmail/qmailsec-20071101.pdf'>I
want to make it safe before I make it fast.</a> In particular, to keep the
translation simple, I'm willing to use run-time checks for things like
out-of-bounds memory access.

<p>
<li> To keep the translation simple, I give up mathematical notation. You
won't be able to say things like `<tt>a + b*c</tt>` (that C compilers then
need <a href='https://en.wikipedia.org/wiki/Common_subexpression_elimination'>CSE</a>
to optimize).

<p>
</ul>

<p>
Since the compiler won't optimize code, not just in the initial phase but
<em>ever</em>, the language should allow the programmer complete control over
performance. It won't let you write utterly unsafe code (bump down to machine
code for that), but it shouldn't introduce gratuitous overheads when emitting
machine code. C started out simple to translate, but gained declarative
keywords like <tt>inline</tt> and <tt>register</tt> over time. I want to push
much harder than C on being easy to translate. For the most part there should
be a 1-to-1 mapping between statements and x86 instructions. Just with more
safety.

<p>
&lsquo;Compiling&rsquo; feels too pompous a word for this process. Let's call
it just &lsquo;translating&rsquo; instead. And it turns out to be surprisingly
constraining. If you start out with the goals stated above about the target
instruction set and how parsimonious the translation process needs to be, there
seems to be exactly one core language you can end up with.

<p>
<em>Ingredients</em>

<p>
As hinted above, the language has to be exclusively statement-oriented. While
I <a href='/post/wart'>prefer</a> the syntax of purely expression-oriented languages,
that's something to build higher up in Mu. Not having to translate expressions
to statements avoids the need to detect common sub-expressions and so on.

<p>
I said above that I want instructions to map 1:1 to instructions. The x86
instruction set doesn't allow instructions to access more than one memory
location. The simplest way to work with this restriction is to make registers
explicit in our code, so that they're a constant consideration for the
programmer. Mu will be a manually register-allocated language. It will still
<em>check</em> your register allocation and flag an error if multiple variables
overlap on the same register. But it's up to the programmer to manage registers
and spill to memory when necessary. The good news: the programmer can <em>really</em>
optimize register use when it matters.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
x/<span style='color:DarkSeaGreen'>eax</span> <span style='color:blue; margin-left:1em'># variable x is stored in register <span style='color:DarkSeaGreen'>eax</span></span>
x <span style='color:blue; margin-left:3.4em'># variable x is stored in memory
</pre>

<p>
The language will specify types of variables. The translator need only
compare the input and output types of each statement in isolation.

<p>
Types allow restrictions to instructions to support memory safety. For
example, `<tt>add</tt>` can be restricted to numbers, `<tt>index</tt>` to
arrays, and `<tt>get</tt>` to records (structs). All three may emit the same
instruction in machine code, but the logical semantics allow us to forbid
arbitrary pointer arithmetic.

<p>
In addition to types, variables can also specify their storage location:
whether they're allocated on a register, on the stack or on the data segment.
The skeleton of the translator starts to take shape now: after parsing and
type-checking, a simple dispatch loop that decides what instruction to emit
based on both the operation and where the operands are stored.

<p>
The one feature I miss most when programming in machine code is the ability to
declare a struct/record type and access record fields rather than numeric
offsets. Mu will have user-defined types.

<p>
While most Mu code will be translated 1:1 into binary, there will be a handful
of situations where instructions are inserted without any counterpart in the
sources. The list so far:

<p>
<ul>
<li> Stack management. When I declare local variables I'd like them to be
automatically popped off the stack when exiting scope.
<li> Checking array bounds. Every index into an array compares with its
length. To make this check convenient, all array start with their lengths in
Mu.
<li> Checking pointers into the heap. Heap allocations can be reclaimed, and
we want to immediately flag dereferences to stale pointers after they've been
reclaimed.
</ul>

<p>
This is the hazy plan. Still lots of details to work out.

<p>
<em>Milestones</em>

<p>
Here's an outline of the broad milestones planned for Mu's eponymous level-2
language:

<p>
<ul>
<li> Parse core syntax into some intermediate representation.
<li> Code-generate just instructions with integer operands, for starters.
<li> Variable declarations and scopes.
<li> User-defined types.
<li> Address types. We probably need them. How to make them safe?
<li> Addresses on the heap, and how to detect when they're reclaimed.
</ul>

<p>
I'll sketch out the first three in this post, and then go off to build the
language with just integers. Once I'm done with it I'll flesh out the rest.

<p>
<em>Core syntax</em>

<p>
As I said in <a href='/post/mu-2019-1'>the previous post</a>, machine code for
any processor consists of linear sequences of instructions. These instructions
uniformly consist of an operation and some number of operands. Conceptually:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
op o1, o2, &hellip;
</pre>

<p>
Different processors impose constraints on these operands. In x86 most
instructions can take at most two operands. At most one of them can lie in
memory, and at most operand can be written to. The constraints are independent;
instructions may write to either memory or registers.

<p>
Machine code usually requires multiple primitive instructions to perform a
call to a user-defined function. It seems useful for primitive operations and
calls to user-defined functions to have a uniform syntax.

<p>
It would be nice to be able to see at a glance which operands of an instruction
are read or written.

<p>
Conventional Assembly languages use order to distinguish the output parameter.
However they don't support a syntax for function calls.

<p>
One potential syntax may be:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
o1, o2, &hellip; &larr; op i1, i2, &hellip;
</pre>

<p>
However, this scheme doesn't admit a clean place for operands that are both
read and written. (We'll call them &lsquo;<em>inout</em>&rsquo; operands,
extrapolating from &lsquo;input&rsquo; and &lsquo;output&rsquo;.) Inout
operands are common in x86 machine code, where you can have at most two
operands:

<p>
<ul>
<li> Add operand o1 <em>to</em> o2;
<li> subtract o1 <em>from</em> o2;
<li> compute the bitwise `<tt>AND</tt>` of o1 and o2, and store the result in
o1;
<li> &hellip;and so on.
</ul>

<p>
User-defined functions can also have inout operands. It would be nice to
highlight them uniformly.

<p>
It would also be nice to see at a glance which operands are in registers, and
which are in memory.

<p>
However, specifying both input/output and reg/mem properties explicitly for
every single operand seems excessive, both for the implementor and for users.

<p>
Function calls can pass inputs and outputs either on the stack or in registers.
Typically the stack holds only input or inout operands. You <em>could</em>
move the return address down to make some extra room to write outputs to. But
these stack manipulations take up instructions, and calls would lose a core
symmetry: it's easy and useful to check that the amount of data pushed before
a call matches the amount of data popped off after.

<p>
Function calls can return results in registers, but callers must always know
precisely which registers are being written. Separating output registers gives
a sense of flexibility that is false.

<p>
Finally, registers can't hold types larger than a single word. Larger types
need memory, and often they need to allocate this memory. If we assume memory
management is manual, it makes functions more self-contained and easy to test
if they don't do their own memory allocation. Instead the predominant idiom in
C and Assembly is for the caller to be responsible for allocating space for
results. The callee simply writes into this space.

<p>
That's a lot of constraints, and quite heterogenous! Putting them all together,
it seems to me they point toward only one solution:

<p>
<ul>
<li> Give up on separating inputs from outputs in the general case. C is wise
here.

<p>
<li> Highlight output registers of function calls. These are just for documentation
and error-checking. As we saw above, you can't mix and match arbitrary
registers into the outputs of a call. But it's still nice for the reader to be
able to tell at a glance which registers are being modified where, and for the
translator to detect when the wrong output register is assumed in a call.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>fn</span> factorial n : int <span style='color:DarkSeaGreen'>&rarr;</span> result/<span style='color:DarkSeaGreen'>eax</span> : int [
&nbsp;  &hellip;
]
<span style='color:blue'># call</span>
x/<span style='color:DarkSeaGreen'>eax</span> &larr; factorial 20  <span style='color:blue; margin-left:1em'># ok</span>
x/<span style='color:DarkSeaGreen'>ecx</span> &larr; factorial 20  <span style='color:blue; margin-left:1em'># error</span>
</pre>

<p>
<li> Highlight output registers of primitives:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
x/<span style='color:DarkSeaGreen'>eax</span> &larr; copy y
</pre>

<p>
Unlike function calls, primitive instructions can usually write to any
register.

<p>
<li> Put inout registers of primitive instructions only on the output. For
example adding y to x when x is a register:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
x/<span style='color:DarkSeaGreen'>eax</span> &larr; add y
</pre>

<p>
<li> Other than registers, memory addresses written to are really inout. Never
put them on the output side. The above two instructions when the destination is
not a register:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
copy-to x, y/<span style='color:DarkSeaGreen'>ecx</span>
add-to x, y/<span style='color:DarkSeaGreen'>ecx</span>
copy-bytes-to x, y <span style='color:blue; margin-left:1em'># function call</span>
</pre>

<p>
The <tt>-to</tt> suffix helps indicate which operand is written. It also seems
to suggest a convention of putting outputs first. You <em>could</em> use a
<tt>-from</tt> suffix and keep outputs clearly at the end. But what if you have
multiple inputs? It seems more common to have a single output.

<p>
</ul>

<p>
<em>Code generation</em>

<p>
Once we've parsed a core syntax we need to emit code for each instruction.
Function calls are straightforward, given pre-existing syntax sugar. In Mu:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
o1, o2 <- f i1, i2
</pre>

<p>
In machine code with syntax sugar:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
(f i1 i2) <span style='color:blue; margin-left:1em'># output registers are implied</span>
</pre>

<p>
In machine code without syntax sugar:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
push i2
push i1
call f
add 8 bytes to <span style='color:DarkSeaGreen'>esp</span>
</pre>

<p>
Since each operand is in a separate instruction, it is free to live in a
register or memory.

<p>
Now for primitives. I'll show a quick sketch of how we translate <tt>add</tt>
instructions to x86 opcodes based on the storage type of their operands. Other
instructions will follow similar logic.

<p>
The x86 instruction set operates on registers, memory or literals. Data in
registers is addressed directly. Data in the stack is accessed in offsets of
the stack pointer: `<tt>*(esp+n)</tt>` where `<tt>n</tt>` is usually a static
value. Global variables are accessed as labels (32-bit addresses). We also
capitalize global variables by convention.

<p>
Each add instruction takes two operands. Here are the opcodes (and optional
sub-opcodes) we care about, from <a href='https://c9x.me/x86/html/file_module_x86_id_5.html'>the
Intel manual</a> (with optional sub-opcode or <em>subop</em> after the slash):

<p>
<style>
.table1 th { font-weight:bold; }
table.table1, .table1 th, .table1 td {
&nbsp;  border:1px solid black;
&nbsp;  padding:0.5em;
}
</style>
<table class='table1' style='margin-left:4em; margin-top:-1.5em'>
<tr>
&nbsp;  <th width='100px'></th>
&nbsp;  <th colspan=4 style='width:300px; text-align:center'>&larr; Source operand &rarr;</th>
</tr>
<tr>
&nbsp;  <th>Destination<br>Operand &darr;</th>
&nbsp;  <th>Literal</th>
&nbsp;  <th>Register</th>
&nbsp;  <th>Local</th>
&nbsp;  <th>Global</th>
</tr>
<tr>
&nbsp;  <td style='font-weight:bold'>Register</td>
&nbsp;  <td>81 /0</td>
&nbsp;  <td>01</td>
&nbsp;  <td>03</td>
&nbsp;  <td>03</td>
</tr>
<tr>
&nbsp;  <td style='font-weight:bold'>Local</td>
&nbsp;  <td>81 /0</td>
&nbsp;  <td>01</td>
&nbsp;  <td>X</td>
&nbsp;  <td>X</td>
</tr>
<tr>
&nbsp;  <td style='font-weight:bold'>Global</td>
&nbsp;  <td>81 /0</td>
&nbsp;  <td>01</td>
&nbsp;  <td>X</td>
&nbsp;  <td>X</td>
</tr>
</table>

<p>
Alternatively, showing where operands are stored (destination first):

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='display:inline-block; width:200px'>reg += literal</span> &#8658; 81 0/subop %reg literal/imm32
<span style='display:inline-block; width:200px'>stack += literal</span> &#8658; 81 0/subop *(esp+offset) literal/imm32
<span style='display:inline-block; width:200px'>Global += literal</span> &#8658; 81 0/subop *Global literal/imm32
<span style='display:inline-block; width:200px'>reg += reg2</span> &#8658; 01 %reg reg2/r32
<span style='display:inline-block; width:200px'>stack += reg2</span> &#8658; 01 *(esp+offset) reg2/r32
<span style='display:inline-block; width:200px'>Global += reg2</span> &#8658; 01 *Global reg2/r32
<span style='display:inline-block; width:200px'>reg += stack</span> &#8658; 03 *(esp+offset) reg/r32
<span style='display:inline-block; width:200px'>reg += Global</span> &#8658; 03 *Global reg/r32
</pre>

<p>
Other binary operations get translated similarly. (For details on the <em>metadata</em>
after slashes, see the section on &ldquo;Error-checking&rdquo; in <a href='/post/mu-2019-1'>the
previous post</a> and <a href='https://github.com/akkartik/mu#readme'>the
project Readme</a>.)

<p>
<em>Variable declarations</em>

<p>
Mu needs to be able to declare friendly names for locations in memory or
register. How would they work? Again, we'll start with some goals/constraints:

<p>
<ul>
<li> We'd like to be able to attach types to registers or memory locations, so
that errors in writing incompatible values to them can be quickly caught.

<p>
<li> Registers can only hold a word-sized type.

<p>
<li> We shouldn't have to explicitly deallocate variables when exiting a scope.

<p>
<li> We'd like to allocate vars close to their use rather than at the top of a
function. Ideally we'd like to support scopes in blocks rather than just for
entire function bodies.

<p>
<li> We'd like to be able to exit early and automatically clean up.

<p>
<li> We'd like to avoid unnecessary shadowing. If a variable isn't used anymore it
shouldn't need to be restored.

<p>
</ul>

<p>
Here's a declaration for a local variable with a type:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>var</span> x : <span style='color:DarkSeaGreen'>int</span>
</pre>

<p>
This gets translated to (&#8658;)

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
subtract 4 from <span style='color:DarkSeaGreen'>esp</span></span>
</pre>

<p>
You can also initialize a variable with 0's:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>var</span> x : <span style='color:DarkSeaGreen'>int</span> &larr; 0
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
push 0
</pre>

<p>
Variables on the stack (offsets from <span style='color:DarkSeaGreen'>esp</span>)
have a single type for their entire lifetimes.

<p>
If you're allocating a variable to a register, you can also turn any instruction
writing to a single register into an initialization:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>var</span> x/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy y
</pre>

<p>
It's also fine to read from a register and write to a conceptually new variable in the same register:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>var</span> opcode/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; and inst/<span style='color:DarkSeaGreen'>ecx</span> 0xff
</pre>

<p>
While register variables have a single type for their entire lifetimes, it's totally fine for a register to be used by multiple variables of distinct types in a single function or block. Put another way, register variables tend to have shorter lifetimes than variables on the stack.

<p>
<hr style='width:20%; border:1px dotted #888'>

<p>
Variable declarations interact with Mu's <tt>{}</tt> blocks. Here's a variable allocated on the stack:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> x : <span style='color:DarkSeaGreen'>int</span> &larr; 0
<span style='margin-left:2em'>&hellip;</span>
}
</pre>

<p>
This gets translated to (&#8658;) the following level-1 pseudocode:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='margin-left:2em'>push 0</span>
<span style='margin-left:2em'>&hellip;</span>
<span style='margin-left:2em'>add 4 to <span style='color:DarkSeaGreen'>esp</span></span>
}
</pre>

<p>
Similarly, a variable allocated to a register:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> x/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy y
<span style='margin-left:2em'>&hellip;</span>
}
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='margin-left:2em'>push <span style='color:DarkSeaGreen'>ecx</span></span> <span style='color:blue; margin-left:1em'># spill register</span>
<span style='margin-left:2em'><span style='color:DarkSeaGreen'>ecx</span> &larr; copy y</span>
<span style='margin-left:2em'>&hellip;</span>
<span style='margin-left:2em'>pop to <span style='color:DarkSeaGreen'>ecx</span> <span style='color:blue; margin-left:1em'># restore register</span></span>
}
</pre>

<p>
(This approach fits a calling convention where all registers are saved by the
callee rather than the caller.)

<p>
Variables don't always need to be shadowed; sometimes they're safe to clobber.
Rather than taking on responsibilities of analyzing lifetimes, Mu provides a
single simple rule: the first variable in a register shadows previous values,
and subsequent variables to the same register in the same block clobber:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> x/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy y
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> z/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; &hellip;A&hellip;
<span style='margin-left:2em'>&hellip;B&hellip;</span>
}
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='margin-left:2em'>push <span style='color:DarkSeaGreen'>ecx</span></span>
<span style='margin-left:2em'><span style='color:DarkSeaGreen'>ecx</span> &larr; copy y</span>
<span style='margin-left:2em'><span style='color:DarkSeaGreen'>ecx</span> &larr; &hellip;A&hellip;</span>
<span style='margin-left:2em'>&hellip;B&hellip;</span>
<span style='margin-left:2em'>pop to <span style='color:DarkSeaGreen'>ecx</span>
}
</pre>

<p>
To shadow, create a new inner block.

<p>
<hr style='width:20%; border:1px dotted #888'>

<p>
Early exits should also clean up any variables in a block.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> x/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy y
<span style='color:DarkSeaGreen; margin-left:2em'>break-if</span> &hellip;A&hellip;
<span style='margin-left:2em'>&hellip;B&hellip;</span>
}
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='margin-left:2em'>push <span style='color:DarkSeaGreen'>ecx</span></span>
<span style='margin-left:2em'>ecx &larr; copy y</span>
<span style='margin-left:2em'>{</span>
<span style='color:DarkSeaGreen; margin-left:4em'>break-if</span> &hellip;A&hellip;
<span style='margin-left:4em'>&hellip;B&hellip;</span>
<span style='margin-left:2em'>}</span>
<span style='margin-left:2em'>pop to <span style='color:DarkSeaGreen'>ecx</span>
}
</pre>

<p>
Early returns are more complicated, because we may need to unwind multiple
blocks. The Mu translator is responsible for tracking the size of the stack
frame at any point, and updating the stack appropriately before returning.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='color:DarkSeaGreen; margin-left:2em'>var</span> x/<span style='color:DarkSeaGreen'>ecx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy y
<span style='color:DarkSeaGreen; margin-left:2em'>return-if</span> &hellip;A&hellip;
<span style='margin-left:2em'>&hellip;B&hellip;</span>
}
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
<span style='margin-left:2em'>push <span style='color:DarkSeaGreen'>ecx</span></span> <span style='color:blue; margin-left:1em'># spill register</span>
<span style='margin-left:2em'>ecx &larr; copy y</span>
<span style='margin-left:2em'>{</span>
<span style='color:DarkSeaGreen; margin-left:4em; font-weight:bold'>break-unless</span> &hellip;A&hellip;
<span style='margin-left:4em'>&laquo;increment <span style='color:DarkSeaGreen'>esp</span> appropriately&raquo;</span>
<span style='color:DarkSeaGreen; margin-left:4em'>return</span>
<span style='margin-left:2em'>}</span>
<span style='margin-left:2em'>&hellip;B&hellip;</span>
<span style='margin-left:2em'>pop to <span style='color:DarkSeaGreen'>ecx</span> <span style='color:blue; margin-left:1em'># single restore</span></span>
}
</pre>

<p>
<em>Wrap up</em>

<p>
Putting these ideas together, here's what factorial looks like in a bare-bones
but type-safe system programming language targeting x86 (supporting only
integers so far):

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:DarkSeaGreen'>fn</span> factorial n : int <span style='color:DarkSeaGreen'>&rarr;</span> result/<span style='color:DarkSeaGreen'>eax</span> : int {
<span style='color:blue; margin-left:2em'># if (n <= 1) return 1</span>
<span style='margin-left:2em'>compare n, 1
<span style='margin-left:2em'>{
<span style='color:DarkSeaGreen; margin-left:4em'>break-if</span> &gt;
<span style='color:DarkSeaGreen; margin-left:4em'>return</span> 1
<span style='margin-left:2em'>}
<span style='color:blue; margin-left:2em'># otherwise return n * factorial(n-1)</span>
<span style='margin-left:2em'>{
<span style='color:DarkSeaGreen; margin-left:4em'>break-if</span> &lt;=
<span style='color:blue; margin-left:4em'># var tmp = n-1</span>
<span style='color:DarkSeaGreen; margin-left:4em'>var</span> tmp/<span style='color:DarkSeaGreen'>ebx</span> : <span style='color:DarkSeaGreen'>int</span> &larr; copy n
<span style='margin-left:4em'>decrement tmp</span>
<span style='color:blue; margin-left:4em'># return n * factorial(tmp)</span>
<span style='color:DarkSeaGreen; margin-left:4em'>var</span> tmp2/<span style='color:DarkSeaGreen'>eax</span> : <span style='color:DarkSeaGreen'>int</span> &larr; factorial tmp
<span style='margin-left:4em'>result &larr; multiply n, tmp2</span>
<span style='color:DarkSeaGreen; margin-left:4em'>return</span> result
<span style='margin-left:2em'>}
}
</pre>

<p>
And here's the level-1 machine code I plan for it to be translated to:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
factorial:
<span style='color:blue; margin-left:2em'># function prologue</span>
<span style='margin-left:2em'>55/push-<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>89/&lt;- %<span style='color:DarkSeaGreen'>ebp</span> 4/r32/<span style='color:DarkSeaGreen'>esp</span></span>
<span style='color:blue; margin-left:2em'># if (n <= 1) return 1</span>
<span style='margin-left:2em'>81 7/subop/compare *(<span style='color:DarkSeaGreen'>ebp</span>+8)</span>
<span style='margin-left:2em'>{</span>
<span style='margin-left:4em'>7f/jump-if-greater <span style='color:DarkSeaGreen'>break</span>/disp8</span>
<span style='margin-left:4em'>b8/copy-to-<span style='color:DarkSeaGreen'>eax</span> 1/imm32</span>
<span style='margin-left:4em'>e9/jump $factorial:end/disp32</span>
<span style='margin-left:2em'>}</span>
<span style='color:blue; margin-left:2em'># otherwise return n * factorial(n-1)</span>
<span style='margin-left:2em'>{</span>
<span style='margin-left:4em'>7e/jump-if-lesser-or-equal <span style='color:DarkSeaGreen'>break</span>/disp8</span>
<span style='color:blue; margin-left:4em'># var tmp = n-1</span>
<span style='margin-left:4em'>53/push-<span style='color:DarkSeaGreen'>ebx</span></span> <span style='color:grey; margin-left:1em'># spill</span>
<span style='margin-left:4em'>8b/-> *(<span style='color:DarkSeaGreen'>ebp</span>+8) 3/r32/<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='margin-left:4em'>4b/decrement-<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='color:blue; margin-left:4em'># return n * factorial(tmp)</span>
<span style='margin-left:4em'>(factorial %<span style='color:DarkSeaGreen'>ebx</span>)</span> <span style='color:blue; margin-left:1em'># &rarr; <span style='color:DarkSeaGreen'>eax</span></span></span>
<span style='margin-left:4em'>f7 4/subop/multiply-into-eax *(<span style='color:DarkSeaGreen'>ebp</span>+8)</span>
<span style='margin-left:4em'>5b/pop-to-<span style='color:DarkSeaGreen'>ebx</span></span> <span style='color:grey; margin-left:1em'># restore</span>
<span style='margin-left:4em'>e9/jump $factorial:end/disp32</span>
<span style='margin-left:2em'>}</span>
$factorial:end:
<span style='color:blue; margin-left:2em'># function epilogue</span>
<span style='margin-left:2em'>89/&lt;- %<span style='color:DarkSeaGreen'>esp</span> 5/r32/<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>5d/pop-to-<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>c3/return</span>
</pre>

<p>
This isn't as efficient as what I wrote by hand at the bottom of <a href='/post/mu-2019-1'>the
previous post</a>, but seems like a realistic translation based on this post.

<p>
Ok, time to get to work building it. I'd love to hear suggestions or feedback
based on this design, either in the comments below or <a href='mailto:ak@akkartik.com'>over
email</a>.

<p>
<div class='btw'>(Thanks Garth Goldwater, <a href='https://twitter.com/vladimir_vg'>Vladimir Gordeev</a>,
<a href='https://twitter.com/CodingFiend'>Edward de Jong</a>, <a href='https://breckyunits.com'>Breck
Yunits</a>, <a href='http://boomla.com'>Tibor Halter</a>, <a href='https://github.com/rdentato'>Remo
Dentato</a> and <a href='http://www.federicopereiro.com'>Federico Pereiro</a>
for helpful feedback on drafts of this post.)</div>

<p>
<em>comments</em>

<p>
<ul><div class="comment">
&nbsp;&nbsp; <li><a name="ccfedb5e9583df1a16c5422635c1b9ffd675e7f2a076f7985eb37911ea17cd64"></a>Peter Van Sandt, 2019-11-22: Interesting to see something that is even more a nicer version of codegen for assembly. I also see how optimizers make programmers less able to reason about the output assembly. Couple of thoughts:

<p>
1. Have you considered based in this on SSA or LLVM IR rather than x86 ASM? It might be interesting to see what that would look like if there was a way to write LLVM IR in a way that was designed for humans to write.

<p>
2. Mathematical equivalences and fancy bitmasking is pretty hard for humans to come up with for every place that it's needed, and without an optimizer, you lose that. To keep the 1-1 equivalence of code with output, what do you think about an optimizing linter? Something that could complain about areas where your code could be optimized, but not actually do it for you.
&nbsp;&nbsp; <ul><div class="comment">
&nbsp;&nbsp;   <li><a name="31fc1decb6f99cd804f30275af0bb9a17f4d1240fbb2c6442c215eae870d6044"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2019-11-22: I like both ideas! An optimizing linter sounds like a <em>great</em> idea, because it goes with the grain of this project: to keep humans in the loop, to encourage them to understand details, and to reward curiosity.

<p>
I'd be curious to hear what about LLVM IR you find to be hard for humans to write. It seemed fairly clean when I looked at it. The problem is not so much with the IR syntax itself but with all the myriad design decisions in the implementation that assume it's going to be just compilers emitting the IR.
&nbsp;&nbsp;   <ul><div class="comment">
&nbsp;&nbsp;     <li><a name="dc758ee01d298c288b5455a0b12c69dbea258ad628774bd45396269cb7283214"></a>Anonymous, 2019-12-09: An optimizing linter has the problem of being destructive.  It goes like this:

<p>
The programmer will write his or her program in a readable way.  They'll run it through the compiler, which points out that something can be optimized, the programmer—having already gone through the process of writing the first implementation with all its constraints and other ins and outs fresh in their mind—will slap their head and mutter "of course!", and then replace the original naive implementation with one based on the notes the compiler has given.  Chances are high that the result will be less comprehensible to other programmers who come along—or even to the same programmer revisiting their own code 6 months later.

<p>
What you really want is something like Knuth's "interactive program-manipulation system". <a href='https://cr.yp.to/qhasm/literature.html'>https://cr.yp.to/qhasm/literature.html</a>

<p>
The first use case could be doing away with inline concerns that `factorial`'s `tmp2` is a register variable.  The original program P would elide storage class annotation altogether (which by default would be understood that it resides on the stack), and then your "transformations that make it efficient" would include specifying that `tmp2` can/should actually live in eax.

<p>
I've had a draft post on this topic for a while and just now dumped it on keybase. <a href='https://crussell.keybase.pub/drafts/optimization-after-the-fact.markdown?text=1'>https://crussell.keybase.pub/drafts/optimization-after-the-fact.markdown?text=1</a>

<p>
Also AIUI, "LLVM IR" doesn't actually exist.  It's not stable.
&nbsp;&nbsp;     <ul><div class="comment">
&nbsp;&nbsp;       <li><a name="e7863cb7a4b38124b5d5235a5c1f00adaa84ffce9aa193343b1390af63afa9f5"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2019-12-09: Very interesting draft, thanks. I ended up <a href='https://news.ycombinator.com/favorites?id=akkartik'>favoriting</a> both <a href='https://news.ycombinator.com/item?id=13440324#13442157'>the Animats comment you linked to</a> as well as its parent thread on Dafny.

<p>
Even though I find it more interesting, I don't follow all the intricacies of your verification-side argument (and theirs); I don't have much background in verification yet. But I spent some time thinking about the rewrite system you describe with similarities to Literate Programming and Aspect-Oriented Programming. I haven't been able to come up with a good way to separate a declarative 'spec' from optimizations using these techniques. This is why I'm currently using the hackier approach of a) having lots of tests, and b) gradually building up a complex system across multiple layers. My layers often end up repeating the code of an earlier layer but with greater elaboration. While the repetition is distasteful, it does allow the reader to start with a simpler version at an earlier layer and then separately see how it's optimized. And the tests ensure that the new version continues to pass old constraints in addition to new ones defined alongside it.

<p>
Maybe someday I or someone else will be able to replace many of Mu's tests with a verification system of some sort. But for now it's been hard enough to just support these hacks while bootstrapping from machine code.
</div></ul>
</div></ul>
</div></ul>
</div></ul>
]]></description>
    </item>
    <item>
      <title>Mu: A minimal hobbyist computing stack</title>
      <link>http://akkartik.name/post/mu-2019-1</link>
      <pubDate>Mon, 14 Oct 2019 15:13:21 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/mu-2019-1</guid>
      <description><![CDATA[
<p>
<div style='margin-left:3em'>
<span class='left_quote_char'>&ldquo;</span><em>It is far better to have an
under-featured product that is rock solid, fast, and small than one that
covers what an expert would consider the complete requirements.&rdquo;</em>
<br>&mdash; <a href='https://en.wikipedia.org/wiki/Richard_P._Gabriel'>Richard
Gabriel</a>'s <a href='https://www.dreamsongs.com/Files/PatternsOfSoftware.pdf'>best
summary</a> (pg 219) of his essay, <a href='https://www.dreamsongs.com/RiseOfWorseIsBetter.html'>&ldquo;Worse
is Better&rdquo;</a>
</div>

<p>
Over the past year I've been working on a minimal-dependency hobbyist computing
stack (everything above the processor) called <a href='https://github.com/akkartik/mu#readme'>Mu</a>.
The goal is to:
<ol>
<li> build up infrastructure people can enjoy programming on,
<li> using as little code as possible, so that people can also hack on the
underpinnings, modifying them to suit diverse desires.
</ol>

<p>
<!-- more -->

<p>
Conventional stacks kinda support 1 if you squint, but they punt on 2, so it
can take years to understand just one piece of infrastructure (like the C
compiler). It looks like nobody understands the entire stack anymore. I'd like
Mu to be a stack a single person can hold in their head all at once, and modify
in radical ways.

<p>
(While it should support understanding everything, you aren't expected to
understand everything. Mu tries to reward curiosity, but should get out of the
way when you're just trying to get something done.)

<p>
One implication of fitting in a single head: Mu constrains the number of
supported languages. Languages have a way of growing into isolated universes,
and interoperation between languages adds its own complexities. It seems
better for future readers if the stack minimizes the number of such boundaries,
even if writers are inconvenienced somewhat.

<p>
I eventually want to make it easy to swap out one kind of language for
another, so that people can program in Lisp-like or Python-like or C-like
syntax according to their taste. But regardless of which notations any given Mu
computer has, it will be parsimonious in the number of <a href='https://www.jsoftware.com/papers/tot.htm'>notations</a>
readers have to learn to understand and take ownership of a single computer.

<p>
Each of these notations needs to be simple enough that it can be implemented
out of lower-level notations. The fact that C compilers are written in C
contributes a lot of the complexity that makes compilers black magic to most
people.

<p>
A year in, it's surprising how inevitable the design has seemed. If the
journey starts at a specific processor architecture and the goal is to
minimize layers of translation above it while paying attention to dependencies,
the final destination seems quite inevitable (ignoring minor syntactic
choices). This discovery seems worth sharing more broadly, if only so others
can prove me wrong and suggest alternative designs.

<p>
<em>Outline</em>

<p>
Since Mu looks quite different from conventional stacks, I need a few blog
posts to cover all the ground. My plan is to give Mu just 3 distinct languages
(only the first currently exists; the third is not even planned yet).

<p>
<ul>
<li> Level 1: just the processor's instruction set with some syntax sugar
<li> Level 2: a memory-safe language
<li> Level 3: an expressive high-level language
</ul>

<p>
These levels don't quite map to familiar terms. Level 1 has some attributes of
machine code and Assembly. Level 2 has some attributes of Assembly and very
high-level languages.

<p>
An alternative view of this stack is <a href='/parnas.pdf'>the
primary affordance each level provides</a>:

<p>
<ul>
<li> Level 1: structured control flow
<li> Level 2: strong typing
<li> Level 3: mould the computer to how people think
</ul>

<p>
Each of these affordances requires a global change to the programming model.
Within each level I try to keep things as <a href='/post/habitability'>habitable</a> as
possible using just local syntax sugar (tools that don't need to understand
the entire codebase).

<p>
The rest of this post focuses on level 1. <a href='/post/mu-2019-2'>The
sequel</a> describes a preliminary design for level 2. I won't get to level 3
until I finish building level 2. So far I've written 40k lines of code (LoC)
in level 1, and 30k LoC in <a href='/post/mu'>an earlier prototype</a> that
has some resemblance to level 2. Since I want each level to be habitable
in isolation, it seems like a good idea to write a decent amount of code in it
before moving on to higher levels. (This is a big difference from <a href='http://web.archive.org/web/20061108010907/http://www.rano.org/bcompiler.html'>past</a>
<a href='https://github.com/kragen/stoneknifeforth'>approaches</a> to <a href='http://git.savannah.nongnu.org/cgit/stage0.git/tree/README'>bootstrapping</a>,
which try to do as little as possible in each level before jumping to higher
levels. The lower levels inevitably end up being hard for newcomers to
understand or interactively take apart.)

<p>
<em>Level 1: SubX</em>

<p>
At the lowest level my goal is to pick a processor and make the experience of
programming in raw machine code not completely suck.

<p>
I mentioned above that each level builds on levels below, but that's a lie at
this level. I don't enjoy editing binary and don't want my readers to have to
either. Instead I have an alternative plan, with two prongs:

<p>
<ul>
<li> A C++ translator that converts a textual format to binary.
<li> A self-hosted translator that converts the textual format to binary.
</ul>

<p>
The C++ translator is more familiar, but pulls in some gigantic dependencies.
The self-hosted translator will look strange but have a tiny codebase and a
tiny surface area (just 3 OS syscalls, for example). Both translators will
emit <em>identical</em> binaries, which is helpful when debugging the system.

<p>
Complete details for how to operate these tools are in the <a href='https://github.com/akkartik/mu#readme'>Readme</a>.
This post is a quick tour of the major design choices, but if it's interesting
you should clone the repo and read the Readme in a text editor. Mu is really
intended to be <a href='/post/comprehension'>played with interactively, rather
than passively read</a>.

<p>
Ok, design choices. My computer runs x86, and it's the most open platform I
have. So Mu is going to run on x86. It is explicitly not designed to be
portable. So we'll start with a quick tour of x86. Here's what you need to
know.

<p>
The x86 instruction set is 40 years old. It started out as an 8-bit processor,
then turned into a 16-bit, 32-bit and now 64-bit processor. It has accumulated
hacks and bolted on features over its history. To help it fit in my head, I've
chosen a <a href='https://github.com/akkartik/mu/blob/master/subx_opcodes'>fairly
regular subset</a> of x86 for SubX (hence the name), focusing on just 32-bit
values (and a couple of instructions for 8-bit bytes so that I can iterate
over strings).

<p>
<em>Instructions</em>

<p>
Machine code for any processor consists of linear sequences of instructions.
All the nested block structure and nested calls of higher-level languages are
gone by this point. All you have is long lists of instructions with the
ability to conditionally skip some subsequences and run some subsequences
repeatedly. (Really, instructions are just numbers, so all you have are long
sequences of numbers.)

<p>
Every instruction in machine code starts with an <em>opcode</em>, some series
of bits that specifies which instruction to run. Assembly languages then
provide friendly names for opcodes like `<tt>add</tt>` and `<tt>jump</tt>`. In
a complex instruction set like for the x86 processor, `<tt>add</tt>` maps to
<a href='https://c9x.me/x86/html/file_module_x86_id_5.html'>a whole family of
opcodes</a>, and there's significant logic to deduce the opcode for an
instruction based on its arguments. It also takes a significant amount of code
to provide good error messages. There are many reasonable-seeming combinations
of operands that x86 doesn't let you add together, and it's hard to compress
that knowledge into a short error message tailored to a specific situation.

<p>
SubX doesn't provide names for opcodes; you have to use the raw opcodes
directly. This eliminates code for translating names to opcodes, and it also
enormously simplifies error messages. The programmer needs to work from the
list of opcodes, and each error needs to only handle a single case. In
practice, this hasn't felt like a major hindrance, because understanding error
messages returned by Assemblers often requires understanding the underlying
opcodes anyway. I've been finding good error messages to be more valuable than
syntactic conveniences. (I'd love <a href='mailto:mu@akkartik.com'>feedback</a>
on this decision.)

<p>
To see the complete list of opcodes supported by SubX at any point in time,
type:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:#888'>$</span> bootstrap help opcodes
</pre>

<p>
<em>Operands</em>

<p>
Each instruction operates on some small number of operands. In x86 instructions
can't take more than 2 operands for the most part. Since there are only two
operands, and since binary operations like addition are fundamental, most
instructions both read and write from one operand:

<p>
<ul>
<li> Add operand o1 <em>to</em> o2;
<li> subtract o1 <em>from</em> o2;
<li> compute the bitwise `<tt>AND</tt>` of o1 and o2, and store the result in
o1;
<li> &hellip;and so on.
</ul>

<p>
Operands may be stored in either <em>registers</em> or memory. Registers are
some small number of named (really numbered) locations. The x86 processor has
8:

<p>
<ul>
<li> <span style='color:DarkSeaGreen'>eax</span> (0)
<li> <span style='color:DarkSeaGreen'>ecx</span> (1)
<li> <span style='color:DarkSeaGreen'>edx</span> (2)
<li> <span style='color:DarkSeaGreen'>ebx</span> (3)
<li> <span style='color:DarkSeaGreen'>esp</span> (4)
<li> <span style='color:DarkSeaGreen'>ebp</span> (5)
<li> <span style='color:DarkSeaGreen'>esi</span> (6)
<li> <span style='color:DarkSeaGreen'>edi</span> (7)
</ul>

<p>
Operands in memory are also usually specified using registers somehow:

<p>
<ul>
<li> `*<span style='color:DarkSeaGreen'>eax</span>`: value in memory at address provided in <span style='color:DarkSeaGreen'>eax</span>
<li> `*(<span style='color:DarkSeaGreen'>ecx</span>+4)`: value in memory at address <span style='color:DarkSeaGreen'>ecx</span>+4
<li> &hellip;and so on.
</ul>

<p>
All this gets turned into numbers, but we don't need the details for this
post. Consult the Readme for details.

<p>
<em>Error-checking</em>

<p>
One big problem programming in raw x86 machine code is that instructions are
not all the same size. Instruction boundaries aren't aligned to every 4 bytes,
or something like that. The processor is <em>parsing</em> future instruction
boundaries as it reads in earlier instructions. It's really easy to accidentally
add an extra byte or forget a byte to an instruction. When that happens, bytes
that were intended to be opcodes can be interpreted as operands, and <em>vice
versa</em>. The program goes silently off the rails, and may not show an error
message until much later. The difficulty of debugging such errors is arguably
the single biggest obstacle to a good programming experience.

<p>
This is the point at which Assembly languages nope out and go build a whole new
syntax for themselves, categorizing opcodes into instruction names and so on.
Since SubX needs to be self-hosted and every feature in it needs to be
programmed in SubX, its solution to this problem is to stay with machine code,
but add lots of space for <em>metadata</em>. Here's a sample instruction in
SubX:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
89/&lt;- 5/rm32/<span style='color:DarkSeaGreen'>ebp</span> 4/r32/<span style='color:DarkSeaGreen'>esp</span> <span style='color:blue; margin-left:2em'># copy <span style='color:DarkSeaGreen'>esp</span> to <span style='color:DarkSeaGreen'>ebp</span></span> 
</pre>

<p>
Instructions lie all on one line. They consist of multiple words separated by
whitespace. Each word contains a <em>datum</em> until the first slash. The
datum is the only part that makes it into the binary. Everything in a word
after the datum and the first slash is metadata. As you see, you can have
multiple bits of metadata squirrelled away in a word, all separated by slashes.

<p>
Metadata is optional and often ignored, so it's a good place for little
comments. However, certain special words for <em>argument types</em> trigger
error-checking. In the above instruction, Mu ignores the `<tt>/&lt;-</tt>` and
reads just the &lsquo;89&rsquo;. However, it knows that opcode 89 expects a
`<tt>/rm32</tt>` and `<tt>/r32</tt>` argument. If it fails to see one of them
in the rest of the instruction, it immediately flags an error. If it sees any
of the other known argument types in the instruction when it doesn't expect
them&mdash;it immediately flags an error. (The register names `/<span style='color:DarkSeaGreen'>ebp</span>`
and `<span style='color:DarkSeaGreen'>esp</span>` are just comments to aid the
reader.)

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
<span style='color:#888'>$</span> bootstrap translate init.linux examples/ex1.subx -o a.elf
'bb/copy-to-ebx' (copy imm32 to EBX): missing imm32 operand
</pre>

<p>
(The code samples in this post hide some details that aren't important on a
first encounter with Mu. I'll elide the actual opcodes further down.)

<p>
While this use of metadata is specific to the properties of x86, they're a
general mechanism for lots of different checks one may want to apply. I used
them in several ways in <a href='https://github.com/akkartik/mu1'>a previous
prototype</a>. I've started to think of the structure of words and metadata as
&ldquo;s-statements&rdquo;, by analogy with s-expressions. A fairly fundamental
uniform syntax that can be used in diverse situations where one doesn't want
arbitrary nesting.

<p>
<em>Structured programming</em>

<p>
As I mentioned above, the primary motivation for the SubX layer is to make
working with control flow more ergonomic. The x86 processor tracks what
instruction to execute using the special <span style='color:DarkSeaGreen'>eip</span>
register. This register can't be modified by most instructions. Only jumps and
calls modify it.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
jump 237  <span style='color:blue; margin-left:10em'># add 237 to <span style='color:DarkSeaGreen'>eip</span></span> 
</pre>

<p>
It gets tedious to adjust the byte offsets every time you add or delete an
instruction. So SubX provides named <em>labels</em> for specific points in the
instruction stream, just like conventional Assembly languages. For example:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
jump-if-equal $foo
&hellip;
$foo:  <span style='color:blue; margin-left:12em'># jump to here</span> 
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
jump-if-equal 237
</pre>

<p>
Another convenience (not usually found in Assembly language) is the special
labels `<tt>{</tt>` and `<tt>}</tt>`, `<tt>break</tt>` and `<tt>loop</tt>`.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
{
&nbsp;  jump-if-equal break
&nbsp;  &hellip;
&nbsp;  jump loop
}
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
$loop1:
&nbsp;  jump-if-equal $break1
&nbsp;  &hellip;
&nbsp;  jump $loop1
$break1:
</pre>

<p>
Basically the `<tt>{</tt>` and `<tt>}</tt>` get translated into labels, and
`<tt>break</tt>` gets translated into a jump to the enclosing `<tt>}</tt>`.
Correspondingly, `<tt>loop</tt>` gets translated to a jump to the enclosing
(earlier) `<tt>{</tt>`. This syntax is surprisingly ergonomic and
<a href='/post/mu'>proved surprisingly easy to teach to non-programmers during
the Mu1 prototype</a>. Where you would say, in an imperative language:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
if (&lt;predicate&gt;) {
&nbsp;  &hellip;
}
</pre>

<p>
in SubX you would say:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
&lt;flag&gt; = &lt;predicate&gt;
{
&nbsp;  jump-unless &lt;flag&gt; break
&nbsp;  &hellip;
}
</pre>

<p>
<em>Functions</em>

<p>
Idiomatic machine code programs consist of <em>functions</em> operating on
<em>data</em>. No matter what your high-level language does, and no matter
what processor it runs on, at bottom it's translated to some interplay between
functions and data. This section gets into that interplay in some detail.

<p>
We keep a program's functions and data segregated in memory to a <em>code
segment</em> and <em>data segment</em>, respectively. (There are a variety of
historic reasons for this that are not interesting. There is also one currently
topical reason that is very interesting: security. A segment is a contiguous
block of memory that gets a single access restriction. Code segments can be
executed from but not written to (after the OS loads them initially). Data
segments can be written but not executed. This <a href='https://en.wikipedia.org/wiki/W%5EX'>W^X
constraint</a> is a critical pillar for securing computers; bad things happen
when programs can be induced to modify their own code.)

<p>
In addition to the data segment a program starts out with, it can request
various segments of empty working memory. A crucial one needed for functions
to work is the <em>stack</em>. Stacks help keep local variables in functions
isolated from each other. In particular, they're necessary for recursive
functions that call themselves either directly or indirectly. Each call must
get new copies of locals that don't interfere with other calls.

<p>
Here's one way function calls can work (used by many modern platforms): Before
making a call the caller pushes arguments on the stack. The function can now
access them from the stack. After it returns the caller pops the arguments off
the stack to clean up. Recursive calls get separate <em>frames</em> on the
stack.

<p>
Since the stack is temporary space that gets cleaned up after a function
returns, it's also an ideal place for local variables. Just push stuff on the
stack and make sure to clean it up before you return. The caller is none the
wiser.

<p>
Managing arguments and local variables does require each function to know
precisely where they live. One unambiguous way to specify arguments and local
variables (again used by many modern platforms) is as offsets off of a special
<em>stack pointer</em> register (<span style='color:DarkSeaGreen'>esp</span>
above). The x86 instruction set provides `<tt>push</tt>` instructions that
automatically decrement <span style='color:DarkSeaGreen'>esp</span>, and
`<tt>pop</tt>` instructions that automatically increment <span style='color:DarkSeaGreen'>esp</span>.
(The stack grows downward.)

<p>
Bottomline: calls in SubX consist by convention of some number of `<tt>push</tt>`
instructions, one `<tt>call</tt>` instruction, and some cleanup of the stack.
SubX provides the following syntax sugar:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
(f o1 o2 o3)
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
push o3
push o2
push o1
call f
add 12 to <span style='color:DarkSeaGreen'>esp</span> <span style='color:blue; margin-left:1em'># pop 3 args * 4 bytes each</span>
</pre>

<p>
<em>Strings</em>

<p>
Strings (arrays of bytes, ignoring character encodings) are a workhorse of
high-level languages, but Assembly doesn't make them convenient to deal with.
Programming in SubX involves writing lots of automated tests, and tests are
most useful when they give good error messages. So passing strings into
functions is a crucial mechanism. SubX allows string literals. When it
encounters one, it appends the new literal to the data segment and replaces it
with its label in the code segment.

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
&nbsp;== code
(f o1 o2 "foo")

&nbsp;== data
&hellip;<em>data</em>&hellip;
</pre>

<p>
&#8658;

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
&nbsp;== code
(f o1 o2 $string1)

&nbsp;== data
&hellip;<em>data</em>&hellip;
$string1:
&nbsp;  66 6f 6f  <span style='color:blue; margin-left:1em'># Utf-8 for &lsquo;f&rsquo; &lsquo;o&rsquo; &lsquo;o&rsquo;</span>
</pre>

<p>
<em>Tests</em>

<p>
One of the ways I'm able to sling large programs at such a low level is by
writing lots of automated tests. A test harness isn't a common sight in
Assembly programming. SubX adds a single new mechanism that makes testing
ergonomic: when emitting a binary it generates a new function called `run-tests`
that calls every function in the program that starts with &lsquo;test-&rsquo;.

<p>
Putting all this syntax sugar together, here's a SubX function to compute the
factorial of a number, along with one automated test:

<p>
<pre style='margin-left:2em; font-family:courier,fixed'>
factorial: <span style='color:blue; margin-left:1em'># n : int &rarr; result/<span style='color:DarkSeaGreen'>eax</span> : int</span>
<span style='color:blue; margin-left:2em'># function prologue</span>
<span style='margin-left:2em'>55/push-<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>89/&lt;- <span style='color:DarkSeaGreen'>ebp</span> 4/r32/<span style='color:DarkSeaGreen'>esp</span></span>
<span style='margin-left:2em'><span style='color:blue'># save registers</span>
<span style='margin-left:2em'>53/push-<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='color:blue; margin-left:2em'># if (n <= 1) return 1</span>
<span style='margin-left:2em'>81 7/subop/compare *(<span style='color:DarkSeaGreen'>ebp</span>+8)</span> <span style='color:blue; margin-left:1em'># n is at *(<span style='color:DarkSeaGreen'>ebp</span>+8)</span></span>
<span style='margin-left:2em'>{</span>
<span style='margin-left:4em'>7f/jump-if-greater <span style='color:DarkSeaGreen'>break</span>/disp8</span>
<span style='margin-left:4em'>b8/copy-to-<span style='color:DarkSeaGreen'>eax</span> 1/imm32</span>
<span style='margin-left:2em'>}</span>
<span style='color:blue; margin-left:2em'># otherwise return n * factorial(n-1)</span>
<span style='margin-left:2em'>{</span>
<span style='margin-left:4em'>7e/jump-if-lesser-or-equal <span style='color:DarkSeaGreen'>break</span>/disp8</span>
<span style='color:blue; margin-left:4em'># <span style='color:DarkSeaGreen'>ebx</span> = n-1</span>
<span style='margin-left:4em'>8b/-> *(<span style='color:DarkSeaGreen'>ebp</span>+8) 3/r32/<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='margin-left:4em'>4b/decrement-<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='color:blue; margin-left:4em'># return n * factorial(ebx)</span>
<span style='margin-left:4em'>(factorial <span style='color:DarkSeaGreen'>ebx</span>)</span> <span style='color:blue; margin-left:1em'># &rarr; <span style='color:DarkSeaGreen'>eax</span></span></span>
<span style='margin-left:4em'>f7 4/subop/multiply-into-eax *(<span style='color:DarkSeaGreen'>ebp</span>+8)</span>
<span style='margin-left:2em'>}</span>
<span style='margin-left:2em'><span style='color:blue'># restore registers</span>
<span style='margin-left:2em'>5b/pop-to-<span style='color:DarkSeaGreen'>ebx</span></span>
<span style='color:blue; margin-left:2em'># function epilogue</span>
<span style='margin-left:2em'>89/&lt;- <span style='color:DarkSeaGreen'>esp</span> 5/r32/<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>5d/pop-to-<span style='color:DarkSeaGreen'>ebp</span></span>
<span style='margin-left:2em'>c3/return</span>

test-factorial:
<span style='margin-left:2em'>(factorial 5)</span>
<span style='margin-left:2em'>(check-ints-equal 120 <span style='color:DarkSeaGreen'>eax</span> <span style='color:darkCyan'>"failure: factorial(5)"</span>)</span>
<span style='margin-left:2em'>c3/return</span>
</pre>

<p>
(Ignore all the magic numbers for opcodes, just trust that &lsquo;5b&rsquo;
means &lsquo;pop to <span style='color:DarkSeaGreen'>ebx</span>&rsquo;, that
&lsquo;5/r32&rsquo; means &lsquo;<span style='color:DarkSeaGreen'>ebp</span>&rsquo;,
and so on.)

<p>
<em>Summary</em>

<p>
SubX is designed to be easy to self-host, so it mixes and matches features from
machine code, conventional Assembly and higher-level programming languages:

<p>
<ul>
<li> No mnemonics; programmer must provide all numeric opcodes and operands
directly.
<li> Error-checking using metadata.
<li> Syntax sugar for function calls and expressions like `<tt>*(<span style='color:DarkSeaGreen'>ebp</span>+8)</tt>`.
<li> Literal strings.
<li> Automated test harness.
</ul>

<p>
The combination of these mechanisms has been ergonomic enough that I've written
40k LoC in SubX over the past year. I've successfully gotten SubX bootstrapped
in itself. I've built an emulator for SubX that can emit traces of instructions
executed, and this trace provides me with <a href='https://github.com/akkartik/mu/blob/master/browse_trace/Readme.md'>a
time-travel debugging experience</a>. I can also package up SubX programs with
a third-party OS kernel into bootable disk images that run natively or <a
href='/post/iso-on-linode'>on a cloud server</a>. In spite of these milestones,
the syntax above is still (obviously) not entirely copacetic. In <a href='/post/mu-2019-2'>the
next post</a> I describe my attempts to design the next level up, with strong
type-safety and memory-safety. Rather to my surprise, the design process
continues to seem inevitable.

<p>
<div class='btw'>(Thanks Garth Goldwater and <a href='https://github.com/rdentato'>Remo
Dentato</a> for helpful feedback on drafts of this post.)</div>

<p>
<em>comments</em>

<p>
<ul><div class="comment">
&nbsp;&nbsp; <li><a name="ae91bb23700b5ff98b6371f8de45012a635e4e9af1cb3bf734f5d74b7f468050"></a>Anonymous, 2019-10-18: I am unclear on why the actual opcodes themselves were retained in code that will obviously need to be fed through an assembler to produce a binary.  Why not elide them and simply create a more ergonomic assembly language?  Is the concern that the complexity of the assembler would threaten habitability more than the foreign-seeming syntax and necessity to memorize opcodes rather than mneumonics does?
&nbsp;&nbsp; <ul><div class="comment">
&nbsp;&nbsp;   <li><a name="f31732388e13d30c094d52e69c6bc2d14f8b7d907a965e5fa831d464f17c00b2"></a><a href="http://akkartik.name/about">Kartik Agaram</a>, 2020-01-04: Yeah, though I'm open to revisiting this decision. The trouble with conventional Assembly languages is that a mnemonic may expand to many opcodes depending on the arguments. That complicates things greatly.

<p>
There's <a href='https://github.com/akkartik/mu/issues/39'>a ticket open in the project to design mnemonics for SubX opcodes in a 1:1 manner</a>. It's a hard problem. We want the mnemonics to be more useful/memorable than opcodes, but not too verbose. I haven't been able to come up with a design that satisfies these constraints, but maybe somebody else can?
</div></ul>
</div></ul>
]]></description>
    </item>
    <item>
      <title>Four example projects</title>
      <link>http://akkartik.name/post/four-repos</link>
      <pubDate>Sat, 16 Mar 2019 00:08:22 PDT</pubDate>
      <guid isPermaLink="false">http://akkartik.name/post/four-repos</guid>
      <description><![CDATA[
<div style='margin-left:3em'>
<span class='left_quote_char'>&ldquo;</span><em>Most kinds of power require a
substantial sacrifice&hellip; By the time someone has acquired it, he has also
matured to the point where he won't use it unwisely.</em>&rdquo;
<br>&mdash; Ian Malcolm, <a href='https://www.imdb.com/title/tt0107290/quotes?item=qt1464414'>&ldquo;Jurassic Park&rdquo;</a>
</div>

<div style='margin-left:3em'>
<span class='left_quote_char'>&ldquo;</span><em>It is impossible to form
anything which has the character of nature by adding pre-formed parts.</em>&rdquo;
<br>&mdash; Christopher Alexander, <a href='https://www.amazon.com/Timeless-Way-Building-Christopher-Alexander/dp/0195024028'>&ldquo;A Timeless Way of Building&rdquo;</a>
</div>

<p>
Lately I tend to program in a certain unconventional manner. A series of design
choices, each seemingly reasonable in isolation, take me pretty far from
conventional wisdom.

<!-- more -->

<ol>

<li> Axiomatically, <a href='/about'>I care as much about the experience of
reading my code</a> as that of running my programs.

<li> Programs are easier to grok when you can run them, so I try to keep my
programs super easy to build.

<li> Dependencies add moving parts when building from source, so I try to
minimize them.

<li> All languages tend to directly or indirectly depend on C, so I try to cut
out the middle men and program directly in C (unless I'm <a href='https://archive.org/details/akkartik-mu-2021-04-22'>prototyping</a>).

<li> Describing dependencies explicitly tends to make C projects (header files
and so on) hard to reorganize, so I rely on small codebase size and automatically
generated headers to keep my code supple.

<li> Grokking strange new codebases is hard, so I organize my projects in
<a href='/post/wart-layers'>layers</a> that can be ignored in the beginning to
focus on a tiny subset of code that still builds and runs.

<li> Determining if a code change is good <a href='/post/1451852'>can be harder
than making the change itself</a>, so I try to provide lots of automated tests
as guardrails for newcomers.

<li> And finally, this list is already pretty long so I give up on almost
everything else, making a virtue of a <a href='/post/unfolding'>rough</a>, low-polish aesthetic in hopes of
minimizing my area of concern and so controlling complexity over time.
Sometimes people are forced to go in and modify my codebase.
<a href='https://en.wikipedia.org/wiki/Wabi-sabi'>So be it!</a>

</ol>

<p>
Applied together, these concerns make my projects look fairly unfamiliar at
first glance, with strange directives mixed into what seems like a C program,
files auto-generated by the build system, strange looking tests and so on. To
help newcomers gain an initial orientation, here is a series of four example
repos that gradually introduce my idioms:

<ul>

<li> <a href='https://git.sr.ht/~akkartik/basic-build'>basic-build</a>: a
zero-dependency build system in 30 lines of code. All it needs is <tt>/bin/sh</tt>.

<li> <a href='https://git.sr.ht/~akkartik/basic-test'>basic-test</a>: a
bare-bones test harness for C in 45 lines of code.

<li> <a href='https://git.sr.ht/~akkartik/basic-whitebox-test'>basic-whitebox-test</a>:
a sandbox for <a href='/post/tracing-tests'>my white-box tests</a> that enable
more comprehensive testing, more radical code reorganizations, and <a href='https://git.sr.ht/~akkartik/basic-whitebox-test/tree/main/browse_trace/Readme.md'>more scalable &ldquo;debug by <tt>print</tt>&rdquo;</a>.

<li> <a href='https://git.sr.ht/~akkartik/basic-layers'>basic-layers</a>: a
sandbox for <a href='/post/wart-layers'>my non-modular, abstraction-busting
approach to organizing large-ish projects</a>.

</ul>

<p>
My hope is that these repos act like foundational &ldquo;meta layers&rdquo;
that help peel back some of the complexity in my larger <a href='https://github.com/akkartik/mu/blob/master/Readme.md'>Mu</a>
project, and so make it more accessible to others. They aren't really
intended to be read in a browser, so <tt>git clone</tt> them (Mac or *nix) and
try following the Readme. <a href='mailto:ak@akkartik.com'>Comments and
feedback most appreciated,</a> particularly if you're moved to try building
something using them.

<p>
<em>(Update 2019-08-14: <a href='https://github.com/rdentato/bld'>Remo Dentato
has a project developing <tt>basic-build</tt> further.</a>)</em>
]]></description>
    </item>

  </channel>
</rss>
