Articles for #airplane-articles

Software development is changing. Tool calling, inference scaling and RL with Verifiable Rewards have combined over the past year to enable agent harnesses like Claude Code which can reliably navigate, modify and contribute to large codebases.

LLMs scale amazingly well with the amount of training data you throw at them. But I’ve been thinking about how to build tools that work alongside the characteristics of LLMs rather than language models needing to learn how to work with existing human-centric tools during training.

I have a hunch that a programming environment built around the strengths and limitations of autoregressive LLMs can lead to cheaper and higher-quality agent-powered development. How could we prove out that hypothesis? One would first need to design a language that aligns with how LLMs “think”. What would such a language look like? In this post I put forward some ideas for a language called Markov that I think would fit the bill.

Read More...

One of my pet peeves about my natural writing style is how I lean into complex sentences divided by commas. Left unchecked, my prose starts looking like it might be ChatGPT’s attempt at writing a blog post in the style of s-expressions. I thought it would be neat to try and write some code to help me proofread for this specific issue and improve my posts.

So much of my Python experience is from writing apps with Django that I forgot how quick and easy it is to whip up a small script that does some text processing with nothing but the standard library. As much as I appreciate static types and exhaustiveness checking in larger programs, being able to ignore edge cases that I know don’t appear in the specific input I’m concerned with is a relief for scripts like this.

Read More...

I’ve been working on a plugin for Obsidian called Obsidian Full Calendar on-and-off for the past 10 months or so. For most of that time the plugin has had no unit tests, and I finally got around to adding some test coverage during a big refactor.

Tests are easiest when code doesn’t have side effects since filesystems and network calls often aren’t available in the environment the tests are running in. Obsidian’s core code is closed-source and can only be run from inside the Electron app, so plugin developers who want test coverage aren’t left with many options but to test their plugins completely outside of Obsidian. Unfortunately for me, my plugin is mostly a pile of glue sitting between FullCalendar as the view layer and the Obsidian filesystem APIs for persistence. I would need to mock out the relevant APIs from Obsidian if I wanted to have any meaningful test coverage of my own code.

There isn’t yet any comprehensive mock Obsidian API for use in a testing environment that I could reach for, so I went ahead writing my own!

Read More...