For a long time I’ve been running my storage on a 2-disk ZFS
mirror. It’s been stable, safe, and easy to manage. However, at some
point, 2 disks just aren’t enough, and I wanted to upgrade to RAIDZ2
so that I could survive up to two simultaneous disk failures.
I could have added another mirror, which would have been simple, and
this setup would allow two drives to fail, but not any two drives. I
wanted the extra safety of being able to lose any two drives.
tl;dr: Linux capabilities are just xattrs (extended attributes) on files — and since tar can preserve xattrs, Bazel can “smuggle” them into OCI layers without ever running sudo setcap.
Every so often I stumble on a trick that makes me do a double-take. This one came up while poking around needing to replace the contents of a Dockerfile that set capabilities on a file, via setcap, and trying to replace it with rules_oci.
We are all pretty familiar with the all powerfulroot in Linux and escalating to root via sudo. Capabilities break that monolith into smaller, more focused privileges [ref]. Instead of giving a process the full keys to the kingdom, you can hand it just the one it needs.
For example:
CAP_NET_BIND_SERVICE
lets a process bind to ports below 1024.
CAP_SYS_ADMIN
a grab-bag of scary powers (mount, pivot_root, …).
CAP_CHOWN
lets a process change file ownership.
Capabilities are inherited from the spawning process but they can also be added to the file itself, such that any time that process is exec it has the desired capabilities. The Linux kernel stores these capabilities in the “extended attributes” (i.e. additional metadata) of the file [ref].
If the filesystem you are using does not support extended attributes, then you cannot set capabilities on a file.
Let’s see an example we will work through.
#include<netinet/in.h>
#include<stdio.h>
#include<sys/socket.h>
#include<unistd.h>intmain(void){intfd=socket(AF_INET,SOCK_RAW,IPPROTO_ICMP);if(fd<0){perror("socket");return1;}printf("Raw socket created successfully!\n");close(fd);return0;}
If we build this with Bazel and try to run it, we will see that it fails unless we either spawn it with CAP_NET_RAW, sudo or add it to the binary via setcap.
> bazel build //:rawsock
> bazel-bin/rawsock
socket: Operation not permitted
>sudo bazel-bin/rawsock
Raw socket created successfully!
# here we add the capability via setcap# no longer need sudo>cp bazel-bin/rawsock /tmp/rawsock
>sudo setcap 'cap_net_raw=+ep' /tmp/rawsock
> /tmp/rawsock
Raw socket created successfully!
# let's check the xattr> getfattr -n security.capability /tmp/rawsock
# file: bazel-bin/rawsock
security.capability=0sAQAAAgAgAAAAAAAAAAAAAAAAAAA=
Okay great – but what does this have to do with Bazel?
Well we were converting a Dockerfile that used setcap to modify the binary.
If your OCI image runs as a non-root user, it will also be unpermitted from creating the raw socket.
FROM alpine:latestCOPY bazel-bin/rawsockUSER nobodyENTRYPOINT rawsock
We can build this Docker image and notice that the entrypoint fails.
> docker build -f Dockerfile.base bazel-bin -t no-caps
> docker run --rm no-caps
socket: Operation not permitted
If we amend the Dockerfile by adding setcap we also see it succeeds.
--- Dockerfile.base 2025-09-09 15:03:22.525245904 -0700
+++ Dockerfile.setcap 2025-09-09 15:30:54.939933727 -0700
@@ -1,5 +1,6 @@
FROM alpine:latest
COPY rawsock /bin/rawsock
-
+RUN apk add --no-cache libcap
+RUN setcap 'cap_net_raw=+ep' /bin/rawsock
USER nobody
ENTRYPOINT /bin/rawsock
\ No newline at end of file
Now we can build and run it again.
> docker build -f Dockerfile.setcap bazel-bin -t with-caps
> docker run --rm with-caps
Raw socket created successfully!
Back to Bazel! Actions in Bazel are executed under the user that spawned the Bazel process. We can validate this with a simple genrule.
# see my user>echo$USER
fmzakari
> bazel build //:whoami
>cat bazel-bin/whoami.txt
fmzakari
How can we go ahead then to create a file with a capability set such that we can replace our Dockerfile layer?
Escalating privileges inside a Bazel action with sudo isn’t straightforward. You might need to configure NOPASSWD for the user, so that it can execute sudo without a password. You could also run the whole bazel command as root but that is granting too much privilege everywhere.
For capabilities to transport themselves through a tar archive, the tar archive itself must have the capability to store extended attributes as well. You can enable this feature with the --xattrs option.
If you decompress the tar archive, and have necessary privileges to set extended attributes (CAP_SETFCAP or sudo) then the unarchived file will retain the capability and everything will work!
>mkdir test>sudo tar--xattrs--xattrs-include="*"-Ctest-xf\
out/blobs/sha256/da1a39c8c0dabc8784a2567fa24df668b50d32b13f2893812d4740fa07a1d41c
> getcap test/bin/rawsock
test/bin/rawsock cap_net_raw=ep
>test/bin/rawsock
Raw socket created successfully!
What does this have to do with building an OCI image in Bazel? 🤨
Turns out that a trick we can employ is to toggle the necessary bits to mark a file as having a necessary capability in the tar archive.
This is exactly what the xattrs rule in bazeldnf does! 🤓
The key idea: capabilities live in extended attributes, and tar can carry those along. That means you don’t need to run setcap inside a genrule at build time as the Dockerfile equivalent — Bazel can smuggle the bits straight into the image tar layer to be consumed by a OCI compliant runtime. ☝️
This trick neatly sidesteps the need for sudo in your rules and keeps builds hermetic.
Not every filesystem or runtime will honor these attributes, but when it works it’s a clever, Bazel-flavored way to package privileged binaries without breaking sandboxing.
whereby that seventh stage is essentially this scene in the matrix...
It's where you deeply understand that 'you can now do anything' and just start doing it because it's possible and fun, and doing so is faster than explaining yourself. Outcomes speak louder than words.
There's a falsehood that AI results in SWE's skill atrophy, and there's no learning potential.
If you’re using AI only to “do” and not “learn”, you are missing out - David Fowler
I've never written a compiler, yet I've always wanted to do one, so I've been working on one for the last three months by running Claude in a while true loop (aka "Ralph Wiggum") with a simple prompt:
Hey, can you make me a programming language like Golang but all the lexical keywords are swapped so they're Gen Z slang?
Why? I really don't know. But it exists. And it produces compiled programs. During this period, Claude was able to implement anything that Claude desired.
The programming language is called "cursed". It's cursed in its lexical structure, it's cursed in how it was built, it's cursed that this is possible, it's cursed in how cheap this was, and it's cursed through how many times I've sworn at Claude.
https://cursed-lang.org/
For the last three months, Claude has been running in this loop with a single goal:
"Produce me a Gen-Z compiler, and you can implement anything you like."
Anything that Claude thought was appropriate to add. Currently...
The compiler has two modes: interpreted mode and compiled mode. It's able to produce binaries on Mac OS, Linux, and Windows via LLVM.
There are some half-completed VSCode, Emacs, and Vim editor extensions, and a Treesitter grammar.
A whole bunch of really wild and incomplete standard library packages.
lexical structure
Control Flow: ready → if otherwise → else bestie → for periodt → while vibe_check → switch mood → case basic → default
Declaration: vibe → package yeet → import slay → func sus → var facts → const be_like → type squad → struct
Flow Control: damn → return ghosted → break simp → continue later → defer stan → go flex → range
Values & Types: based → true cringe → false nah → nil normie → int tea → string drip → float lit → bool ඞT (Amogus) → pointer to type T
Comments: fr fr → line comment no cap...on god → block comment
example program
Here is leetcode 104 - maximum depth for a binary tree:
vibe main
yeet "vibez"
yeet "mathz"
// LeetCode #104: Maximum Depth of Binary Tree 🌲
// Find the maximum depth (height) of a binary tree using ඞ pointers
// Time: O(n), Space: O(h) where h is height
struct TreeNode {
sus val normie
sus left ඞTreeNode
sus right ඞTreeNode
}
slay max_depth(root ඞTreeNode) normie {
ready (root == null) {
damn 0 // Base case: empty tree has depth 0
}
sus left_depth normie = max_depth(root.left)
sus right_depth normie = max_depth(root.right)
// Return 1 + max of left and right subtree depths
damn 1 + mathz.max(left_depth, right_depth)
}
slay max_depth_iterative(root ඞTreeNode) normie {
// BFS approach using queue - this hits different! 🚀
ready (root == null) {
damn 0
}
sus queue ඞTreeNode[] = []ඞTreeNode{}
sus levels normie[] = []normie{}
append(queue, root)
append(levels, 1)
sus max_level normie = 0
bestie (len(queue) > 0) {
sus node ඞTreeNode = queue[0]
sus level normie = levels[0]
// Remove from front of queue
collections.remove_first(queue)
collections.remove_first(levels)
max_level = mathz.max(max_level, level)
ready (node.left != null) {
append(queue, node.left)
append(levels, level + 1)
}
ready (node.right != null) {
append(queue, node.right)
append(levels, level + 1)
}
}
damn max_level
}
slay create_test_tree() ඞTreeNode {
// Create tree: [3,9,20,null,null,15,7]
// 3
// / \
// 9 20
// / \
// 15 7
sus root ඞTreeNode = &TreeNode{val: 3, left: null, right: null}
root.left = &TreeNode{val: 9, left: null, right: null}
root.right = &TreeNode{val: 20, left: null, right: null}
root.right.left = &TreeNode{val: 15, left: null, right: null}
root.right.right = &TreeNode{val: 7, left: null, right: null}
damn root
}
slay create_skewed_tree() ඞTreeNode {
// Create skewed tree for testing edge cases
// 1
// \
// 2
// \
// 3
sus root ඞTreeNode = &TreeNode{val: 1, left: null, right: null}
root.right = &TreeNode{val: 2, left: null, right: null}
root.right.right = &TreeNode{val: 3, left: null, right: null}
damn root
}
slay test_maximum_depth() {
vibez.spill("=== 🌲 LeetCode #104: Maximum Depth of Binary Tree ===")
// Test case 1: Balanced tree [3,9,20,null,null,15,7]
sus root1 ඞTreeNode = create_test_tree()
sus depth1_rec normie = max_depth(root1)
sus depth1_iter normie = max_depth_iterative(root1)
vibez.spill("Test 1 - Balanced tree:")
vibez.spill("Expected depth: 3")
vibez.spill("Recursive result:", depth1_rec)
vibez.spill("Iterative result:", depth1_iter)
// Test case 2: Empty tree
sus root2 ඞTreeNode = null
sus depth2 normie = max_depth(root2)
vibez.spill("Test 2 - Empty tree:")
vibez.spill("Expected depth: 0, Got:", depth2)
// Test case 3: Single node [1]
sus root3 ඞTreeNode = &TreeNode{val: 1, left: null, right: null}
sus depth3 normie = max_depth(root3)
vibez.spill("Test 3 - Single node:")
vibez.spill("Expected depth: 1, Got:", depth3)
// Test case 4: Skewed tree
sus root4 ඞTreeNode = create_skewed_tree()
sus depth4 normie = max_depth(root4)
vibez.spill("Test 4 - Skewed tree:")
vibez.spill("Expected depth: 3, Got:", depth4)
vibez.spill("=== Maximum Depth Complete! Tree depth detection is sus-perfect ඞ🌲 ===")
}
slay main_character() {
test_maximum_depth()
}
If this is your sort of chaotic vibe, and you'd like to turn this into the dogecoin of programming languages, head on over to GitHub and run a few more Claude code loops with the following prompt.
study specs/* to learn about the programming language. When authoring the cursed standard library think extra extra hard as the CURSED programming language is not in your training data set and may be invalid. Come up with a plan to implement XYZ as markdown then do it
There is no roadmap; the roadmap is whatever the community decides to ship from this point forward.
At this point, I'm pretty much convinced that any problems found in cursed can be solved by just running more Ralph loops by skilled operators (ie. people with experience with compilers who shape it through prompts from their expertise vs letting Claude just rip unattended). There's still a lot to be fixed, happy to take pull-requests.
LLMs amplify the skills that developers already have and enable people to do things where they don't have that expertise yet.
Success is defined as cursed ending up in the Stack Overflow developer survey as either the "most loved" or "most hated" programming language, and continuing the work to bootstrap the compiler to be written in cursed itself.
We use Protocol Buffers heavily at $DAYJOB$ and it’s becoming increasingly a large pain point, most notably due to challenges with coercing multiple versions in a dependency graph.
Recently, a team wanted to augment the generated Java code protoc (Protobuf compiler) emits. I was aware that the compiler had a “plugin” architecture but had never looked deeper into it.
Let’s explore writing a Protocol Buffer plugin, in Java and for the Java generated code. 🤓
Turns out that plugins are simple in that they operate solely over standard input & output and unsurprisingly marshal protobuf over them.
A plugin is just a program which reads a CodeGeneratorRequest protocol buffer from standard input and then writes a CodeGeneratorResponse protocol buffer to standard output. [ref]
The request & response protos are described in plugin.proto.
+------------------+ CodeGeneratorRequest (stdin) +------------------+
| | -------------------------------------------> | |
| protoc | | Your Plugin |
| (Compiler) | <------------------------------------------- | (e.g., in Java) |
| | CodeGeneratorResponse (stdout) | |
+------------------+ +------------------+
|
| (protoc then writes files
| to disk based on plugin's response)
V
+------------------+
| |
| Generated |
| Code Files |
| |
+------------------+
Here is a dumb plugin that emits a fixed class to demonstrate.
publicstaticvoidmain(String[]args)throwsException{CodeGeneratorRequestrequest=CodeGeneratorRequest.parseFrom(System.in);CodeGeneratorResponseresponse=CodeGeneratorResponse.newBuilder().addFile(File.newBuilder().setContent("""
// Generated by the plugin
public class Dummy {
public String hello() {
return "Hello from Dummy";
}
}
""").setName("Dummy.java").build()).build();response.writeTo(System.out);}
We can run this and see that the expected file is produced.
> protoc example.proto --plugin=protoc-gen-dumb \--dumb_out=./generated
>cat generated/Dummy.java
// Generated by the plugin
public class Dummy {
public String hello(){return"Hello from Dummy";}}
Insertion points are markers within the generated source that allow other plugins to include additional content.
We have to modify our File that we include in the response to specify the insertion point and instead of a new file being created, the contents of files will be merged. ✨
Our example plugin would like to add the hello() function to every message type described in the proto file.
We do this by setting the appropriate insertion point which we found from auditing the original generated code. In this particular example, we want to add our new funciton to the Class definition and pick class_scope as our insertion point.
List<File>generatedFiles=protos.stream().flatMap(p->p.getMessageTypes().stream()).map(m->{finalFileDescriptorfd=m.getFile();StringjavaPackage=fd.getOptions().getJavaPackage();finalStringfileName=javaPackage.replace(".","/")+"/"+m.getName()+".java";returnFile.newBuilder().setContent("""
// Generated by the plugin
public String hello() {
return "Hello from " + this.getClass().getSimpleName();
}
\s""").setName(fileName).setInsertionPoint(String.format("class_scope:%s",m.getName())).build();}).toList();
We now run both the Java generator alongside our custom plugin.
We can audit the generated source and we see that our new method is now included! 🔥
Note: The plugin must be listed after java_out as the order matters on the command-line.
> protoc example.proto --java_out=./generated \ --plugin=protoc-gen-example \--example_out=./generated
> rg "hello" generated/ -B 1
generated/com/example/protobuf/tutorial/Person.java
1038- // Generated by the plugin
1039: public String hello(){
generated/com/example/protobuf/tutorial/Address.java
862- // Generated by the plugin
863: public String hello(){
While we are limited by the insertion points previously defined in the open-source implementation of the Java protobuf generator, it does provide a convenient way to augment the the generated files.
We can also include additional source files that may wrap the original files for cases where the insertion points may not suffice.
Ever run into the issue where you exit your main method in Java but the application is still running?
That can happen if you have non-daemon threads still running. 🤔
The JVM specification specifically states the condition under which the JVM may exit [ref]:
A program terminates all its activity and exits when one of two things happens:
All the threads that are not daemon threads terminate.
Some thread invokes the exit() method of class Runtime or class System, and the exit operation is not forbidden by the security manager.
What are daemon-threads?
They are effectively background threads that you might spin up for tasks such as garbage collection, where you explicitly don’t want them to inhibit the JVM from shutting down.
A common problem however is that if you have code-paths on exit that fail to stop all non-daemon threads, the JVM process will fail to exit which can cause problems if you are relying on this functionality for graceful restarts or shutdown.
Let’s observe a simple example.
publicclassMain{publicstaticvoidmain(String[]args){Threadthread=newThread(()->{try{while(true){// Simulate some work with sleepSystem.out.println("Thread is running...");Thread.sleep(1000);}}catch(InterruptedExceptione){Thread.currentThread().interrupt();}});// This is redundant, as threads inherit the daemon// status from their parent.thread.setDaemon(false);thread.start();System.out.println("Leaving main thread");}}
If we run this, although we exit the main thread, we observe that the JVM does not exit and the thread continues to do its “work”.
> java Main
Leaving main thread
Thread is running...
Thread is running...
Thread is running...
Often you will see classes implement Closeable or AutoCloseable so that an orderly shutdown of these sort of resources can occur.
It would be great however to test that such graceful cleanup is done appropriately for our codebases.
Is this possible in Bazel?
@TestpublicvoidtestNonDaemonThread(){Threadthread=newThread(()->{try{while(true){// Simulate some work with sleepSystem.out.println("Thread is running...");Thread.sleep(1000);}}catch(InterruptedExceptione){Thread.currentThread().interrupt();}});thread.setDaemon(false);thread.start();}
If we run this test however we notice the test PASSES 😱
Turns out that Bazel’s JUnit test runner uses System.exit after running the tests, which according to the JVM specification allows the runtime to shutdown irrespective of active non-daemon threads. [ref]
Some thread invokes the exit() method of class Runtime or class System, and the exit operation is not forbidden by the security manager.
From discussion with others in the community, this explicit shutdown was added specifically because many tests would hang due to improper non-daemon thread cleanup. 🤦
How can we validate graceful shutdown then?
Well, we can leverage sh_test and startup our java_binary and validate that the application exits within a specific timeout.
Additionally, I’ve put forward a pull-request PR#26879 which adds a new system property bazel.test_runner.await_non_daemon_threads that can be added to a java_test such that the test runner validates that there are no non-daemon threads running before exiting.
It would have been great to remove the System.exit call completely when the presence of the property is true; however I could not find a way to then set the exit value of the test.
Turns out that even simple things can be a little complicated and it was a bit of a headscratcher to see why our tests were passing despite our failure to properly tear down resources.
I just finished up a phone call with a "stealth startup" that was pitching an idea that agents could generate code securely via an MCP server. Needless to say, the phone call did not go well. What follows is a recap of the conversation where I just shot down the idea and wrapped up the call early because it's a bad idea.
If anyone pitches you on the idea that you can achieve secure code generation via an MCP tool or Cursor rules, run, don't walk.
Over the last nine months, I've written about the changes that are coming to our industry, where we're entering an arena where most of the code going forward is not going to be written by hand, but instead by agents.
I haven't written code by hand for nine months. I've generated, read, and reviewed a lot of code, and I think perhaps within the next year, the large swaths of code in business will no longer be artisanal hand-crafted. Those days are fast coming to a close.
Thus, naturally, there is a question that's on everyone's mind:
How do I make the agent generate secure code?
Let's start with what you should not do and build up from first principles.
The Java language implementation for Bazel has a great feature called strict dependencies – the feature enforces that all directly used classes are loaded from jars provided by a target’s direct dependencies.
If you’ve ever seen the following message from Bazel, you’ve encountered the feature.
error: [strict] Using type Dog from an indirect dependency (TOOL_INFO: "//:dog").
See command below **
public void dogs(Dog dog){
^
** Please add the following dependencies:
//:dog to //:park
** You can use the following buildozer command:
buildozer 'add deps //:dog' //:park
The analog tool for removing dependencies which are not directly referenced is unused_deps.
You can run this on your Java codebase to prune your dependencies to those only strictly required.
It performs a query searching for any rules that start with kt_, java_ or android_. This would catch our common rules such as java_library or java_binary.
Here is where things get a little more interesting. The tool emits an ephemeral Bazel WORKSPACE in a temporary directory that contains a Bazel aspect.
What is the aspect the tool injects into our codebase?
# Explicitly creates a params file for a Javac action.
def_javac_params(target,ctx):params=[]foractionintarget.actions:ifnotaction.mnemonic=="Javac"andnotaction.mnemonic=="KotlinCompile":continueoutput=ctx.actions.declare_file("%s.javac_params"%target.label.name)args=ctx.actions.args()args.add_all(action.argv)ctx.actions.write(output=output,content=args,)params.append(output)breakreturn[OutputGroupInfo(unused_deps_outputs=depset(params))]javac_params=aspect(implementation=_javac_params,)
The aspect is designed to emit additional files %s.javac_params that contain the arguments to the compilation actions.
If we inspect what this file looks like for the simple java_library I created //:app, we see it’s the arguments to java itself.
If you are wondering what JavaBuilder_deploy.jar is? Bazel uses a custom compiler plugin that will be relevant shortly. ☝️
How does the aspect get injected into our project?
Well, after figuring out which targets to build via the bazel query, unused_deps will bazel build your target pattern and specify --override_repository to include this additional dependency and enable the aspect via the --aspects flag.
QUESTION #1: Why does the tool need to set up this aspect anyways? Bazel will already emit param files *-0.params for each Java target that contains nearly identical information.
The tool will then iterate through all these JAR files, open them up and look at the MANIFEST.MF file within it for the value of Target-Label which is the
Bazel target expression for this dependency.
In this case we can see the desired value is Target-Label: //:libB.
If you happen to use rules_jvm_external to pull in Maven dependencies, the ruleset will “stamp” the downloaded JARs which means injecting them with the Target-Label entry in their MANIFEST.MF specifically to work with unused_deps [ref].
After the labels of all the direct dependencies are known for each target, unused_deps will parse the jdeps file, ./bazel-bin/libapp.jdeps, of each target which is a binary protocol serialization of blaze_deps.Dependencies found in deps.go.
This is the super cool feature of Bazel and integrating into the Java compiler. 🔥
Bazel invokes the Java compiler itself and will then iterate through all the symbols, via a provided symbol table, the compiler had to resolve. For each symbol, if the dependency is not from the --direct_dependencies list than it must have been provided through a transitive dependency. [ref].
The presence of kind IMPLICIT would actually trigger a failure for the strict Java dependency check if enabled.
unused_deps then takes the list of the direct dependencies and keeps only all the dependencies the compiler reported back as actually requiring to perform compilation.
The set difference represents the set of targets that are effectively unused and can be reported back to the user for removal! ✨
QUESTION #3: There is a third type of dependency kind INCOMPLETE which I saw when investigating our codebase. I was unable to discern how to trigger it and what it represents.
What I enjoy about Bazel is learning how you can improve developer experience and provide insightful tools when you integrate the build system deeply with the underlying language, unused_deps is a great example of this.
The following was developed last month and has already been delivered at two conferences. If you would like for me to run a workshop similar to this at your employer, please get in contact.
Hey everyone, I'm here today to teach you how to build a coding agent. By this stage of the conference, you may be tired of hearing the word "agent".
You hear the word frequently. However, it appears that everyone is using this term loosely without a clear understanding of what it means or how these coding agents operate internally. It's time to pull back the hood and show that there is no moat.
Learning how to build a coding agent is one of the best things you can do for your personal development in 2025, as it teaches you the fundamentals. Once you understand these fundamentals, you'll move from being a consumer of AI to a producer of AI who can automate things with AI.
Let me open with the following facts:
it's not that hardto build a coding agentit's 300 lines of coderunning in a loop
With LLM tokens, that's all it is.
300 lines of code running in a loop with LLM tokens. You just keep throwing tokens at the loop, and then you've got yourself an agent.
Today, we're going to build one. We're going to do it live, and I'll explain the fundamentals of how it all works. As we are now in 2025, it has become the norm to work concurrently with AI assistance. So, what better way to demonstrate the point of this talk than to have an agent build me an agent whilst I deliver this talk?
0:00
/0:22
Cool. We're now building an agent. This is one of the things that's changing in our industry, because work can be done concurrently and whilst you are away from your computer.
The days of spending a week or a couple of days on a research spike are now over because you can turn an idea into execution just by speaking to your computer.
The next time you're on a Zoom call, consider that you could've had an agent building the work that you're planning to do during that Zoom call. If that's not the norm for you, and it is for your coworkers, then you're naturally not going to get ahead.
please build your ownas the knowledgewill transform youfrom being a consumerto a producer that canautomate things
The tech industry is almost like a conveyor belt - we always need to be learning new things.
If I were to ask you what a primary key is, you should know what a primary key is. That's been the norm for a long time.
In 2024, it is essential to understand what a primary key is.
In 2025, you should be familiar with what a primary key is and how to create an agent, as knowing what this loop is and how to build an agent is now fundamental knowledge that employers are looking for in candidates before they'll let you in the door.
As this knowledge will transform you from being a consumer of AI to being a producer of AI that can orchestrate your job function. Employers are now seeking individuals who can automate tasks within their organisation.
If you're joining me later this afternoon for the conference closing (see below), I'll delve a bit deeper into the above.
Right now, you'll be somewhere on the journey above.
On the top left, we've got 'prove it to me, it's not real,' 'prove it to me, show me outcomes', 'prove it to me that it's not hype', and a bunch of 'it's not good enough' folks who get stuck up there on that left side of the cliff, completely ignoring that there are people on the other side of the cliff, completely automating their job function.
In my opinion, any disruption or job loss related to AI is not a result of AI itself, but rather a consequence of a lack of personal development and self-investment. If your coworkers are hopping between multiple agents, chewing on ideas, and running in the background during meetings, and you're not in on that action, then naturally you're just going to fall behind.
don't be the person on the left side of the cliff.
The tech industry's conveyor belt continues to move forward. If you're a DevOps engineer in 2025 and you don't have any experience with AWS or GCP, then you're going to find it pretty tough in the employment market.
What's surprising to software and data engineers is just how fast this is elapsing. It has been eight months since the release of the first coding agent, and most people are still unaware of how straightforward it is to build one, how powerful this loop is, and its disruptive implications for our profession.
So, my name's Geoffrey Huntley. I was the tech lead for developer productivity at Canva, but as of a couple of months ago, I'm one of the engineers at Sourcegraph building Amp. It's a small core team of about six people. We build AI with AI.
ampcode.comcursorwindsurfclaude codegithub co-pilotare lines of code running in a loop with LLM tokens
Cursor, Windsurf, Claude Code, GitHub Copilot, and Amp are just a small number of lines of code running in a loop of LLM tokens. I can't stress that enough. The model does all the heavy lifting here, folks. It's the model that does it all.
You are probably five vendors deep in product evaluation, right now, trying to compare all these agents to one another. But really, you're just chasing your tail.
It's so easy to build your own...
There are just a few key concepts you need to be aware of.
Not all LLMs are agentic.
The same way that you have different types of cars, like you've got a 40 series if you want to go off-road, and then you've also got people movers, which exist for transporting people.
The same principle applies to LLMs, and I've been able to map their behaviours into a quadrant.
A model is either high safety, low safety, an oracle, or agentic. It's never both or all.
If I were to ask you to do some security research, which model would you use?
That'd be Grok. That's a low safety model.
If you want something that's "ethics-aligned", it's Anthropic or OpenAI. So that's high safety. Similarly, you have oracles. Oracles are on the polar opposite of agentic. Oracles are suitable for summarisation tasks or require a high level of thinking.
Meanwhile, you have providers like Anthropic, and their Claude Sonnet is a digital squirrel (see below).
The first robot used to chase tennis balls. The first digital robot chases tool calls.
Sonnet is a robotic squirrel that just wants to do tool calls. It doesn't spend too much time thinking; it biases towards action, which is what makes it agentic. Sonnet focuses on incrementally obtaining success instead of pondering for minutes per turn before taking action.
It seems like every day, a new model is introduced to the market, and they're all competing with one another. But truth be told, they have their specialisations and have carved out their niches.
The problem is that, unless you're working with these models at an intimate level, you may not have this level of awareness of the specialisations of the models, which results in consumers just comparing the models on two basic primitives:
The size of the context window
The cost
It's kind of like looking at a car, whether it has two doors or three doors, whilst ignoring the fact that some vehicles are designed for off-roading, while others are designed for passenger transport.
To build an agent, the first step is to choose a highly agentic model. That is currently Claude Sonnet, or Kimi K2.
Now, you might be wondering, "What if you want a higher level of reasoning and checking of work that the incremental squirrel does?". Ah, that's simple. You can wire other LLMs in as tools into an existing agentic LLM. This is what we do at Amp.
We call it the Oracle. The Oracle is just GPT wired in as a tool that Claude Sonnet can function call for guidance, to check work progress, and to conduct research/planning.
Amp's oracle is just another LLM registered in as a tool to an agentic LLM that it can function call
The next important thing to learn is that you should only use the context window for one activity. When you're using Cursor or any one of these tools, it's essential to clear the context window after each activity (see below).
LLM outcomtes are a needle in a haystack of what you've allocated into the haystack.
If you start an AI-assisted session to build a backend API controller, then reuse that session to research facts about meerkats. Then it should be no surprise when you tell it to redesign the website in the active session; the website might end up with facts about your API or meerkats, or both.
nb. the context window for Sonnet since delivering this workshop has increased to 1m
Context windows are very, very small. It's best to think of them as a Commodore 64, and as such, you should be treating it as a computer with a limited amount of memory. The more you allocate, the worse your outcome and performance will be.
The advertised context window for Sonnet is 200k. However, you don't get to use all of that because the model needs to allocate memory for the system-level prompt. Then the harness (Cursor, Windsurf, Claude Code, Amp) also needs to allocate some additional memory, which means you end up with approximately 176k tokens usable.
You probably heard a lot about the Model Context Protocols (MCPs). They are the current hot thing, and the easiest way to think about them is as a function with a description allocated to the context window that tells it how to invoke that function.
A common failure scenario I observe is people installing an excessive number of MCP servers or failing to consider the number of tools exposed by a single MCP tool or the aggregate context window allocation of all tools.
There is a cardinal rule that is not as well understood as it should be. The more you allocate to a context window, the worse the performance of the context window will be, and your outcomes will deteriorate.
Avoid excessively allocating to the context window with your agent or through MCP tool consumption. It's very easy to fall into a trap of allocating an additional 76K of tokens just for MCP tools, which means you only have 100K usable.
Less is more, folks. Less is more.
I recommend dropping by and reading the blog post below if you want to understand when to use MCP and when not to.
When you should use MCP, when you should not use MCP, and how allocations work in the context window.
Let's head back and check on our agent that's being built in the background. If you look at it closely enough, you can see the loop and how it's invoking other tools.
Essentially, how this all works is outlined in the loop below.
For every piece of input from the user or result of a tool call that gets allocated to the response, and that response is sent off for inferencing:
The inferencing loop (minus tool registrations)
Let's open up our workshop materials (below) and run the basic chat application via:
One of the most requested things was a comparison to other init systems.
Since I’m most familiar with runit, I shall compare nitro and runit here.
Comparison to runit
runit and nitro share the basic design of having a directory of
services and using small scripts to spawn the processes.
Comparing nitro to runit, there are a few new features and some
architectural differences.
From a design point of view, runit follows the daemontoools approach
of multiple small tools: The runit-init process spawns runsvdir, which
spawns a runsv process for each service.
nitro favors a monolithic approach, and keeps everything in a single process.
This makes it also easier to install for containerization.
The new features are:
nitro keeps all runtime state in RAM and provides an IPC interface
to query it, whereas runit emits state to disk. This enables nitro
to run on read-only file systems without special configuration.
(However, you need a tmpfs to store the socket file. In theory, on
Linux, you could even use /proc/1/fd/10 or an abstract Unix domain
socket, but that requires adding permission checks.)
support for one-shot “services”, i.e. running scripts on up/down
without a process to supervise (e.g. persist audio volume, keep RNG
state). For runit, you can fake this with a
pause process, which has a little
more overhead.
parametrized services. One service directory can be run multiple
times, e.g. agetty@ can be spawned multiple times to provide
agetty processes for different terminals. This can be faked in
runit with symlinks, but nitro also allows fully dynamic creation of
service instances.
log chains. runit supports only one logger per service, and log
services can’t have loggers on their own.
Currently, nitro also lacks some features:
service checks are not implemented (see below), a service that
didn’t crash within 2 seconds is considered to be running currently.
runsvchdir is not supported to change all services at once.
However, under certain conditions, you can change the contents of
/etc/nitro completely and rescan to pick them up. nitro opens
the directory /etc/nitro once and just re-reads the contents on
demand. (Proper reopening will be added at some point when
posix_getdents
is more widespread. opendir/readdir/closedir implies
dynamic memory allocation.)
You can’t override nitroctl operations with scripts as for sv.
nitro tracks service identity by name, not inode number of the service
directory. This has benefits (parametrized services are possible) and
drawbacks (you may need to restart more things explicitly if you fiddle
with existing services, service lookup is a bit more work).
On the code side, nitro is written with modern(ish) POSIX.1-2008
systems in mind, whereas runit, being written in 2001 contains some
quirks for obsolete Unix systems. It also uses a less familiar
style of writing C code.
Do containers need a service supervisor?
It depends: if the container just hosts a simple server, probably not.
However, sometimes containers also need to run other processes to
provide scheduled commands, caches, etc. which benefit from
supervision.
Finally, PID 1 needs to reap zombies, and not all processes used as
PID 1 in containers do that. nitro is only half the size of dumb-init,
and less than twice as big as tini.
Declarative dependencies
Both runit and nitro don’t support declaring dependencies between
services. However, services can
wait for other
services to be up (and nitro has a special state for that, so only
successfully started services are considered UP.)
Personally, I don’t believe service dependencies are of much use. My
experiences with sysvinit, OpenRC, and systemd show that they are hard
to get right and can have funny sideeffects such as unnecessary
restarts of other services when something crashed, or long delays
until the system can be brought down.
For system bringup, it can be useful to sequence operations
(e.g. start udevd very early, then bring up the network, then mount
things, etc.). nitro supports this by allowing the SYS/setup script
to start and wait for services. Likewise, services can be shutdown in
defined order.
Deferring to policies
nitro is a generic tool, but many features provided by other
supervisors can be implemented as site policies using separate
tools. For example, nothing stops you from writing a thing to infer
service dependencies and do a “better” bringup. However, this code
doesn’t need to be part of nitro itself, nor run inside PID 1.
Likewise, things like liveness checks can be implemented as separate
tools. External programs can quite easily keep track of too many
restarts and trigger alerts. An simple Prometheus exporter is
included in contrib.
Lack of readiness checks
At some point I want to add readiness checks, i.e. having an explicit
transition from STARTING to UP (as mentioned above, currently this
happens after 2 seconds).
Unfortunately, the existing mechanisms for service readiness
(e.g. systemd’s sd_notify or s6 notification fd) are incompatible
to each other, and I don’t really like either. But I also don’t
really want to add yet another standard.
Historical background
[This is mostly written down for future reference.]
I think my first exposure to daemontools-style supervision was back in
2005 when I had shared hosting at
Aria’s old company
theinternetco.net.
There was a migration from Apache to Lighttpd, which meant
.htaccess files weren’t supported anymore. So I got my own Lighttpd
instance that was supervised by, if I remember correctly, freedt.
Later, I started the first musl-based Linux distribution
sabotage and built busybox
runit-based init scripts
from scratch.
When Arch (which I used mostly back then) moved towards systemd, I
wrote ignite, a set of
runit scripts to boot Arch. (Fun fact: the last machine running
ignite was decommissioned earlier this year.)
Finally, xtraeme discovered the project and invited me to help
move Void to runit.
En passant I became a Void maintainer.
Work on nitro started around 2020 with some experiments how a
monolithic supervisor could look like. The current code base was
started in 2023.
This blog post intends to be a definitive guide to context engineering fundamentals from the perspective of an engineer who builds commercial coding assistants and harnesses for a living.
Just two weeks ago, I was back over in San Francisco, and there was a big event on Model Context Protocol Servers. MCP is all hype right now. Everyone at the event was buzzing about the glory and how amazing MCP is going to be, or is, but when I pushed folks for their understanding of fundamentals, it was crickets.
0:00
/0:53
It was a big event. Over 1,300 engineers registered, and an entire hotel was rented out as the venue for the takeover. Based on my best estimate, at least $150,000 USD to $200,000 USD was spent on this event. The estimate was attained through a game of over and under with the front-of-house engineers. They brought in a line array, a GrandMA 3, and had full DMX lighting. As a bit of a lighting nerd myself, I couldn't help but geek out a little.
A GrandMA3 lighting controller is worth approximately $100,000.
To clarify, this event was a one-night meet-up, not a conference. There was no registration fee; attendance was free, and the event featured an open bar, including full cocktail service at four bars within the venue, as well as an after-party with full catering and chessboards. While this post might seem harsh on the event, I enjoyed it. It was good.
Not to throw shade, it was a fantastic event, but holy shit! AI Bubble?
The meetup even hired a bunch of beatboxers to close off the event, and they gave a live beatbox performance about Model Context Protocol...
0:00
/1:15
MC protocol live and in the flesh.
One of the big announcements was the removal of the 128 tool limit from Visual Studio Code....
Why Microsoft? It's not a good thing...
Later that night, I was sitting by the bar catching up with one of the engineers from Cursor, and we were just scratching our heads,
"What the hell? Why would you need 128 tools or why would you want more than that? Why is Microsoft doing this or encouraging this bad practice?"
For the record, Cursor caps the number of MCP tools that can be enabled in Cursor to just 40 tools, and it's for a good reason. What follows is a loose recap. This is knowledge that is known by people who build these coding harnesses, and I hope this knowledge spreads - there's one single truth:
You may perhaps not recognize the name of Kevin S. Braunsdorf, or “ksb”
(kay ess bee) as he was called, but you certainly used one tool he wrote,
together with Matthew Bradburn, namely the implementation of test(1)
in GNU coreutils.
Kevin S. Braunsdorf died last year, on July 24, 2024, after a long illness.
In this post, I try to remember his work and legacy.
⁂
He studied at Purdue University and worked there as a sysadmin from
1986 to 1994. Later, he joined FedEx and greatly influenced how IT
is run there, from software deployments to the physical design of datacenters.
Kevin was a pioneer of what we today call “configuration engineering”,
and he wrote a Unix toolkit called msrc_base to help with these tasks.
(Quote: “This lets a team of less than 10 people run more than 3,200
instances without breaking themselves or production.”)
Together with other tools that are useful in general, he built the
“pundits tool-chain”.
These tools deserve further investigation.
Now, back in those days, Unix systems were vastly heterogeneous and
ridden with vendor-specific quirks and bugs. His tooling centers
around a least common denominator; for example, m4 and make are
used heavily as they were widely available (and later, Perl). C
programs have to be compiled on their specific target hosts. Remote
execution initially used rsh, file distribution was done with
rdist. Everything had to be bootstrappable from simple shell
scripts and standard Unix tools, porting to new platforms was common.
The idea behind msrc
The basic concept of how
msrc
works was already implemented in the first releases from
2000
we can find online: at its core, there is a two-stage Makefile, where
one part runs on the distribution machine, and then the results get
transferred to the target machine (say, with rdist), and then a
second Makefile (Makefile.host) is run there.
This is a practical and very flexible approach. Configuration can be
kept centralized, but if you need to run tasks on the target machine
(say, compile software across your heterogeneous architecture), it is
possible to do as well.
Over time, tools were added to parallelize this (xapply), make the
deployment logs readable (xclate), or work around resource
contention (ptbw). Likewise, tools for inventory management and
host definitions were added (hxmd, efmd). Stateful operations on
sets (oue) can be used for retrying on errors by keeping track of
failed tasks….
All tools are fairly well documented, but documentation is spread
among many files, so it takes some time to understand the core
ideas.
Unix systems contain a series of ad-hoc text formats, such as the
format of /etc/passwd. ksb invented a tiny language to work with
such file formats, implemented by the dicer. A sequence of field
separators and field selectors can be used to drill down on formatted
data:
The first field (the whole line) is split on :, then we select the
5th field, split by space, then select the last field ($).
For the basename of the shell, we split by /.
Using another feature, the mixer, we can build bigger strings from
diced results. For example to format a phone number:
Since the dicer and the mixer are implemented as library routines,
they appear in multiple tools.
“Business logic” in m4
One of the more controversial choices in the pundits tool-chain is that
“business logic” (e.g. things like “this server runs this OS and has
this purpose, therefore it should have this package installed”) is
generally implemented using the notorious macro processor m4. But
there were few other choices back then: awk would have been a
possibility, but is a bit tricky to use due to its line-based
semantics. perl wasn’t around when the tool-chain was started, though
it was used later for some things. But m4 shines if you want to
convert a text file into a text file with some pieces of logic.
One central tool is
hxmd,
which takes tabular data file that contain configuration data (such
as, which hosts exist and what roles do they have), and can use m4
snippets to filter and compute custom command lines to deploy them,
e.g.:
Later, another tool named
efmd
was added that does not spawn a new m4 instance for each configuration line.
m4 is also used as a templating language. There I learned the nice
trick of quoting the entire document except for the parts where you
want to expand macros:
`# $Id...
# Output a minimal /etc/hosts to install to get the network going.
'HXMD_CACHE_TARGET`:
echo "# hxmd generated proto hosts file for 'HOST`"
echo "127.0.0.1 localhost 'HOST ifdef(`SHORTHOST',` SHORTHOST')`"
dig +short A 'HOST` |sed -n -e "s/^[0-9.:]*$$/& 'HOST ifdef(`SHORTHOST',` SHORTHOST')`/p"
'dnl
This example also shows that nested escaping was nothing ksb frowned
upon.
Wrapper stacks
Since many tools of the pundits tool-chain are meant to be used
together, they were written as so-called
“wrappers”,
i.e. programs calling each other. For example, above mentioned hxmd
can spawn several commands in parallel using
xapply,
which themselves call
xclate
again to yield different output streams, or use
ptbw
for resource management.
The great thing about the design of all these tools is how nicely they
fit together. You can easily see what need drove the creation of the tool,
and how they still can be used in a very general way,
also for unanticipated use cases.
Influences on my own work
Discovering these tools was important for my own Unix
toolkit and some tools
are directly inspired,
e.g. xe, and
arr.
It's a meme as accurate as time. The problem is that our digital infrastructure depends upon just some random guy in Nebraska.
Open-source, by design, is not financially sustainable. Finding reliable, well-defined funding sources is exceptionally challenging. As projects grow in size, many maintainers burn out and find themselves unable to meet the increasing demands for support and maintenance.
Speaking from experience here, as someone who has delivered talks at conferences (see below) six years ago and also took a decent stab at resolving open source funding. The settlement on my land on Kangaroo Island was funded through open-source donations, and I'm forever thankful to the backers who supported me during a rough period of my life for helping make that happen.
Rather than watch a 60-minute talk by two burnt-out open-source maintainers, here is a quick summary of the conference video. The idea was simple:
If companies were to enumerate their bills of material and identify their unpaid vendors, they could take steps to mitigate their supply chain risks.
For dependencies that are of strategic importance, then the strategy would be a combination of financial support, becoming regular contributors to the project or even hiring the maintainers of these projects as engineers for [short|long]-term engagements.
Six years have gone by, and I haven't seen many companies do it. I mean, why would they? The software's given away for free, it's released as-is, so why would they pay?
It's only out of goodwill that someone would do it, or in my case, as part of a marketing expenditure program. While I was at Gitpod, I was able to distribute over $33,000 USD to open-source maintainers through the program.
The idea was simple: you could acquire backlinks and promote your brand on the profiles of prolific open-source maintainers, their website and in their GitHub repositories for a fraction of the cost compared to traditional marketing.
Through the above framework, I was able to raise over $33,000 USD for open source maintainers. The approach still works, and I don't understand why other companies are still overlooking it.
Now it's easy to say "marketing business dirty", etc., but what was underpinning this was a central thought.
If just one of those people can help more people better understand a technology or improve the developer experience for an entire ecosystem what is the worth/value of that and why isn’t our industry doing that yet?
The word volunteer, by definition, means those who have the ability and time to give freely.
Paying for resources that are being consumed broadens the list of people who can do open-source. Additionally, money enables open-source maintainers to buy services and outsource the activities that do not bring them joy.
so what has changed since then?
AI has. I'm now eight months into my journey of using AI to automate software development (see below)
I have been actively trying to contribute to CppNix – mostly because using it brings me joy and it turns out so does contributing. 🤗
Stepping into any new codebase can be overwhelming. You are trying to navigate new nomenclature, coding standards, tooling and overall architecture. Nix is over 20 years old and has its fair share of warts in a codebase. Knowledge of the codebase is bimodal either being very diffuse or consolidated to a few minds (i.e. @ericson2314). Thankfully everyone on the Matrix channel has been extremely welcoming.
I have been actively following Snix, a modern Rust re-implementation of the components of the Nix package manager. I like the ideals from the project authors of communicating over well-defined API boundaries via separate processes and a library-first type of design. I was wondering however whether we could leverage CppNix as a library as well. 🤔
Is there a need to throw the baby out with the bath water? 👶
Turns out using Nix as a library is incredibly straightforward!
To start, let’s create a devShell that will include our necessary packages: nix (duh), meson (build tool) and pkg-config.
flake.nix
{description="Example of how to use Nix as a library.";inputs={nixpkgs.url="github:NixOS/nixpkgs/25.05";devshell.url="github:numtide/devshell";};outputs={self,nixpkgs,devshell}:letlib=nixpkgs.lib;systems=["x86_64-linux""aarch64-linux""x86_64-darwin""aarch64-darwin"];in{devShells=lib.genAttrssystems(system:letpkgs=importnixpkgs{inheritsystem;};in{default=pkgs.mkShell{packages=withpkgs;[nixmesonninjapkg-config# I am surprised I need this# I think this is a bug# https://github.com/NixOS/nix/issues/13782boost];};});};}
</summary>
Adding pkg-config to our devShell will initiate a buildHook for any package that contains a dev output and set up the necessary environment variables. This will be the mechanism with which our build tool meson finds the necessary shared-objects and header files.
Let’s now create a trivialmeson.build file. Since we have our pkg-config setup, we can declare “system dependencies” that we expect to be present, knowing that we are including these dependencies from our devShell.
For our sample project I will recreate functionality that is already present in the nix command. We will write a function that accepts a /nix/store path, resolve its derivation and prints it as JSON.
That feels pretty cool!
Lots of projects end up augmenting Nix by wrapping it with fancy bash scripts, however we can just as easily leverage it as a library and write native-first code.
Learning the necessary functions to call is a little obtuse however I was able to reason through the necessary APIs by looking at unit-tests in the repository.
What idea do you want to leverage Nix for but maybe put off since you thought doing it on top of Nix would be too hacky?
Special thanks to @xokdvium who helped me through some learnings on meson and how to leverage Nix as a library. 🙇
There is endless hype about the productivity boon that LLMs will usher in.
While I am amazed at the utility offered by these superintelligent LLMs, at the moment (August 2025) I remain bearish on the utilization of these tools to have any meaningful impact on productivity especially for production-grade codebases where correctness, maintainability, and security are paramount.
They are clearly helpful for exploring ideas or any goal where the code produced may be discarded at the end.
Thinking about how much promise of productivity we might gain from this tool had me reflecting on what other changes in the past 5 years had already benefited me and a clear winner stands out: GitHub’s code search via cs.github.com.
Pre-2020, code search in the open-source domain never really had a good solution, given the diaspora of various hosting platforms. If you’ve worked in any large corporate environment (Amazon, Google, Meta etc…) you might have already had exposure to the powers of an incredible code search. The lack of such a tool for public codebases was a limitation we simply worked This is partly why third-party libraries were consolidated into well-known projects like Apache or established companies such as Google’s Guava.
An upside to the consolidation of code on GitHub’s platform was capitalized on with the release of their revamped code search. Made generally available in May 2023, the new engine added powerful features like symbol search and the ability to follow references.
The productivity win is clear to me, even with the introduction of LLMs. I visit cs.github.com daily, more frequently and with more interaction than any of the LLMs available to me.
Why?
Finding code written by other humans is fun, and for some reason, more joyful to read. There is a certain level of joy to finding solutions to problems you may be facing that were authored and written by another human. This psychological effect may diminish as the code I’m wading through begins to tilt toward AI-generated content. But for now, the majority of the code I’m viewing still subjectively looks like that authored by a human.
I also tend to work in niche areas such as NixOS or Bazel that don’t have a large corpus of material online so the results from the LLM tend to be more disappointing.
If given a Sophie’s choice between GitHub code search and LLMs, strictly for the purpose of code authorship, I would pick code search as of today.
Humans easily adapt to their environment, a phenomenon known as the hedonic treadmill. As we all get excited for the incoming technology of generative AI, let’s take a moment to reflect on the already amazing contribution to engineering we have become accustomed to due to a wonderful code search.
At DEFCON33, the Nix community had its first-ever presence via nix.vegas and I ended up in fun conversation with tomberek 🙌.
“What fun things can we do with < and > with the eventual deprecation of NIX_PATH?
The actual 💡 was from tomberek and this is a demonstration of what that might look like without necessitating any changes to CppNix itself.
As a very worthwhile aside, the first time presence of the Nix community at DEFCON was fantastic and I am extra appreciative to numinit and RossComputerguy 🙇. The badges handed out were so cool. They have strobing LEDs but also can act as a substituter for the Nix infra that was setup.
Okay, back to the idea 💁.
Importing nixpks via the NIX_PATH through the angle-bracket syntax has been a long-standing wart on the reproducibility promises of Nix.
letpkgs=import<nixpkgs>{};inpkgs.hello
There is a really great article about all the problems with this approach to bringing in projects on nix.dev, for those whom are still leveraging it.
With the eventual planned removal of support for NIX_PATH, we are now presented with an opportunity of some new functionality in Nix, namely the angled brackets <something> that can be reconstituted for a new purpose.
Looks like others are already starting to think about this idea. The project htmnix demonstrates the functionality of writing pure-HTML but evaluating it with nix eval 😂.
For something potentially more immediately useful, how about giving quicker access to the attributes of the current flake? 🤔
A common pattern that has emerged is to inject inputs and outputs into extraSpecialArgs so that they are available to modules in NixOS or home-manager.
If we offer a variant of __findFile in scope, Nix will call our implementation rather than the default implementation.
Let’s implement a variant that utilizes builtins.getFlake to return the current flake attributes.
Our goal is to write something as simple as the following and have the contents within the angle brackets be treated as an attribute path of the flake.
<outputs.hello>+" and welcome to Nix!"
What do we have to do to get this to work?
Well we need to provide our own version of __findFile.
{inputs={nixpkgs.url="github:NixOS/nixpkgs/nixos-unstable";nixpkgslib.url="github:nix-community/nixpkgs.lib";};description="Trivial flake that returns a string for eval";outputs={nixpkgslib,nixpkgs,self,}:{__findFile=nixPath:name:letlib=nixpkgslib.lib;flakeAttrs=builtins.getFlake(toString./.);inlib.getAttrFromPath(lib.splitString"."name)flakeAttrs;hello="Hello from a flake!";example=builtins.scopedImportself./default.nix;};}
We write a function of __findFile that trivially splits the contents within the angle bracket to access the attrset of the flake as returned by builtins.getFlake (toString ./.).
There is some additional magic with builtins.scopedImport 🪄 which is not documented. It allows giving a different base set of variables, via a provided attrset, to use for variables. This is how we can override __findFile in all subsequent files.
So does this even work?
> nix eval .#example --impure"Hello from a flake! and welcome to Nix!"
Yes! 🔥 With the caveat that we had to provide --impure since getting the current flake via ./. requires it.
This is a pretty ergonomic way to access the attributes of the current Flake automatically without having us all to go through the same setup for what is amounting to common best practices.
The need to have --impure is a bit of a bummer although this is a pretty neat improvement. There could be a new builtin, builtins.getCurrentFlake, which automatically provides the context of the current flake and therefore could be pure.
Update: simpler & pure
I got some wonderful feedback from eljamm via the discourse post that we can just leverage self and avoid having to use builtins.getFlake.
{inputs={nixpkgs.url="github:NixOS/nixpkgs/nixos-unstable";nixpkgslib.url="github:nix-community/nixpkgs.lib";};description="Trivial flake that returns a string for eval";outputs={nixpkgslib,nixpkgs,self,}:{__findFile=nixPath:name:letlib=nixpkgslib.lib;inlib.getAttrFromPath(lib.splitString"."name)self;hello="Hello from a flake!";example=builtins.scopedImportself./default.nix;};}
We now don’t need to provide --impure 👌 and we get all the same fun new ergonomic way to access flake attributes.
> nix eval .#example
"Hello from a flake! and welcome to Nix!"