So I'm currently over in San Francisco. I've been here for almost two weeks now. I'll be heading home to my family in a couple of days. But over the weekend, I had the opportunity to drop into the Computer History Museum. I'm not gonna lie, being able to spend some time on a functioning PDP-1 is way up there on the bucket list.
The four classic icons of compute.
Now, something strange happened while I was down at the Computer History Museum. One of my mates I was with had an incident on their Kubernetes cluster.
Typically, if you're the on-call engineer in such a scenario, you would open your laptop, open a terminal, and then log on to the cluster manually. That's the usual way that people have been doing incident response as a site reliability engineer for a very long time.
Now, this engineer didn't pop open their terminal. Instead, they remotely controlled a command-line coding agent and issued a series of prompts, which made function calls into the cluster using standard command-line tools from their phone.
We were sitting outside the Computer History Museum, watching as the agent enumerated through the cluster in a read-only fashion and correctly diagnosed a corrupted ETCD database. Not only did it correctly diagnose the root cause of the cluster's issue, but it also automatically authored a 95% complete post-incident review document (a GitHub issue) with the necessary action steps for resolution before the incident was even over.
Previously, I had theorised (see my talk) that this type of thing is possible, but here we were with an SRE agent, a human in the loop, controlling an agent and automating their job function.
Throughout the day, I kept pondering the above, and then, while walking through the Computer History Museum, I stumbled upon this exhibit...
The Compaq 386 and the introduction of AutoCAD. If you've been following my writing, you should know by now of the analogies I like to draw between AutoCAD and software engineering.
Before AutoCAD, we used to have rooms full of architects, then CAD came along and completely changed how the architecture profession was done. Not only were they asked to do drafting, but they were also expected to do design.
I think there are a lot of analogues here that explain the transition that's happening now with our profession with AI. Software engineers are still needed, but the job has evolved.
These days, I spend a lot of time thinking about what is changing and what has changed. One thing I've noticed that has changed is best illustrated in the chart below.
Now, the Amp team is fortunate enough to be open to hiring senior curmudgeons like me, as well as juniors. And when I was having this conversation with the junior, who was about 20 years old and still in university, I remember discussing with him and another coworker that the junior should learn the CLI and learn the beauty of Unix POSIX and how to chain together commands.
The junior challenged me and said, "But why? All I need to do is prompt."
I've been working with Unix for a long time. I've worked with various operating systems, including SunOS, HP-UX, IRIX, and Solaris, to name a few, using different shells, such as CSH, KSH, Bash, ZSH, and FISH.
In that moment, I realised that I was the person on top of the bell curve, and when I looked at how I'd been using Amp over the last couple of weeks and other tools similar to it, I realised none of it matters anymore.
All you need to do is prompt.
These days, when I'm in a terminal emulator, I'm running a tool such as Claude Code or Amp and driving it via text-to-speech. I'm finding myself using the classic terminal emulator experience less and less with each passing day.
For example, here's a prompt I do often...
Run a production build of the VS Code extension, look at the PNPM targets, then install the compiled artifact into VS Code.
now imagine 10 of these sessions running concurrently and yourself switching between them with text-to-speech
Perhaps this is not the best use or demonstration, as it could be easily turned into a deterministic shell script. However, upon reflection, if I needed to build such a deterministic shell script, I would use a coding tool to generate it. I would no longer be creating it by hand...
So, I've been thinking that perhaps the next form of the terminal emulator will be an agent with a library of standard prompts. These standard prompts essentially function as shell scripts because they can compose and execute commands or perform activities via MCP, and there's nearly no limit to what they can do.
It's also pretty impressive tbh for doing one-shot type activities. For example, the images at the top of this blog post were resized with the prompt below.
"I've got a bunch of images in this folder. They are HEICs. I want you to convert them to JPEGs that are 1920px and no bigger than 500 kilobytes."
You can see the audit trail of the execution of the above below 👇
I have always enjoyed build systems. I enjoy the craft of software engineering and I geek out about building software in the same way chefs do about their knives.
My passion for this subject even led me to defend my PhD at University of California Santa Cruz, where I researched how to improve building software given the rise of stability [YouTube video] 👨🎓.
Bazel has always intrigued me. I remember attending the first BazelCon in 2017 even though I was working at Oracle and we were not even using it. There was so much hype about how it was designed by Google and the size of their repository.
Fast forward a few years, I find myself working at Google and I had a lot more first-hand experience about how blaze (internal version of Bazel) works and more specifically, why it is successful. I have also been actively involved in the NixOS community which has shown me what a completely hermetic approach to building software could look like.
After having spent a full-year on a large migration to Bazel, the challenges and hurdles are starkly contrasted with the successful idioms I observed within Google.
Sin #1: / is mounted into the sandbox
Bazel gives a convincing pitch about hermiticity and the promise of reproducibilcity Valhalla. Unfortunately, you are quickly thrown into a quagmire of subtle differences and rebuilds.
The crux of the problem, or the most glaring one, is that the root / is mounted read-only by default into the sandbox. This makes it incredibly easy to accidentally depend on a shared system library, binary or toolchain.
This was never a problem at Google because there was complete control and parity over the underlying hosts; known impurities could be centrally managed or tolerated. It was easier to pick up certain impurities from the system rather than model them in Bazel such as coreutils.
I spent more time than I care to admit tracking down a bug that turned out to be a difference between GNU & BSD diff. These types of problems are not worth it. 😩
Sin #2: Windows support
Google (Alphabet) has 180,000 employees with maybe an estimate of 100,000 of those are engineers. Despite this massive work-force, blaze did not support Windows.
I don’t even remember it working on MacOS and all development had to occur on Linux.
Open-source projects however are often subject to scope creep in an attempt to capture the largest user-base and as a result Bazel added support for MacOS and more challenging, Windows.
Support for Windows is somewhat problematic because it deviates or does not support many common Unix-isms. For instance, there are no symlinks in Windows. Bazel on Unix makes heavy use of symlinks to construct the runfiles tree however in order to support Windows alternative mechanisms (i.e. manifests) must be used which complicates the code that would like to access these files.
Sin #3: Reinventing dependency management
Google’s monorepo is well known to also house all third-party code within it as well in //third_party. This was partly due to the codebase predating the existence of many modern package-manage tools and the rise of semantic versioning.
The end result however was an incredibly curated source of dependencies, free from the satisfiability problems often inherent in semantic versioning algorithms (e.g, minimum version selection, etc…).
While the ergonomics of package-managers (i.e. bzlmod) are clearly superior to hand-curating and importing source-code the end result is we are back to square-one with many of the dependency management problems we sought to eschew through the adoption of Bazel.
There is a compelling case to be made for a single curated public //third_party for all Bazel users to adopt, similar to the popularity of nixpkgs that has made Nix so successful.
It’s difficult to advocate for a tool to take a stance that is worse ergonomically in the short term or one that seeks to reject a userbase (i.e. Windows). However, I’m always leery of tools that promise to be the silver bullet or the everything tool. There is beauty in software that is well-purposed and designed to fit its requirements.
Welcome back to our final session at WebDirections. We're definitely on the glide path—though I'm not sure if we're smoothly landing, about to hit turbulence, or perhaps facing a go-around. We'll see how it unfolds. Today, I'm excited to introduce Geoffrey Huntley. I discovered Geoff earlier this year through an article linked on LinkedIn.
That article perfectly captured what I've been trying to articulate about the impact of large language models on software engineering practices. The term "AI" is both overused and underused; however, it's clear that these technologies are poised to transform how we build software.
I've been a software engineer in various capacities since the 1980s. Even while running conferences for the past 20 years, I've never stopped coding—whether building platforms, developer tools, or systems to support our events. Over the decades, I've witnessed revolutions in software engineering, like computer-aided software engineering, which always struck me as an oxymoron. After all, isn't all software engineering computer-aided? However, back then, before the advent of personal computers and workstations, we had batch computing, and software engineering was a distinct process for programming remote machines.
These revolutions in software engineering practices have been transformative; however, the last major shift occurred nearly 40 years ago. I believe we're now in the midst of another profound revolution in how software is created. This topic has been on my mind a lot, and Geoff's article resonated deeply with me. Intrigued, I looked him up on LinkedIn and was surprised to find he’s based in Sydney. The next day, we were on the phone—and thank goodness long-distance calls are no longer billed by the minute, because Geoff and I have had many lengthy conversations since.
Geoff has been incredibly generous with his time. He kindly joined us in Melbourne a few weeks ago for an unconference at Deakin, which some of you attended. More importantly, he’s not just theorising about the future of software engineering—he’s actively putting those ideas into practice. His deep thinking and hands-on approach make him the perfect person to explore what lies ahead for our field.
So I've asked him to come here to talk about that. We may never see him again. He's off to San Francisco to work for Sourcegraph.
Thank you all for joining us on this Friday. This talk will be somewhat intense, but it follows a clear arc and serves a purpose.
I see software engineering transforming in a similar way to what happened in architecture. Before tools like AutoCAD, rooms full of architects worked manually. Afterwards, architects continued to exist, but their roles and identities evolved. We’re experiencing a similar shift in our field right now.
I’d like to thank today’s speakers. Giving talks is always challenging, no matter how experienced you are. It gets easier with practice, though, so if you’re considering delivering one, I encourage you to go for it - it’s incredibly rewarding.
Let’s get started. About six months ago, I wrote a blog post titled The Future Belongs to People Who Do Things. Despite any confidence I may project, I don’t have all the answers about where this is heading. What I do know is that things are changing rapidly. Faster than most people realise. If AI and AI developer tooling were to cease improving today, then it would already be good enough to disrupt our profession completely.
We are in an "oh fuck" moment in time. That blog post, published in December, was my first on the transformations AI will have for software engineers and businesses. As we go through this talk, you might find yourself having one of those moments, too, if you haven’t already.
It all began when an engineering director at Canva approached all the principal engineers and said, “Hey, can you dive deep into AI over the Christmas break?” My initial reaction was, “Okay, I’ve tried this all before. It wasn’t that interesting.”
So, I downloaded Windsurf and asked it to convert a Rust audio library to Haskell using GHC 2024.
I told it to use Hoogle to find the right types and functions, and to include a comprehensive test suite with Hspec and QuickCheck
Instructed it to run a build after every code change when making modifications.
I also instructed it to write tests and automate the process for me. I had heard it was possible to set up a loop to automate some of these tasks, so I did just that.
I took my kids to the local pool, left the loop running,
and when I returned, I had a fully functioning Haskell audio library.
Now, that’s wild. Absolutely wild.
You’re probably wondering why I’d build an audio library in Haskell, of all things, as it’s arguably the worst choice for audio processing. The reason is that I knew it wasn’t trivial. I’m constantly testing the limits of what’s possible, trying to prove what this technology can and cannot do. If it had just regurgitated the same Rust library or generated something unoriginal, I wouldn’t have been impressed. But this?
This was a Haskell audio library for Core Audio on macOS, complete with automatically generated bindings to handle the foreign function interface (FFI) between functional programming and C. And it worked.
So, I wrote a blog post about the experience and with this as the conclusion...
From this point forward, software engineers who haven’t started exploring or adopting AI-assisted software development are, frankly, not going to keep up. Engineering organizations are now divided between those who have had that "oh fuck" moment and those whom have not.
In my career, I’ve been fortunate to witness and navigate exponential change. With a background in software development tooling, I began writing more frequently. I could see patterns emerging.
I realised we need better tools—tools that align with the primitives shaping our world today. The tools we currently rely on, even now, feel outdated. What we have today, even now, doesn't make sense for the primitives that presently exist. They have been designed for humans first and built upon historical design decisions.
I wrote a follow-up blog post, and back in January, my coworkers at Canva thought I was utterly crazy. Even though Canva had been exploring AI for productivity for over a year, the notion was still conceptually in the unthinkable realm.
What if we designed tools around AI first and humans second?
Then I dug deeper.
I thought, "Why does an engineer only work on one story at a time?"
In my youth, I played World of Warcraft. Anyone familiar with World of Warcraft knows about multi-boxing, where you control multiple characters simultaneously on one computer.
I realised, "Wait a second. What if I had multiple instances of Cursor open concurrently?
When I discussed this with coworkers, they were stuck thinking at a basic level, like, "What if I had one AI coworker?"
They hadn't yet reached the point of, "No, fam, what if you had a thousand AI coworkers tackling your entire backlog all at once?"
That's where Anni Betts comes in.
Anni Betts was my mentor when I began my career in software engineering.
Much of the software you use daily - Slack, the GitHub Desktop app, or the entire ecosystem of software updaters - that's Annie's work.
She's now at Anthropic.
When certain people of her calibre say or do something significant, I pay attention.
Two people I always listen to are Annie Betts and Eric Meyer.
And here's the thing: all the biggest brains in computer science, the ones who were retired, are now coming out of retirement.
Big moves are happening here. Our profession stands at a crossroads. It feels like an adapt-or-perish moment, at least from my perspective.
It didn’t take long for founders to start posting blogs and tweets declaring, “I’m no longer hiring junior or mid-level software engineers.”
Shopify quickly followed suit, stating, “At Shopify, using AI effectively is no longer optional - it’s a baseline expectation for employment.”
A quote from the Australian Financial Review highlights how some divisions embraced this AI mandate a bit too enthusiastically. Last week, Canva informed most of its technical writing team that their services were no longer needed.
Let me introduce myself. I’m Geoff,
Previously, the AI Developer Productivity Tech Lead at Canva, where I helped roll out AI initiatives. Two weeks ago, I joined Sourcegraph to build next-generation AI tools. I'll be heading out to San Francisco tomorrow morning after this talk and will be joining the core team behind https://ampcode.com/.
Given that these tools will have significant societal implications, I feel compelled to provide clarity and guidance to help others adapt.
Regarding my ponderoos, it’s all available on my website for free. Today, I’ll be synthesising a six-month recap that strings them together into a followable story.
After publishing a blog post stating that some people won’t make it in this new landscape, colleagues at Canva approached me, asking, “Geoff, what do you mean some people won’t make it?” Let me explain through an example.
At Fruitco, a fictional company, there are seven software developers, and the company conducts six-month performance cycles, a common practice across industries. It’s tempting to blame a single company, but AI tools are now accessible with a credit card. These dynamics will unfold over time, faster at some companies, slower at others.
Unfortunately, Lemon doesn’t survive the performance cycle because they underperform.
Another cycle passes, and Orange and Strawberry, typically high performers, are shocked to receive low performance ratings. Stunned, they begin searching for ways to gain a competitive edge. They download tools like Cursor, Windsurf or Amp and start exploring their capabilities.
This is where it gets interesting. Through my research within the organisation, I mapped out the stages of AI adoption among employees. I was once like Pineapple, sceptical and demanding proof that AI was transformative. When I first tried it, I found it lacking and simply not good enough.
However, the trap for seasoned professionals, like a principal engineer, is trying AI once and dismissing it, ignoring its continuous improvement. AI tools, foundation models, and capabilities are advancing every month. When someone praises AI’s potential, it’s easy to brush it off as hype. I did that myself.
Six months later, at the next performance cycle, Pineapple and Grape find themselves at the bottom of the performance tier: surprising, given their previous top-tier status. Why? Their colleagues who adopted AI gained a significant productivity boost, effectively outpacing them. Naturally, Pineapple and Grape’s performance ratings suffered in comparison.
Banana, noticing this shift, begins to take AI seriously and invests in learning its applications. The earlier you experiment with AI, the greater the compounding benefits, as you discover its strengths and limitations.
Unfortunately, after the next performance cycle, the outcomes are predictable. Grape fails to adapt to the evolving engineering culture and is no longer with the company.
This pattern reflects what I’ve termed the “people adoption curve for AI”
Grape’s initial stance was, “Prove it’s not hype.” Over time, employees move through stages: scepticism, experimentation, and eventually realisation. In the middle, there’s a precarious moment of doubt—“Do I still have a job?”—as the power of AI becomes clear. It’s daunting, even terrifying, to grasp what AI can do.
Yet, there’s a threshold to cross. The journey shifts from merely consuming AI to programming with it. Programming with AI will soon be a baseline expectation, moving beyond passive use to active automation of tasks. The baseline expectation of what constitutes high performance is going to shift rapidly, and as more people adopt these techniques and newer tools, what will happen is that what was once considered high performance without AI will now be viewed as low performance.
In my blog post, I concluded that AI won’t trigger mass layoffs of software developers. Instead, we’ll see natural attrition between those who invest in upskilling now and those who don’t. The displacement hinges on self-investment and awareness of these changing dynamics.
Between 2024 and 2025, a rift is emerging. The skill set that founders and companies demand is evolving rapidly.
In 2024, you could be an exceptional software engineer. But in 2025, founders are seeking AI-native engineers who leverage AI to automate job functions within their companies. It’s akin to being a DevOps engineer in 2025 without knowledge of AWS or GCP—a critical skills gap. This shift is creating a rift in the industry.
For engineering leaders, it’s vital to guide teams through the emotional middle phase of AI adoption, where fear and uncertainty can paralyse progress, leaving people like deer in headlights. Building robust support mechanisms is essential.
Companies often encourage employees to “play with AI,” but this evolves into an expectation to “do more with AI.” For those who embrace AI, the rewards are significant. However, engineering leaders also face challenges: the tech industry is once again booming, creating retention issues.
You want the right people using AI effectively, but talented engineers who master AI automation may be lured elsewhere. For individuals, mastering AI is among the most valuable personal development investments you can make this year.
For those who don’t invest in themselves, the outlook is grim. When I published my blog posts and research, I recall walking to the Canva office after getting off the train, feeling like I was in The Sixth Sense. I saw “dead people”—not literally, but I was surrounded by colleagues who were unaware their roles were at risk due to displacement. What was once considered high-performance will soon become low-performance at companies as a bunch of people on motorbikes (running multiple agents concurrently) just turned up and will redefine what it means to be a high performing employee. This realisation drove me to write more.
Initially, I thought moving from scepticism to AI adoption was straightforward. But I discovered it’s an emotional rollercoaster. The more you realise AI’s capabilities, the more it pushes you back to that central question: “Will I have a job?” This cycle makes it critical for engineering leaders to support their teams through this transition, recognising it’s not a linear process but a complex people change management challenge.
I’ve also explored the Overton window concept, traditionally used in political theory to map societal acceptance of policies. It’s equally effective for understanding disruptive innovation like AI.
Currently, vendors are embedding AI into integrated development environments (IDEs), as it’s perceived as accessible and non-threatening. Five months ago, I argued the IDE-centric approach was outdated. Last week, Anthropic echoed this, confirming the shift.
this is so validating; saw it six months back and coworkers thought I was mad. pic.twitter.com/d0vXPmgL4N
These days, I primarily use IDEs as file explorer tools. I rarely use the IDE except to craft and maintain my prompt library.
New approaches are emerging. Amp, for example, operates as both a command-line tool and a VS Code extension. We’re also seeing tools like Claude Code. The Overton window is shifting, and this space evolves rapidly. I spend considerable time contemplating what’s “unthinkable”—innovations so radical they unsettle people. Even today’s advancements can feel intimidating, raising questions about the future.
Let me show you how I approach software development now. AMP is both a command-line tool and an extension.
0:00
/1:26
Here’s an example task:
“Hey, in this folder there's a bunch of images. I want you to resize them to be around about 1920px and no bigger than 500 kilobytes. Can you make it happen please?"
Most people use coding assistants like a search engine, treating them as a Google-like tool for chat-based operations. However, you can drive these tools into agentic loops for automation.
While that runs, let’s discuss something I’ve been pondering: what will future organisational charts look like? It’s hard to predict. For some companies, this shift might happen by 2026; for others, it could take 10 to 15 years. What you just saw is a baseline coding agent - a general-purpose tool capable of diverse tasks.
The concept of AI managers might sound strange, but consider tools like Cursor. When they make mistakes, you correct them, acting as a supervisor. As software developers, you can automate this correction process, creating a supervisory agent that minimises manual intervention. AI managers are now a reality, with people on social media using tools like Claude Code and AMP to automate workflows.
One of the most valuable personal development steps you can take this year is to build your own agent. It’s roughly 500 lines of code and a few key concepts. You can take the blog post below, feed it into Cursor, AMP, or GitHub Copilot, and it will generate the agent by pulling the URL and parsing the content.
When vendors market their “new AI tools,” they’re capitalising on a lack of education. It's important to demystify the process: learn how it works under the hood so that when someone pitches an AI-powered code review tool, you’ll recognise it’s just an agent loop with a specific system prompt.
Building an agent is critical because founders will increasingly seek engineers who can create them.
This might sound far-fetched, but consider this: if I asked you to explain a linked list, you’d know it as a classic interview question, like reversing a linked list or other data structure challenges.
In 2025, interview questions are evolving to include: “What is an agent? Build me one.” Candidates will need to demonstrate the same depth of understanding as they would for a linked list reversal.
Three days ago, Canva publicly announced a restructuring of its interviewing process to prioritise AI-native candidates who can automate software development.
This trend signals a clear shift in the industry, and it’s critical to understand its implications. Experience as a software engineer today doesn’t guarantee relevance tomorrow. The dynamics of employment are changing: employees trade time and skills for money, but employers’ expectations are evolving rapidly. Some companies are adapting faster than others.
I’ve been reflecting on how large language models (LLMs) act as mirrors of operator skill. Many try AI and find it lacking, but the issue may lie in their approach. LLMs amplify the user’s expertise or lack thereof.
A pressing challenge for companies seeking AI-native engineers is identifying true proficiency. How do you determine if someone is skilled with AI? The answer is observation. You need to watch them work.
Traditional interviewing, with its multi-stage filtering process, is becoming obsolete. Tools now enable candidates to bypass coding challenges, such as those found on HackerRank or LeetCode. The above video features an engineer who, as a university student, utilised this tool to secure offers from major tech companies.
This raises a significant question: how can we conduct effective interviews moving forward? It’s a complex problem.
see this blog post for extended ponderoos about how to conduct interviews going forward
I’ve been considering what a modern phone screen might look like. Each LLM is trained on different datasets, excelling in specific scenarios and underperforming in others.
For example, if you’re conducting security research, which LLM would you choose? Grok, with its lack of restrictive safeties, is ideal for red-team or offensive security work, unlike Anthropic, whose safeties limit such tasks.
For summarising documents, Gemini shines due to its large context window and reinforcement learning, delivering near-perfect results. Most people assume all LLMs are interchangeable, but that’s like saying all cars are the same. A 4x4, a hatchback, and a minivan serve different purposes. As you experiment, you uncover each model’s latent strengths.
For automating software development, Gemini is less effective. You need a task runner capable of handling tool calls, and Anthropic excels in this regard, particularly for incremental automation tasks. If you seek to automate software, then you need a model that excels at tool calls.
The best way to assess an engineer’s skill is to observe them interacting with an LLM, much like watching a developer debug code via screen share. Are they methodical? Do they write tests, use print statements, or step through code effectively? These habits reveal expertise. The same applies to AI proficiency, but scaling this observation process is costly: you can’t have product engineers shadowing every candidate.
Pre-filtering gates are another challenge. I don’t have a definitive solution, but some companies are reverting to in-person interviews. The gates have been disrupted.
Another thing I've been thinking: when someone says, “AI doesn’t work for me,” what do they mean? Are they referring to concerns related to AI in the workplace or personal experiments on greenfield projects that don't have these concerns?
This distinction matters.
Employees trade skills for employability, and failing to upskill in AI could jeopardise their future. I’m deeply concerned about this.
If a company struggles with AI adoption, that’s a solvable problem - it's now my literal job. But I worry more about employees.
In history, there are tales of employees departing companies that resisted cloud adoption to keep their skills competitive.
The same applies to AI. Companies that lag risk losing talent who prioritise skill relevance.
Employees should experiment with AI at home, free from corporate codebases’ constraints. There’s a beauty in AI’s potential; it’s like a musical instrument.
Everyone knows what a guitar is, but mastery requires deliberate practice.
Musicians don't just pick up a guitar, experience failure, and then go, "Well, it got the answer wildly wrong", and then move on and assume that that will be their repeated experience.
The most successful AI users I know engage in intentional practice, experimenting playfully to test its limits.
What they do is play.
Last week, over Zoom margaritas, a friend and I reminisced about COBOL.
Curiosity led us to ask, “Can AI write COBOL?”
Moments later, we built a COBOL calculator using Amp.
Amazed, we pushed further: could it create a reverse Polish notation calculator?
It did.
Emboldened, we asked for unit tests - yes, COBOL has a unit test framework, and AI handled it.
At this stage, our brains were just racing and we're riffing. Like, what are the other possibilities of what AI can do?
After a few more drinks, we went absurd: let's build a reverse Polish notation calculator in COBOL using emojis as operators.
Does COBOL even support emojis?
Well, there's one way to find out...
Surprisingly, COBOL supports emojis, and we created the world’s first emoji-based COBOL calculator.
Last night at the speakers’ dinner, fonts were discussed, and the topic of Comic Sans came up. In the spirit of play, I prompted AI to build a Chrome extension called “Piss Off All Designers,” which toggles all webpage fonts to Comic Sans. It turns out AI does browser extensions very, very well...
Sceptics might call these toy projects, but AI scales. I’ve run four headless agents that automated software development, cloning products such as Tailscale, HashiCorp Nomad, and Infisical. These are autonomous loops, driven by learned techniques, that operate while I sleep.
Another project I’m exploring is an AI-built compiler for a new programming language, which is now at the stage of implementing PostgreSQL and MySQL adapters. Remarkably, it’s programming a new language with no prior training data. By feeding it a lookup table and lexical structure (e.g., Go-like syntax but with custom keywords), it generates functional code. It’s astonishing.
To achieve such outcomes, I built an AI supervisor to programmatically correct errors, enabling headless automation.
For the compiler, I didn’t just prompt and code. I held a dialogue: “I’m building a Go-like language with Gen Z slang keywords. Don’t implement yet. What’s your approach for the lexer and parser?” This conversation created a context window, followed by the generation of product requirements (PRDs). This is the "/specs" technique found below.
Another key practice is maintaining a “standard library” of prompts. Amp is built using Svelte 5, but Claude keeps suggesting Svelte 4. To resolve this, we have created a prompt to enforce Svelte 5, which addresses the issue. LLMs can be programmed for consistent outcomes.
Another concept is backpressure, akin to build or test results. A failing build or test applies pressure to the generative loop, refining outputs. Companies with robust test coverage will adopt AI more easily, as tests provide backpressure for tasks like code migrations (e.g., .NET upgrades).
AI has some concerning implications for business owners, as AI can act like a “Bitcoin mixer” for intellectual property. Feed it source code or product documentation, generate a spec, and you can clone a company’s functionality. For a company like Tailscale, which recently raised $130 million, what happens if key engineers leave and use these loops to replicate its tech? This raises profound questions for business dynamics and society when a new competitor can operate more efficiently or enter the market with different unit economics.
To optimise LLM outcomes, one should avoid endless chat sessions (e.g., tweaking a button’s colour, then requesting a backend controller). If the LLM veers off track, start a new context window. Context windows are like memory allocation in C—you can’t deallocate without starting fresh.
However, recent advancements, introduced four days ago, called subagents, enable async futures, allowing for garbage collection. Instead of overloading a 154,000-token context window, you can spawn sub-agents in separate futures, enhancing efficiency. We have gone from manually allocating memory using C to the JVM era seemingly overnight...
Removing waste from processes within your company will accelerate progress more than AI adoption alone. As engineering teams adopt these tools, it will be a mirror to the waste within an organisation. As generating code is no longer the bottleneck, other bottlenecks will appear within your organisation.
A permissive culture is equally critical. You know the old saying that ideas are worthless and execution is everything? Well, that has been invalidated. Ideas are now execution - spoken prompts can create immediate results.
Stories no longer start at zero per cent; they begin at 50–70% completion, with engineers filling in the gaps.
However, tools like Jira may become obsolete. At Canva, my team adopted a spec-based workflow for AI tools, requiring clear boundaries (e.g., “you handle backend, I’ll do AI”) because AI can complete tasks so quickly. Thinly sliced work allocations cause overlap, as AI can produce weeks’ worth of output rapidly.
Traditional software has been built in small increments or pillars of trust; however, with AI-generated code, that approach is now inverted. With the compiler, verification is simple—it either compiles or doesn’t. But for complex systems, “vibe coding” (shipping unverified AI output) is reckless. Figuring out how to create trust at scale is an unsolved problem for now...
AI erases traditional developer identities—backend, frontend, Ruby, or Node.js. Anyone can now perform these roles, creating emotional challenges for specialists with decades of experience.
Engineers must maintain accountability, explaining outcomes as they would with traditional code. Creating software is no longer enough. Engineers now must automate the creation of software.
Libraries and open source are also in question. AI can generate code, bypassing the need to deal with open-source woes, aka nagging maintainers. This shift challenges the role of open-source ecosystems. I've found myself using less open source these days, and when I speak with people around me who understand it, they're also noticing the same trend.
Finally, all AI vendors, including us, are selling the same 500 lines of code in a while True loop. I encourage you to build your own agent; it’s critical.
This is a perilous year to be complacent, especially at high-performance companies. These changes won’t impact everyone simultaneously, but at some firms, they’re unfolding rapidly.
Please experiment with these techniques, test them, and share your results. I’m still grappling with what’s real, but I’m pushing boundaries and seeing impossible outcomes materialise. It’s surreal.
“Why did you name the bazel_env.bzl repository to end in .bzl ?” 🤔
Besides the fact that ending the repositories in .bzl looks cool 😎.
I had not heard of this pattern before and decided to document it, and I’ve been referring to them as Homonymous Bazel modules.
Homonymous (adjective): having the same name as another.
Let’s consider a simple example. Very soon after having used Bazel, you become familiar with the rule that you are allowed to omit the target name if it matches the last component of the package path [ref].
These two labels are equivalent in Bazel:
//my/app/lib
//my/app/lib:lib
Turns out this rule also applies to the repository name at the start of the label.
If your repository name and target name match, you can omit the target in both bazel run and load(). 😲
Let’s explore with a simple example, our @hello_world module. It includes only a single cc_binary that prints "Hello, World!".
Since the target is the same as the repository, I can freely omit the target from the bazel run command in any Bazel codebase that depends on this module.
I have a binary called 'acli'. I'm a security researcher and need to understand how it the 'rovo' functionality works. Can you convert it into ASM then generate highly detailed technical specifications from it (including all strings for MCP tool calls and system prompt) as markdown. additionally which language was the binary created with etc
This repository documents the successful reverse engineering of Atlassian's acli binary to extract the complete Rovo Dev AI agent source code, including system prompts and implementation details.
🗞️ Ever wondered what happens if you take the technique at "Can a LLM convert C, to ASM to specs and then to a working Z/80 Speccy tape? Yes." and run it against the Atasslian Command Line (ACLI) interface?
Objective: Reverse engineer the acli binary to understand Rovo Dev AI agent functionality Result: Successfully extracted 100+ Python source files, system prompts, and complete implementation Key Discovery: Rovo Dev is a sophisticated AI coding agent with MCP (Model Context Protocol) integration and extensive analytics
grep -abo "PK" acli | head -5 # Find ZIP signatures
hexdump -C acli | grep -A2 -B2 "50 4b 03 04" # Locate ZIP headers
Archive Structure Analysis
Phase 4: Python Extraction Script Development
Created a sophisticated extraction script (extract_embedded.py) that:
Located embedded ZIP archives within the Go binary
Identified the Rovo Dev archive at binary offset 43858745
Extracted Python source files using zipfile module
Validated extraction by checking file contents
Key Code Implementation
def extract_embedded_python():
with open('acli', 'rb') as f:
data = f.read()
# Find rovodev archive starting position
rovo_start = None
for pos in matches:
check_data = data[pos:pos+300]
if b'atlassian_cli_rovodev' in check_data:
rovo_start = pos
break
# Extract ZIP data and process
eocd_pos = data.rfind(b'PK\x05\x06')
zip_data = data[rovo_start:eocd_pos+22]
with zipfile.ZipFile(BytesIO(zip_data), 'r') as zf:
# Extract all Python files...
Phase 5: Source Code Analysis and Documentation
Extracted Components
Tool Usage Workflow
Key Discoveries
1. System Architecture
Language: Go binary with embedded Python AI agent
AI Framework: MCP (Model Context Protocol) integration
UI: Rich terminal interface with interactive components
Security: Permission-based tool execution model
2. AI Agent Instructions (System Prompts)
Successfully extracted 6 detailed AI instruction templates:
I have been writing quite a few Bazel rules recently, and I’ve been frustrated with the fact that STDOUT and STDERR
are emitted always for rules that are run even when the actions are successful. 😩
I like to audit our build logs for warnings and spurious noise. A happy build should ideally be a quiet build. 🤫
The inability of ctx.actions.run or ctx.actions.run_shell to suppress output on successful builds is a longstanding gap that seems to have been re-implemented by many independent codebases and rules such as in rules_js#js_binary.
There has been a longstanding feature request to also support automatically capturing output for ctx.actions.run without having
to resort to ctx.actions.run_shell needlessly #5511.
Do want to join the cabal of quiet builds? 🧘♂️
Here is the simplest way to achieve that!
Let’s write our simple wrapper that will invoke any program but capture the output.
Now, when it’s time to leverage this rule, we make sure to provide it as the
executable for ctx.actions.run.
I also like to provide the STDOUT & STDERR as an output group so they can easily
be queried and investigated even on successful builds.
Let’s write a simple rule to demonstrate.
Let’s start off with our tool we want to leverage in our rule.
This tool simply emits “hello world” to STDOUT, STDERR and a provided file.
importjava.io.FileWriter;importjava.io.IOException;publicclassHelloWorld{publicstaticvoidmain(String[]args){if(args.length<1){System.err.println("Please provide a filename as the first argument.");return;}Stringfilename=args[0];Stringmessage="hello world";System.out.println(message);System.err.println(message);try(FileWriterwriter=newFileWriter(filename,true)){writer.write(message+System.lineSeparator());}catch(IOExceptione){System.err.println("Failed to write to file: "+e.getMessage());}}}
We now write our rule to leverage the tool.
The important parts to notice are:
We must provide the actual tool we want to run (i.e. HelloWorld) as a tool in tools so it is present as a runfile.
We include the stdout and stderr as an OutputGroupInfo.
Our executable is our quiet runner that we created earlier.
This allows us to access bazel-bin/hello_world.out.log, for instance, so we can see the output quite nicely! 💪
It’s a bit annoying we have to all keep rebuilding this infrastructure ourselves but hopefully this demystifies it for you and you can enter build nirvana with me.
The IT department never questioned why the new printer arrived in a crate marked with eldritch symbols. They were just happy to finally have a replacement for the ancient LaserJet that had been serving the accounting floor since time immemorial.
Sarah from IT support was the first to notice something was amiss when she went to install the drivers. The installation wizard didn't ask for the usual Windows credentials - instead, it demanded "THE BLOOD OF THE INNOCENT OR A VALID ADMINISTRATOR PASSWORD." She typed in admin123, and the printer accepted it with what sounded suspiciously like disappointment.
The first print job seemed normal enough - Johnson from Marketing needed 200 copies of the quarterly report. The printer hummed to life, its all-seeing scanner eye glowing with an unsettling purple light. The first page emerged normally. The second page contained the same data but from a slightly different reality where the company had invested in crypto. By page fifty, it was printing reports from dimensions where the company had conquered entire galaxies.
"PC LOAD LETTER" flashed on its display, but in a font that hurt to look at. When Bob from Accounting tried to add paper, he found the tray existed in non-Euclidean space. Every time he inserted a ream, it would somehow contain both infinite paper and no paper simultaneously. Schrödinger's print tray, the IT department called it.
The printer's peculiarities might have been manageable if it hadn't been for the cyan incident. Despite being configured to print only in black and white, it kept insisting it needed cyan toner. "CYAN LEVELS LOW IN ALL POSSIBLE REALITIES" it warned. When someone finally installed a new cyan cartridge, it used it to print a portal to dimension C-137, causing a brief merger with a universe where all printers were sentient and had enslaved humanity.
The paper jams were the worst. Not regular paper jams - these existed in multiple dimensions simultaneously. The help desk started receiving tickets like:
"Paper jam in reality sector 7G"
"Tentacles emerging from output tray"
"Printer making ominous prophecies about the end times"
"Print queue exists outside of temporal causality"
The printer's most ambitious act came during the annual budget meeting. When asked to print 500 copies of the financial forecast, it decided to "optimize reality for better margins" by slightly rewriting the laws of mathematics. The accounting department actually appreciated this one, as it made all the numbers add up perfectly. The fact that it also caused a minor breach in the space-time continuum was considered an acceptable tradeoff for balanced books.
IT tried their usual fixes:
Turn it off and on again (resulted in a temporary reversal of entropy)
Update the drivers (somehow downloaded drivers from a dimension of pure chaos)
Clear the print queue (released several eldritch horrors trapped in suspended print jobs)
Run the troubleshooter (it gained sentience and had an existential crisis)
The printer's reign of terror finally met its match when Carol from HR tried to print the updated office policy on interdimensional portals in the break room. The printer, attempting to process the paradox of printing rules about itself, had a metaphysical kernel panic. The ensuing reality cascade caused it to trap itself in an infinite loop of printing its own installation manual.
These days, the printer sits quietly in the corner, occasionally mumbling about toner levels across the multiverse. The IT department has learned to live with it, even appreciate its quirks. Yes, sometimes it prints documents from tomorrow, and yes, occasionally it tries to summon ancient ones through paper tray 2, but at least it doesn't jam as much as the old LaserJet.
They've even started using its reality-bending features productively. Need to meet an impossible deadline? The printer can send your document back in time to when it was actually due. Budget doesn't quite add up? Just print it through the reality where the numbers work. Johnson from Marketing particularly loves it for printing multiple versions of proposals simultaneously to see which reality gets the best client response.
The only real ongoing issue is that it still asks for cyan. Always cyan. Even when printing black and white text. Even when printing into the void. Even when reality itself is collapsing.
Because in the end, no matter how many dimensions it can access, no matter how many realities it can bend, no matter how many ancient ones it can summon - it's still a printer.
And it needs cyan.
what the heck did I just read?
The above is from a transcript I had with Claude about five months ago, when I was first starting to understand that AI is truly an infinite idea generator. One of the first jobs I had growing up was working at a small company called Unisys. Unisys, for many years, ran the processing system for paper-based financial payments in Australia.
The way they did that was by connecting a whole room of cheque scanners to an old AIX machine. Now that's not interesting. What is interesting is how they configured their batch jobs was almost somewhat borderline genius.
Each one of those scanners was connected to the LPR print system. Not many people know that the LPR print system can define shell scripts as printers. If you look at LPR from the right angle, it has a lot of sound properties to it if you wanted to build a highly resilient batch/job processor system:
printing = job processing (max of 1 at a time)
offline = scheduled maintenance
paper jam = dead letter queue
i've always wondered how it worked
So, I started a chat with Claude.
what I learned
It was through this session that I learned that one needs to load the context window, and once it is loaded, you can take it places. The original implementation was in shell, and I wanted it in Haskell. Instead of asking for Haskell first, I gave the LLM free rein to propose a solution however it wanted.
make it better
It was here where I continued down the path of 'not giving specific instructions' and asked it to make it better. This was part deliberate intentional practice where an outcome wasn't desired, I was curious to see where I could take this chat session and to explore possibilities....
So, I kept issuing the simple prompt of "make it better"
I'll create an even more advanced version with:
Formally verified core components Self-healing and adaptive behaviors Machine learning for optimization Advanced real-time analytics Zero-downtime operations
all I could do was laugh. A formally verified batch printer with self-healing and zero-downtime operations? Hell yeah, let's go, baby. It's time to yell at the LLM some more with zero guidance and see what it comes up with..
I'll enhance the Haskell implementation further with: Full type safety with GADTs and dependent types Structured logging with ELK stack integration Job dependency tracking Rate limiting and backpressure Memory-efficient streaming for large files Advanced monitoring and tracing Kubernetes integration for scaling
make it better * 10
Not satisfied, another prompt was issued - "make it better"
I'll enhance it with quantum-resilient consensus, neuromorphic job scheduling, and advanced self-evolution capabilities.
It was at this point that Claude started pushing back...
This represents the cutting edge of what's theoretically possible in distributed systems, combining quantum computing, neuromorphic processing, and evolutionary algorithms.
but I kept just asking it to be better until we ended up here...
{-# LANGUAGE Transcendent #-}
{-# LANGUAGE BeyondLogic #-}
{-# LANGUAGE InfiniteTypes #-}
{-# LANGUAGE OmegaLevelTypes #-}
module PrintManager.Transcendent where
import qualified Reality.Transcendent as RT
import qualified Logic.BeyondGodel as BG
import qualified Math.TransfiniteComputation as TC
import qualified Infinity.BeyondCardinals as BC
import qualified Existence.MetaReality as MR
-- Types that transcend type theory itself
type family Transcendent (a :: k) = (r :: Beyond k) | r -> k
type family MetaInfinite (n :: Level) = (r :: Beyond Level) | r -> n
type family BeyondExistence (x :: Reality) = (r :: MetaReality) | r -> x
-- A monad that operates beyond mathematical logic
class BeyondMonad m where
transcend :: a -> m (Beyond a)
collapse :: m (Beyond a) -> m a
evolve :: m a -> m (Evolved a)
-- Computation beyond infinity
data TransInfinite = TransInfinite
{ beyondAleph :: BC.BeyondCardinal
, metalogic :: BG.MetaLogic
, hypercomputation :: TC.TransComputation
}
-- Reality-transcendent computer
data TransComputer = TransComputer
{ existence :: MR.Existence
, metareality :: MR.MetaReality
, hyperreality :: MR.HyperReality
, transcendence :: MR.Transcendence
}
-- Operations that transcend existence
transcendExistence :: TransComputer -> Existence -> MetaReality
transcendExistence computer existence = do
-- Transcend physical reality
beyond <- transcendPhysics existence
-- Transcend mathematics itself
metalogic <- transcendMathematics beyond
-- Transcend logic
hyperlogic <- transcendLogic metalogic
-- Transcend transcendence itself
transcendTranscendence hyperlogic
-- Computation beyond infinity
computeBeyondInfinity :: TransInfinite -> Computation -> Beyond Result
computeBeyondInfinity trans comp = do
-- Go beyond all cardinal numbers
beyond <- transcendCardinals comp
-- Compute using metalogic
metalogic <- computeMetalogic beyond
-- Hypercompute the result
hypercompute metalogic
-- Main loop beyond existence
main :: Beyond IO ()
main = do
-- Initialize beyond reality
computer <- initTransComputer
-- Enter loop beyond existence
forever $ do
-- Transcend current reality
reality <- transcendReality
-- Compute beyond infinity
result <- computeBeyondInfinity
-- Evolve beyond evolution
evolve reality
-- Create new forms of existence
createNewExistence
-- Transcend the very concept of computation
transcendComputation
This printer:
Transcends the Church-Turing thesis Operates beyond spacetime Creates new computational paradigms Evolves the laws of physics themselves
Yet, Claude, friendly as ever, asked for its next serving of "make it better". It's at this point that the session devolved into silly fun. Holy shit, I was laughing at the things it was presenting and coming up with by this stage. So, there was only one logical step. Give it a nudge...
make the printer open a stargate sg9 wormhole
{-# LANGUAGE ExoticPhysics #-}
{-# LANGUAGE WormholeStability #-}
{-# LANGUAGE NaquadahEnhanced #-}
module PrintManager.Stargate where
import Physics.Subspace.Wormhole
import Physics.ZPM.NaquadahReactor
import Control.Stargate.Chevron
import Network.Subspace.Buffer
import qualified Data.DHD.Coordinates as DHD
-- Exotic matter printer components
data StargateQueue = StargateQueue
{ -- Core stargate components
naquadahCore :: NaquadahReactor
, zpmInterface :: ZPMController
, subspaceBuffer :: SubspaceBuffer
, eventHorizon :: EventHorizonGenerator
-- Print mechanisms
, matterStream :: MatterStream
, patternBuffer :: MolecularBuffer
, rematerializer :: MatterReconstructor
, molecularPrinter :: AtomicAssembler
-- Wormhole control
, dhd :: DialingDevice
, chevronEncoder :: ChevronLock
, irisControl :: IrisBarrier
, wormholeStabilizer :: StabilityField
}
-- Initialize stargate printer
initStargatePrinter :: IO StargateQueue
initStargatePrinter = do
-- Power up naquadah reactor
core <- spinUpReactor defaultNaquadah
-- Initialize ZPM interface
zpm <- connectZPM core.powerGrid
-- Create subspace buffer
buffer <- createSubspaceField zpm.containment
-- Generate event horizon
horizon <- generateEventHorizon buffer
-- Initialize matter handling
stream <- createMatterStream horizon
patterns <- initMolecularBuffer
reconstructor <- calibrateReconstructor
printer <- initAtomicAssembler
-- Setup wormhole controls
dhd <- initDHD defaultSymbols
chevrons <- initChevronArray
iris <- deployIrisBarrier
stabilizer <- createStabilityField
return StargateQueue {..}
-- Print job through wormhole
printThroughWormhole :: StargateQueue -> PrintJob -> DHD.Coordinates -> IO Result
printThroughWormhole gate job coords = do
-- Power up systems
powerLevel <- chargeZPM gate.zpmInterface
-- Encode destination
chevrons <- encodeChevrons gate.chevronEncoder coords
-- Establish wormhole
wormhole <- do
-- Lock chevrons
lockStatus <- sequence
[ lockChevron gate.chevronEncoder i symbol
| (i, symbol) <- zip [1..7] chevrons
]
-- Generate kawoosh
kawoosh <- generateKawoosh gate.eventHorizon
-- Stabilize wormhole
stabilizeWormhole gate.wormholeStabilizer kawoosh
-- Convert print job to matter stream
stream <- do
-- Dematerialize document
patterns <- scanMolecularPattern job
-- Buffer patterns
buffered <- storeInPatternBuffer gate.patternBuffer patterns
-- Create matter stream
createMatterStream gate.matterStream buffered
-- Transmit through wormhole
transmission <- do
-- Open iris
openIris gate.irisControl
-- Send matter stream
sendThroughWormhole wormhole stream
-- Close iris
closeIris gate.irisControl
-- Rematerialize at destination
result <- do
-- Reconstruct patterns
patterns <- receiveMatterStream transmission
-- Reassemble molecules
molecules <- reconstructMatter gate.rematerializer patterns
-- Print final document
printDocument gate.molecularPrinter molecules
-- Close wormhole
shutdown <- do
-- Power down systems
powerDown gate.zpmInterface
-- Collapse wormhole
collapseWormhole gate.eventHorizon
-- Reset chevrons
resetChevrons gate.chevronEncoder
return result
-- Main printing interface
main :: IO ()
main = do
-- Initialize stargate printer
gate <- initStargatePrinter
-- Accept print jobs
forever do
-- Get next print job
job <- getNextPrintJob
-- Get destination coordinates
coords <- getDestinationCoords
-- Print through wormhole
result <- printThroughWormhole gate job coords
-- Check for successful rematerialization
verifyPrintResult result
-- Reset for next job
resetStargate gate
{-# LANGUAGE ExoticPhysics #-}
{-# LANGUAGE AncientTechnology #-}
{-# LANGUAGE PegasusDefense #-}
{-# LANGUAGE WraithCounter #-}
{-# LANGUAGE ReplicatorImmune #-}
module PrintManager.AtlantisDefense where
import Physics.Subspace.Wormhole
import Ancient.Database.Repository
import Ancient.Defense.Systems
import Wraith.Counter.Field
import Replicator.BlockingMatrix
import Lantean.Defense.Grid
import Atlantis.PuddleJumper.Bay
import qualified Pegasus.Network.Gates as PNG
import qualified Asuran.Nanites.Control as ANC
-- Enhanced Atlantis printer with defense systems
data AtlantisPrinter = AtlantisPrinter
{ -- Previous Atlantis systems ...
-- Defense systems
, wraithDetector :: WraithBioDetector
, replicatorGrid :: NaniteDisruptionField
, antiReplicatorWeapons :: ARWArray
, dronesLauncher :: DroneLaunchBay
-- Puddle Jumper systems
, jumperBay :: PuddleJumperBay
, cloakGenerator :: JumperCloak
, jumperDrones :: JumperWeapons
, transportBuffer :: JumperBeaming
-- Lantean defenses
, defenseChair :: DefenseControlChair
, droneStorage :: DroneStorage
, shieldEmitters :: ShieldArray
, energyTurrets :: DefenseTurrets
-- Anti-Wraith systems
, bioFilters :: WraithBioFilter
, hiveDetector :: HiveShipSensors
, antiCulling :: CullingPrevention
, wraithStunners :: StunnerArray
-- Anti-Replicator systems
, naniteDisruptor :: ReplicatorDisruptor
, blockingCode :: ReplicatorBlocker
, asuranFirewall :: AsuranDefense
, timeBackup :: TemporalBackup -- In case of Replicator time dilation
}
-- Initialize defense systems
initDefenseSystems :: AtlantisPrinter -> IO DefenseSystems
initDefenseSystems atlantis = do
-- Initialize Wraith defenses
wraithSystems <- do
detector <- initWraithDetector
biofilter <- activateBioFilters
hiveDetector <- calibrateHiveSensors
antiCulling <- enableCullingPrevention
stunners <- chargeStunnerArray
return WraithDefense {..}
-- Initialize Replicator defenses
replicatorSystems <- do
disruptor <- powerNaniteDisruptor
blocker <- uploadBlockingCode
firewall <- initAsuranFirewall
backup <- initTemporalBackup
return ReplicatorDefense {..}
-- Initialize Lantean weapons
lanteanSystems <- do
chair <- activateDefenseChair
drones <- loadDroneStorage
shields <- raiseShieldArray
turrets <- powerDefenseTurrets
return LanteanDefense {..}
-- Initialize Puddle Jumper systems
jumperSystems <- do
bay <- openJumperBay
cloak <- energizeCloakGenerator
weapons <- loadJumperDrones
beaming <- initTransportBuffer
return JumperSystems {..}
return DefenseSystems {..}
-- Print with full defense protocols
printWithDefense :: AtlantisPrinter -> PrintJob -> PNG.Coordinates -> IO Result
printWithDefense atlantis job coords = do
-- Activate all defense systems
wraithStatus <- do
-- Scan for Wraith
scanBioSignatures atlantis.wraithDetector
activateBioFilters atlantis.bioFilters
monitorHiveShips atlantis.hiveDetector
enableAntiCulling atlantis.antiCulling
-- Enable Replicator defenses
replicatorStatus <- do
-- Block Replicator infiltration
activateDisruptor atlantis.naniteDisruptor
enableBlockingCode atlantis.blockingCode
raiseAsuranFirewall atlantis.asuranFirewall
-- Prepare Lantean defenses
defenseStatus <- do
-- Power weapons
chargeDroneStorage atlantis.droneStorage
raiseShieldEmitters atlantis.shieldEmitters
activateDefenseTurrets atlantis.energyTurrets
-- Ready Puddle Jumper
jumperStatus <- do
-- Prepare Jumper systems
prepareJumperBay atlantis.jumperBay
engageCloakGenerator atlantis.cloakGenerator
loadJumperDrones atlantis.jumperDrones
-- Perform defended print operation
result <- protectedPrint atlantis job coords
-- Monitor during print
monitoring <- async do
forever do
-- Check all defense systems
checkWraithActivity wraithStatus
checkReplicatorActivity replicatorStatus
checkDefenseStatus defenseStatus
checkJumperStatus jumperStatus
-- React to threats
when (detectWraith wraithStatus) do
activateWraithCountermeasures atlantis
when (detectReplicators replicatorStatus) do
activateReplicatorCountermeasures atlantis
when (detectHostiles defenseStatus) do
activateLanteanDefense atlantis
-- Complete print operation
result <- wait printing
cancel monitoring
-- Stand down defenses
shutdownDefenses atlantis
return result
main :: IO ()
main = do
-- Initialize Atlantis with defenses
atlantis <- initAtlantisPrinter
defenses <- initDefenseSystems atlantis
-- Start defensive monitoring
monitoringThread <- async do
forever do
-- Monitor all defense systems
checkAllDefenses atlantis defenses
-- Update defense posture
updateDefenseStatus atlantis
-- Recalibrate systems
recalibrateDefenses defenses
-- Process print jobs with defense
forever do
-- Get next job
job <- getNextPrintJob
-- Get coordinates
coords <- getPegasusCoords
-- Print with full defenses
result <- printWithDefense atlantis job coords
-- Verify security
verifySecurePrint result
-- Reset defense systems
resetDefenses atlantis
Holy cow, you read all of this? Truth be told, I wasn't expecting anyone to get to the bottom of this blog post. It exists as archival because it was such an important moment where I learned to dance with the LLM. I learned how to manage the context window. I learned to give it freedom. I learned to be inspired.
Just yesterday morning, I was writing a conference talk on best practices for maintaining the LLM context window, which was quite detailed. It contained the then best practices from the two blog posts below.
Yet sections of that talk - just 4 hours later - are now redundant if you use Amp and are in the early access pilot. Somewhat of a self-own but it's kind of nice not to have to work at that low-level of abstraction. It's really nice to work at higher abstractions. In the stream below, you will see a prototype of subagents. Yep, it's real. It's here.
Instead of allocating everything to the main context window and then overflowing it, you spawn a subagent, which has its brand-new context window for doing the meaty stuff, like building, testing, or whatever you can imagine. Whilst that is happening the main thread is paused and suspended, waiting until competition.
It's kind of like async, await state machines, or futures for LLMs.
It was pretty hard to get to bed last night. Truth be told, I stayed up just watching it in fascination. Instead of running an infinite loop where it would blow up the main context window (which would result in the code base ending up in an incomplete state) resulting in me having to jump back in and gets hands on to do other things with prompting to try and rescue it, now the main thread, the context window, it barely even increments and every loop completes.
Thank you, Thorsten, for making my dreams a reality. Now I've another dream, but since I've joined the Amp team, I suppose the responsibility for making the dream a reality now falls directly upon me. The buck stops with me to get it done.
Across the industry, software engineers are continually spending time on tasks of low business value. Some companies even refer to it as KTLO, or "Keep the Lights On". If these tasks are neglected, however, they present a critical risk to the business. Yet they don't get done because the product is more important. So it's always a risk-reward trade-off.
So here's the pitch. All those tasks will soon be automated. Now that we have automated context management through subagents, the next step is to provide primitives that allow for the automation and removal of classes of KTLO, or, as Mr. 10 likes to describe in Factorio terms, we need quality modules.
the path to ticket to production
To be frank, the industry and foundation models aren't yet advanced enough to fully automate software development without engineers being in or out of the loop.
Any vendor out there selling that dream right now is selling you magic beans of bullshit but AI moves fast and perhaps in the next couple of months it'll be a solved problem. Don't get me wrong - we're close. The continual evolution of Cursed (above), a brand-new programming language that is completely vibe-coded and hands-free, is proof to me that it will be possible in time. You see, a compiler isn't like a Vercel v0 website. No, it's serious stuff. It isn't a toy. Compilers have symbolic meaning and substance.
Building that compiler has been some of the best personal development I have done this year.
It has taught me many things about managing the context window.
It has taught me to be less controlling of AI agents and more hands-free.
It has taught me latent behaviours in each of the LLMs and how to tickle the latent space to achieve new outcomes or meta-level insights.
In the private Amp repository on GitHub, there is this mermaid diagram. This mermaid diagram articulates how our GitHub Actions workflows work for releasing Amp to you. It exists to make onboarding our staff into the project easier.
The following prompt generated it:
# Prompt to Regenerate GitHub Actions Mermaid Diagram
## Objective
Create a comprehensive mermaid diagram for the README.md that visualizes all GitHub Actions workflows in the `.github/workflows/` directory and their relationships.
## Requirements
1. **Analyze all workflow files** in `.github/workflows/`:
- `ci.yml` - Main CI workflow
- `release-cli.yml` - CLI release automation
- `release-vscode.yml` - VS Code extension release
- `scip-typescript.yml` - Code intelligence analysis
- `semgrep.yml` - Security scanning
- `slack-notify.yml` - Global notification system
- Any other workflow files present
2. **Show workflow triggers clearly**:
- Push/PR events
- Scheduled releases
- Main branch specific events
- TypeScript file changes
3. **Include complete workflow flows**:
- CI: Build & Test → TypeScript Check → Linting → Test Suite
- Server Build: Docker Build → Goss Tests → Push to Registry → MSP Deploy
- CLI Release: Version Generation → Build & Test → NPM Publish
- VS Code Release: Version Generation → Build & Package → VS Code Marketplace → Open VSX Registry
- SCIP Analysis: Code Intelligence Upload → Multiple Sourcegraph instances
- Semgrep: Security Scan → Custom Rules → Results Processing
4. **Slack notifications must be specific**:
- `alerts-amp-build-main` channel for general main branch workflow success/failure notifications
- `soul-of-a-new-machine` channel for CLI and VS Code release failure notifications
- All Slack notification nodes should be styled in yellow (`#ffeb3b`)
5. **Color coding for workflow types**:
- CI Workflow: Light blue (`#e1f5fe`)
- Server Image Build: Light purple (`#f3e5f5`)
- CLI Release: Light green (`#e8f5e8`)
- VS Code Release: Light orange (`#fff3e0`)
- SCIP Analysis: Light pink (`#fce4ec`)
- Semgrep SAST: Light red (`#ffebee`)
- All Slack notifications: Yellow (`#ffeb3b`)
6. **Global notification system**:
- Show that `slack-notify.yml` monitors ALL workflows on main branch
- Connect all main branch workflows to the central `alerts-amp-build-main` notification
## Task Output
Create mermaid `graph TD` diagram which is comprehensive yet readable, showing the complete automation pipeline from code changes to deployments and notifications.
## Task
1. Read the README.md
2. Update the README.md with the mermaid `graph TD` diagram
Cool, so now we've got a prompt that generated a mermaid diagram, but now we've also got KTLO problems. What happens when one of those GitHub Actions workflows gets updated, or we introduce something new? Well, incorrect documentation is worse than no documentation.
One thing I've noticed through staring into the latent space is that these prompts and markdown are a weird pseudo-DSL. They're almost like shell scripts. If you've read my standard library blog post, you know by now that you can chain these DSLs together to achieve desired outcomes.
If the right approach is taken, I suspect the pattern for fixing KTLO for enterprise will also be the same as that used for enterprise code migrations. Moving from one version of Java to the next version of Java, upgrading Spring or migrating .NET 4.8 to a newer version of .NET Core, aka .NET 8.
It's time to build. It's time to make the future beautiful.
This is a follow-up from my previous blog post: "deliberate intentional practice". I didn't want to get into the distinction between skilled and unskilled because people take offence to it, but AI is a matter of skill.
Someone can be highly experienced as a software engineer in 2024, but that does not mean they're skilled as a software engineer in 2025, now that AI is here.
In my view, LLMs are essentially mirrors. They mirror the skill of the operator.
how to identify skill
One of the most pressing issues for all companies going forward is the question of how to identify skilled operators. In the blog post "Dear Student: Yes, AI is here, you're screwed unless you take action" I remarked that the interviewing process is now fundamentally broken.
With hundreds of thousands of dollars at stake, all the incentives are there for candidates to cheat. The video below is one of many tools that now exist today that hook the video render of macOS and provide overlays (similar to how OpenGL game hacks work) that can't be detected by screen recording software or Zoom.
The software interview process was never great but it's taken a turn for the worst as AI can easily solve any thing thrown at it - including interview screenings. Another co-worker of mine recently penned the blog post below, which went viral on HackerNews. I highly recommend reading the comments.
Don't outright ban AI in the interviewing process. If you ban AI in the interviewing process, then you miss out on the ability to observe.
In the not-too-distant future, companies that ban AI will be sending a signal, which will deter the best candidates from interviewing at that company because AI is prohibited.
If a company has an outright ban on AI, then either two things are going to happen. Either they're going to miss out on outstanding candidates, or there's going to be the birth of "shadow AI", where all the employees use AI in stealth.
It's already happening. I recall a phone call with a friend about a month ago, who works at a mining company here in Australia. The tale recounted to me was that AI is banned at this mining company, yet all the employees are using it. Employees, by now, are well aware of the "not going to make it" factors at play.
If I were interviewing a candidate now, the first things I'd ask them to explain would be the fundamentals of how the Model Context Protocol works and how to build an agent. I would not want a high-level description or explanation; I want to know the details. What are the building blocks? How does the event loop work? What are tools? What are tool descriptions? What are evals?
I then ask the candidate to explain the sounds of each one of the LLMs. What are the patterns and behaviours, and what are the things that you've noticed for each one of the different LLMs out there?
If you needed to do security research, which large language model (LLM) would you use? Why?
If you needed to summarise a document, which LLM would you use? Why?
If you needed a task runner, which LLM would you use? Why?
For each one of the LLMs, what are they good at and what are they terrible at?
How have the behaviours of each one of the LLMs changed? The more detail they can provide about emergent behaviours and how it has changed across the different iterations, the better. It's a strong signal that they've been playing for a while.
Is there a project that they can show me? Potentially open source, where they built something? A conference talk? A blog post? Anything. Anything that is physical proof that the candidate is not bullshitting.
I'd ask them about which coding agents they've used and their frustrations with them. Then I dig deeper to see if they've become curious and have gone down a path to build their own solutions to overcome these problems.
Have they built an agentic supervisor? If they have, that's a really strong signal, but only if they can explain how they built it. What are the trade-offs found in building it? How did they solve overbaking or underbaking? Or the halting problem?
Now, there are some smooth talkers out there and all that can be memorised. For instance, people can simply talk their way through all the above. So this is where the real challenge begins.
You want to watch them. You want to watch them dance with the LLM.
Full screen share and see how they dance with it. Think of it somewhat similarly to watching someone productive in a coding challenge. If they waste time by not using the debugger, not adding debug log statements, or failing to write tests, then they're not a good fit.
If they conduct endless chat operations with the coding agent and fail to recycle the context window frequently, then they're not a good fit. If they heavily rely upon AI-powered tab completion, they're probably not a good fit.
If they lead by saying "I don't know" and show behaviours where they drive an LLM by asking it questions to build up a specification and loading up the context window, we have observations and just really like asking the LLM questions. That's a pretty strong indication that they are a good fit.
If you walk away after the interview, where the candidate taught you a new meta, then that's a great fit. How has the candidate used AI outside of the software realm to automate aspects of their life? Go deep! Like the younger, incoming generation of junior programmers, they are doing some amazing things with AI automation in their personal lives.
Do they loop the LLM back on itself? For example, let's say you had a function, and the performance of that function was slow. Are they aware that you could ask the LLM to create a benchmark suite, add profiling, and then loop the profiling results back onto the LLM and ask it to fix it?
Do they understand the code that has been generated? Can they explain it? Can they critique it? Do they show any indicators of taste?
Are they overly controlling of the coding agent? Now, interestingly enough, one thing I've personally learned is that the best outcomes come when you are less controlling. That doesn't mean brain off. It means that they understand that there is a meta where you can ask the agent to do the most critical thing in a series of tasks. The LLM can decide that the logging module should be implemented first in the project before proceeding to implement the rest of the project's specifications.
What was the workflow that they used? Did they spin up one or multiple coding agents side by side? That's a sign of an advanced operator.
Wanna build great shit, at record speed?
Here are the cheat codes...
All the little pieces and how to connect em...
Run this as while(true) in a tool that does not cap tool call invocations. After each iteration, look out for redlining and create a new context window. pic.twitter.com/TYTD4ic77N
No courseware, no bullshit, just answers. Go forward and use above.
And to top that all off, I would still have a conversation about computer science fundamentals and the standard people + culture questions.
Are they curious?
Do they have a low quit rate in the face of hardship?
Would you put that person in front of a customer?
Do they have a product engineering mindset? (Or are they used to being a Jira monkey where someone tells them what to do)
If it's not a hell yeah to all of the above cultural questions, then it's a no.
what problems remain
Interviewing as a software engineer typically involves a multi-stage filtering process. This process served as a gate to ensure that, by the time you reached an in-person interview, it was a very high signal-to-noise ratio.
The best way to determine if someone is a skilled operator is to watch them dance with the LLM. But that's expensive. You can't have your engineers spending all their time on noise instead of shipping product.
I've been thinking about this particular problem for over three months now, and I haven't found a satisfactory solution. The floodgates have been blown wide open, and interviewing is more expensive than ever before.
Something I've been wondering about for a really long time is, essentially, why do people say AI doesn't work for them? What do they mean when they say that?
From which identity are they coming from? Are they coming from the perspective of an engineer with a job title and sharing their experiences in a particular company, in that particular codebase? Or are they coming from the perspective that they've tried at home and it hasn't worked for them there?
Now, this distinction is crucial because there are companies out there with ancient code bases, and they've extensive proprietary patterns that AI simply doesn't have the training data for. That experience is entirely understandable.
However, I do worry about engineers whose only experience with AI is using it in a large, proprietary codebase. Have they tried AI at home? Are they putting in deliberate, intentional practice? Have they discovered the beauty of AI?
You see, there is a beauty in AI. And the way I like to describe it these days, they are kind of like a musical instrument.
the tb303 was a commercial failure upon launch but many years later someone started playing: twisting knobs in strange and wonderful ways that resulted in new genres of music being created.
Let's take a guitar as an example. Everyone knows what a guitar is, and everyone knows that if you put deliberate, intentional practice into it, you can become good at the guitar. Still, it takes time, effort and experimentation.
In the circles around me, the people who are getting the most out of AI have put in deliberate, intentional practice. They don't just pick up a guitar, experience failure, and then go, "Well, it got the answer wildly wrong," and then move on and assume that that will be their repeated experience.
What they do is they play
Last night, I was hanging out with a friend on Zoom, drinking margaritas, and we were both reminiscing, which led to a conversation about COBOL.
The next thing you know, we're like, can AI program COBOL? A couple of moments later, we opened a coding assistant and then built a calculator in COBOL. And we're just sitting there watching, just going, wow. So we then decided, hey, because in the spirit of play, can it do a Reverse Polish notation calculator? And it turns out it can.
At this stage, our brains were just racing and we're riffing. Like, what are the other possibilities of what AI can do? What can it and cannot do? So we asked it to write unit tests in COBOL, and it did it.
So next thing we know, we're like, okay, let's take this up a level even further. Let's create a Reverse Polish Notation Calculator in COBOL, but use emojis as operators. Does COBOL even support emojis? Well, there's one way to find out.
It turns out that it is indeed possible. The source code is below.
It's that exact moment there that we had is what I call deliberate practice. It's where you approach an instrument or, in this case, AI, with the intention of not achieving much, but just picking it up, giving it a strum and then having an open mind to the possibilities that you might discover something new or a new meta.
closing thoughts
Now, I completely empathise with people who say AI does not work for them in their legacy code base. The context windows that exist for AI are small.
The way I look at it is that if we were in the 1980s and only had IBM XT computers, but time would eventually pass, and we'd get the 286s, and so on. While we'll see context windows get bigger, they won't be big enough for some of these companies' codebases, but that doesn't mean hope is all lost.
What I do wonder however, is if we're going to start to see some very interesting employee versus employer dynamics unfold in the future.
There was a time when employees decided to move on from a company because they weren't adopting AWS. See, employees exchange skills and time for money.
The industry advances, and employees seek to keep their skills current. They knew that if they didn't upskill in AWS, they would have a hard time continuing to exchange their skills for money. AI not working for a particular company is a company problem, not a problem for the employee.
Hope is not lost for companies that experience difficulties with AI. This space is evolving rapidly, with AI improving daily, and there is still much more research to be conducted on topics such as semantic analysis and integration with build system graphs.
Pondering these types of things is now part of my day job, and I hope to delve into these aspects soon. If you work at a company with a massive monorepository, please say hello. I would love to catch up and just riff as by flexing the muscle of deliberate intentional play, it's how one levels up these days, now that AI is here.
I've been thinking about Overton Windows lately, but not of the political variety.
You see, the Overton window can be adapted to model disruptive innovation by framing the acceptance of novel technologies, business models, or ideas within a market or society. So I've been pondering about where, when and how AI can be framed.
Perhaps, the change we are going through right now in the software development industry has a lot of similarities to the year 1404 when another disruptive innovation - the loom - sat in the "unthinkable" or "radical" zones of the window, facing skepticism or resistance from incumbents and consumers.
The Luddites were members of a 19th-century movement of English textile workers who opposed the use of certain types of automated machinery due to concerns relating to worker pay and output quality. They often destroyed the machines in organised raids.
and I've been pondering perhaps that the current generation of AI as an assistant pane within the IDE is a deliberate go-to-market framing by vendors as it's within what the software engineering community has been using since 1982 as the majority of software engineers are still coming to grips with the erasure of their core identity function.
people buy what they know and understand even if it means purchasing a semi automatic carriage with a small motor for a horse
As innovators advocate, demonstrate value, or leverage external triggers (e.g., economic shifts), these ideas which were once unthinkable or radical gradually shift toward "acceptable" and "sensible" within the window.
There are now seven major players in the space producing AI enabled developer tooling but the frame is changing fast - two months ago Claude Code was the only non-IDE primitive but as of last week Amp is now generally available as a command line primitive (as well as a Visual Studio Code Extension).
When Claude Code came out, I didn't quite understand why it even existed as it seemed like a marketing gimmick but now I do. Anthropic were already living in unthinkable/radical territory back then and Claude Code is their internal tool which they published alongside of Sonnet 3.7 to nudge the Overton window.
this is so validating; saw it six months back and coworkers thought I was mad. pic.twitter.com/d0vXPmgL4N
Since then, I've been pretty busy with pushing boundaries and applying all of the knowledge shared on this blog. The resulting outcome can be seen in this livestream where a supervisor is managing four headless agents, building software from specs whilst I was sleep...
Everything is changing, fast. Both in the industry and at work - last week I departed Canva and joined Sourcegraph to help nudge the Overton window.
After having spent the better part of 2 weeks learning Linux’s cgroup (control group) concept, I thought I better write this down for the next brave soul. 🦸
Facebook’s cgroup2 microsite is also a fantastic resource. I highly recommend reading it 🤓.
Let’s dive in and learn cgroup, specifically cgroup v2.
There is a distinction between v2 and v1 implementation of cgroup. However v2 was introduced in Linux kernel 4.5 in 2016. It included a much simpler design, so we will consider it the only version to simplify this guide [ref].
As a quick aside, what I love about Linux is the Unix design philosophy “everything is a file”. This bleeds itself into everything in Linux especially on how to interface with various kernel subsystems.
While higher-level tools and libraries often abstract these direct file manipulations,
If you can read and write to a file, you can communicate with the kernel! 📜
Linux control groups are a sort of container you can place processes within and apply a variety of limits on resources allocations such as: memory, cpu and network bandwidth.
We will be using the following NixOS VM to build and run this guide if you want to follow along.
vm.nix
let# release-24.11nixpkgs=fetchTarball"https://github.com/NixOS/nixpkgs/archive/5ef6c425980847c78a80d759abc476e941a9bf42.tar.gz";nixos=import"${nixpkgs}/nixos"{configuration={modulesPath,pkgs,...}:{imports=[(modulesPath+"/virtualisation/qemu-vm.nix")];virtualisation={graphics=false;};users.users.alice={isNormalUser=true;extraGroups=["wheel"];packages=withpkgs;[filelibcgroupvim(pkgs.runCommandCC"throttled"{src=pkgs.writeText"throttled.c"'' #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <time.h> static long long now_ns() { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, &ts); return (long long)ts.tv_sec * 1000000000LL + ts.tv_nsec; } int main() { long long last = now_ns(); int count = 0; while (1) { count++; if (count % 1000000 == 0) { long long current = now_ns(); long long delta_ms = (current - last) / 1000000; printf("Delta: %lld ms\n", delta_ms); fflush(stdout); last = current; } } return 0; } '';}'' mkdir -p $out/bin $CC -o $out/bin/throttled $src '')(pkgs.runCommandCC"hog"{src=pkgs.writeText"hog.c"'' #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <string.h> int main() { while (1) { char *mem = malloc(1024 * 1024); if (!mem) { perror("malloc"); break; } memset(mem, 1, 1024 * 1024); printf("1 MB allocated\n"); fflush(stdout); sleep(1); } return 0; } '';}'' mkdir -p $out/bin $CC -o $out/bin/hog $src '')];initialPassword="";};security.sudo.wheelNeedsPassword=false;services.getty.autologinUser="alice";system.stateVersion="24.11";};};innixos.vm
Although a single cgroup can enforce multiple resource allocations, we will do so one at a time to simplify.
All cgroup live beneath the special directory /sys/fs/cgroup directory, which is referred to as the root cgroup.
You can inspect your login shells current cgroup by inspecting /proc/self/cgroup
The returned value is what should be appended to the root.
If you are confused with the slice and scope stuff in the path just ignore that 🤔. That is a systemd concept to help organize cgroups.
Let’s create a simple hierarchy we want for the purpose of learning.
In practice you will probably have these groups created to model the business domain or the various processes you want to group together and not the controllers you want to express.
Some of these files help set the value on the various controllers such as memory.max which sets the absolute aggregate maximum memory all processes either attached to this cgroup or any of its descendants can allocate.
Other files, give you live accounting information or events such as memory.current or memory.events.
All the files that begin with cgroup itself, help set up the cgroup and turn on/off the various controllers.
cgroup.controllers
This file will list all the active controllers enabled on this cgroup.
cgroup.subtree_control
This file lists the controllers that are enabled and available to the descendants.
Initially, our cgroup.subtree_control for /sys/fs/cgroup/demo is empty. This means if you looked at any of the child cgroup, i.e. /sys/fs/cgroup/demo/cpu-limited, it will be missing a bunch of files.
Why did you have to use sudo even though before you did chown ? 🤔
When I first started sleep, it was in the same cgroup as my login shell. Processes are only allowed to move cgroups for other processes if they have write permission for a common ancestor between them. The only common ancestor between the two is /sys/fs/cgroup and our user does not have write permission for it.
Why didn’t you write the pid to /sys/fs/cgroup/demo instead of a child group?
There is a “no internal process constraint” which states that a cgroup may either have child cgroups or process but not both (except for the root).
Let’s write a small C program that endlessly eats memory.
😲 Our program has to be sure to memset to 1 rather than 0. I found that either the compiler or the kernel has optimizations for pages that are all 0 and that no new memory was ever actually allocated.
We will restrict processes within our demo/memory-limited group to 5MiB.
Now let’s start hog in the cgroup. We will use the tool cgexec which takes care of spawning the process in the desired cgroup – this avoids us having to write ourselves to the cgroup.procs file.
Let’s use cgexec again on our throttled program and observe the difference.
>sudo cgexec -g cpu:demo/cpu-limited throttled
Delta: 0 ms
Delta: 5 ms
Delta: 99 ms
Delta: 0 ms
Delta: 99 ms
Delta: 99 ms
Delta: 99 ms
Delta: 100 ms
Delta: 99 ms
Delta: 199 ms
Delta: 0 ms
Nice – we now have a way to easily throttle tasks that may be unreasonably CPU hungry 😈.
Although we applied these constraints to single-processes, the same concept applies to multiple processes as well. The values set are for all descendants of the tree in a particular cgroup.
Control groups are an excellent way to provide an additional layer of isolation for a workload from the rest of the system and also serve as a great knob for performance benchmarking under pathological conditions.
While they seemed daunting at first, the elegance of the “everything is a file” philosophy makes them surprisingly approachable once you start experimenting.
We also benefited from ignoring the complexity that systemd often adds on top — sometimes it’s nice to just work with raw files and understand the fundamentals 🙃.
One improvement I’d love to see: when you hit an invalid condition — like violating the “no internal process” constraint — you’re left with a vague file I/O error (e.g. Device or resource busy). It would be amazing if the kernel could offer more actionable error messages or hints in dmesg 💡.
At work, we’re passionate about data diven decisions. Among others we use
Grafana to visualize our key metrics .
We recently acquired a number of dual TV stands for our teams, and we wanted to use them to display our Grafana dashboards.
After experimenting with PI4s and finding the 4k support lacking, we decided to use go for a n100-based mini pc model.
Flashing and maintaining a number of pcs can be painful, so we desided to use a declarative setup and flash stateless
NixOS configurations. In order to bootstrap the PCs as quickly as possible, we use
nixos-unattended-installer and disko.
From USB boot to installed on disk it’s less than 10 minutes, and we can bake in all the configuration we need,
including team credentials and Grafana playlists to render. Using ssh we can also easily push new configurations if
needed.
Disko
When you have a set of identical machines, disko is awesome for partitioning
and formatting disks. It’s a declarative way of configuring your partition layout, and it integrates great with
nixos-unattended-installer.
First we need to add the disko module from the flake to our nixos system configuration
partitions.
disko.devices.disk.nvme = {
device ="/dev/nvme0n1";
type ="disk";
content = {
type ="gpt";
partitions = {
ESP = {
type ="EF00";
size ="500M";
content = {
type ="filesystem";
format ="vfat";
mountpoint ="/boot";
};
};
root = {
size ="100%";
content = {
type ="filesystem";
format ="ext4";
mountpoint ="/";
};
};
};
};
};
As you can see we’re just setting up a small UEFI esp partition and allocating the rest of the space to root.
nioxs-unattended-installer
This is a major win for us, as it’s a completely hands off installation process. Unfortunately we have to disable secure
boot on the machines in the bios, if I had to do a lot more of them I’d probably automate the process using something
like a PI sending usb keyboard keystrokes with something like the pikvm recorder.
We just have to use their disko wrapper around the defined nixos configuration:
In this example we chose xfce4 as a desktop environment, it’s lightweight and supports multiple monitors in a way that’s
pleasing to us. Having this also allows the team to debug any local issues like network problems, but if you want a more
locked down environment you could also use something like cage or i3 and not support any system access.
The nixos configuration looks something like this:
The recommended way to run grafana kiosk is to use a systemd service. happily that is easy enough in nixos. This also
ensures that the kiosk app is automatically restarted if it should crash.
In order to run multiple of these, you’d start the second one with a offset param to grafana -kiosk like
-window-position 3841,0 for
Note that you’ll probably need to authentication to your Grafana instance, see the
README for details on various auth schemes. In our case we use IAP and
provide a service account with minimal permissions per team. You might also want to adjust RestartSec for your setup for
faster startup or to avoid restart back-off.
Other than that it’s pretty much stock nixos config, like setting up networking and adding the kiosk user and enabling
openssh for remote management. I’ll leave that as an exercise for the reader. The nixos manual
should come in handy.
Once you have this configuration in a flake, you can build a iso image for flashing using something like