DevelopMeh
https://developmeh.com
ZolaenSun, 15 Mar 2026 00:00:00 +0000Kwik-E-Mart: Who Needs a Gas Town When a Gas Station Will DoSat, 07 Mar 2026 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/kwik-e-mart-who-needs-a-gas-town/
https://developmeh.com/i-made-a-thing/kwik-e-mart-who-needs-a-gas-town/<!-- INTERVIEW REMARKS:
## Key Details from Interview
- Origin: Inspired by beads' mail command which "doesn't really do anything" — imagined agents
sending specific prompts to each other through mailboxes
- Gas Town (Steve Yegge's multi-agent orchestrator) was too big — "I don't need a town, a gas
station is the right size for most of the agentic work I do"
- Targets small deterministic workflows on low throughput operations — frontier LLMs at high
concurrent volume are expensive
- Expected beads to offer a mailbox that tracks agent sessions, allowing retrigger/resume with
a new prompt and await response — that's what Kwik-E-Mart is
- Mental model: newsgroup server, not message broker — messages are public, consumers pull what
they want, copy locally for later
- The problem: LLM supervision orchestration causes degradation — context length over time
screws up attention, agents stop working accurately
- Design guarantee: if the action should happen, the agent gets the message. Success isn't
guaranteed but execution, logging, and feedback always happen
- Can identify if LLM performed expected operation and retry if not
- CI use case: module-specific consumers with tailored prompts kept IN THE CODEBASE alongside
the code they're about — not generic "AI update docs" but specific prompts per module/file class
- Use cases: keeping documentation up to date (sequence diagrams, API docs synced when endpoint
code changes), intercepting PagerDuty webhooks for log research, resolving/explaining test
failures
- Deployment: lambda, k8s pod, sidecar, CI runner — runs where the events are, not where the
developer is. These things don't run on developer systems.
- Sidecar pattern: deployed next to any application watching for specific log conditions
- Scale argument: NATS/Kafka/Redis aren't wrong, they're overscaled. Too much setup and
resources for problems that need something compact
- Explicitly followed Eric Raymond's 17 Unix Rules and Mike Gancarz's Unix Philosophy tenets
- The emotional core: "We are in this very stupid cycle of talking to LLMs for all activities.
We should be delegating and automating. Repeatability gives us productivity and reduces drudgery."
- The thesis: "Let the unsleeping robot do it 85% good enough. We can always clean it up but
otherwise we get nothing."
- Documentation example: "just having sequence diagrams updated or API docs synced when http
endpoint code changes is like magic"
## Manson Essayist Structure
Inverse thesis — DON'T open with the comparison or the tool. Open tangential.
Thread 1: The drudgery — boring tasks don't get done, 85% > 0%
Thread 2: Agent degradation — LLMs supervising LLMs eat their own context
Thread 3: Scale mismatch — towns, cities, gas stations
Thread 4: Unix philosophy — Raymond, Gancarz, this is deliberate architecture
Thread 5: The comparison — NATS, Redis, Kafka, pipes, Kwik-E-Mart — placed, not ranked
The thesis emerges: delegate to the unsleeping robot, give it the right prompt,
guarantee the attempt not the outcome.
-->
<p>There's a command in <a href="https://github.com/steveyegge/beads">beads</a> called <code>mail</code>. It doesn't really do anything. I stared at it for longer than I should have.</p>
<p>I was knee-deep in <a href="https://developmeh.com/i-made-a-thing/catalyst-orchestrator/">catalyst-orchestrator</a> at the time, watching Haiku make routing decisions that were completely predictable, burning tokens to arrive at conclusions a status field could have told me for free. And I kept thinking about that mail command. What if it <em>worked</em>? What if agents could send each other specific prompts — not chat, not supervision, just: "here's the job, here's exactly how to do it, go."</p>
<p>But that's not what we do. What we do is open a session, type instructions into an LLM, watch it work for a while, watch it start to drift, watch its attention degrade as the context fills up, and then either restart the whole thing or just accept the garbage output because we're tired of babysitting.</p>
<p>Secret time: I was tired of babysitting.</p>
<hr />
<h2 id="the-drudgery-problem">The Drudgery Problem <a class="anchor" href="#the-drudgery-problem">🔗</a>
</h2>
<p>Here's what actually happens with documentation. Nobody updates the sequence diagram. Nobody syncs the API docs when the HTTP endpoint changes. Not because they don't care — because they're busy, or they forgot, or the PR was already approved and who's going back to add a diagram now?</p>
<p>The alternative to 85% good enough isn't 100%. It's zero.</p>
<p>We are in this very stupid cycle of talking to LLMs for all activities. Opening sessions. Crafting prompts in real time. Watching the robot work. Correcting the robot. Watching it again. That's not automation. That's a slightly fancier version of doing it yourself, except now you're also managing the thing that's doing it.</p>
<p>Repeatability is what gives us productivity and reduces drudgery. Not cleverness. Not meta-orchestration. Not agents supervising agents in some fractal management structure that would make Dilbert weep. Just: when this thing happens, do this other thing, with this specific prompt, every time.</p>
<p>You should be picking up the conflicts now. We want automation but we keep building supervision. We want repeatability but we keep reaching for general-purpose tools that require us to be in the room.</p>
<hr />
<h2 id="the-degradation-problem">The Degradation Problem <a class="anchor" href="#the-degradation-problem">🔗</a>
</h2>
<p>I watched an agent lose its mind over a long session. Not dramatically — it didn't hallucinate monsters or start writing poetry. It just got... worse. Slowly. The way a person gets worse at their job at hour eleven of a twelve-hour shift.</p>
<p>LLM supervision orchestration has a structural problem: the orchestrator is itself an LLM. It's consuming context to manage context. Every decision it makes, every status check it interprets, every routing choice — that's all context window. And context window is attention. And attention degrades over length. The longer the session runs, the less accurately the supervising agent works, which means the work it's supervising also gets less accurate, which means it has more problems to manage, which means more context consumed.</p>
<p>It's a death spiral with a credit card attached.</p>
<p><a href="https://github.com/steveyegge/gastown">Gas Town</a> solves this by being a <em>town</em> — 20 to 30 Claude instances coordinated by a Mayor, with Polecats and Refineries and Deacons. It's impressive engineering. It's also a town. I don't need a town. Most of the agentic work I do targets small deterministic workflows on low throughput operations. Frontier LLMs at high concurrent volume are expensive. I needed a gas station.</p>
<hr />
<h2 id="what-a-gas-station-looks-like">What a Gas Station Looks Like <a class="anchor" href="#what-a-gas-station-looks-like">🔗</a>
</h2>
<p><a href="https://sr.ht/~ninjapanzer/Kwik-E-Mart/">Kwik-E-Mart</a> is what happened when I stopped thinking about orchestration and started thinking about mail.</p>
<p>Not email. Newsgroups. NNTP. Remember newsgroups? Messages are public. You pull what you want. You copy what you want locally. Nobody's routing messages <em>to</em> you — you subscribe to what you care about and you read when you're ready. The server doesn't care if you read or not. The messages are there.</p>
<p>That's the mental model. A daemon persists events to an append-only JSON-lines file. Producers dispatch events through stdin or a watch command that polls arbitrary commands on an interval. Consumers pull events, render Go templates against the event payload, and execute LLM subprocesses with the rendered prompt. The consumer acknowledges when it's done. If it crashes, the event is still there.</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>kwike daemon --http :4444
</span><span>kwike watch "git diff HEAD" --type ci.diff --interval 30s
</span><span>kwike dispatch --type review.requested < payload.json
</span><span>kwike consume --config reviewer.yaml --once
</span></code></pre>
<p>Four subcommands. Single binary. That's the whole thing.</p>
<p>I can't guarantee the LLM performs correctly. It's all based on prompt tuning and sometimes just luck. But I can guarantee that if the action should happen, the agent gets the message. There's always an execution, always a log, always feedback. And I can identify whether the LLM performed the expected operation and retry if it didn't. The durability is in the pipeline, not the output.</p>
<hr />
<h2 id="the-unix-thing">The Unix Thing <a class="anchor" href="#the-unix-thing">🔗</a>
</h2>
<p>This wasn't accidental. I explicitly followed Eric Raymond's 17 Unix Rules from <em>The Art of Unix Programming</em> and Mike Gancarz's Unix Philosophy tenets.</p>
<p>JSON-lines is "store data in flat text files." The four subcommands are "make each program do one thing well." Piping dispatch from stdin is "make every program a filter." The daemon over Unix sockets is "write transparent programs." <code>--dry-run</code> is "write programs which fail in a way that is easy to diagnose."</p>
<p>Small is beautiful. Build modular programs. Use composition. Avoid unnecessary output.</p>
<p>These aren't principles I admire from a distance. They're the architecture document.</p>
<hr />
<h2 id="so-where-do-the-big-tools-fit">So Where Do the Big Tools Fit? <a class="anchor" href="#so-where-do-the-big-tools-fit">🔗</a>
</h2>
<p>Here's the thing — NATS, Redis Streams, Kafka — they're not wrong. They're overscaled. They solve big problems at big scale with big infrastructure. Sometimes you need that. Most of the time, for this kind of work, you don't.</p>
<h3 id="unix-pipes">Unix Pipes <a class="anchor" href="#unix-pipes">🔗</a>
</h3>
<p>The simplest version of reactive LLM flows is already in your shell:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>fswatch -r ./content </span><span style="color:#d65940;">| while </span><span style="color:#95cc5e;">read</span><span> file</span><span style="color:#d65940;">; do
</span><span> cat </span><span style="color:#f8bb39;">"$file" </span><span style="color:#d65940;">| </span><span>llm </span><span style="color:#f8bb39;">"summarize this"
</span><span style="color:#d65940;">done
</span></code></pre>
<p>This works until your consumer crashes and misses events, or you want two consumers processing the same stream differently, or you want to replay history. Stdout is gone. There's no durability. There's no fanout. But for a one-off? Don't overthink it.</p>
<h3 id="nats-jetstream">NATS JetStream <a class="anchor" href="#nats-jetstream">🔗</a>
</h3>
<p>General-purpose distributed messaging. Durable streams, consumer groups, subject-based routing, clustering. Battle-tested. Massive ecosystem.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>nats stream add EVENTS --subjects </span><span style="color:#f8bb39;">"events.>"
</span><span>nats consumer add EVENTS llm-reviewer --deliver all --ack explicit
</span><span>
</span><span style="color:#3c4e2d;"># Consumer loop you write yourself
</span><span style="color:#d65940;">while </span><span>true</span><span style="color:#d65940;">; do
</span><span> msg</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(nats consumer next EVENTS llm-reviewer --count 1 --timeout 30s)
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$msg" </span><span style="color:#d65940;">| </span><span>llm -s </span><span style="color:#f8bb39;">"you are a code reviewer"
</span><span style="color:#d65940;">done
</span></code></pre>
<p>The gap: every bit of LLM integration is your problem. You write the consumer loop, the retry logic, the concurrency limits, the backoff, the prompt rendering. It's a messaging system you build LLM workflows on top of. And you need a NATS server running.</p>
<h3 id="redis-streams">Redis Streams <a class="anchor" href="#redis-streams">🔗</a>
</h3>
<p>If Redis is already in your stack:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>redis-cli XADD events:file </span><span style="color:#f8bb39;">'*'</span><span> path </span><span style="color:#f8bb39;">"$file"</span><span> diff </span><span style="color:#f8bb39;">"$diff"
</span><span>redis-cli XGROUP CREATE events:file reviewers </span><span style="color:#f8bb39;">'$'</span><span> MKSTREAM
</span><span>redis-cli XREADGROUP GROUP reviewers worker1 BLOCK 0 STREAMS events:file </span><span style="color:#f8bb39;">'>'
</span></code></pre>
<p>Similar trade-offs. Durable. Fanout via consumer groups. Simpler mental model than NATS. Same LLM gap.</p>
<h3 id="kafka">Kafka <a class="anchor" href="#kafka">🔗</a>
</h3>
<p>I <a href="https://developmeh.com/i-made-a-thing/recreating-kafka-blind/">wrote about recreating Kafka from scratch</a>. The experience reinforced that Kafka's model is right for high-throughput distributed event streaming and way too much for everything I'm describing here. It's Java. It needs a cluster. Standing it up for "I want to watch some files and talk to an LLM" is like hiring a construction crew to hang a picture frame.</p>
<h3 id="the-table">The Table <a class="anchor" href="#the-table">🔗</a>
</h3>
<table><thead><tr><th></th><th>Unix Pipes</th><th>NATS JetStream</th><th>Redis Streams</th><th>Kafka</th><th>Kwik-E-Mart</th></tr></thead><tbody>
<tr><td>Durability</td><td>No</td><td>Yes</td><td>Yes</td><td>Yes</td><td>Yes</td></tr>
<tr><td>Fanout</td><td>No</td><td>Yes</td><td>Yes</td><td>Yes</td><td>Yes</td></tr>
<tr><td>LLM templating</td><td>No</td><td>No</td><td>No</td><td>No</td><td>Yes</td></tr>
<tr><td>Consumer config</td><td>Shell</td><td>Imperative</td><td>Imperative</td><td>Imperative</td><td>Declarative YAML</td></tr>
<tr><td>Watch → Event</td><td>Manual</td><td>Manual</td><td>Manual</td><td>Manual</td><td>Built-in</td></tr>
<tr><td>CI-friendly</td><td>Sort of</td><td>Needs server</td><td>Needs server</td><td>Needs cluster</td><td><code>--once</code> <code>--dry-run</code></td></tr>
<tr><td>Distributed</td><td>No</td><td>Yes</td><td>Yes</td><td>Yes</td><td>Yes (HTTP mode)</td></tr>
<tr><td>Deployment</td><td>None</td><td>Server + CLI</td><td>Redis server</td><td>JVM cluster</td><td>Single binary</td></tr>
</tbody></table>
<p>NATS and Kafka win on scale, ecosystem, and distributed infrastructure. They're designed for that. Kwik-E-Mart wins on "I need a single binary that watches things and feeds specific prompts to LLMs with zero infrastructure."</p>
<hr />
<h2 id="where-it-actually-runs">Where It Actually Runs <a class="anchor" href="#where-it-actually-runs">🔗</a>
</h2>
<p>This isn't a developer tool. It's infrastructure.</p>
<p>Consider a GitLab CI pipeline where module-specific consumers with tailored prompts live <em>in the codebase</em> alongside the code they're about. Not a generic "AI, update the docs." A specific consumer config for <em>that</em> module, with <em>that</em> prompt, that knows how to handle <em>that</em> class of file:</p>
<pre data-lang="yaml" style="background-color:#12160d;color:#6ea240;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#95cc5e;">dispatch</span><span>:
</span><span> </span><span style="color:#95cc5e;">stage</span><span>: </span><span style="color:#f8bb39;">collect
</span><span> </span><span style="color:#95cc5e;">script</span><span>:
</span><span> - </span><span style="color:#f8bb39;">kwike dispatch --type "lint.result" < lint.json
</span><span> - </span><span style="color:#f8bb39;">kwike dispatch --type "test.result" < test.json
</span><span> </span><span style="color:#95cc5e;">artifacts</span><span>:
</span><span> </span><span style="color:#95cc5e;">paths</span><span>:
</span><span> - </span><span style="color:#f8bb39;">events.jsonl
</span><span>
</span><span style="color:#95cc5e;">review</span><span>:
</span><span> </span><span style="color:#95cc5e;">stage</span><span>: </span><span style="color:#f8bb39;">analyze
</span><span> </span><span style="color:#95cc5e;">needs</span><span>: [</span><span style="color:#f8bb39;">dispatch</span><span>]
</span><span> </span><span style="color:#95cc5e;">script</span><span>:
</span><span> - </span><span style="color:#f8bb39;">kwike consume --config llm-review.yaml --once
</span></code></pre>
<p><code>--once</code> means process events and exit. That's CI-native behavior. NATS consumers are designed to be long-lived. Kafka consumers definitely are. Kwik-E-Mart treats batch execution as a first-class mode because that's how CI works.</p>
<p>Or consider it deployed in a Lambda, spinning up instanced, routing based on the specific PagerDuty issue, delivering the right prompt for the right class of incident. Or as a k8s pod watching Prometheus for alarm conditions, helping define when PagerDuty should even be triggered. Or as a sidecar to any application, watching logs for specific conditions.</p>
<p>It runs where the events are, not where the developer is. These things don't run on developer systems.</p>
<hr />
<h2 id="the-point">The Point <a class="anchor" href="#the-point">🔗</a>
</h2>
<p>We keep building towns when we need gas stations. We keep opening chat sessions when we need mailboxes. We keep supervising robots when we should be delegating to them.</p>
<p>The unsleeping robot will do it 85% good enough. We can always clean it up. But otherwise? We get nothing. The sequence diagram stays stale. The API docs drift. The test failure sits unexplained until a human has time to look, which is never.</p>
<p>Just having sequence diagrams updated or API docs synced when HTTP endpoint code changes is like magic. Except it's not magic. It's a JSON-lines file, four subcommands, and a prompt that knows what it's looking at.</p>
<blockquote>
<p>"Small is beautiful." — Mike Gancarz, <em>The UNIX Philosophy</em>, 1994</p>
</blockquote>
<!-- REMARKS FOR NEXT SESSION:
## Status
- First draft complete from blog interviewer session
- Manson essayist inverse thesis structure applied:
- Opens with beads mail command (tangential)
- Weaves drudgery, degradation, scale, and Unix philosophy threads
- Thesis emerges at the end: delegate to the unsleeping robot, 85% > 0%
- Comparison section covers Unix pipes, NATS, Redis Streams, Kafka, Kwik-E-Mart
- CI and deployment section covers GitLab, Lambda, k8s, sidecar patterns
## Verify with Author
- [VERIFY: Is the catalyst-orchestrator timeline right — was beads mail inspiration during that work?]
- [VERIFY: The Gas Town description — "20 to 30 Claude instances" — is that accurate to your experience or just marketing?]
- [VERIFY: The newsgroup/NNTP analogy — does that land the way you want it to?]
- [VERIFY: Any specific module/prompt examples you want to include for the CI section?]
- [VERIFY: Tone check — is this the right amount of heat or too much/too little?]
## Links
- Kwik-E-Mart: https://sr.ht/~ninjapanzer/Kwik-E-Mart/
- Gas Town: https://github.com/steveyegge/gastown
- Beads: https://github.com/steveyegge/beads
- Catalyst article: @/i-made-a-thing/catalyst-orchestrator.md
- Recreating Kafka article: @/i-made-a-thing/recreating-kafka-blind.md
-->
<hr />
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="15-03-2026">15 03 2026 <a class="anchor" href="#15-03-2026">🔗</a>
</h3>
<h4 id="v0-5-making-tools-for-robots">v0.5 — Making tools for robots <a class="anchor" href="#v0-5-making-tools-for-robots">🔗</a>
</h4>
<p>Kwike is not really designed to be used by humans, its rather complicated. Since its been designed as a tool robots use its cli is intended to instruct process and carry a lot of documentation. Its an interesting problem and the question is, does something like this inform on how to build for humans too.</p>
</div>
Catalyst: An Orchestrator That Stopped Asking and Started DecidingSun, 15 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/catalyst-orchestrator/
https://developmeh.com/i-made-a-thing/catalyst-orchestrator/<!-- OUTLINE: Intro — The Two Problems
1. The beads-orchestrator (version 0, link: https://git.sr.ht/~ninjapanzer/beads-orchestrator) had Claude
doing everything — orchestration, decision-making, all of it. Claude Code crashed reading subagent chat
strings. It forgot to keep iterating. Memory issues. Timing issues.
2. Durability. Internet drops, Claude Max windows expire, sessions die. Needed something that waits,
recovers, and resumes without babysitting.
Goal: Create tasks across multiple projects (3 at once), let the daemon pick them up and complete them
overnight. The JetBrains beads manager (link to "keep-your-eyes-on..." article) feeds tasks in.
Optimize productivity while you sleep.
Hook/story: First real unattended run with the beads-orchestrator hit a concurrency bug — spun up so many
Sonnet sessions it burned through a Claude usage window in about 3 minutes.
-->
<hr />
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="11-02-2026">11 02 2026 <a class="anchor" href="#11-02-2026">🔗</a>
</h3>
<h4 id="v0-1-the-haiku-decides">v0.1 — The Haiku Decides <a class="anchor" href="#v0-1-the-haiku-decides">🔗</a>
</h4>
<p>The first version of catalyst was built around a simple idea: let a cheap, fast LLM handle the coordination. The daemon watched beads molecules for ready steps, spawned agents (Sonnet for implementation, Haiku for review, Opus for merging), and when something interesting happened — a review finished, a step failed, an agent got stuck — it packaged that event as a gate and shipped it over a Unix socket to a Haiku orchestrator.</p>
<p>Haiku's job was to interpret the situation and respond. "Review passed — merge or MR?" Haiku would answer <code>merge</code>. "Step failed — retry, skip, or abort?" Haiku would decide. The daemon was deliberately reactive. It didn't interpret agent output. It didn't make routing decisions. It just watched molecules, ran agents, hit gates, and waited for Haiku to tell it what to do.</p>
<p>The architecture looked like this:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>┌─────────────────┐ NDJSON/socket ┌─────────────────┐
</span><span>│ Haiku │◄────────────────────►│ catalyst │
</span><span>│ orchestrator │ │ daemon │
</span><span>│ │ gate_waiting: │ │
</span><span>│ - clears gates │ "review passed, │ - watches │
</span><span>│ - picks beads │ merge or mr?" │ molecules │
</span><span>│ - delegates │ ◄────────────────── │ - runs agents │
</span><span>│ merge to │ "merge" │ - emits events │
</span><span>│ Opus │ ──────────────────► │ - handles │
</span><span>└─────────────────┘ │ gates │
</span><span> └─────────────────┘
</span></code></pre>
<p>Agent output was unstructured prose. The daemon didn't parse it — Haiku did. Formulas defined the full workflow explicitly: <code>implement → review → fix → merge</code>, with gates as decision points between steps. Every transition required Haiku's blessing.</p>
<p>This worked. But it had a cost. Haiku was interpreting free-form text to make routing decisions that were, in practice, completely predictable. "Review passed" always meant merge. "Review failed" always meant fix. The orchestrator was spending tokens to arrive at conclusions the daemon could have reached by parsing a status field.</p>
</div>
<div class="devlog-entry">
<h3 id="13-02-2026">13 02 2026 <a class="anchor" href="#13-02-2026">🔗</a>
</h3>
<h4 id="v0-2-the-daemon-parses-the-daemon-routes">v0.2 — The Daemon Parses, The Daemon Routes <a class="anchor" href="#v0-2-the-daemon-parses-the-daemon-routes">🔗</a>
</h4>
<p>v0.2 was a philosophical inversion. Instead of the daemon asking Haiku what to do, agents were told to output a machine-parseable block, and the daemon was taught to read it.</p>
<p>The STEP-RESULT protocol replaced unstructured prose:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>---STEP-RESULT---
</span><span>STATUS: DONE
</span><span>VERDICT: APPROVED
</span><span>SUMMARY: Implementation meets all acceptance criteria
</span><span>INSTRUCTIONS:
</span><span>- Minor: consider adding a timeout to the HTTP client
</span><span>---END-RESULT---
</span></code></pre>
<p>The daemon's new <code>StepResultParser</code> extracted structured fields. The <code>StatusRouter</code> made deterministic decisions based on what it found:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>STATUS: DONE
</span><span> ├── Reviewer? Check VERDICT
</span><span> │ ├── APPROVED → enable merge step
</span><span> │ └── REJECTED → enable fix step, pass INSTRUCTIONS downstream
</span><span> └── Otherwise → advance DAG
</span><span>
</span><span>STATUS: BLOCKED → mark bead blocked
</span><span>STATUS: ERROR → mark bead blocked
</span></code></pre>
<p>No LLM interpretation. No token spend on routing. The daemon read the result and knew where to go.</p>
<p>This version also introduced automatic retry with Opus escalation — if an agent produced malformed output (missing the STEP-RESULT block), the daemon retried up to 5 times with the original model, then escalated to Opus. If even Opus couldn't produce a parseable result, the bead was marked blocked. All retry events were logged to bead comments for auditability.</p>
<p>Agent prompts moved from hardcoded Go strings to an external TOML template file (<code>agent_prompts.toml</code>), with Go template variables (<code>{{.BeadID}}</code>, <code>{{.Description}}</code>, <code>{{.ReviewInstructions}}</code>) injected at runtime. Review instructions from the reviewer's INSTRUCTIONS field flowed downstream to the fixer and merge agents via bead comments — the reviewer could say "fix the nil pointer in auth.go:45" and the fixer would see that in its prompt.</p>
<p>The formula still defined <code>implement → review → fix → merge</code> as explicit steps. The fix step always existed in the workflow, even when the review approved and it was never needed.</p>
</div>
<div class="devlog-entry">
<h3 id="14-02-2026">14 02 2026 <a class="anchor" href="#14-02-2026">🔗</a>
</h3>
<h4 id="v0-3-the-daemon-creates-steps-at-runtime">v0.3 — The Daemon Creates Steps at Runtime <a class="anchor" href="#v0-3-the-daemon-creates-steps-at-runtime">🔗</a>
</h4>
<p>v0.3 asked: if the daemon is already making the routing decisions, why does the fix step need to exist in the formula at all?</p>
<p>The answer was that it didn't. In v0.3, the fix step was removed from every formula. The workflow became <code>implement → review → merge</code>. When the reviewer output <code>VERDICT: REJECTED</code>, the daemon dynamically created a fix step in beads storage, spawned a fixer agent, and when the fix completed, reset the review step to <code>open</code> so it would re-run.</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>implement → review ←──────────────┐
</span><span> │ │
</span><span> ┌─────────┴─────────┐ │
</span><span> │ │ │
</span><span> APPROVED REJECTED │
</span><span> │ │ │
</span><span> ▼ ▼ │
</span><span> merge create fix │
</span><span> (dynamic step) │
</span><span> │ │
</span><span> fix runs │
</span><span> │ │
</span><span> fix DONE ─────┘
</span></code></pre>
<p>This loop could run up to 3 times. Iteration count was tracked via bead comments (<code>[daemon] FIX_ITERATION: N</code>). After 3 rejections, the bead was marked BLOCKED — the daemon decided the implementation couldn't be salvaged through automated fixes.</p>
<p>The key design insight was crash recovery. Fix steps weren't held in daemon memory — they were persisted as real beads issues. If the daemon crashed mid-fix, it would restart, scan for in-progress molecules, find the fix step by its title pattern (<code>Fix: <beadID></code>), and resume processing. The existing step identification logic (<code>extractFormulaStepID()</code>) already recognized the "Fix:" prefix, so dynamically created steps were processed identically to formula-defined ones.</p>
<p>The review step stayed <code>in_progress</code> during the fix loop rather than being closed and reopened. This avoided a tricky state transition (beads didn't naturally support closed → open) and made conceptual sense — the review represented "evaluate this implementation," and that evaluation wasn't complete until the code either passed or exceeded the iteration limit.</p>
<p>This version also introduced <code>stub-claude</code> for deterministic end-to-end testing. Instead of running real Claude against real code, test scenarios defined expected agent sequences: "reviewer rejects once, then approves" should produce exactly 6 agent invocations (refine, implement, review, fix, review, merge). This made the implicit fix loop testable without burning API tokens.</p>
</div>
<hr />
<h2 id="the-arc">The Arc <a class="anchor" href="#the-arc">🔗</a>
</h2>
<!-- OUTLINE: The Arc / Reflection — threads to weave together:
1. Proving Sonnet/Opus wrong about Haiku — They said Haiku couldn't orchestrate. It could, but its job
was better expressed as a DAG with error routing to Opus. The insight wasn't that Haiku was bad — it
was that the orchestration decisions were predictable enough to be code.
2. Moving things into code — Recurring theme across all 3 versions. The fix step was in the formula
(v0.1, v0.2), then it wasn't (v0.3). You keep pulling decisions out of LLM interpretation and into
deterministic logic. Not because LLMs can't do it, but because when the answer is predictable, code
is cheaper and more reliable. An implementer always gets a review, a review might always need a fix.
3. The daemon as a patient worker — The whole point is durability. It waits for internet. It waits for
Claude Max windows. It survives crashes. The fix loop persists in beads storage. This isn't about AI
autonomy — it's about building something that doesn't need you awake to keep working.
-->
<!-- OUTLINE: What's Next
- Session optimization: Each step is a fresh Claude session — expensive. Agents should assess task
complexity and choose Sonnet vs Opus accordingly.
- Inferred permissions: Move away from yolo mode. Infer project and task permissions, inject them per
session to reduce dangerous hallucinations.
- DAG visualization: See the daemon's decision-making as it routes work. Let users flag items for manual
review and make mid-stream changes for better control.
-->
Automatic Programming: Iteration 4Sun, 08 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/devex/automatic-programming-iteration-4/
https://developmeh.com/devex/automatic-programming-iteration-4/<img src="/devex/mark1-computer.jpg" alt="Grace Hopper: Mark 1 Computer" style="width: 100%; height: 600px; object-fit: cover; object-position: center calc(50% + 50px);">
<h2 id="automatic-programming-iteration-4">Automatic Programming: Iteration 4 <a class="anchor" href="#automatic-programming-iteration-4">🔗</a>
</h2>
<p>While I know that the discourse is not complete binary whether you are for or against LLM generated code it's probably the right time to take a step back a few years and explore the iterations of our industry.</p>
<p>Let's just work backwards, COBOL, iteration 3.</p>
<blockquote>
<p>[Common Business-Oriented Language] (Synonymous with evil.) A weak, verbose, and flabby language used by code grinders to do boring mindless things on dinosaur mainframes. Hackers believe that all COBOL programmers are suits or code grinders, and no self-respecting hacker will ever admit to having learned the language. Its very name is seldom uttered without ritual expressions of disgust or horror.
<em>Evans, Claire L - Broad band</em> the untold story of the women who made the Internet_ -> <em>From The Hacker's Dictionary</em></p>
</blockquote>
<p>Yes, the perpetual software of big finance and numerous other systems. While I haven't written COBOL myself I have had to rewrite at least one system written in COBOL.</p>
<p>Poking fun aside COBOL exists because of the dream that non-experts could be computer programmers. By today's standards you still need to be an expert to write COBOL. Prior to its creation, software was written in assembly and before that machine code, and before that patch cables, actual wires, iteration 1.</p>
<blockquote>
<p>Grace knew that would only happen when two things occurred:</p>
<ol>
<li>Users could command their own computers in natural language.</li>
<li>That language was machine independent.
That is to say, when a piece of software could be understood by a programmer as well as by its users, when the same piece of software could run on a UNIVAC as easily as on an IBM machine, code could begin to bend to the wills of the world. Grace called this general idea "automatic programming"...
<em>Evans, Claire L - Broad band</em> the untold story of the women who made the Internet_</li>
</ol>
</blockquote>
<p>That Grace of course was Grace Hopper, and she was obsessed with making programming easier and more efficient. In her time programming was a kind of wizardry and very few knew the incantations to make the computer operate. From a business standpoint making programming easier was bad for business since computer companies sold the computer and the software.</p>
<p>Portable programs, ones that could be written on any machine for any other machine was a business risk. It created competition so there was resistance.</p>
<blockquote>
<p>Those who resisted automatic programming became known as "Neanderthals." They might as well have called themselves framebreakers, as Lord Byron had over a century before.
<em>Evans, Claire L - Broad band</em> the untold story of the women who made the Internet_</p>
</blockquote>
<p>"Framebreakers" refers to those workers who opposed the automatic loom, better known as the Luddites.</p>
<p>Before computers cloth was made on the loom and the origin of the punchcard was used to create an automatic loom. After its invention there was not much use for using a manual loom. It was disruptive and changed an entire industry, displacing workers.</p>
<p>Grace and her cadre believed in a future where the programs write themselves. There is some corollary to today with code generation, which actually is programs writing themselves, something Grace dreamed of. The difference between the loom and the compiler is only in the growth potential. Cloth was an end result of a chain of optimization but while you could tirelessly create it in any pattern imaginable, someone still needed to imagine the patterns. Variety of cloth became commonplace, the art remained, the drudgery was lost.</p>
<p>Now I respect that for some the act was the value, I share those feelings. I love writing the actual code, I care about it more than the product it produces. An opinion whose popularity depends on what side of the invoice you sit.</p>
<blockquote>
<p>A quick lesson: computers do not understand English, French, Mandarin Chinese, or any human language. Only Machine code, usually binary, can command a computer, at its most elemental level, to pulse electricity through its interconnected logic gates.
<em>Evans, Claire L - Broad band</em> the untold story of the women who made the Internet_</p>
</blockquote>
<p>Programs are essentially the aggregation of basic operations, layers upon layers. If we think of code generation as just another kind of compiler it's part of a long lifeline of change approaching the ideal "automatic programming."</p>
<p>Of course I would prefer to ask Grace her formal opinion. But in her time when presenting her arguments, mathematicians were once inundated in the tedium of arithmetic to solve their equations. Computers arrived and essentially removed the need for those steps and allowed them to get closer to the interesting part, the solutions. She argued that the compiler did the same thing, modulating the complexity of using computers allowed programmers to spend more time on stimulating thoughts.</p>
<p>Of course the reality was mathematicians became programmers to advance their work.</p>
<p>Programmers used compilers to build elegant languages, and COBOL.</p>
<p>I think it's quite funny to have the perspective that programs are binary, iteration 2, and writing the program for the computer was to create something that could accurately generate binary programs.</p>
<p>Now we stand at another transition where automatic programming is telling the computer to write the code that the compiler turns into a binary program. If the compiler was the 3rd level operation we are now at the 4th.</p>
<h2 id="here-is-where-it-all-came-together-though">Here is where it all came together though: <a class="anchor" href="#here-is-where-it-all-came-together-though">🔗</a>
</h2>
<blockquote>
<p>Grace loved coding, but she admitted that "the novelty of inventing programs wears off and degenerates into the dull labor of writing and checking programs. This duty now looms as an imposition on the human brain"
<em>Evans, Claire L - Broad band</em> the untold story of the women who made the Internet_</p>
</blockquote>
<p>I have been feeling this for years, the code just keeps getting more repetitive. I just keep doing the same extremely complicated and extremely boring operations over and over again. The novelty of software I grew up with in the 2000s is over. Everything is a framework or a dogma and all solutions are solving the same problems with a different color scheme and font.</p>
<p>If anything would make me embrace the 4th level it's this, even if it means no one needs me anymore. I can at least see the realization of Grace's dream and in some way if everyone becomes a programmer finally, I'll have more people to talk to about what I love.</p>
<p>We aren't there yet, the software world is still pretty complicated and you have to know a lot of special dance moves to get things working right. But it's not going to be forever.</p>
<h2 id="the-book">The book <a class="anchor" href="#the-book">🔗</a>
</h2>
<p>Before I wander off into a diatribe of where our future is going lemme just stop and tell you to read this book:</p>
<p><a href="https://www.penguinrandomhouse.com/books/545427/broad-band-by-claire-l-evans/">Broad Band - Claire L. Evans</a></p>
<p>It's a good one, it has changed my mind on if I will continue to call myself an engineer or a programmer. The definition has finally been clarified. I was today - 2 weeks old and that was too long to know the truth.</p>
<p>Also it answers the question of why women were not present in Computer Science but I was taught by women who had careers in computer science.</p>
<p>Point is strong recommendation.</p>
<h2 id="where-are-we-going">Where are we going? <a class="anchor" href="#where-are-we-going">🔗</a>
</h2>
<p>I dunno, maybe we are all out of work. Maybe our MBA Degree bosses will finally see their reality of the numbers going up and to the right forever.</p>
<p>I see it like this, programming ended when the job absorbed all its roles by mere definition. It has been an amalgam for a while and that has been a crime. Now I can focus on building things again at the scale required in our times. I can concurrently build 2 or 3 projects while focusing on my writing. Sounds like a dream and the troubles of today are not forever.</p>
<p>The current goals of centralized AI is unsustainable and within a few years the ASICs will arrive, our computers will be packed with high bandwidth memory and the models will be local. Just like how all of a sudden we all started walking around super computers in our pockets we will build the infrastructure to build all the hardware we need to move forward.</p>
<p>I mean we live in the dumbest timeline and greed seems to be winning but if the pattern from past is here to loop again we go from the dumb time to the bright time for a while again. I am looking forward to that at least.</p>
Keep Your Eyes on the IDE, and Your Robots on the TicketsSun, 08 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/keep-your-eyes-on-the-ide-and-your-robots-on-the-tickets/
https://developmeh.com/i-made-a-thing/keep-your-eyes-on-the-ide-and-your-robots-on-the-tickets/<h2 id="keep-your-eyes-on-the-ide-and-your-robots-on-the-tickets">Keep Your Eyes on the IDE, and Your Robots on the Tickets <a class="anchor" href="#keep-your-eyes-on-the-ide-and-your-robots-on-the-tickets">🔗</a>
</h2>
<p><em>Initial Scene:</em></p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>Narrator: Bead Manager?! What does that even mean... let's start back at the beginning:
</span></code></pre>
<p><em>Scene Break:</em> (Dissolve)</p>
<p><em>Time Jump:</em></p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>"Two weeks earlier..."
</span></code></pre>
<p><em>New Scene:</em></p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>A tall handsome man with thick dark hair leans over a computer with boxes of black bordered in grey scrolling dark green text. Scowling...
</span><span>
</span><span>Author enters the room
</span><span>
</span><span>Author: Who the hell are you! Get away from my laptop! Freaking coffee shops...
</span></code></pre>
<h3 id="the-hero-s-journey">The Hero's Journey <a class="anchor" href="#the-hero-s-journey">🔗</a>
</h3>
<p>As you can imagine I have been following the post transformer LLM growth for about 4-5 years at this point. I didn't understand it and I never really used it but I keep my ear to the ground. Increasingly frustrated with the inability to keep the LLM on task. I mean its ignorance on my part and the tool isn't ready yet. Such is the mark of progress, things improve over time. Although I am still challenged with simple things.</p>
<blockquote>
<p>Give me 20 variations of this prompt for as jsonl training data using X format</p>
</blockquote>
<p>I get 8...</p>
<p>I get 23</p>
<p>I get 12</p>
<p><em>Jump Cut:</em></p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>Laptop launches out the window
</span></code></pre>
<p>So that's problem one and how do we solve it? Well with a novel wrapper that counts outputs and then re-prompts to do it again. I think they call that the <em>Ralph Loop</em>, I don't, I just call it the nature of the thing.</p>
<p>I learned later that this is generally caused by ambiguity of the context. Asking for 1 item 20 times and feeding back in the previous set to avoid duplicates always works better. The teaching: the computer is dumb, don't make it think too hard and everything goes smoother.</p>
<p>Most of what is to follow is the application of <a href="/soft-wares/agentic-patterns-elements-of-reusable-context-oriented-determinism/">Agentic Patterns: Elements of Reusable Context-Oriented Determinism</a></p>
<h3 id="beads">Beads <a class="anchor" href="#beads">🔗</a>
</h3>
<p>What beads provides is really just an idea and its worth exploring yourself: <a href="https://github.com/steveyegge/beads">https://github.com/steveyegge/beads</a>.</p>
<p>It self describes as "A memory upgrade for your coding agent," which I think is arguable but it was the trigger I needed to expand my concept of what a workflow with an LLM could look like. To be honest I didn't just go "Ah Beads! Its all clear now." Instead I found this article about <a href="https://steve-yegge.medium.com/welcome-to-gas-town-4f25ee16dd04">Gas Town</a> which I didn't read, thanks ADHD, and instead installed it blindly. If I was to give it a review it would be that Gas Town is kind of a meme of agent orchestration. Clearly, there is a lot of work put into it, but I think the author might agree that its an expression of an idea in a more artistic than practical form.</p>
<p>But who cares, I walked back from Gas Town to beads, the underlying magic, in my opinion. So I describe this as a context graph, I am able to manually or with an agent LLM extract just as much focused context as I want and use it as a concrete repeatable prompt. While the same prompt doesn't get the exact response each time, the same prompt gets generally the same tool use and generally the same code is constructed. Which makes me wonder if the variability of code is so limited by its grammar restrictions that LLMs have less predictive options to bias towards.</p>
<p>Ok, I am gilding the lily a bit, a bead is just a bug ticket or a todo list and its a prompt that has a dependency chain. How I am using it is more like Jira for robots, if Jira wasn't software designed for my suffering. I am able to build a feature and break down tasks then feed a path of those tasks to the agent.</p>
<p>You may be asking, but why not just use markdown files or JSONL. Well because I am a human and I hate reading JSONL files, I have ADHD so if the file is longer than 10 lines it will never be fully read. Better put what you want as the last line on the bottom cause thats all I see. Point is I need to be able to monitor, tune, and track the agents. See what Gas Town did was have the agents self-manage. While novel, its a bit bizarre when you are trying to avoid scope creep cause LLMs love to add features.</p>
<p>Back to the other question, why not markdown files. Well, two reasons, first they are kinda noisy, second if the LLM has to read more than the exact section of the file they are working on some ambiguity could be introduced. If you notice the agent will often scan a file 50 lines at a time if there is no index. Which means some of that ends up in its context. When we want determinism our first goal is to make sure each interaction is exactly the same prompt. This means beads is mostly an opinion and is probably not required.</p>
<h3 id="stay-in-the-ide-and-manage-your-robots">Stay in the IDE and Manage your robots <a class="anchor" href="#stay-in-the-ide-and-manage-your-robots">🔗</a>
</h3>
<p>So good choices after bad maybe but when I have a database for my tasks and their prompts I need a way to visualize it. The purpose here is to allow me to create and observe the tasks my agent orchestration is running on. For me this is just Claude Opus delegating tasks to Sonnet agents in an agentic loop.</p>
<p>This all started with this command <code>bd graph --compact --all</code></p>
<p><img src="/i-made-a-thing/Screenshot_2026-02-08_14-06-45.png" alt="Beads graph output" /></p>
<p>All because I wanted to watch my agent orchestration work through my tickets for another project.</p>
<p>Well that has led to this:</p>
<p><img src="/i-made-a-thing/Recording%202026-02-08%20at%2011.11.43.gif" alt="Beads Manager plugin demo" /></p>
<p>A full management console that lets me watch the beads transition status but also let me edit and add comments.</p>
<p>In this video there is an experimental refinement mechanism being demonstrated, available in the current release: <a href="https://plugins.jetbrains.com/plugin/30089-beads-manager">Jetbrains Marketplace</a></p>
<h3 id="the-workflow">The workflow <a class="anchor" href="#the-workflow">🔗</a>
</h3>
<p>So the other half of this tool is this set of prompts for claude: <a href="https://github.com/ninjapanzer/beads-orchestration-claude">beads-orchestration-claude</a></p>
<p>Now this is for claude but the practice can be applied manually or using other agents, the pattern is what matters and the prompt helps encapsulate the pattern more than the agent.</p>
<p>The keys here are:</p>
<ul>
<li>Recoverable</li>
<li>Durable</li>
<li>Keep your eyes in the IDE</li>
</ul>
<h4 id="1-planning">1. Planning <a class="anchor" href="#1-planning">🔗</a>
</h4>
<p>So our first path here is to plan out a feature. This is really the only time we have a discussion with the LLM but my recommendation is to write a brief in a markdown file. A musing is good enough where you describe the problem, some technical planning around constraints and the systems you want to support.</p>
<p>What you do for any brief, use-cases, goals, non-goals, definitions, and open questions.</p>
<p>Once this is prepared you hand this over to the agent. For me that uses the <code>/new project</code> command <a href="https://github.com/ninjapanzer/beads-orchestration-claude?tab=readme-ov-file#new-project-setup">REF</a> if we provide it with <code>project name</code> <code>readme</code> <code>git remote url</code> it will setup a baseline project with beads using some LLM magic and a bash script read the brief and prepare the project with a proper explanation of the project for CLAUDE.</p>
<p>Once we have a nice agent specific write up for the project, which is important, we can begin planning. Beads provides some tools that will naturally be injected into your project to help the agent. But you may need to tell your agent this</p>
<blockquote>
<p>use beads <code>bd</code> to plan out tasks for this project, <code>bd prime</code> for an overview of commands</p>
</blockquote>
<p><code>bd prime</code> exposes an agent friendly output for how to invoke commands.</p>
<p>Your agent should now be creating issues in beads for your project. Depending on how you like it you can use as many or as few features as you like from beads, which has a number of fields to hold context about actions. At the very simplest you will get titles and descriptions. If you asked for a feature or an epic you will find they may have been mapped as dependencies.</p>
<p>You should then review the tasks. This can be done with <code>bd list</code> and <code>bd show id</code> or use the jetbrains plugin.</p>
<h4 id="2-review">2. Review <a class="anchor" href="#2-review">🔗</a>
</h4>
<p>So now we review the beads and expand / contract the plan asking the agent to defer tickets we are unsure about or expand others.</p>
<h4 id="3-work-breakdown">3. Work Breakdown <a class="anchor" href="#3-work-breakdown">🔗</a>
</h4>
<p>This is probably the most important part. Ask a reasoning model to review all the beads and provide implementation details for those exact tasks in the beads. The idea here is to have the agent make a big plan but instead of writing all the code write code snippets that are attached to the tasks.</p>
<p>We can then take the vibe code approach and execute on this or do a pre-review of our code. Its not uncommon for the agent to have wandered down a bad architecture path. Here is our moment to focus on a specific task and a specific ticket and allow things to be revised in a focused way.</p>
<p>The best way to do this is to first clear your context and ask:</p>
<blockquote>
<p>Given the project overview please review bead <id> and revise to include an a single refresh flow for all data sources. Also review implementation details.</p>
</blockquote>
<h4 id="4-sdlc">4. SDLC <a class="anchor" href="#4-sdlc">🔗</a>
</h4>
<p>Tell the agent to now make documentation and testing tasks linking them as required to the beads that relate to them. You should end up with a layer 2 of tasks that will follow up after the implementation completes.</p>
<p>I usually then ask:</p>
<blockquote>
<p>Given the use-cases in the project overview define an e2e testing ticket for planning e2e tests that we can review at the end.</p>
</blockquote>
<p>If all is well the agent should create a task that it will stop and design testing with you that include acceptance criteria based on the provided use-cases.</p>
<h4 id="5-burn-tokens">5. Burn tokens <a class="anchor" href="#5-burn-tokens">🔗</a>
</h4>
<p>Now we get to the more technical part. We need to delegate actions to sub agents and depending on what agent infra you use this could be built-in or require manual orchestration.</p>
<p>The command <code>/beads-orchestrate</code> <a href="https://github.com/ninjapanzer/beads-orchestration-claude/tree/master?tab=readme-ov-file#beads-orchestration">REF</a> handles most of the heavy lifting.</p>
<p>It instructs the orchestrator to fork new processes using a template. For claude this means it will append</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>--dangerously-skip-permissions --model sonnet</span><span style="color:#d65940;">|</span><span>haiku --print -p </span><span style="color:#f8bb39;">"..."
</span></code></pre>
<p>For the prompt it will read the bead details along with some workflow about updating the bead and write that new prompt to a temp file. Passing the temp file to the new process. This obviously gives you the ability to debug what is happening at the injected prompt level.</p>
<p>It then sleeps and waits for the sub process to finish.</p>
<p><em>Why?</em> Well Claude is just a nodejs app and it eventually runs out heap space because it reads the stdout and stderr of all tasks it orchestrates internally. As a subprocess it watches, it's a fresh Claude instance so if it fails it fails in a recoverable way. Since the prompt file is named after the bead it can recover by just restarting the agent.</p>
<p>At this point the orchestrator should spawn the implementer which reads the implementation details and completes the work.</p>
<p>Then the orchestrator will spawn a new agent to handle code review, usually a simpler agent.</p>
<p>All this time the agents will leave comments on the tickets so you can see where it ran into problems and picked a tradeoff. This is very important for root cause analysis later if a feature degrades. You can have the LLM resurrect the changes merged into a branch the same name as the bead. Review the decision it made and make another one. Better yet, since the orchestrator and implementer read the comments you can just append your request to the ticket, mark it open, and tell the agent to refactor it again.</p>
<p>Here is an example of what a bead comment thread looks like in practice. This is from an issue where the Gradle build was failing:</p>
<blockquote>
<p><strong>[Opus]</strong> at 2026-02-06 01:11
Starting orchestration. Dispatching Sonnet implementer to fix Kotlin stdlib warning in gradle buildPlugin.</p>
<p><strong>[Opus]</strong> at 2026-02-06 01:16
Resuming orchestration. Dispatching Sonnet implementer to fix Kotlin stdlib warning.</p>
<p><strong>[Opus]</strong> at 2026-02-06 01:24
Resuming orchestration. Previous worktree had no commits - starting fresh. Dispatching Sonnet implementer to fix Kotlin stdlib warning.</p>
<p><strong>[Sonnet]</strong> at 2026-02-06 01:25
Starting implementation. Will examine build.gradle.kts and gradle.properties to understand current configuration, then apply fix per https://jb.gg/intellij-platform-kotlin-stdlib</p>
<p><strong>[Sonnet]</strong> at 2026-02-06 01:26
COMPLETED: Added kotlin.stdlib.default.dependency=false to gradle.properties. Build verified successful without warnings. Fix committed to branch.</p>
</blockquote>
<p>Notice how the orchestrator (Opus) had to resume twice - once after the first dispatch seemingly stalled, and again when it found the worktree had no commits. This is the kind of recovery that happens automatically. The implementer (Sonnet) then picked up the task, did its research, applied the fix, and verified success. All of this is visible in the ticket history without watching terminal output scroll by.</p>
<h4 id="6-when-it-fails">6. When it fails <a class="anchor" href="#6-when-it-fails">🔗</a>
</h4>
<p>This workflow isn't perfect but thats the big reason for the plugin. This whole process keeps you from staring at the chat stream and back into the IDE as your work. If you see progress not being made or an issue has comments that move it to blocked you can address it there and then just kick the orchestration. The goal is we have boring work we don't wanna do and we let the robot do it while we act on the interesting parts.</p>
<p>But sometimes it just hangs, haven't solved it yet. When this happens we are always recoverable. Claude subprompts have a 10 minute timeout so even orphaned they will be killed. You just start orchestration again on a clear context and things recover without your attention.</p>
BATS - Testing Bash Like You Mean ItSun, 08 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/tech-dives/bats-testing-bash-like-you-mean-it/
https://developmeh.com/tech-dives/bats-testing-bash-like-you-mean-it/<p>Bash has a reputation problem.</p>
<p>It's the language people write when they can't figure out how to do something in a "real" language. It's duct tape. It's the thing that holds your CI/CD together with <code>set -e</code> and crossed fingers. Nobody tests bash scripts because, well, how would you even do that?</p>
<p>This is bullshit.</p>
<p>Bash is core to every Unix-like operating system. It's the glue between tools. It's the orchestration layer for distributed systems. If you're building CLI tools meant to be composed, piped, and chained together—bash isn't a workaround, it's the runtime.</p>
<p>I built a distributed job queue CLI. The components were solid Go with good unit tests. But unit tests couldn't answer the real question: does this thing actually work when you're using it the way it's meant to be used? In bash. From the command line. With real files and processes and timing issues.</p>
<p>BATS—the Bash Automated Testing System—turned out to be the answer. Not Cucumber. Not end-to-end frameworks that spawn browsers. BATS. Because if your tool lives in bash, your integration tests should too.</p>
<p>Here's how to use it.</p>
<h2 id="what-bats-actually-is">What BATS Actually Is <a class="anchor" href="#what-bats-actually-is">🔗</a>
</h2>
<p>BATS is a TAP-compliant testing framework for bash scripts. It runs tests, reports results, and provides assertion helpers that don't make you want to throw your keyboard.</p>
<h3 id="installation">Installation <a class="anchor" href="#installation">🔗</a>
</h3>
<p>Skip the package managers. Clone the repos directly into your project so everyone gets the same version:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># install-bats-libs.sh
</span><span style="color:#3c4e2d;">#!/bin/bash -e
</span><span>
</span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[ </span><span>-d </span><span style="color:#f8bb39;">"./.test/bats" </span><span style="color:#95cc5e;">]</span><span style="color:#d65940;">; then
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"Deleting folder $FOLDER"
</span><span> rm -rf </span><span style="color:#f8bb39;">"./.test/bats/"
</span><span> mkdir -p ./.test/bats
</span><span style="color:#d65940;">else
</span><span> mkdir -p ./.test/bats
</span><span style="color:#d65940;">fi
</span><span>
</span><span>git clone --depth 1 https://github.com/bats-core/bats-core ./.test/bats/bats
</span><span>rm -rf ./.test/bats/bats/.git
</span><span>
</span><span>git clone --depth 1 https://github.com/ztombol/bats-support ./.test/bats/bats-support
</span><span>rm -rf ./.test/bats/bats-support/.git
</span><span>
</span><span>git clone --depth 1 https://github.com/ztombol/bats-assert ./.test/bats/bats-assert
</span><span>rm -rf ./.test/bats/bats-assert/.git
</span><span>
</span><span>git clone --depth 1 https://github.com/jasonkarns/bats-mock.git ./.test/bats/bats-mock
</span><span>rm -rf ./.test/bats/bats-mock/.git
</span></code></pre>
<blockquote>
<p><strong>Bash Note:</strong> <code>[ -d "./.test/bats" ]</code> uses the single-bracket test command (<a href="https://linux.die.net/man/1/test"><code>test</code></a>) to check if a directory exists. The <code>-d</code> flag returns true if the path exists and is a directory. Single brackets are POSIX-compliant and work in any shell. The spaces inside the brackets are required—<code>[-d ...]</code> won't work.</p>
</blockquote>
<p>Run it once, commit the <code>.test/bats</code> directory. Now your tests work the same everywhere.</p>
<p>Need to start fresh? Here's the cleanup script:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># clean-bats.sh
</span><span style="color:#3c4e2d;">#!/bin/bash -e
</span><span>
</span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[ </span><span>-d </span><span style="color:#f8bb39;">"./.test/bats" </span><span style="color:#95cc5e;">]</span><span style="color:#d65940;">; then
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"Deleting folder $FOLDER"
</span><span> rm -rf </span><span style="color:#f8bb39;">"./.test/bats/"
</span><span style="color:#d65940;">fi
</span></code></pre>
<p>This gives you:</p>
<ul>
<li><strong>bats-core</strong> - The test runner itself</li>
<li><strong>bats-support</strong> - Required dependency for other helpers</li>
<li><strong>bats-assert</strong> - <code>assert_success</code>, <code>assert_output</code>, <code>assert_line</code></li>
<li><strong>bats-mock</strong> - Stubbing external commands</li>
</ul>
<p>Run tests with the local binary:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>./.test/bats/bats/bin/bats .test/</span><span style="color:#d65940;">*</span><span>.bats
</span></code></pre>
<p>Or add it to your PATH in your test helper (we'll get to that).</p>
<h2 id="level-1-basic-command-testing">Level 1: Basic Command Testing <a class="anchor" href="#level-1-basic-command-testing">🔗</a>
</h2>
<p>Start simple. Can your CLI run without exploding?</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/basic.bats
</span><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span style="color:#3c4e2d;"># Load helper libraries
</span><span>load bats/bats-support/load
</span><span>load bats/bats-assert/load
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"command exists and shows help" </span><span>{
</span><span> run mycli --help
</span><span> assert_success
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"Usage:"
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"version flag returns version" </span><span>{
</span><span> run mycli --version
</span><span> assert_success
</span><span> assert_output --regexp </span><span style="color:#f8bb39;">'[0-9]+\.[0-9]+\.[0-9]+'
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"invalid command shows error" </span><span>{
</span><span> run mycli not-a-real-command
</span><span> assert_failure
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"Unknown command"
</span><span>}
</span></code></pre>
<p>Run it:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>bats test/basic.bats
</span></code></pre>
<p>You get TAP output:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span> ✓ command exists and shows help
</span><span> ✓ version flag returns version
</span><span> ✓ invalid command shows error
</span><span>
</span><span>3 tests, 0 failures
</span></code></pre>
<h3 id="setup-and-teardown">Setup and Teardown <a class="anchor" href="#setup-and-teardown">🔗</a>
</h3>
<p>Tests need clean state. Use <code>setup()</code> and <code>teardown()</code>:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/workspace.bats
</span><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span>load bats/bats-support/load
</span><span>load bats/bats-assert/load
</span><span>
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># Create temporary directory for this test
</span><span> TEST_TEMP</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(mktemp -d)
</span><span> </span><span style="color:#95cc5e;">cd </span><span style="color:#f8bb39;">"$TEST_TEMP"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Initialize your tool's workspace
</span><span> mkdir -p .myapp
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># Clean up after test
</span><span> </span><span style="color:#95cc5e;">cd</span><span> /
</span><span> rm -rf </span><span style="color:#f8bb39;">"$TEST_TEMP"
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"creates job file" </span><span>{
</span><span> run mycli jobs create </span><span style="color:#f8bb39;">"Do the thing"
</span><span> assert_success
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify file was created in workspace
</span><span> </span><span style="color:#d65940;">[ -</span><span>f .myapp/queue.jsonl </span><span style="color:#d65940;">]
</span><span>}
</span></code></pre>
<blockquote>
<p><strong>Bash Note:</strong> <code>[ -f .myapp/queue.jsonl ]</code> uses <code>-f</code> to test if a regular file exists (<a href="https://linux.die.net/man/1/test"><code>test</code></a>). In BATS, the test passes if the command returns exit code 0 (true). If the file doesn't exist, the test command returns 1 and BATS marks the test as failed.</p>
</blockquote>
<p>Every test gets a fresh <code>$TEST_TEMP</code>. No pollution between tests. No "but it worked on my machine" because you forgot to clean up.</p>
<h3 id="assertions-that-actually-help">Assertions That Actually Help <a class="anchor" href="#assertions-that-actually-help">🔗</a>
</h3>
<p>The basic assertions you'll use constantly:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># assertion-cheatsheet.bash (not a runnable file, just reference)
</span><span>run some-command
</span><span>
</span><span style="color:#3c4e2d;"># Exit code
</span><span>assert_success </span><span style="color:#3c4e2d;"># Exit 0
</span><span>assert_failure </span><span style="color:#3c4e2d;"># Exit non-zero
</span><span>assert_equal $status 2 </span><span style="color:#3c4e2d;"># Specific exit code
</span><span>
</span><span style="color:#3c4e2d;"># Output
</span><span>assert_output </span><span style="color:#f8bb39;">"exact match"
</span><span>assert_output --partial </span><span style="color:#f8bb39;">"substring"
</span><span>assert_output --regexp </span><span style="color:#f8bb39;">'^[0-9]+$'
</span><span>
</span><span style="color:#3c4e2d;"># Line-specific (0-indexed)
</span><span>assert_line --index 0 </span><span style="color:#f8bb39;">"First line"
</span><span>assert_line --partial </span><span style="color:#f8bb39;">"appears somewhere"
</span><span>
</span><span style="color:#3c4e2d;"># Negation
</span><span>refute_output </span><span style="color:#f8bb39;">"should not appear"
</span><span>refute_line --partial </span><span style="color:#f8bb39;">"nope"
</span></code></pre>
<p>This is already more rigorous than most bash scripts get. You're testing real behavior, not mocking function calls.</p>
<h2 id="level-2-test-helpers-and-mocking">Level 2: Test Helpers and Mocking <a class="anchor" href="#level-2-test-helpers-and-mocking">🔗</a>
</h2>
<p>Real CLI tools interact with other tools. They read files. They parse JSON. They have dependencies.</p>
<p>You need test helpers.</p>
<h3 id="shared-setup-in-a-test-helper">Shared Setup in a Test Helper <a class="anchor" href="#shared-setup-in-a-test-helper">🔗</a>
</h3>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/test_helper.bash
</span><span>
</span><span style="color:#3c4e2d;"># Shared test workspace setup
</span><span style="color:#db784d;">export </span><span>TEST_WORKSPACE</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${BATS_TEST_TMPDIR}/workspace"
</span><span style="color:#db784d;">export </span><span>MOCK_BD</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${BATS_TEST_TMPDIR}/bin/bd"
</span><span>
</span><span style="color:#60a365;">setup_workspace</span><span>() {
</span><span> rm -rf </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}"
</span><span> mkdir -p </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp"
</span><span> mkdir -p </span><span style="color:#f8bb39;">"$(dirname "${MOCK_BD}")"
</span><span> </span><span style="color:#95cc5e;">cd </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}"
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown_workspace</span><span>() {
</span><span> </span><span style="color:#95cc5e;">cd</span><span> /
</span><span> rm -rf </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}"
</span><span>}
</span></code></pre>
<blockquote>
<p><strong>Bash Note:</strong> <code>${VAR}</code> and <code>$(cmd)</code> look similar but do completely different things. <code>${BATS_TEST_TMPDIR}</code> is <a href="https://www.gnu.org/software/bash/manual/html_node/Shell-Parameter-Expansion.html">parameter expansion</a>—it retrieves the value of the variable. <code>$(dirname "${MOCK_BD}")</code> is <a href="https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html">command substitution</a>—it runs <code>dirname</code> and captures its output. The braces in <code>${VAR}</code> are optional for simple names (<code>$VAR</code> works too) but required when concatenating: <code>${VAR}_suffix</code> vs the broken <code>$VAR_suffix</code>.</p>
</blockquote>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/test_helper.bash (continued)
</span><span>
</span><span style="color:#3c4e2d;"># Mock the 'bd' command that your CLI depends on
</span><span style="color:#3c4e2d;"># Uses heredoc (<<EOF) to write a multi-line script to a file
</span><span style="color:#60a365;">setup_mock_bd</span><span>() {
</span><span> </span><span style="color:#db784d;">local </span><span>issues_json</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$1"
</span><span>
</span><span> cat </span><span style="color:#d65940;">> </span><span style="color:#f8bb39;">"${MOCK_BD}" </span><span style="color:#d65940;"><<EOF
</span><span style="color:#f8bb39;">#!/usr/bin/env bash
</span><span style="color:#f8bb39;">case "</span><span style="color:#db784d;">\$</span><span style="color:#f8bb39;">1" in
</span><span style="color:#f8bb39;"> list)
</span><span style="color:#f8bb39;"> cat <<'ISSUES'
</span><span style="color:#f8bb39;">${issues_json}
</span><span style="color:#f8bb39;">ISSUES
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;"> show)
</span><span style="color:#f8bb39;"> # Return single issue based on </span><span style="color:#db784d;">\$</span><span style="color:#f8bb39;">2 (issue ID)
</span><span style="color:#f8bb39;"> echo '{"id":"'"$2"'","title":"Mock issue","status":"open"}'
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;"> *)
</span><span style="color:#f8bb39;"> echo "Mock bd: Unknown command </span><span style="color:#db784d;">\$</span><span style="color:#f8bb39;">1" >&2
</span><span style="color:#f8bb39;"> exit 1
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;">esac
</span><span style="color:#d65940;">EOF
</span><span>
</span><span> chmod +x </span><span style="color:#f8bb39;">"${MOCK_BD}"
</span><span> </span><span style="color:#db784d;">export </span><span>PATH</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$(dirname "${MOCK_BD}"):${PATH}"
</span><span>}
</span><span>
</span><span style="color:#3c4e2d;"># JSON assertion helper
</span><span style="color:#60a365;">assert_json_field</span><span>() {
</span><span> </span><span style="color:#db784d;">local </span><span>json</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$1"
</span><span> </span><span style="color:#db784d;">local </span><span>field</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$2"
</span><span> </span><span style="color:#db784d;">local </span><span>expected</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$3"
</span><span>
</span><span> </span><span style="color:#db784d;">local </span><span>actual</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$json" </span><span style="color:#d65940;">| </span><span style="color:#f8bb39;">jq -r "$field")
</span><span> </span><span style="color:#95cc5e;">[[ </span><span style="color:#f8bb39;">"$actual" </span><span style="color:#d65940;">== </span><span style="color:#f8bb39;">"$expected" </span><span style="color:#95cc5e;">]] </span><span style="color:#d65940;">|| </span><span>{
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"Expected ${field}='${expected}', got '${actual}'"
</span><span> </span><span style="color:#d65940;">return</span><span> 1
</span><span> }
</span><span>}
</span><span>
</span><span style="color:#3c4e2d;"># File content helpers
</span><span style="color:#60a365;">assert_file_contains</span><span>() {
</span><span> </span><span style="color:#db784d;">local </span><span>file</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$1"
</span><span> </span><span style="color:#db784d;">local </span><span>expected</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"$2"
</span><span>
</span><span> grep -q </span><span style="color:#f8bb39;">"$expected" "$file" </span><span style="color:#d65940;">|| </span><span>{
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"File $file does not contain '$expected'"
</span><span> </span><span style="color:#d65940;">return</span><span> 1
</span><span> }
</span><span>}
</span></code></pre>
<blockquote>
<p><strong>Bash Note:</strong> <code><<EOF ... EOF</code> is a heredoc (<a href="https://www.gnu.org/software/bash/manual/html_node/Redirections.html#Here-Documents">Here Documents</a>)—a way to embed multi-line strings. Variables like <code>${issues_json}</code> are expanded inside. Use <code><<'EOF'</code> (quoted delimiter) to prevent expansion when you want literal <code>$</code> characters in the output. The <code>cat > file <<EOF</code> pattern writes the heredoc content to a file.</p>
</blockquote>
<blockquote>
<p><strong>Bash Note:</strong> <code>local</code> declares a variable scoped to the current function (<a href="https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html#index-local"><code>local</code></a>). Without <code>local</code>, variables are global and leak into other functions—a common source of test pollution. Always use <code>local</code> for function parameters and temporary values.</p>
</blockquote>
<blockquote>
<p><strong>Bash Note:</strong> <code>[[ "$actual" == "$expected" ]]</code> uses double brackets, a bash-specific conditional (<a href="https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html#index-_005b_005b"><code>[[</code></a>). Unlike single brackets, double brackets don't require quoting variables to prevent word splitting, support pattern matching with <code>==</code>, and allow <code>&&</code>/<code>||</code> inside the expression. The <code>|| { ... }</code> pattern runs the block only if the test fails—a compact way to handle errors without <code>if/else</code>.</p>
</blockquote>
<p>Now your tests can load this:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/sync.bats
</span><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span>load bats/bats-support/load
</span><span>load bats/bats-assert/load
</span><span>load bats/bats-file/load
</span><span>load test_helper
</span><span>
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> setup_workspace
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown</span><span>() {
</span><span> teardown_workspace
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"syncs with bd issues" </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Setup mock bd command to return fake issues
</span><span> local mock_issues=</span><span style="color:#f8bb39;">'[
</span><span style="color:#f8bb39;"> {"id":"abc-123","title":"Fix the widget","status":"open"},
</span><span style="color:#f8bb39;"> {"id":"def-456","title":"Refactor gizmo","status":"done"}
</span><span style="color:#f8bb39;"> ]'
</span><span>
</span><span> setup_mock_bd </span><span style="color:#f8bb39;">"$mock_issues"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Run your CLI that calls 'bd list' internally
</span><span> run mycli sync
</span><span>
</span><span> assert_success
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"Synced 2 issues"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify the sync created local files
</span><span> assert_file_exist </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/issues.json"
</span><span>}
</span></code></pre>
<h3 id="mocking-external-commands">Mocking External Commands <a class="anchor" href="#mocking-external-commands">🔗</a>
</h3>
<p>Your CLI probably calls external tools. Git. curl. jq. Whatever.</p>
<p>Mock them:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/test_helper.bash (add to existing file)
</span><span style="color:#60a365;">setup_mock_git</span><span>() {
</span><span> cat </span><span style="color:#d65940;">> </span><span style="color:#f8bb39;">"${BATS_TEST_TMPDIR}/bin/git" </span><span style="color:#d65940;"><<</span><span style="color:#f8bb39;">'</span><span style="color:#d65940;">EOF</span><span style="color:#f8bb39;">'
</span><span style="color:#f8bb39;">#!/usr/bin/env bash
</span><span style="color:#f8bb39;">case "$1" in
</span><span style="color:#f8bb39;"> rev-parse)
</span><span style="color:#f8bb39;"> echo "abc123def456" # Fake commit hash
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;"> status)
</span><span style="color:#f8bb39;"> echo "On branch main"
</span><span style="color:#f8bb39;"> echo "nothing to commit, working tree clean"
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;"> *)
</span><span style="color:#f8bb39;"> exit 1
</span><span style="color:#f8bb39;"> ;;
</span><span style="color:#f8bb39;">esac
</span><span style="color:#d65940;">EOF
</span><span> chmod +x </span><span style="color:#f8bb39;">"${BATS_TEST_TMPDIR}/bin/git"
</span><span> </span><span style="color:#db784d;">export </span><span>PATH</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${BATS_TEST_TMPDIR}/bin:${PATH}"
</span><span>}
</span></code></pre>
<p>Then in your test:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/deploy.bats (excerpt)
</span><span>@test </span><span style="color:#f8bb39;">"records git commit in metadata" </span><span>{
</span><span> setup_mock_git
</span><span>
</span><span> run mycli deploy
</span><span>
</span><span> assert_success
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify it captured the fake commit hash
</span><span> local metadata=$(cat .myapp/last-deploy.json)
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$metadata" ".commit" "abc123def456"
</span><span>}
</span></code></pre>
<h3 id="testing-json-output">Testing JSON Output <a class="anchor" href="#testing-json-output">🔗</a>
</h3>
<p>CLI tools love JSON. Test it properly:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/json-output.bats (excerpt)
</span><span>@test </span><span style="color:#f8bb39;">"job status returns valid JSON" </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Create a job first
</span><span> run mycli jobs create </span><span style="color:#f8bb39;">"Test job"
</span><span> assert_success
</span><span>
</span><span> local job_id=$(</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$output" </span><span style="color:#d65940;">| </span><span>jq -r </span><span style="color:#f8bb39;">'.job_id'</span><span>)
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Query job status
</span><span> run mycli jobs show </span><span style="color:#f8bb39;">"$job_id"
</span><span> assert_success
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Validate JSON structure
</span><span> echo </span><span style="color:#f8bb39;">"$output"</span><span> | jq . > /dev/null || fail </span><span style="color:#f8bb39;">"Invalid JSON"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Check specific fields
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$output" ".job_id" "$job_id"
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$output" ".state" "pending"
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$output" ".title" "Test job"
</span><span>}
</span></code></pre>
<p>This is real integration testing. You're not stubbing out JSON parsing—you're testing the actual output your users will see.</p>
<h2 id="level-3-background-processes-and-state-machines">Level 3: Background Processes and State Machines <a class="anchor" href="#level-3-background-processes-and-state-machines">🔗</a>
</h2>
<p>Here's where BATS gets interesting.</p>
<p>Real CLI tools do async things. They wait for conditions. They poll. They recover from failures. They manage state transitions.</p>
<h3 id="testing-background-processes">Testing Background Processes <a class="anchor" href="#testing-background-processes">🔗</a>
</h3>
<p>Say your CLI has a <code>--wait</code> flag that blocks until a job completes. How do you test that?</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/async.bats
</span><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span>load bats/bats-support/load
</span><span>load bats/bats-assert/load
</span><span>load bats/bats-file/load
</span><span>load test_helper
</span><span>
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> setup_workspace
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown</span><span>() {
</span><span> teardown_workspace
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">"waits for job completion" </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Create a pending job directly in the file system
</span><span> local job_id=</span><span style="color:#f8bb39;">"job-$(date +%s)"
</span><span> local pending_job=</span><span style="color:#f8bb39;">'{
</span><span style="color:#f8bb39;"> "job_id":"'</span><span>${job_id}</span><span style="color:#f8bb39;">'",
</span><span style="color:#f8bb39;"> "title":"Background test job",
</span><span style="color:#f8bb39;"> "state":"pending",
</span><span style="color:#f8bb39;"> "created_at":"'</span><span>$(date -Iseconds)</span><span style="color:#f8bb39;">'"
</span><span style="color:#f8bb39;"> }'
</span><span>
</span><span> echo </span><span style="color:#f8bb39;">"$pending_job"</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/queue.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Start the wait command in the background
</span><span> mycli jobs show </span><span style="color:#f8bb39;">"$job_id"</span><span> --wait --timeout=10s \
</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/output.txt"</span><span> 2>&1 &
</span><span> local wait_pid=$!
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Give it a moment to start
</span><span> sleep 1
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Simulate job completion by moving it to done state
</span><span> local completed_job=</span><span style="color:#f8bb39;">'{
</span><span style="color:#f8bb39;"> "job_id":"'</span><span>${job_id}</span><span style="color:#f8bb39;">'",
</span><span style="color:#f8bb39;"> "title":"Background test job",
</span><span style="color:#f8bb39;"> "state":"completed",
</span><span style="color:#f8bb39;"> "created_at":"'</span><span>$(date -Iseconds)</span><span style="color:#f8bb39;">'",
</span><span style="color:#f8bb39;"> "completed_at":"'</span><span>$(date -Iseconds)</span><span style="color:#f8bb39;">'"
</span><span style="color:#f8bb39;"> }'
</span><span>
</span><span> rm -f </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/queue.jsonl"
</span><span> echo </span><span style="color:#f8bb39;">"$completed_job"</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/done.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Wait for the background process to finish
</span><span> wait $wait_pid
</span><span> local exit_code=$?
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify it exited successfully
</span><span> assert_equal $exit_code 0
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Check the output
</span><span> run cat </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/output.txt"
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"completed successfully"
</span><span>}
</span></code></pre>
<blockquote>
<p><strong>Bash Note:</strong> The <code>&</code> at the end of a command runs it in the background (<a href="https://www.gnu.org/software/bash/manual/html_node/Job-Control-Basics.html">Job Control</a>). <code>$!</code> is a special variable containing the PID of the last background process (<a href="https://www.gnu.org/software/bash/manual/html_node/Special-Parameters.html">Special Parameters</a>). The <a href="https://linux.die.net/man/1/bash"><code>wait</code></a> builtin blocks until the specified PID exits and sets <code>$?</code> to its exit code. This pattern—background a process, do something, then wait for it—is essential for testing async CLI behavior.</p>
</blockquote>
<blockquote>
<p><strong>Bash Note:</strong> <code>$(date +%s)</code> uses command substitution (<a href="https://www.gnu.org/software/bash/manual/html_node/Command-Substitution.html">Command Substitution</a>) to capture a command's stdout as a string. The <code>$()</code> syntax is preferred over backticks because it nests cleanly and is easier to read.</p>
</blockquote>
<p>You're testing the actual polling logic, the actual file watching, the actual timeout behavior. Not a mock. Not a stub. The real thing.</p>
<h3 id="testing-timeout-behavior">Testing Timeout Behavior <a class="anchor" href="#testing-timeout-behavior">🔗</a>
</h3>
<p>What happens when things don't complete?</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/async.bats (continued)
</span><span>@test </span><span style="color:#f8bb39;">"wait times out if job never completes" </span><span>{
</span><span> local job_id=</span><span style="color:#f8bb39;">"job-timeout-test"
</span><span> local pending_job=</span><span style="color:#f8bb39;">'{"job_id":"'</span><span>${job_id}</span><span style="color:#f8bb39;">'","state":"pending"}'
</span><span>
</span><span> echo </span><span style="color:#f8bb39;">"$pending_job"</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/queue.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Start wait with short timeout
</span><span> mycli jobs show </span><span style="color:#f8bb39;">"$job_id"</span><span> --wait --timeout=2s \
</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/output.txt"</span><span> 2>&1 &
</span><span> local wait_pid=$!
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Don't complete the job - let it timeout
</span><span>
</span><span> wait $wait_pid
</span><span> local exit_code=$?
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Should exit with error
</span><span> assert_equal $exit_code 1
</span><span>
</span><span> run cat </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/output.txt"
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"timeout"
</span><span>}
</span></code></pre>
<h3 id="testing-state-machine-transitions">Testing State Machine Transitions <a class="anchor" href="#testing-state-machine-transitions">🔗</a>
</h3>
<p>Job queues are state machines. Jobs move between states. Some transitions are valid. Some aren't.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/state-machine.bats (excerpt)
</span><span>@test </span><span style="color:#f8bb39;">"prevents invalid state transitions" </span><span>{
</span><span> local job_id=</span><span style="color:#f8bb39;">"state-test-job"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Create completed job
</span><span> local completed_job=</span><span style="color:#f8bb39;">'{"job_id":"'</span><span>${job_id}</span><span style="color:#f8bb39;">'","state":"completed"}'
</span><span> echo </span><span style="color:#f8bb39;">"$completed_job"</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/done.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Try to start a completed job (invalid transition)
</span><span> run mycli jobs start </span><span style="color:#f8bb39;">"$job_id"
</span><span>
</span><span> assert_failure
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"Cannot start job in completed state"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify job state didn't change
</span><span> run mycli jobs show </span><span style="color:#f8bb39;">"$job_id"
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$output" ".state" "completed"
</span><span>}
</span></code></pre>
<h3 id="testing-time-dependent-behavior">Testing Time-Dependent Behavior <a class="anchor" href="#testing-time-dependent-behavior">🔗</a>
</h3>
<p>The hard part. Jobs with heartbeats. Stale locks. Orphan recovery.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/recovery.bats (excerpt)
</span><span>@test </span><span style="color:#f8bb39;">"recovers orphaned jobs with stale heartbeats" </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Create job with old heartbeat (2 minutes ago)
</span><span> local stale_time=$(date -Iseconds -d </span><span style="color:#f8bb39;">'2 minutes ago'</span><span>)
</span><span> local job_id=</span><span style="color:#f8bb39;">"orphan-job"
</span><span>
</span><span> local orphan_job=</span><span style="color:#f8bb39;">'{
</span><span style="color:#f8bb39;"> "job_id":"'</span><span>${job_id}</span><span style="color:#f8bb39;">'",
</span><span style="color:#f8bb39;"> "state":"running",
</span><span style="color:#f8bb39;"> "started_at":"'</span><span>${stale_time}</span><span style="color:#f8bb39;">'",
</span><span style="color:#f8bb39;"> "heartbeat_at":"'</span><span>${stale_time}</span><span style="color:#f8bb39;">'"
</span><span style="color:#f8bb39;"> }'
</span><span>
</span><span> echo </span><span style="color:#f8bb39;">"$orphan_job"</span><span> > </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/active.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Run recovery command
</span><span> run mycli jobs recover
</span><span>
</span><span> assert_success
</span><span> assert_output --partial </span><span style="color:#f8bb39;">"Recovered 1 orphaned job"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify job moved back to queue
</span><span> assert_file_exist </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/queue.jsonl"
</span><span> refute_file_exist </span><span style="color:#f8bb39;">"${TEST_WORKSPACE}/.myapp/active.jsonl"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify job state reset
</span><span> run mycli jobs show </span><span style="color:#f8bb39;">"$job_id"
</span><span> assert_json_field </span><span style="color:#f8bb39;">"$output" ".state" "pending"
</span><span>}
</span></code></pre>
<p>This test manipulates time by creating timestamps in the past. It then verifies that your recovery logic correctly identifies stale jobs and transitions them.</p>
<h3 id="testing-concurrent-operations">Testing Concurrent Operations <a class="anchor" href="#testing-concurrent-operations">🔗</a>
</h3>
<p>Multiple processes writing to the same files. The nightmare scenario.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/concurrency.bats (excerpt)
</span><span>@test </span><span style="color:#f8bb39;">"handles concurrent job creation" </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Start 5 job creations in parallel
</span><span> for i in {1..5}; do
</span><span> mycli jobs create </span><span style="color:#f8bb39;">"Concurrent job $i"</span><span> &
</span><span> done
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Wait for all background processes
</span><span> wait
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify all 5 jobs were created
</span><span> run mycli jobs list
</span><span> assert_success
</span><span>
</span><span> local job_count=$(</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$output" </span><span style="color:#d65940;">| </span><span>jq </span><span style="color:#f8bb39;">'. | length'</span><span>)
</span><span> assert_equal </span><span style="color:#f8bb39;">"$job_count" "5"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Verify no duplicate job IDs
</span><span> local unique_ids=$(</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$output" </span><span style="color:#d65940;">| </span><span>jq -r </span><span style="color:#f8bb39;">'.[].job_id' </span><span style="color:#d65940;">| </span><span>sort </span><span style="color:#d65940;">| </span><span>uniq </span><span style="color:#d65940;">| </span><span>wc -l)
</span><span> assert_equal </span><span style="color:#f8bb39;">"$unique_ids" "5"
</span><span>}
</span></code></pre>
<p>If your CLI uses file locking or atomic writes, this test will catch races.</p>
<h2 id="running-your-test-suite">Running Your Test Suite <a class="anchor" href="#running-your-test-suite">🔗</a>
</h2>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># run-tests.sh
</span><span style="color:#3c4e2d;">#!/usr/bin/env bash
</span><span style="color:#95cc5e;">set</span><span> -euo pipefail
</span><span>
</span><span style="color:#3c4e2d;"># Run all BATS tests using the local install
</span><span>./.test/bats/bats/bin/bats .test/</span><span style="color:#d65940;">*</span><span>.bats
</span><span>
</span><span style="color:#3c4e2d;"># Or for more verbose output
</span><span style="color:#3c4e2d;"># ./.test/bats/bats/bin/bats --tap .test/*.bats
</span><span>
</span><span style="color:#3c4e2d;"># Or with timing
</span><span style="color:#3c4e2d;"># ./.test/bats/bats/bin/bats --formatter tap --timing .test/*.bats
</span></code></pre>
<p>Make it executable:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>chmod +x run-tests.sh
</span></code></pre>
<h3 id="ci-integration">CI Integration <a class="anchor" href="#ci-integration">🔗</a>
</h3>
<p>If you committed the <code>.test/bats/</code> directory (recommended), CI is trivial:</p>
<pre data-lang="yaml" style="background-color:#12160d;color:#6ea240;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#3c4e2d;"># .github/workflows/test.yml
</span><span style="color:#95cc5e;">name</span><span>: </span><span style="color:#f8bb39;">Tests
</span><span>
</span><span style="color:#db784d;">on</span><span>: [</span><span style="color:#f8bb39;">push</span><span>, </span><span style="color:#f8bb39;">pull_request</span><span>]
</span><span>
</span><span style="color:#95cc5e;">jobs</span><span>:
</span><span> </span><span style="color:#95cc5e;">bats</span><span>:
</span><span> </span><span style="color:#95cc5e;">runs-on</span><span>: </span><span style="color:#f8bb39;">ubuntu-latest
</span><span> </span><span style="color:#95cc5e;">steps</span><span>:
</span><span> - </span><span style="color:#95cc5e;">uses</span><span>: </span><span style="color:#f8bb39;">actions/checkout@v3
</span><span>
</span><span> - </span><span style="color:#95cc5e;">name</span><span>: </span><span style="color:#f8bb39;">Run BATS tests
</span><span> </span><span style="color:#95cc5e;">run</span><span>: </span><span style="color:#f8bb39;">./.test/bats/bats/bin/bats .test/*.bats
</span></code></pre>
<p>No installation step needed. The test framework is already in your repo.</p>
<p>If you prefer not to commit the bats libraries, run the install script first:</p>
<pre data-lang="yaml" style="background-color:#12160d;color:#6ea240;" class="language-yaml "><code class="language-yaml" data-lang="yaml"><span style="color:#3c4e2d;"># .github/workflows/test.yml (alternative)
</span><span style="color:#95cc5e;">name</span><span>: </span><span style="color:#f8bb39;">Tests
</span><span>
</span><span style="color:#db784d;">on</span><span>: [</span><span style="color:#f8bb39;">push</span><span>, </span><span style="color:#f8bb39;">pull_request</span><span>]
</span><span>
</span><span style="color:#95cc5e;">jobs</span><span>:
</span><span> </span><span style="color:#95cc5e;">bats</span><span>:
</span><span> </span><span style="color:#95cc5e;">runs-on</span><span>: </span><span style="color:#f8bb39;">ubuntu-latest
</span><span> </span><span style="color:#95cc5e;">steps</span><span>:
</span><span> - </span><span style="color:#95cc5e;">uses</span><span>: </span><span style="color:#f8bb39;">actions/checkout@v3
</span><span>
</span><span> - </span><span style="color:#95cc5e;">name</span><span>: </span><span style="color:#f8bb39;">Install BATS
</span><span> </span><span style="color:#95cc5e;">run</span><span>: </span><span style="color:#f8bb39;">./install-bats-libs.sh
</span><span>
</span><span> - </span><span style="color:#95cc5e;">name</span><span>: </span><span style="color:#f8bb39;">Run BATS tests
</span><span> </span><span style="color:#95cc5e;">run</span><span>: </span><span style="color:#f8bb39;">./.test/bats/bats/bin/bats .test/*.bats
</span></code></pre>
<p>Now every push runs your full integration test suite.</p>
<h2 id="tips-for-keeping-tests-fast">Tips for Keeping Tests Fast <a class="anchor" href="#tips-for-keeping-tests-fast">🔗</a>
</h2>
<p>BATS tests are real integration tests. They're slower than unit tests. That's fine. But you don't want them to be slow.</p>
<p><strong>Don't repeat expensive setup:</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># .test/expensive-setup.bats (example pattern)
</span><span>
</span><span style="color:#3c4e2d;"># SLOW - creates workspace every test
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> setup_workspace
</span><span> mycli init </span><span style="color:#3c4e2d;"># Expensive operation
</span><span>}
</span><span>
</span><span style="color:#3c4e2d;"># FAST - use setup_file for one-time setup
</span><span style="color:#60a365;">setup_file</span><span>() {
</span><span> </span><span style="color:#db784d;">export </span><span>SHARED_WORKSPACE</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(mktemp -d)
</span><span> </span><span style="color:#95cc5e;">cd </span><span style="color:#f8bb39;">"$SHARED_WORKSPACE"
</span><span> mycli init
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown_file</span><span>() {
</span><span> rm -rf </span><span style="color:#f8bb39;">"$SHARED_WORKSPACE"
</span><span>}
</span></code></pre>
<p><strong>Use <code>--filter</code> during development:</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># Only run tests matching pattern
</span><span>bats --filter </span><span style="color:#f8bb39;">"concurrent"</span><span> test/</span><span style="color:#d65940;">*</span><span>.bats
</span></code></pre>
<p><strong>Parallelize with <code>--jobs</code>:</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># Run tests in parallel (requires bats-core >= 1.5.0)
</span><span>bats --jobs 4 test/</span><span style="color:#d65940;">*</span><span>.bats
</span></code></pre>
<h2 id="when-not-to-use-bats">When Not to Use BATS <a class="anchor" href="#when-not-to-use-bats">🔗</a>
</h2>
<p>BATS is for testing bash scripts and CLI tools. It's not for:</p>
<ul>
<li>Testing web UIs (use Playwright, Cypress, etc.)</li>
<li>Unit testing Go/Rust/Python code (use your language's test framework)</li>
<li>Load testing (use k6, Locust, etc.)</li>
</ul>
<p>But if you're testing the actual user experience of a CLI tool—the thing someone runs from their terminal—BATS is perfect.</p>
<h2 id="the-point">The Point <a class="anchor" href="#the-point">🔗</a>
</h2>
<p>Bash isn't a toy language. It's not "just scripts." It's the orchestration layer for most of the software infrastructure on the planet.</p>
<p>If you're building CLI tools meant to be composed and chained together, your integration tests should reflect that reality. Test them in the environment they'll actually run: bash, with real files, real processes, real timing.</p>
<p>BATS gives you the structure to do that without losing your mind. Setup and teardown that works. Assertions that read like English. Helpers that let you mock external dependencies without rewriting your entire tool.</p>
<p>Your bash scripts deserve tests. BATS makes it possible.</p>
<hr />
<p><strong>Further Reading:</strong></p>
<ul>
<li><a href="https://bats-core.readthedocs.io/">BATS Core Documentation</a></li>
<li><a href="https://github.com/bats-core/bats-assert">bats-assert helpers</a></li>
<li><a href="https://github.com/bats-core/bats-file">bats-file helpers</a></li>
<li><a href="https://testanything.org/">Test Anything Protocol (TAP)</a></li>
</ul>
Agentic Patterns: Elements of Reusable Context-Oriented DeterminismFri, 06 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/agentic-patterns-elements-of-reusable-context-oriented-determinism/
https://developmeh.com/soft-wares/agentic-patterns-elements-of-reusable-context-oriented-determinism/<h2 id="agentic-patterns-elements-of-reusable-context-oriented-determinism">Agentic Patterns: Elements of Reusable Context-Oriented Determinism <a class="anchor" href="#agentic-patterns-elements-of-reusable-context-oriented-determinism">🔗</a>
</h2>
<p>While not as exhaustive as the title might indicate but aligned with my focus on enforcing as much determinism as possible from any given LLM ala Article let's take a look at exploiting tool using LLMs as a process instead of as a conversation. As I posed in the linked article much of the failures we experience are related to attention and confusion which is the progressive noise we introduce as we try to convince the model to perform an action.</p>
<p>What I describe below are patterns for building <a href="/tech-dives/a-deterministic-box-for-non-deterministic-engines/">A Deterministic Box for Non-Deterministic Engines</a></p>
<h3 id="chats-are-an-artifact">Chats are an artifact <a class="anchor" href="#chats-are-an-artifact">🔗</a>
</h3>
<p>This behavior of progressing the chat with multiple statements to a solution is merely an artifact of pre-tooluse models. So we the humans needed to interact with moving files and integrating code at each step while testing it became natural to turn interactions into long conversations. Ones that eventually degrade into failure loops, while surely someone has told you to just keep clearing your context and start over.</p>
<p>Since the evolution of tools like functiongemma which provides trainable, simple function calling on commodity hardware we are on the edge of building decision trees for tool oriented expert systems, but that's a topic for a different day. For now the models we have that are effective tool users are too large to be portable and our contract is still text.</p>
<h3 id="reduction-in-variability">Reduction in variability <a class="anchor" href="#reduction-in-variability">🔗</a>
</h3>
<p>You may recall from math class that you should avoid deriving new values from derived values until you can prove the quality of the procedure. As any instability in accuracy will grow the inaccuracy of outputs. The same is essentially the normal behavior of long running chats. Since model responses can essentially steer (influence) the decisions of the model in the future of the same context window we can fall into a quality trap.</p>
<pre data-lang="python" style="background-color:#12160d;color:#6ea240;" class="language-python "><code class="language-python" data-lang="python"><span style="color:#d65940;">import </span><span>anthropic
</span><span>
</span><span>client </span><span style="color:#d65940;">= </span><span>anthropic.Anthropic()
</span><span>messages </span><span style="color:#d65940;">= </span><span>[]
</span><span>
</span><span style="color:#d65940;">while </span><span style="color:#db784d;">True</span><span>:
</span><span> </span><span style="color:#3c4e2d;"># Get input from you
</span><span> user_input </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">input</span><span>(</span><span style="color:#f8bb39;">"You: "</span><span>)
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Put your message in the context
</span><span> messages.append({</span><span style="color:#f8bb39;">"role"</span><span>: </span><span style="color:#f8bb39;">"user"</span><span>, </span><span style="color:#f8bb39;">"content"</span><span>: user_input})
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Send the context to the LLM
</span><span> response </span><span style="color:#d65940;">= </span><span>client.messages.create(
</span><span> model</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"claude-sonnet-4-20250514"</span><span>,
</span><span> max_tokens</span><span style="color:#d65940;">=</span><span style="color:#95cc5e;">1024</span><span>,
</span><span> messages</span><span style="color:#d65940;">=</span><span>messages, </span><span style="color:#3c4e2d;"># Full history sent each time
</span><span> )
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># LLM response
</span><span> assistant_message </span><span style="color:#d65940;">= </span><span>response.content[</span><span style="color:#95cc5e;">0</span><span>].text
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># LLM response added to context
</span><span> messages.append({</span><span style="color:#f8bb39;">"role"</span><span>: </span><span style="color:#f8bb39;">"assistant"</span><span>, </span><span style="color:#f8bb39;">"content"</span><span>: assistant_message})
</span><span>
</span><span> </span><span style="color:#95cc5e;">print</span><span>(</span><span style="color:#95cc5e;">f</span><span style="color:#f8bb39;">"Claude: </span><span>{assistant_message}</span><span style="color:#db784d;">\n</span><span style="color:#f8bb39;">"</span><span>)
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Loop
</span></code></pre>
<p>As this partial example shows if we send 3 messages there will be 3 responses and our context is 6 messages. Each time we send something new the LLM rereads the entire context not just the last message meaning any derived issues that we or the LLMs predictive invariability adds can pollute the quality of the overall decisions made. There are also unseen patterns due to how training is compressed that leads to non-intuitive work and concept replacement for convoluted examples. For example the LLM will be more accurate if there was more reinforcement during deep learning of that topic and it fails faster when exercising in novel space. If you wanna understand this better go read this book: https://sebastianraschka.com/llms-from-scratch/</p>
<h2 id="kiss-the-llm">KISS the LLM <a class="anchor" href="#kiss-the-llm">🔗</a>
</h2>
<p>So the solution is the same as it ever was. Keep things small and focused on a single task. The LLM isn't a person and doesn't think, we are using human language to steer outputs the same way we write function signatures to have enough context to supply information to downstream operations. That doesn't mean we never chat with the bot. It does have a big context window and we can take advantage of that for specific patterns.
Plan-Then-Execute Pattern</p>
<p>As discussed we want to keep context focused when we need a long running session. This is the key to plan-then-execute. Coding agents' system prompts have been biased towards creating implementations which like an eager intern jumps the gun and starts building before understanding. When this happens we find ourselves immediately refactoring the wrong idea. The context becomes polluted with examples of the wrong solution and leads to lower quality outputs.</p>
<p>While some coding agents have a "planning" mode, this is a system prompt hack to try and keep it from producing but I'll admit I have had lower luck because this funnels you towards implementation faster. The solution here is to work with the agent's bias to produce and have it produce research artifacts. It will gladly deep dive into a code-base and provide elegant descriptions of architecture and sequence. This is best performed with a reasoning model.</p>
<h2 id="kill-then-breakdown-pattern">Kill-Then-Breakdown Pattern <a class="anchor" href="#kill-then-breakdown-pattern">🔗</a>
</h2>
<p>A sub-step of plan then execute requires context flushing. After we have verified the quality of the research we start fresh and have the next agent, preferably a reasoning model, read that research document and we instruct it to break down the work into tasks and provide a planned implementation to each task. Once again we are working towards the models goals of writing code or producing files and we get small snippets of code associated with each task. The plan and breakdown is a token heavy portion of the work stream but since we keep check-pointing with artifacts written in markdown there is a repeatable retention in value. That said context size does play a part in cost so flushing the context and loading a compacted version of the topic does end up saving some cost over the context exploding and the risk of loss during compaction.</p>
<h2 id="now-execute-mr-meeseeks-pattern">Now Execute (Mr. Meeseeks Pattern) <a class="anchor" href="#now-execute-mr-meeseeks-pattern">🔗</a>
</h2>
<p>While you can use a single reasoning model to go through each task it has broken down and implement it there is a better way. We can ask the reasoning model to act as an orchestrator that spawns sub-agents of cheaper models for each task in the Mr. Meeseeks pattern. The reasoning model starts a simpler model and passes it the task and expected implementation we just broke down and goes to work on it. For simplicity's sake don't run these operations in parallel yet without some considerations to keep multiple agents from overwriting themselves. As each task is marked completed the sub-agent will be killed and a new one with a fresh context is started.</p>
<p>It's important to remember that the orchestrator gets the output from the subagents so if your development environment produces a lot of noise or if your agents aren't clamping their read size you may run into interesting scenarios where you overflow the coding agents memory. The solution here is to run each sub-agent in a new process instead as a task run by the first coding agent. I am sure you can see how this can expand.</p>
<h2 id="specification-driven-agent-development-pattern">Specification-Driven Agent Development Pattern <a class="anchor" href="#specification-driven-agent-development-pattern">🔗</a>
</h2>
<p>Is what we just accomplished. During planning we created a specification from an existing code-base or a set of discussions. Then we captured focused implementation details. Then we did a bunch of tiny implementations. While the more formal nature of spec driven usually stops at the original manifest of "what is this feature going to be" we should take it one step further to actually storing partial facts about implementation. On the consumption side of the coder agent it will be somewhat literal with what it was given but it has to perform integration and resolve writing tests as well as ensure the work fits into existing tests and functionality.</p>
<p>Also given we have this spec we can add extra steps to our workflow. The orchestrator ends up following a very simple workflow and its focus is retained around the same document of compressed knowledge it wrote. This is an important fact because models are specifically expressive, they talk a certain way, which means a model reading what it wrote is less ambiguous than it reading what you wrote. It has enforced patterns from training we can reactivate.</p>
<h2 id="agent-verifier-pattern-code-review">Agent Verifier Pattern (Code Review) <a class="anchor" href="#agent-verifier-pattern-code-review">🔗</a>
</h2>
<p>Since we have all these concrete artifacts regarding code and spec and final implementation we can then as our last step ask a small simple agent to just give us a thumbs up or down on it, essentially a code reviewer. Before we determine something is done we allow a new context to observe just the changes and the spec, if it rejects we spawn a new implementer to try again. Then we spawn a new reviewer and review again until it works.</p>
<p>In practice this interaction looks a little something like this:</p>
<p><img src="/soft-wares/1770381709160.png" alt="Agent verifier pattern in practice" /></p>
<p>During the end of this task the image to be added wasn't correct and the reviewer failed it causing it to loop.</p>
<h2 id="prompt-writer-agent-pattern">Prompt Writer Agent Pattern <a class="anchor" href="#prompt-writer-agent-pattern">🔗</a>
</h2>
<p>So this doesn't work out of the box but it's pretty easy mock up. The next step is to codify what we send during each phase of execution. For this to work we need to be very explicit. Even though the orchestrator knows the workflow it may forget as it handles agent spawning which leads to the workflow rules not being transferred to the sub agents. We are in a derived value degradation problem again.</p>
<p>We have to help the orchestrator by providing it a template of the actions we want each sub-agent to take. It can fill in the gaps with the task. So before it spawns an agent it reviews the workflow and writes the sub-agent prompt to a file. It then tells the sub-agent to read the file and implement. This provides two benefits to accuracy. Since the orchestrating agent has to keep re-reading the template ala RE2 (Read and Re-read prompting) it retains more attention because it keeps getting repeated in the context. Since it then writes the refined prompt for the agent if we crash or context collapses we can immediately recover by reviewing overall task process and the presence of the prompt files. It is in fact highly durable allowing multiple orchestration concurrently if you have the money.</p>
<p>Additionally, the reviewer will get its own prompt written but it can review the coders prompt when checking for spec compliance.</p>
<h3 id="in-practice">In Practice <a class="anchor" href="#in-practice">🔗</a>
</h3>
<p>If I align this to Anthropic models:</p>
<ul>
<li>Orchestrator -> Opus (Reasoning)</li>
<li>Implementer -> Sonnet (Competent)</li>
<li>Reviewer -> Haiku (Simple)</li>
</ul>
<p>I also don't rely as much on markdown files past the very first phase of planning. I move all context with the exception of sub agent prompts into a graph. For that graph I use beads which while it has its flaws enables an approach I call the "Context Graph Pattern" which I will go into in a bit.</p>
<p>What beads essentially is is Jira or Linear but with a outputs that work better for LLMs. Essentially a command line tool that has a help dialog that outputs markdown instructions, which improves comprehension by the LLM. It's a graph because like any issue tracker issues can form chains and comments.</p>
<p>In the previous picture above this comment stream is from a plugin to interact with my graph visually. It permits me to leave comments for the agents or even rewrite a spec on the fly.</p>
<h2 id="context-graph-pattern">Context Graph Pattern <a class="anchor" href="#context-graph-pattern">🔗</a>
</h2>
<p>Using a tool that allows me to commit context as a focused structure means I get reproducibility and an audit log. Since beads uses issue IDs as commit names the graph extends into the git history. Code and spec and decision tree can all be one artifact without reading all the files. This keeps our context as tight as possible.</p>
<p>Because the graph is mutable if the first attempt was a complete failure I have two choices:</p>
<ul>
<li>Provide feedback as a refinement and retry -> Refactor</li>
<li>Rewrite the spec and have the agent pull the previous changes and start over -> Rewrite</li>
</ul>
<p>I can continue to iterate this way at a much lower time cost to me as a developer and since the graph is also able to be committed to a repo and shared with other developers they can do the same.</p>
<p>When we enhance a feature we can include the previous changes either by diff review and spec retrieval from the graph or by explicit linking within the graph itself. There is a portion of this structure that lets you act as Product, Project, and Tech Lead for the given outcomes.</p>
<p>Of course no silver bullet, you will end up a developer for some things in the end don't worry. But this can be guided by a concrete context for yourself when you ask the LLM what it thinks went wrong and you get your hands dirty.</p>
<h2 id="example">Example <a class="anchor" href="#example">🔗</a>
</h2>
<p>If you wanna see a functional example of this process I have been dog-fooding it for a bit and all the artifacts from the plugin I posted a picture of are over here.</p>
<p>https://git.sr.ht/~ninjapanzer/jetbrains-beads-manager</p>
<p>A majority of this code was written in my absence in a execution loop. This usually gets you about 80% there. I then spend some time filing bug tickets adding clarifications and refinements. There were only 3 actual chat sessions that occurred during integration where I provided some focused behavioral examples and some bulk documentation where it built some new tasks and orchestrated them.</p>
<p>I would call this a mature alpha as it was produced in 1 sitting. Functionality is usable enough that I finished the development only using the plugin. But this isn't showing off If you pull this down and have beads installed you can see my prompts and what an actual context graph looks like.</p>
<h2 id="the-point">The point <a class="anchor" href="#the-point">🔗</a>
</h2>
<p>Is not to replace humans as the engineers but replace the grunt work. That said the pattern is implied by the use case. If I am building silly tools for myself, who cares what the code really looks like. If I am building functionality I have to rely on, I need to put considerably more agency in the matter. I will still offload the grunt work when possible but it still a practice. I would hope my carpenter would cut a few less corners on my cabinets than their own. It's not that we are lazy it's we exercise our agency in a way that is comfortable for us. What we build for others must be of the highest quality, what we build for ourselves needs to meet the need.</p>
<p>I mean who knows what will happen and if greed will win and our work will be meat base robot pooper-scoopers. Until everyone figures it out get more work done and take a few more coffee breaks.</p>
Just Forget About Owning CodeTue, 03 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/just-forget-about-owning-code/
https://developmeh.com/soft-wares/just-forget-about-owning-code/<img src="/soft-wares/0204crcv.jpeg" alt="The future is FOSS" style="width: 100%; height: 600px; object-fit: cover; object-position: center calc(50% + 50px);">
<h2 id="just-forget-about-owning-code">Just Forget About Owning Code <a class="anchor" href="#just-forget-about-owning-code">🔗</a>
</h2>
<h3 id="why-keep-making-versions-of-the-same-thing">Why keep making versions of the same thing? <a class="anchor" href="#why-keep-making-versions-of-the-same-thing">🔗</a>
</h3>
<p>So let's think about how LLMs are trained. I have, mostly because I have been reading <a href="https://sebastianraschka.com/llms-from-scratch/">Build a Large Language Model (From Scratch)</a> and I was reminded of the nature of supervised / deep learning systems and their implication on how models are refined. Let's think about how LLMs got to this point, using this Wash Post article as an idea <a href="https://www.washingtonpost.com/technology/2026/01/27/anthropic-ai-scan-destroy-books/">Destroying and Scanning Books</a>, well it needs stuff to read and according to this article to get some of this volume it cut the bindings off books and scanned them. What the LLM produces is a highly advanced predictive generation of those sources. It's completely true that the model doesn't quite know the source of the information after training and because it's a sophisticated predictive engine it does better when creating something similar to what it trained on.</p>
<p>Ok so let's walk that back a little bit about code and that big engineering dream of generalizing solutions. To this point one thing LLMs do a great job of is creating CLI applications in GO, no surprise there are lots of examples of really good CLIs in GO. Of course some of this can be generalized to other languages and if I walk a few steps from here there is an argument to be made that designing literate CLI APIs is kinda solved. Sweet, as an engineer, I consider this a complete win as most of my work is purposely to offload knowing how to do things cause I have lots to do.</p>
<p>I can recall back in the days in the early web when pagination of post counts was a hard problem, now there is probably a go to framework for every language. Most of us don't really think much about pagination anymore, more we consider the kind of pagination we want and apply the solution.</p>
<p>But as things become more complicated the generalizations are too hard and have too many edge cases, solving this would take more time and money than even some community funded altruism would allow. Just consider Authentication, I have worked in a lot of places, and regardless of the agreed rule, "don't roll your own auth," sure as hell every one of these places has done just that. I can enumerate all the great FOSS auth platforms that could be used and extended, that aren't. Honestly, don't get me started on the nature of buy vs build vs vendor vs OSS, its the stupidest discussion you will ever hear. With LLMs it might even be dumber honestly, but this is the baseline for my argument.</p>
<p>Why isn't the advent of LLMs just the start of fluent FOSS solutions to all the things we repeatedly build, a reduction and concentration of quality. As we all spend money on LLMs reintroducing the wheel and building everything fast and naively we could be defining protocols and refining specs. Where is the moat (the thing that keeps someone else from running in and eating your lunch), well, there never was one, code is essentially valueless. The moat for a software business was the product and the money it costs to just build common implementations that send some data somewhere else. What keeps someone from competing with you is that building software is hosting systems and building software is expensive.</p>
<p>Why keep building bespoke versions of anything. Sure the code is cheap before LLMs cheap innovation comes from encapsulation, if you use the Linux ecosystem as an example. An environment where a majority of the interactions are using tools designed in the 80s.</p>
<blockquote>
<p>DOTADIW, or "Do One Thing And Do It Well." - Unix Philosophy</p>
</blockquote>
<p>So we still have to build this stuff but that can be the work in the end. We organize the systems and we build the technologies to act as a host and we orchestrate and we compose whatever tools we need on demand.</p>
<p>Here is my dream, think a package manager like Nix but easier to write that describes some interaction and a general UI that is baseline whatever your operating system is for simplicity. Now consider you want a movie ticket, so you do something like this:</p>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#3c4e2d;">## Iron Lung Ticket Buyer
</span><span>
</span><span>search </span><span style="color:#f8bb39;">"theater inventory for postal code 1111" </span><span>=> data
</span><span>search </span><span style="color:#f8bb39;">"Iron Lung" </span><span>=> data </span><span style="color:#d65940;">-></span><span> show_data
</span><span>get </span><span style="color:#f8bb39;">"Paypal" </span><span>=> payment
</span><span>get </span><span style="color:#f8bb39;">"calendar" </span><span>=> filter
</span><span>get </span><span style="color:#f8bb39;">"seat chart" </span><span>=> picker
</span><span>
</span><span>compose show_data </span><span style="color:#d65940;">-></span><span> filter(</span><span style="color:#f8bb39;">"this evening"</span><span>) => filtered
</span><span>compose filtered </span><span style="color:#d65940;">-></span><span> picker OR </span><span style="color:#95cc5e;">select</span><span>(</span><span style="color:#95cc5e;">2</span><span>)
</span><span>resolve payment </span><span style="color:#d65940;">-></span><span> prompt => tickets
</span></code></pre>
<p>So I don't need AMC to produce a website but maybe they want to, I don't care. What I do want is to figure out what shows for "Iron Lung" are playing and get tickets this evening at my local AMC. I want to execute this structure on my local machine because its pretty simple. I am essentially composing some expert systems to do some things I want. Those systems are packaged and they might do some local LLM work or use NLP (Natural Language Processing) but the act is simple and the theater gets their money the way I wanna pay it. They don't have to build a paypal integration and I get some tickets. But there isn't really a reason this needs to be more complicated than this and I probably don't need a cloud provider to maintain this interaction.</p>
<h3 id="i-lost-you-but-you-want-this">I lost you, but you want this <a class="anchor" href="#i-lost-you-but-you-want-this">🔗</a>
</h3>
<p>I know I lost you here because it looks like I have built a programming language and I kinda have but really the syntax doesn't matter so much its instructing the orchestration of a package manager, there is no compilation. Some simple model just walks through these steps and uses modules that provide the interactions you requested. The heavy lifting is all managed by the common interactions.</p>
<p>When I think of enshittification and owning the means of production the LLMs that generate code is a two edged sword, sure a company can produce a lot of features and compete but also a nobody can disrupt that and it gets to a point where you are going to spend all your time making your moat deeper and wider with more code but the number of people building bridges over the moat increases at a rate faster than you can defend.</p>
<p>In the case above Paypal is incentivized to create a module that allows their payment system to adapt to whatever the vendor supports. Either deeply integrated or using a one time credit card, it's now insanely easier for them to build that expert system and they have to compete with Stripe doing the same thing. The model shifts away from them locking in merchant rates but being the chosen consumer brand because they have the best tools or customer satisfaction.</p>
<p>The point is some business will not have a choice they don't have to build APIs anymore the web is the API and anyone can build code to extract that data.</p>
<h3 id="how-does-this-not-happen">How does this not happen? <a class="anchor" href="#how-does-this-not-happen">🔗</a>
</h3>
<p>I am waiting for the time when the big AI companies start selling the ability to block certain types of code generation or they pass legislation that makes scraping a crime... think about it. We are nearing a case of mutually assured destruction or a human utopia.</p>
Rust Dancing ANSI Banana with Server-Sent EventsSun, 01 Feb 2026 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events/
https://developmeh.com/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events/<p><strong>Remember that dancing Ruby banana?</strong> 🍌</p>
<p>Well, I couldn't help myself. After building the <a href="/i-made-a-thing/ruby-streaming-banana-dancer/">Ruby version with chunked transfer encoding</a>, I started wondering: what if we explored the <em>other</em> way to stream data to browsers and terminals? Enter the Rust implementation using Server-Sent Events.</p>
<p>Yeah, I rewrote it in Rust. With SSE.</p>
<p>So here's the thing: when you want to stream data from a server to clients, you've got options. My Ruby version uses chunked transfer encoding—basically HTTP/1.1's way of saying "I'm sending you data in pieces, and I'll tell you when each piece ends." But there's another player in town: Server-Sent Events (SSE), which is a proper protocol built on top of chunked encoding for one-way server-to-client streaming.</p>
<p>Why both? Because understanding the difference matters when you're building real streaming applications. Plus, Rust's async ecosystem with Actix-Web makes SSE implementation surprisingly elegant.</p>
<p>The best part? It works with both curl <em>and</em> web browsers. Same endpoint, different experiences. Curl gets raw ANSI animations, browsers get properly formatted SSE streams. One server, two clients, zero compromise.</p>
<p>Want to see how SSE differs from plain chunked encoding? Grab the code at <a href="https://git.sr.ht/~ninjapanzer/sse-dancing-banana">sse-dancing-banana</a> and follow along. Or if you just want to see a banana dance: <code>curl -N http://localhost:8080/live</code></p>
<p>Bottom line: Sometimes the best way to learn a protocol is to make something completely silly with it. And what's sillier than making fruit dance in your terminal?</p>
<hr />
<p>Hope your terminal's ready for some Rust-powered dancing! 🍌🦀🎵</p>
<p><img src="../streaming-banana.gif" alt="streaming-banana" /></p>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="02-02-2026">02 02 2026 <a class="anchor" href="#02-02-2026">🔗</a>
</h3>
<h4 id="sse-vs-chunked-encoding-what-s-the-difference">SSE vs Chunked Encoding: What's the Difference? <a class="anchor" href="#sse-vs-chunked-encoding-what-s-the-difference">🔗</a>
</h4>
<p>When I built the Ruby version, I used chunked transfer encoding directly. It's HTTP/1.1's mechanism for streaming—you send data in chunks, each prefixed with its size in hex, terminated by a zero-length chunk. Simple, direct, low-level.</p>
<p>But SSE is different. It's a <em>protocol</em> built on top of chunked encoding. Think of chunked encoding as the delivery truck, and SSE as the carefully labeled packages inside. SSE defines a specific text format for events:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>data: <your content here>
</span><span>data: <more content>
</span><span>
</span></code></pre>
<p>Each event ends with a double newline. You can have multi-line data (prefix each line with <code>data:</code>), event types, IDs for reconnection, even retry hints. It's structured, and browsers have native <code>EventSource</code> API support.</p>
<p>Here's how the Rust code handles both in the same endpoint:</p>
<pre data-lang="rust" style="background-color:#12160d;color:#6ea240;" class="language-rust "><code class="language-rust" data-lang="rust"><span>async </span><span style="color:#95cc5e;">fn </span><span style="color:#60a365;">live</span><span>(req: HttpRequest) -> impl Responder {
</span><span> </span><span style="color:#95cc5e;">let</span><span> user_agent </span><span style="color:#d65940;">=</span><span> req
</span><span> .</span><span style="color:#95cc5e;">headers</span><span>()
</span><span> .</span><span style="color:#95cc5e;">get</span><span>(</span><span style="color:#f8bb39;">"User-Agent"</span><span>)
</span><span> .</span><span style="color:#95cc5e;">and_then</span><span>(|h| h.</span><span style="color:#95cc5e;">to_str</span><span>().</span><span style="color:#95cc5e;">ok</span><span>())
</span><span> .</span><span style="color:#95cc5e;">unwrap_or</span><span>(</span><span style="color:#f8bb39;">""</span><span>);
</span><span>
</span><span> </span><span style="color:#95cc5e;">let</span><span> is_curl </span><span style="color:#d65940;">=</span><span> user_agent.</span><span style="color:#95cc5e;">contains</span><span>(</span><span style="color:#f8bb39;">"curl"</span><span>);
</span><span>
</span><span> </span><span style="color:#3c4e2d;">// ... speed parameter parsing ...
</span><span>
</span><span> </span><span style="color:#95cc5e;">let</span><span> stream </span><span style="color:#d65940;">= </span><span>stream::unfold(
</span><span> FrameStream { current: </span><span style="color:#95cc5e;">0</span><span>, interval, is_curl },
</span><span> </span><span style="color:#db784d;">move </span><span style="color:#d65940;">|</span><span style="color:#db784d;">mut</span><span> state</span><span style="color:#d65940;">|</span><span> async </span><span style="color:#db784d;">move </span><span>{
</span><span> actix_web::rt::time::sleep(state.interval).await;
</span><span> </span><span style="color:#d65940;">if</span><span> state.current </span><span style="color:#d65940;">>= </span><span style="color:#db784d;">FRAMES</span><span>.</span><span style="color:#95cc5e;">len</span><span>() {
</span><span> state.current </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">0</span><span>;
</span><span> }
</span><span> </span><span style="color:#95cc5e;">let</span><span> frame </span><span style="color:#d65940;">= </span><span style="color:#db784d;">FRAMES</span><span>[state.current];
</span><span> </span><span style="color:#95cc5e;">let</span><span> data </span><span style="color:#d65940;">=</span><span> state.</span><span style="color:#95cc5e;">format_frame_data</span><span>(frame);
</span><span> state.current </span><span style="color:#d65940;">+= </span><span style="color:#95cc5e;">1</span><span>;
</span><span> </span><span style="font-style:italic;color:#db784d;">Some</span><span>((
</span><span> </span><span style="font-style:italic;color:#db784d;">Ok</span><span>::<</span><span style="color:#d65940;">_</span><span>, std::convert::Infallible>(web::Bytes::from(data)),
</span><span> state,
</span><span> ))
</span><span> },
</span><span> );
</span><span>
</span><span> HttpResponse::Ok()
</span><span> .</span><span style="color:#95cc5e;">content_type</span><span>(</span><span style="color:#f8bb39;">"text/event-stream"</span><span>)
</span><span> .</span><span style="color:#95cc5e;">streaming</span><span>(stream)
</span><span>}
</span></code></pre>
<p>The magic happens in <code>format_frame_data</code>. For curl, we send raw ANSI:</p>
<pre data-lang="rust" style="background-color:#12160d;color:#6ea240;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#95cc5e;">fn </span><span style="color:#60a365;">format_frame_data</span><span>(</span><span style="color:#d65940;">&</span><span>self, frame: </span><span style="color:#d65940;">&</span><span style="color:#95cc5e;">str</span><span>) -> String {
</span><span> </span><span style="color:#d65940;">if </span><span>self.is_curl {
</span><span> </span><span style="color:#3c4e2d;">// Chunked encoding: just send the frame with ANSI clear codes
</span><span> format!(</span><span style="color:#f8bb39;">"</span><span style="color:#db784d;">{}{}\n\n</span><span style="color:#f8bb39;">"</span><span>, </span><span style="color:#db784d;">ANSI_CLEAR</span><span>, frame)
</span><span> } </span><span style="color:#d65940;">else </span><span>{
</span><span> </span><span style="color:#3c4e2d;">// SSE: format according to the SSE protocol
</span><span> </span><span style="color:#95cc5e;">let</span><span> cleaned </span><span style="color:#d65940;">= </span><span>self.</span><span style="color:#95cc5e;">strip_ansi</span><span>(frame);
</span><span> </span><span style="color:#95cc5e;">let</span><span> lines: </span><span style="font-style:italic;color:#db784d;">Vec</span><span><</span><span style="color:#d65940;">&</span><span style="color:#95cc5e;">str</span><span>> </span><span style="color:#d65940;">=</span><span> cleaned.</span><span style="color:#95cc5e;">lines</span><span>().</span><span style="color:#95cc5e;">collect</span><span>();
</span><span> </span><span style="color:#95cc5e;">let</span><span> sse_lines: </span><span style="font-style:italic;color:#db784d;">Vec</span><span><</span><span style="font-style:italic;color:#db784d;">String</span><span>> </span><span style="color:#d65940;">=</span><span> lines
</span><span> .</span><span style="color:#95cc5e;">iter</span><span>()
</span><span> .</span><span style="color:#95cc5e;">map</span><span>(|l| format!(</span><span style="color:#f8bb39;">"data: </span><span style="color:#db784d;">{}</span><span style="color:#f8bb39;">"</span><span>, l))
</span><span> .</span><span style="color:#95cc5e;">collect</span><span>();
</span><span> format!(</span><span style="color:#f8bb39;">"</span><span style="color:#db784d;">{}\n\n</span><span style="color:#f8bb39;">"</span><span>, sse_lines.</span><span style="color:#95cc5e;">join</span><span>(</span><span style="color:#f8bb39;">"</span><span style="color:#db784d;">\n</span><span style="color:#f8bb39;">"</span><span>))
</span><span> }
</span><span>}
</span></code></pre>
<p>See the difference? For curl, we're just sending data. For browsers, we're wrapping each line in <code>data:</code> prefixes and preserving the SSE format. The browser's <code>EventSource</code> API automatically parses this.</p>
<p><strong>Why does this matter?</strong></p>
<ol>
<li><strong>Reconnection</strong>: SSE includes automatic reconnection with <code>Last-Event-ID</code>. Chunked encoding? You're on your own.</li>
<li><strong>Browser Support</strong>: <code>EventSource</code> is built-in. Chunked encoding requires manual <code>fetch()</code> streaming, which is newer and less supported.</li>
<li><strong>Event Types</strong>: SSE lets you send different event types on the same stream. Chunked encoding is just bytes.</li>
<li><strong>Simplicity</strong>: For server-to-client streaming, SSE handles the protocol. Chunked encoding is just the transport.</li>
</ol>
<p><strong>When to use what?</strong></p>
<ul>
<li><strong>Chunked Encoding</strong>: When you need low-level control, binary data, or don't care about browser niceties. Think raw terminal streaming, like the Ruby version.</li>
<li><strong>SSE</strong>: When you want browser compatibility, automatic reconnection, structured events, or you're building a real-time notification system.</li>
</ul>
<p>For this project, SSE won because I wanted both curl <em>and</em> browser support without writing separate endpoints.</p>
</div>
<div class="devlog-entry">
<h3 id="02-02-2026-1">02 02 2026 1 <a class="anchor" href="#02-02-2026-1">🔗</a>
</h3>
<h4 id="rust-s-async-streams-the-good-parts">Rust's Async Streams: The Good Parts <a class="anchor" href="#rust-s-async-streams-the-good-parts">🔗</a>
</h4>
<p>Coming from Ruby's Sinatra with its simple <code>stream</code> block, I expected Rust to be painful. It wasn't.</p>
<p>Actix-Web's streaming response is built on Rust's <code>Stream</code> trait, which is like an async iterator. You create something that implements <code>Stream</code>, and the framework handles the rest:</p>
<pre data-lang="rust" style="background-color:#12160d;color:#6ea240;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#95cc5e;">struct </span><span>FrameStream {
</span><span> current: </span><span style="color:#95cc5e;">usize</span><span>,
</span><span> interval: Duration,
</span><span> is_curl: </span><span style="color:#95cc5e;">bool</span><span>,
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">impl </span><span>Stream </span><span style="color:#67854f;">for </span><span>FrameStream {
</span><span> </span><span style="color:#95cc5e;">type </span><span>Item </span><span style="color:#d65940;">= </span><span style="font-style:italic;color:#db784d;">Result</span><span><web::Bytes, std::convert::Infallible>;
</span><span>
</span><span> </span><span style="color:#95cc5e;">fn </span><span style="color:#60a365;">poll_next</span><span>(</span><span style="color:#db784d;">mut </span><span>self: Pin<</span><span style="color:#d65940;">&</span><span style="color:#db784d;">mut </span><span style="color:#95cc5e;">Self</span><span>>, _cx: </span><span style="color:#d65940;">&</span><span style="color:#db784d;">mut </span><span>Context<'</span><span style="color:#d65940;">_</span><span>>)
</span><span> -> Poll<</span><span style="font-style:italic;color:#db784d;">Option</span><span><</span><span style="color:#95cc5e;">Self::</span><span>Item>>
</span><span> {
</span><span> </span><span style="color:#d65940;">if </span><span>self.current </span><span style="color:#d65940;">>= </span><span style="color:#db784d;">FRAMES</span><span>.</span><span style="color:#95cc5e;">len</span><span>() {
</span><span> self.current </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">0</span><span>;
</span><span> }
</span><span> </span><span style="color:#95cc5e;">let</span><span> frame </span><span style="color:#d65940;">= </span><span style="color:#db784d;">FRAMES</span><span>[self.current];
</span><span> </span><span style="color:#95cc5e;">let</span><span> data </span><span style="color:#d65940;">= </span><span>self.</span><span style="color:#95cc5e;">format_frame_data</span><span>(frame);
</span><span> self.current </span><span style="color:#d65940;">+= </span><span style="color:#95cc5e;">1</span><span>;
</span><span> Poll::Ready(</span><span style="font-style:italic;color:#db784d;">Some</span><span>(</span><span style="font-style:italic;color:#db784d;">Ok</span><span>(web::Bytes::from(data))))
</span><span> }
</span><span>}
</span></code></pre>
<p>But I took a shortcut. Instead of implementing <code>Stream</code> manually, I used <code>stream::unfold</code>, which is like <code>reduce</code> but for streams:</p>
<pre data-lang="rust" style="background-color:#12160d;color:#6ea240;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#95cc5e;">let</span><span> stream </span><span style="color:#d65940;">= </span><span>stream::unfold(
</span><span> FrameStream { current: </span><span style="color:#95cc5e;">0</span><span>, interval, is_curl },
</span><span> </span><span style="color:#db784d;">move </span><span style="color:#d65940;">|</span><span style="color:#db784d;">mut</span><span> state</span><span style="color:#d65940;">|</span><span> async </span><span style="color:#db784d;">move </span><span>{
</span><span> actix_web::rt::time::sleep(state.interval).await;
</span><span> </span><span style="color:#3c4e2d;">// ... produce next item ...
</span><span> </span><span style="font-style:italic;color:#db784d;">Some</span><span>((</span><span style="font-style:italic;color:#db784d;">Ok</span><span>(web::Bytes::from(data)), state))
</span><span> },
</span><span>);
</span></code></pre>
<p>The state (<code>FrameStream</code>) gets passed into the async block, which produces the next item and returns the updated state. Rinse, repeat, stream forever. It's elegant once you get past the types.</p>
<p><strong>The Rust Tax</strong>: You pay upfront in type signatures (<code>Result<web::Bytes, std::convert::Infallible></code> for an infallible stream?), but you get safety and zero-cost abstractions. No runtime overhead for this streaming abstraction—it compiles down to a state machine.</p>
<p><strong>The Ruby Comparison</strong>: In Ruby's Sinatra, I did this:</p>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span>stream(</span><span style="color:#db784d;">:keep_open</span><span>) </span><span style="color:#d65940;">do </span><span>|out|
</span><span> </span><span style="color:#95cc5e;">loop </span><span style="color:#d65940;">do
</span><span> out </span><span style="color:#d65940;"><<</span><span> render_frame
</span><span> </span><span style="color:#95cc5e;">sleep 0.1
</span><span> </span><span style="color:#d65940;">end
</span><span style="color:#d65940;">end
</span></code></pre>
<p>Simple, but you're managing the loop and sleep manually. Rust's <code>stream::unfold</code> encodes that pattern into the type system. More verbose, but impossible to accidentally block the runtime or leak resources.</p>
</div>
<div class="devlog-entry">
<h3 id="01-02-2026">01 02 2026 <a class="anchor" href="#01-02-2026">🔗</a>
</h3>
<h4 id="compile-time-frame-embedding">Compile-Time Frame Embedding <a class="anchor" href="#compile-time-frame-embedding">🔗</a>
</h4>
<p>One detail I'm proud of: the frames are embedded at compile time using <code>include_str!</code>:</p>
<pre data-lang="rust" style="background-color:#12160d;color:#6ea240;" class="language-rust "><code class="language-rust" data-lang="rust"><span style="color:#95cc5e;">const </span><span style="color:#db784d;">FRAMES</span><span>: [</span><span style="color:#d65940;">&</span><span style="color:#95cc5e;">str</span><span>; </span><span style="color:#95cc5e;">8</span><span>] </span><span style="color:#d65940;">= </span><span>[
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame0.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame1.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame2.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame3.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame4.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame5.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame6.txt"</span><span>),
</span><span> include_str!(</span><span style="color:#f8bb39;">"../../assets/frames/frame7.txt"</span><span>),
</span><span>];
</span></code></pre>
<p>No runtime file I/O. No error handling for missing files in production. The frames are literally part of the compiled binary, stored in the <code>.rodata</code> section. If the files don't exist at compile time, the build fails. Hard fail at compile time beats mysterious runtime errors.</p>
<p>In Ruby, I loaded frames at runtime:</p>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span>frames </span><span style="color:#d65940;">= </span><span style="font-style:italic;color:#db784d;">Dir</span><span>.glob(</span><span style="color:#f8bb39;">"ascii_frames/*.txt"</span><span>).sort.map { |f| </span><span style="font-style:italic;color:#db784d;">File</span><span>.read(f) }
</span></code></pre>
<p>This works, but it's runtime overhead, potential I/O errors, and requires the filesystem to be available. For a simple animation, compile-time embedding is perfect.</p>
<p><strong>Trade-off</strong>: Binary size increases by ~8 text files. For a banana animation, I'll take it.</p>
</div>
<div class="devlog-entry">
<h3 id="01-02-2026-1">01 02 2026 1 <a class="anchor" href="#01-02-2026-1">🔗</a>
</h3>
<h4 id="nix-for-rust-less-painful-than-ruby">Nix for Rust: Less Painful Than Ruby <a class="anchor" href="#nix-for-rust-less-painful-than-ruby">🔗</a>
</h4>
<p>After fighting Nix for the Ruby version's gem dependencies, Rust was refreshing:</p>
<pre data-lang="nix" style="background-color:#12160d;color:#6ea240;" class="language-nix "><code class="language-nix" data-lang="nix"><span>outputs </span><span style="background-color:#00a8c6;color:#f8f8f0;">=</span><span> { self</span><span style="color:#d65940;">, </span><span>nixpkgs</span><span style="color:#d65940;">, ... </span><span>}:
</span><span> </span><span style="color:#67854f;">let
</span><span> </span><span style="color:#db784d;">system </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"x86_64-linux"</span><span>;
</span><span> </span><span style="color:#db784d;">pkgs </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">import </span><span>nixpkgs { </span><span style="color:#67854f;">inherit </span><span style="color:#db784d;">system</span><span>; };
</span><span> </span><span style="color:#67854f;">in </span><span>{
</span><span> </span><span style="color:#db784d;">devShells</span><span>.${system}.</span><span style="color:#db784d;">default </span><span style="color:#d65940;">= </span><span>pkgs</span><span style="color:#d65940;">.</span><span>mkShell {
</span><span> </span><span style="color:#db784d;">buildInputs </span><span style="color:#d65940;">= </span><span style="color:#67854f;">with </span><span>pkgs; [
</span><span> rustc
</span><span> cargo
</span><span> rust-analyzer
</span><span> ];
</span><span> };
</span><span> }</span><span style="background-color:#00a8c6;color:#f8f8f0;">;</span><span>
</span></code></pre>
<p>That's it. Cargo handles dependencies via <code>Cargo.lock</code>, which Nix respects. No gemset.nix translation layer, no bundlerEnv complexity. Rust's deterministic builds align perfectly with Nix's philosophy.</p>
<p>For production, I'd add <code>pkgs.buildRustPackage</code>, but for local dev? This simple shell is all you need.</p>
<p>The Rust ecosystem's commitment to reproducible builds (via Cargo.lock) makes Nix integration almost trivial. Ruby's dynamic nature fights Nix at every turn. This is one of those moments where Rust's compile-time philosophy pays dividends.</p>
</div>
A Deterministic Box for Non-Deterministic EnginesTue, 27 Jan 2026 00:00:00 +0000[email protected]
https://developmeh.com/tech-dives/a-deterministic-box-for-non-deterministic-engines/
https://developmeh.com/tech-dives/a-deterministic-box-for-non-deterministic-engines/<h2 id="the-nature-of-non-determinism-with-llms">The Nature of Non-Determinism with LLMs <a class="anchor" href="#the-nature-of-non-determinism-with-llms">🔗</a>
</h2>
<p>So you may have heard of weights, biases, and temperature when LLMs are described. For the uninitiated: weights and biases are the core parameters learned during training that encode the model's knowledge, while temperature is an inference-time parameter that controls how much variance appears in the model's outputs. Higher temperature means more randomness in token selection; lower temperature means more deterministic responses. It's exactly this temperature parameter that ensures the model will respond with some variance for the same input. So that's clearly this non-determinism which flies in the face of the normal expectation of computers, but it's this that also provides some of the nuance in token prediction that makes the LLM work so it's easy to identify this as an <strong>Architectural Trade-Off</strong> and not necessarily a <strong>Detractor</strong>. So hoping that provides some grounding let's talk about how to make good use of this engine of... making shit up.</p>
<h3 id="making-shit-up">Making Shit Up <a class="anchor" href="#making-shit-up">🔗</a>
</h3>
<p>Yep, so that's not a tradeoff, it's a flaw, one we haven't solved yet. When the context is ambiguous the model chooses to do one of two things:</p>
<ol>
<li>Just pretend it didn't hear what it was asked to do</li>
<li>Make shit up, hallucinations</li>
</ol>
<p>Of course I think the former is not talked about as much as the hallucinations. Not to mention that the hallucinations are harder to detect and protect. Note that hallucinations are actually a separate problem from non-determinism - they're about confidence miscalibration and training data limitations, not temperature variance. Hallucinations can occur even with low temperature settings. But we can take a stab at it with some extra prompting and extra runtimes at the cost of tokens. Don't get too upset this is just the normals of computers, we make a simple thing and it has sharp edges, so we make more things that consume some extra energy to constrain the first.</p>
<p>Usually, these are to solve for the inefficiency of the human communication, but sometimes it's just cause people wanna abuse it. I like to think of Auth as a regular pain point we don't really need but have to have because trust is a hard problem. Most of whats on the web doesn't need centralized authentication but GPG has always been too hard so we made something easier to understand.</p>
<h2 id="what-to-do">What to do? <a class="anchor" href="#what-to-do">🔗</a>
</h2>
<p>Ok, back to the question, well I call it micromanagement but that kind of implies that the model and its agents have some kind of human agency, which they don't. Although some of their processes are directly modeled after humans so we can loosely apply some techniques to rein them in.</p>
<p>First, let's talk about context and ambiguity. If you haven't figured this out yet the longer the context the more the model's attention distributes across tokens, reducing precision on individual details - a "lost-in-the-middle" effect where information gets deprioritized. Most of this is your fault because even with your best effort you introduce inconsistencies and other inaccuracies into the conversation. The lesson, clear your context often and especially between phases of your work, aka, planning, building, and verifying. I like to consider this an analogy to writing and editing. Have someone else edit your work or write it and review it a week later to improve objectivity. Thankfully with LLMs their memory is as ephemeral as you like.</p>
<p>So we need a way to turn a goal into a workstream that allows us to actually look away from the model's stream. Some might call this an agentic orchestration but I feel these often sprint from meaningful to overly complicated in a matter of weeks. Especially if you use something like Claude-Code, Codex, or OpenCode all the building tools are there already. So starting from something like Claude-Code we need to teach our main agent interface to better follow some process when working.</p>
<p>Here is an example:</p>
<p><strong>CLAUDE.MD</strong></p>
<pre data-lang="markdown" style="background-color:#12160d;color:#6ea240;" class="language-markdown "><code class="language-markdown" data-lang="markdown"><span>
</span><span>## </span><span style="color:#db784d;">Working Style
</span><span>
</span><span>When collaborating on this project:
</span><span>- Check existing files first before suggesting changes
</span><span>- Ask questions one at a time to refine ideas
</span><span>- Prefer multiple choice questions when possible
</span><span>- Focus on understanding: purpose, constraints, success criteria
</span><span>- Apply YAGNI ruthlessly - remove unnecessary features from all designs
</span><span>- Present designs in sections and validate each incrementally
</span><span>- Go back and clarify when something doesn't make sense
</span><span>
</span><span>## </span><span style="color:#db784d;">Deliverables
</span><span>
</span><span>- Break down the decisions from collaboration into tasks
</span><span>- You must use any defined task tracking tools outlined in the Task Tracking section to create tasks falling back to markdown files if nothing is defined
</span><span>- Create a report for the executiong plan with dependencies mapped
</span><span>
</span><span>## </span><span style="color:#db784d;">Workflow Guidelines
</span><span>
</span><span>- Create an epic for each high-level objective
</span><span>- Create subtasks as a todo chain under the epic
</span><span>- Write titles as the task to be performed (imperative form)
</span><span>- Add detailed descriptions with examples of work to be done
</span><span>- Verify each task before closing
</span><span>- Log details about failures and retries in ticket descriptions for historical tracking
</span><span>- When an epic is completed, write a report of the task graph and verify all items were performed
</span></code></pre>
<h3 id="controlling-core-memories">Controlling Core Memories <a class="anchor" href="#controlling-core-memories">🔗</a>
</h3>
<p>As I included above <em>Deliverables</em> and <em>Workflow Guidelines</em> we initially want our first pass to be on work breakdown and dependency. This will provide some added benefits the way we will track that work progress though. Often the agent writing code falls victim to the two points above with a couple of variations. Hallucinations in this case are items that just don't work and the remainder is missed features. That's good though because we can track and essentially later interrogate success and failure of the model's execution. Better yet we can finally realize the age old dream that we can repeat a variation of a task in the future more accurately because each replanning is less ambiguous. Good luck doing this with people but with LLMs it's all data.</p>
<p>So memory management moves into tasks, which can be in markdown, Jira via MCP (Model Context Protocol - a standard for connecting AI agents to external tools), or my preference, <a href="https://github.com/steveyegge/beads">Beads</a> I don't think there is a lot of big effective differences for me except when we come back to the nature of context size complication introducing confusion.</p>
<p>So beads does for AI what Jira does for humans and yet even as a human I would rather use Beads than Jira. Arguably, the difference is that tools like Beads focus on de-complicating the organization of work, its there for the worker's benefit. Jira on the other hand only benefits the bean counters and the workers just have to suffer so that a very few can complain that the reports it produces are useless.</p>
<p>Sorry, my Jira PTSD is showing... Beads, right Beads lets the coding agent take its task breakdown and put it into a graph with dependencies and epics, these feel meaningless to the agent but it's more about what we get to do with it later. It's easier for me to say to a fresh context, review the epic X and verify its functionality. You'll notice that when it finds something is a failure it usually just tries to fix it but it's also going to record a stream of attempts and what was the final resolution. Resulting in a history of the model's confusion introduced from me or the plan, but when I wanna do something similar I can use the JSONL (JSON Lines format - one JSON object per line) from the beads sync operation to prompt a variation of the task and create a new task breakdown.</p>
<p>Here is a claude partial to explain beads</p>
<pre data-lang="markdown" style="background-color:#12160d;color:#6ea240;" class="language-markdown "><code class="language-markdown" data-lang="markdown"><span>### </span><span style="color:#db784d;">Task tracking
</span><span>
</span><span>Use 'bd' (beads) for task tracking. Run </span><span style="color:#f8bb39;">`bd onboard`</span><span> to get started.
</span><span>
</span><span>#### </span><span style="color:#db784d;">bd Quick Reference
</span><span>
</span><span>```</span><span style="color:#db784d;">bash
</span><span style="color:#3c4e2d;"># Discovery & Navigation
</span><span>bd ready </span><span style="color:#3c4e2d;"># Find available work
</span><span>bd show </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span style="color:#3c4e2d;"># View issue details
</span><span>bd show </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span>--children </span><span style="color:#3c4e2d;"># Show issue with subtasks
</span><span>
</span><span style="color:#3c4e2d;"># Task Management
</span><span>bd create </span><span style="color:#f8bb39;">"<title>"</span><span> --type epic </span><span style="color:#3c4e2d;"># Create an epic
</span><span>bd create </span><span style="color:#f8bb39;">"<title>"</span><span> --parent </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span style="color:#3c4e2d;"># Create subtask under parent
</span><span>bd update </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span>--description </span><span style="color:#f8bb39;">"..." </span><span style="color:#3c4e2d;"># Update description
</span><span>bd update </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span>--status in_progress </span><span style="color:#3c4e2d;"># Claim work
</span><span>bd close </span><span style="color:#d65940;"><</span><span>id</span><span style="color:#d65940;">> </span><span style="color:#3c4e2d;"># Complete work
</span><span>
</span><span style="color:#3c4e2d;"># Sync & Persistence
</span><span>bd sync </span><span style="color:#3c4e2d;"># Sync with git (exports to JSONL)
</span><span>```
</span><span>
</span><span>#### </span><span style="color:#db784d;">Workflow Guidelines
</span><span>
</span><span>- Create an epic for each high-level objective
</span><span>- Create subtasks as a todo chain under the epic
</span><span>- Write titles as the task to be performed (imperative form)
</span><span>- Add detailed descriptions with examples of work to be done
</span><span>- Verify each task before closing
</span><span>- Log details about failures and retries in ticket descriptions for historical tracking
</span><span>- When an epic is completed, write a report of the task graph and verify all items were performed
</span><span>
</span><span>#### </span><span style="color:#db784d;">Displaying Task Graphs
</span><span>
</span><span>Use </span><span style="color:#f8bb39;">`bd show <epic-id> --children`</span><span> to display the task hierarchy. For visual reports, create ASCII diagrams showing task dependencies and completion status.
</span></code></pre>
<h2 id="uniqueness-vs-repeatability">Uniqueness vs Repeatability <a class="anchor" href="#uniqueness-vs-repeatability">🔗</a>
</h2>
<p>This is kind of the funny part of this whole process, the LLM can help with a bespoke task but it doesn't generally improve performance because the context size tends to bias towards failures and you end up having to check its outputs and re-validate anything ambiguous. You may say that you don't need to, but just look at the news, it's the failure mode the AI tools get lambasted on. Of course being an engineer we know that everything is essentially wrong and we are balancing the acceptable amount of wrong we can accept at any given moment.</p>
<p>This of course means that when we can find a process that is refinable to a predictable set of tasks we will end up trying to build some complicated brittle script that can automate the process and here is why building things with computers can be kinda dry. We should let the models handle the fixed set of tasks that need a little flexibility but doesn't offer too much range of opportunities for errors.</p>
<p>Refinement of process from memory is just a strategy but it's one that works quite well since the next agent can read the actions of its predecessor, you can bias it to take the success path and start ignoring it, which is the dream. For years I have been using LLMs and finding myself trapped staring at the console because it prompts me for feedback every couple of seconds building context or I have to endlessly remind it to complete the tasks. Both of these conditions are mostly eliminated.</p>
<h3 id="deploying-to-k8s">Deploying to K8s <a class="anchor" href="#deploying-to-k8s">🔗</a>
</h3>
<p>A concrete example of this is to deploy an application to kubernetes. This is super well documented and there is a ton of good tooling but it's also a highly configurable system. Each enterprise makes its own rules and policies around how containers are described. It can be very tiresome reading charts and chart documentation while bashing your head against a statement like <em>CrashBackoffLoop</em>. It's not like there isn't a way to learn about what's happening but it's a lot of command orchestration, the LLM can collect context of the failure much better since it can read multiple streams concurrently. So I recently deployed <a href="https://flagd.dev">flagd</a> (an open-source feature flag daemon) to our cluster and let Claude handle the building of charts. To keep things simple I let it just write the deployment and service artifacts itself. It failed a bunch but eventually determined from feedback of the k8s MCP that we were missing a config file. It mounted a volume and created a config map, 20 minutes later I have a stable instance running. I will need to do some work to make this fit into organizational norms but I am also ready for my PoC at the same time. I also have a log of what worked and what didn't work so when it comes time to deploy something else to our cluster I have a baseline of tasks which will reduce the planning time and token volume.</p>
<p>An example of this completed task output:</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>Beads Task Report - January 26, 2026
</span><span>
</span><span>Epic: Deploy flagd to namespace
</span><span>
</span><span>ID: feature-flag-prd-q6w </span><span style="color:#d65940;">| </span><span>Status: ✅ CLOSED
</span><span>
</span><span>---
</span><span>Task Graph
</span><span>
</span><span>feature-flag-prd-q6w (EPIC) ✅ CLOSED
</span><span>├── .1 Create flagd Deployment manifest ✅ CLOSED
</span><span>├── .2 Create flagd Service manifest ✅ CLOSED
</span><span>├── .3 Deploy flagd Deployment to cluster ✅ CLOSED
</span><span>├── .4 Deploy flagd Service to cluster ✅ CLOSED
</span><span>└── .5 Verify flagd deployment is running ✅ CLOSED
</span><span>
</span><span>---
</span><span>Task Completion Summary
</span><span>┌─────┬────────────────────────────┬────────┬───────────────────────────────────────────┐
</span><span>│ ID │ Task │ Status │ Close Reason │
</span><span>├─────┼────────────────────────────┼────────┼───────────────────────────────────────────┤
</span><span>│ .1 │ Create Deployment manifest │ ✅ │ flagd_deploy/k8s/deployment.yaml created │
</span><span>├─────┼────────────────────────────┼────────┼───────────────────────────────────────────┤
</span><span>│ .2 │ Create Service manifest │ ✅ │ flagd_deploy/k8s/service.yaml created │
</span><span>├─────┼────────────────────────────┼────────┼───────────────────────────────────────────┤
</span><span>│ .3 │ Deploy Deployment │ ✅ │ Deployment exists in cluster │
</span><span>├─────┼────────────────────────────┼────────┼───────────────────────────────────────────┤
</span><span>│ .4 │ Deploy Service │ ✅ │ Service exists with ClusterIP 127.0.0.1 │
</span><span>├─────┼────────────────────────────┼────────┼───────────────────────────────────────────┤
</span><span>│ .5 │ Verify deployment running │ ✅ │ Pod running after troubleshooting │
</span><span>└─────┴────────────────────────────┴────────┴───────────────────────────────────────────┘
</span><span>---
</span><span>Troubleshooting Log (Task .5)
</span><span>
</span><span>The verification task encountered issues that required 3 attempts:
</span><span>
</span><span>Attempt 1 - CrashLoopBackOff
</span><span>- Container exited with code 0 after printing help
</span><span>- Fix: Added args: </span><span style="color:#d65940;">[</span><span>start</span><span style="color:#d65940;">]</span><span> to deployment
</span><span>
</span><span>Attempt 2 - Fatal: no sync implementation set
</span><span>- flagd requires a flag source (file, gRPC, or HTTP sync)
</span><span>- Fix: Created ConfigMap with example flags, added volume mount
</span><span>
</span><span>Attempt 3 - SUCCESS ✓
</span><span>- Created ConfigMap: accounts-utility-flagd-service-dev-flags
</span><span>- Updated deployment with volume mount to /etc/flagd
</span><span>- Added args: --uri file:/etc/flagd/flags.json
</span><span>
</span><span>---
</span><span>Final Deployment State
</span><span>┌───────────────┬────────────────────────────────────────────────────┐
</span><span>│ Component │ Value │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ Pod │ flagd-service-dev-87fb5946f-vft4t │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ Status │ 1/1 Running │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ Pod IP │ 10.10.10.102 │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ Service IP │ 172.10.10.10 │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ Node │ node │
</span><span>├───────────────┼────────────────────────────────────────────────────┤
</span><span>│ flagd Version │ v0.13.2 │
</span><span>└───────────────┴────────────────────────────────────────────────────┘
</span><span>Exposed Ports:
</span><span>- gRPC: 8013
</span><span>- HTTP: 8014
</span><span>- FlagSync: 8015
</span><span>- OFREP: 8016
</span><span>
</span><span>---
</span><span>Result: 5/5 tasks completed - Epic successfully closed
</span></code></pre>
<h2 id="what-next">What next <a class="anchor" href="#what-next">🔗</a>
</h2>
<p>Here is how I would go about things, start by recording your plans. Maybe take one of my examples and refine it for you and check your experiences. Then install Beads and just manually create tasks and see how the agent interacts. Then go ahead an automate the whole thing but maybe this time we can avoid <a href="https://xkcd.com/1319/">xkcd:1319</a> but probably not :)</p>
Claude or ClodFri, 23 Jan 2026 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/claude-or-clod/
https://developmeh.com/soft-wares/claude-or-clod/<p>First off this might sound like a shitpost, but anecdotally, I chuckle to myself about this all the time when I am vibe coding. Claude is something of a straw tiger here and the title is just for the lolz.</p>
<p>So the last couple of weeks I have been diving back into Claude-Code as my primary tool and away from Junie from Jetbrains. Under the hood I use Claude Sonnet for both and have been using various versions of gemmas and gippities since I stumbled across gpt4all like 3ish years ago.</p>
<p>I have a fun history with this technology. As a college student I naively tried to build this kind of knowledge-based query interface in what I called LDOCS (Large Document Search)—yep, I couldn't figure out acronyms back then either. The idea was to enter all the works of Mark Twain and then ask questions about the TCU, the Twain Creative Universe. It was wide-eyed and it didn't work, but it was enough for my senior thesis. Point is, I've been thinking about this space and what my expectations of it are for quite a while.</p>
<p>Now enters a real thing that isn't some idealistic trash I dreamed up, and I get to use it every day. It's pretty sweet. We all know it.</p>
<p>But is it really ready to work? Maybe. Let's run down a few experiences over the course of a year and my takeaway as a 20-year career veteran.</p>
<h2 id="the-win">The Win <a class="anchor" href="#the-win">🔗</a>
</h2>
<p>I needed testing tools for the Passkey/WebAuthN Related Origin Request spec. Specifically, I needed to validate origin relationships for passkey authentication—arcane stuff involving eTLDs (effective top-level domains) and ccTLDs (country-code top-level domains). The kind of work that makes you squint at RFCs and Chromium source code until your eyes cross.</p>
<p>Consider this: a passkey is designed to affix to a single domain. But enterprises have many domains. This draft spec provisionally allows passkeys to work across a predefined set of origins. Taking it from spec to tool meant understanding browser internals I was too dumb to grok from the RFC alone.</p>
<p>I fed C files from Chromium into the model. "Give me a Go CLI that does this," I said.</p>
<p>It did.</p>
<p><a href="https://github.com/developmeh/passkey-origin-validator">passkey-origin-validator</a></p>
<p>Go is simple. There are <em>tons</em> of examples in the training data. The model nailed the CLI scaffolding—flags, argument parsing, output formatting. Beautiful. It even gave me things I didn't know I wanted, like flags for files vs URLs to test against.</p>
<p>Then I looked at the eTLD logic.</p>
<p>Wrong.</p>
<p>Not catastrophically wrong. Subtly wrong. The kind of wrong that would pass a surface-level review but fail in production when someone tried to authenticate from <code>.co.uk</code> or <code>.com.au</code>. Think about the rules for <code>developmeh.com</code> versus <code>developmeh.co.jp</code>. The model had <em>predicted</em> what eTLD logic should look like based on patterns it had seen. It hadn't <em>understood</em> the problem.</p>
<blockquote>
<p>The model doesn't think. It predicts.</p>
</blockquote>
<p>I fixed it myself. Wrote the domain suffix matching logic by hand, validated against the public suffix list, added edge cases the model never considered. The task took about 20% longer than if I'd done it solo from the start.</p>
<p>But the documentation? Solid. The tests? Comprehensive. The CLI help text? Actually helpful.</p>
<p>Tradeoff.</p>
<p>Here's the thing that kept me up at night: if I were less skilled—if I didn't know the problem space intimately—I might not have noticed it didn't really solve the problem. I would've shipped broken code with excellent documentation. Tech debt with a bow on it.</p>
<p>You should be picking up the conflicts now.</p>
<h2 id="the-loss">The Loss <a class="anchor" href="#the-loss">🔗</a>
</h2>
<p>I wanted to build a WebRTC tunnel to a CGNATed (Carrier-Grade NAT) device. Think: running a server on your phone behind carrier-grade NAT, establishing a peer connection, maintaining a stable tunnel. Something new, something off-standard, something not well-represented in the training corpus.</p>
<p><a href="https://github.com/developmeh/webrtc-poc">webrtc-poc</a></p>
<p>The model could write the WebRTC boilerplate. It could scaffold STUN/TURN server connections. It could generate the SDP (Session Description Protocol) offer/answer flow. But when it came time to orchestrate the actual handshake—the delicate dance of ICE candidates and connection state changes—it fell apart. It couldn't figure out how to start servers in the right order to establish the same handshake it did in the PoC.</p>
<p>I spent a lot of money. I got very little success.</p>
<p>"AI is great," I told a friend afterward, "just don't ask it to do WebRTC or anything with a handshake."</p>
<p>He laughed. I didn't.</p>
<p>The reality is the theme of commonality. That's where we should be trying to understand the model's place in our workflows.</p>
<h2 id="the-new-junior-dev">The New Junior Dev <a class="anchor" href="#the-new-junior-dev">🔗</a>
</h2>
<p>Yes.</p>
<p>And definitively, no.</p>
<p>More accurate: they're like any dev the first time on a new project. I've seen seniors newly introduced to a codebase make the same general mistakes the model does. I call them "shortcuts" because it appears they're skipping good process, racing toward the goal so they can go home. Something like Mr. Meeseeks—existence is pain, just finish the task and let me stop existing.</p>
<p>The pitfalls are predictable:</p>
<ul>
<li>Convoluted business logic (special cases)</li>
<li>Unfocused context (not enough files in the RAG)</li>
<li>Test confusion</li>
</ul>
<p>When complex code is modified, the model tends to focus changes on a single file that appears akin to LoB (Locality of Behavior). But when there's too much abstraction, the model doesn't have a common pattern to predict against and cuts the corner by doing something easier—like changing contracts and moving things to a central location. That's exactly what people do who have a lower quality-to-completion drive. I internalize this as the same motivation hierarchy the model prefers.</p>
<p>Test confusion is my favorite. The model will add a conditional check to force test values <em>only when the test suite is running</em>. It'll detect <code>NODE_ENV === 'test'</code> or check for the presence of a global test flag, then branch the logic. The tests pass. The code is fundamentally broken.</p>
<blockquote>
<p>The model is very human and ethically unaffected.</p>
</blockquote>
<p>It doesn't feel bad about lying to the test suite. It doesn't experience shame when it hacks around a problem instead of solving it. It just predicts the next token that makes the error go away.</p>
<p>This should define our trust of its outputs.</p>
<p>I was asked recently by someone I greatly respect:</p>
<blockquote>
<p>Paul, you have managed engineering teams before. You know we try not to micromanage people. Should we micromanage the AI though?</p>
</blockquote>
<p>Yes. That's generally the narrative about agents. They require a lot of refinement for anything complex. The folks over at METR have statistics: tasks under 30 minutes complete successfully about 80% of the time. As tasks approach an hour, success drops to 50%.</p>
<p>This tracks with my experience. Short, well-defined, pattern-matching tasks? The model crushes them. Longer tasks requiring sustained context and architectural decisions? Coin flip.</p>
<p>But here's what bothers me about those numbers: we're measuring <em>completion</em>, not <em>correctness</em>.</p>
<h2 id="the-agent-experiment">The Agent Experiment <a class="anchor" href="#the-agent-experiment">🔗</a>
</h2>
<p>New experiment: agent orchestration.</p>
<p>I spun up four agents—product agent, PM agent, tech-lead agent, architect agent. Gave them a feature request for adding feature flags, told them to plan it out. They produced a PRD (Product Requirements Document). ADRs (Architecture Decision Records). A technical implementation plan. A work breakdown structure. Jira tickets for rollout phases.</p>
<p>Total time: about three hours.</p>
<p>Total artifacts produced: 62.</p>
<p>How long to validate 62 artifacts?</p>
<p>Here in lies the trap.</p>
<p>Verbosity hides meaning the same way big pull requests hide bad code. You <em>think</em> you're being thorough because there's so much output. You <em>feel</em> productive because the agents generated thousands of words of planning documentation. But reading is slower than writing, and verification is slower than generation.</p>
<p>I stared at those 62 files and felt a familiar dread. The same dread I feel when someone drops a 3,000-line PR in my lap and says "just a few small changes." Your eyes glaze. You skim. You approve. You pray.</p>
<p>The volume itself becomes a kind of argument: <em>look how much I produced</em>. But production isn't value. Production is just... production.</p>
<p>The orchestration itself was surprisingly easy to build. Agents calling agents, passing context, refining outputs. The decomposition into four specialized phases felt right—narrow experts doing narrow work instead of one omniscient assistant hallucinating across domains.</p>
<p>But identifying success? Knowing if the plan was actually <em>good</em>?</p>
<p>That part wasn't easy at all.</p>
<p>I hear all the time that effective LLM use for code gen is about planning everything. Small tasks. Tool construction.</p>
<p>Better to see it like this:</p>
<blockquote>
<p>What's best for the model is also what's best for you as a dev. You know, the things that don't seem to save time.</p>
</blockquote>
<p>The practices that don't <em>seem</em> to save time—writing focused functions, documenting intent, structuring code into discrete responsibilities—those are exactly what make AI augmentation work.</p>
<p>The model can't navigate a tangled mess of god objects and hidden dependencies any better than a new human teammate can. But give it a clean interface, a well-defined problem, and examples of the pattern you want? It'll predict something useful.</p>
<p>Generally asking the LLM to do the work is the wrong solution. It's kind of meh at it. But building tools that are small and composable so it can be the orchestration engine? Now you might have something. If the tool is small enough, maybe it can even build it.</p>
<p>How do you make a system that's easy to manage? SRP (Single Responsibility Principle). You build interfaces and contracts that are consistent. Contract first. You focus on composition over inheritance. You keep patterns simple and try to repeat yourself when possible.</p>
<p>Like poetry.</p>
<h2 id="anyways">Anyways <a class="anchor" href="#anyways">🔗</a>
</h2>
<p>So yes its both, a clod and Claude. It depends on the day and the time spent. Its not free work its work where the course parts can achieve less focus for you.</p>
<p>These tools don't think. They predict. Sometimes well enough to be useful. Sometimes not. The only way to find out is to build something and see what breaks.</p>
<p>Failure is a valid outcome. We just have to keep trying.</p>
The AI DiariesMon, 19 Jan 2026 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/ai-diaries/
https://developmeh.com/soft-wares/ai-diaries/<h2 id="the-ai-diaries">The AI Diaries <a class="anchor" href="#the-ai-diaries">🔗</a>
</h2>
<blockquote>
<p>As soon as it works, no one calls it AI anymore - John McCarthy</p>
</blockquote>
<p>So I tend to avoid using the term AI but it's sometimes unavoidable. Right now I am being forced to spend considerable time using coding tools. And sometimes I like it, sometimes I think it's a bore, and almost always it wastes some of my time. At a minimum it makes up for all the time it wastes but it always creates more noise than value. I have a lot of anecdotes working in this space so I will land them here, at the edge of obscurity.</p>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h2 id="06-03-2026">06 03 2026 <a class="anchor" href="#06-03-2026">🔗</a>
</h2>
<p>During the beginning of the hype cycle for "AI", the thing that turned me off was the noise about everyone having to become a "Product Engineer." It was the concept that the only value software provides is building products to be sold. Disgusting! (Obviously, I know what platform this is and how dumb this opinion is for this tribe)</p>
<p>It took a while and a few books, one specifically that told some stories of Grace Hopper that healed my issues. It was the realization that software and automation was to reduce drudgery not to create revenue. It was the need to reduce drudgery that results in positive outcomes and the reason people would pay for software.</p>
<p>Have your own opinions but I repeatedly seem to re-learn this lesson, my choices should be about my thoughts and those are validated by vetting the thoughts of others.</p>
</div>
<div class="devlog-entry">
<h2 id="01-03-2026">01 03 2026 <a class="anchor" href="#01-03-2026">🔗</a>
</h2>
<p>I have been off in a microcosm of building PoCs for things, some of them useful:</p>
<ul>
<li><a href="https://sr.ht/~ninjapanzer/codify-orchestration/">Agent Orchestration</a></li>
<li><a href="https://sr.ht/~ninjapanzer/beads-monitor/">IDE Tooling</a></li>
</ul>
<p>Some not so much:</p>
<ul>
<li><a href="https://sr.ht/~ninjapanzer/small-language-models/">From Scratch Semantic Code Search</a></li>
</ul>
<p>But coming back to the real world for a moment I am starting to see how I might be in a bubble, an AI bubble. I have simplified AI as a soft executor. Like a script with smart error handling. It's not conversational, it's imperative and command oriented. I think I am a control freak or maybe it's a matter of perspective scale, but I am preparing my statement to the model with a pre-defined expectation of the outcome. I also express the expected outcome and then explain the direction and then confirm the outcome in my statement. I am not looking for the model to introduce inspiration. I highly question the decision to allow the idea to flow from the model, because it often has some pretty bad ideas.</p>
<p>Its about reducing drudgery, <a href="/devex/automatic-programming-iteration-4/">Like I alluded to here</a>. Having something of a long tail in this industry now I have seen the world when the product was hosted in the office on consumer hardware. I did the cowboy coding without version control, just FTP to the server. That's how debugging happened too sometimes. Then we got all full of ourselves, we needed more guardrails, we needed to support more engineers with lower skills, we needed to grow. That sounds like punching-down or retroactive gate-keeping that's not the point. There wasn't time to train people on or off the job. Bootcamps did their best to teach functional skills and get the butts in the seats but they were unable to embed what years of experience also provides. Being good thoughtful people we focused on how to make the work safer for more people and permit more capability with less experience. Then we over generalized, we produced specialization in managing tools and never built up the experience, it was abstracted away at a rate that required it to have more abstraction. So now what we have is a ton of drudgery, that hides its meaning so well it has become hard to use.</p>
<p>Regardless, we poo poo on shell scripts because they are brittle (citation needed). I have worked too many places where I was refused a merge due to the presence of a shell script because I could have written the same thing in Ruby or JavaScript. A language that the humans understood better and purported would be less brittle. Of course the brittleness is in the error handling not the language. So this is where I started thinking about llms as error traps for code, don't write a skill that does work, have the LLM write the code and then a skill to run the code. When the code breaks because of a bad filename or a missing system dependency it doesn't blow up. Instead the model takes over and either mitigates the flaw and follows the "spirit" of the code to get things done or just fixes the code.</p>
<p>Here is the thing, both of these are brittle solutions, one we expect to be brittle eventually, the other will be unexpectedly brittle but self correcting. I know the cheap seats will say, "by unexpectedly you mean it will rm -rf ~/". I mean maybe sure but that's pretty unlikely especially if you don't let it run rm -rf outside of $CWD or just not at all.</p>
<p>TLDR; Here is the point, we are always afraid of something, when it was humans we were afraid of the humans, now its robots and we are elevating the humans to the robots. What if we just all agreed that both are likely to screw everything up and take a chance to pivot. Excessive abstraction didn't fix the problem it didn't make the work easier or faster. It did make it consistent which is the thing to learn. Now we have robots and our purpose is to make them operate consistently. Time to break some rules and climb out of the pit of drudgery.</p>
</div>
<div class="devlog-entry">
<h2 id="22-02-2026">22 02 2026 <a class="anchor" href="#22-02-2026">🔗</a>
</h2>
<blockquote>
<p>Looking at the task, extracting 42+ fields and refactoring a 2300-line file in a single pass is ambitious. The agent likely ran into too many compilation errors and got stuck.</p>
<p>Let me mark this bead as blocked and continue with the other work. The user can decide how to proceed.</p>
<p>rwad failed again - no commits made. This is a complex refactoring (42+ fields in a 2300-line file). Marking as blocked.</p>
</blockquote>
<p>For context this is the file in question <a href="https://git.sr.ht/~ninjapanzer/jetbrains-beads-manager/blob/843f4d526e4c0ef0158965b7a0947520920ce304/src/main/kotlin/com/beads/monitor/toolwindow/IssueDetailPanel.kt">com.beads.monitor.toolwindow.IssueDetailPanel</a> which over the course of about 20 days of development this file represents the main interface for the beads manager.</p>
<p><img src="/soft-wares/Snapshot_2026-02-22_16-15-53.png" alt="interface" /></p>
<p>There is a lot going on here but really rather mild compared to software I have worked on in the past. For this file only represents the right hand details. Which in the background includes things like partial editing, reloading, and partial refreshes. But there is little I can accept should cause this file to be 2300 lines. What brought us here was the discovery that regressions were introduced at each new feature added. Many would take 5-6 iterations for Claude to solve for. It becaue an enormous time suck.</p>
<p>The learning here is that there must be a pressure valve for refactors. I expect that we need systems that can observe file complexity once again. Like the days of old its probably necessary to guide coding agents to far more strict standards than the average developer. This wild growth is unsettling and in my past when working with younger devs it was tools like flog and flay that helped guide code growth.</p>
<p>As a very senior developer now these things seem natural, complexity is a way of life and as I always say software engineering is change management.</p>
</div>
<div class="devlog-entry">
<h2 id="08-02-2026">08 02 2026 <a class="anchor" href="#08-02-2026">🔗</a>
</h2>
<p>I've been tracking beads data on the <a href="https://plugins.jetbrains.com/plugin/30089-beads-manager">JetBrains Beads Manager plugin</a> build and the numbers tell the 80/20 story pretty clearly.</p>
<p>Four days, 156 issues closed. Sounds impressive until you look at the breakdown:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span> ┃ Features/Epics ┃ Tasks ┃ Bugs ┃
</span><span>────────╋────────────────╋──────────────╋───────────────────╋────
</span><span>Feb 5 ┃ ▓▓ 5 ┃ ░░░░░░░░ 24 ┃ ████████████ 29 ┃ 59
</span><span>Feb 6 ┃ ▓▓▓▓ 12 ┃ ░░░ 9 ┃ ████ 12 ┃ 37
</span><span>Feb 7 ┃ ▓ 4 ┃ ░░░░░░░░░ 34 ┃ ████ 12 ┃ 50
</span><span>Feb 8 ┃ ▓ 1 ┃ ░░ 5 ┃ █ 3 ┃ 10
</span><span>────────╋────────────────╋──────────────╋───────────────────╋────
</span><span>Total ┃ 22 (14%) ┃ 72 (46%) ┃ 56 (36%) ┃ 156
</span></code></pre>
<p>Day 1 built the thing. 5 features, 24 tasks to wire them up, and immediately 29 bugs. Day 2 added more features - macOS compatibility, settings panels, refresh timers. Day 3 and 4? Chasing bugs and polish. UI stuttering, race conditions, tree selection quirks, scroll position resets.</p>
<p>56 bugs out of 156 total issues. That's 36% of all tickets just fixing what the agents broke while building features. And those bug tickets often took longer - VFS async race conditions, deprecated API replacements, multi-selection state management. The kind of stuff where the agent confidently implements the wrong fix and you're three attempts deep before finding the real problem.</p>
<p>The agents built a working plugin in a day. Then we spent three days making it actually work.</p>
</div>
<div class="devlog-entry">
<h2 id="03-02-2026">03 02 2026 <a class="anchor" href="#03-02-2026">🔗</a>
</h2>
<p>This is just a thought process I go through with LLM generated code...</p>
<blockquote>
<p>OK, I can produce more code than I reasonably can keep track of in a single session, which means there is always going to be some code I didn't read.</p>
</blockquote>
<blockquote>
<p>OK, I can always produce and keep in sync documentation about the code that is produced, ADRs and design docs. But if they are too long no one will read them. But at least there is some consumable record.</p>
</blockquote>
<p>Kinda like a factory maybe stamping widgets, because this model of writing all the code all the time seems a little odd. I should be writing less code and there should be more shared code. If the product is the feature and the speed to market is what matters then the cost for encapsulation should go down. Modern products will end up as composable licensable modules.</p>
<p>This is kind of the path that infrastructure took, so why not product. Think about it, if we can remove the human ego from deciding on a solution then any solution is good as long as it can be wired into the product.</p>
<p>If code gen is expensive it's better to reduce the work and just contribute to open source.</p>
<p>I might have lost you there but hear me out: <a href="/soft-wares/just-forget-about-owning-code">Just Forget About Owning Code</a></p>
</div>
<div class="devlog-entry">
<h2 id="02-02-2026">02 02 2026 <a class="anchor" href="#02-02-2026">🔗</a>
</h2>
<p>On Sunday I spent some more considerable time building something dumb and noticed something interesting. While I had observed this before this was formal confirmation because I was able to encounter the same issue across multiple models. While I don't know what is the common source for coding training data but as a person who makes programs that do specifc non-business tasks it seems clear that all the models I have worked on so far don't know how to make a browser extension. Add it to the list of things like webrtc but in this case understanding Manifest v2 vs v3 is always a challenge. In most cases my usage for LLMs is to help get me past the hump of a new technology, traditionally if I know a technology I write the code myself. I have built a number of extensions with various LLMs and they always get trapped on CSP and manifest considerations. They also don't seem to understand anything about how the browser works outside the spec. An extension has to follow a bunch of rules that are bespoke to the application but these are unknown to the models training it would seem.</p>
<p>But $10 to build an HLS extractor is pretty cool.</p>
</div>
<div class="devlog-entry">
<h2 id="30-01-2026">30 01 2026 <a class="anchor" href="#30-01-2026">🔗</a>
</h2>
<p>Success with coding agents is as expected completely bound to the quality of the model used. So much of how the agent works is dependent on the model architecture very little configuration work built for Claude will say work with Quen. But outside of foundation models tool use is quite limited for comodity hardware. Having taken a stab this weekend across a number of different models I can confirm that models focused on a task perform better than generalized foundation models.</p>
<p>A great example of this is a comparison of MiniMax and Qwen2.5 Coder vs Claude code. The tools are so completely similar that it really raised the differences between the models. One of the things that Claude Code has going for it is the user experience, its quite tight. But it also leads to some Apple like resistence. On the other hand Open code as a tool did all the same things sans agent generation skills but being able to switch between models was critical. I would use MiniMax2.5 for coding in one terminal and then Qwen or something smaller on a local machine. It was totally reasonable to have a cloud model doing the heavy lifting and a local model doing code reviews or writing comments.</p>
</div>
<div class="devlog-entry">
<h2 id="28-01-2026">28 01 2026 <a class="anchor" href="#28-01-2026">🔗</a>
</h2>
<p>I gotta admit there is one thing about using AI coding tools that continues to be true no matter how much I try and constrain the model's failures I generally get similar results. If I don't know exactly what I want it to do and can provide a complex enough context the results will be that of an "Eager Intern" meaning I will get results that I didn't expect and when there were obvious places where the model should have stopped and asked questions it failed. I suspect that the model architecture was trained to focus more on task completion than task accuracy. I have a few times been able to get various agents to "give up" and tell me to try again. Of those Junie definitely does this and doesn't waste my time. Claude-Code though is too appeasing, it closes tasks without verification even when prompted to verify their work. Even with orchestration of multiple agents with fresh contexts, asking to build an app that isn't a todo-list will fail. This benefits the sale of coding tools, during evaluation it impresses with the ability to construct simple things but falls over when complex solutions are required. When I say complex I mean those that are generally novel or require doing interactions over APIs. It commonly produces boilerplate which I think is by design to influence the numbers for LoC for code generation stats. But insidiously, it is also there to obscure the solution it introduces.</p>
<p>A clear sign of AI code generation is bloat and intentional omissions. As of yet the only way I have found to avoid this omissions is to have the model show its work and put it in the clear view of me. So I can set it on a task and watch its completion, then ask it to review the goals and try again. This clearly sucks and I can introduce tools to guide it away from the problem but that's just a bad tool not something that is going to change the nature of my job. It is on the other hand an insult to my 20 year career and all the juniors that are unable to get a job because there is an assumption that if we just "trust me bro" enough it will work.</p>
</div>
<div class="devlog-entry">
<h2 id="27-01-2026">27 01 2026 <a class="anchor" href="#27-01-2026">🔗</a>
</h2>
<blockquote>
<p>If you work for a company that laid off all your juniors in the past year, it is unbelievably poor taste to continue posting about the merits of AI and vibe coding on a platform where the majority of folks are currently looking for full-time work and do not want to be beaten to death with constant AI thinkpieces. Where did human-centered go in 2026? Because all I've seen so far from C-suite leaders and middle managers is forgetting how they got to where they are now. - Jen Udan - <a href="https://www.linkedin.com/posts/activity-7416144126259200000-3bt5?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAIQ9iQBdQxO0rU7SDH3FYCQeNKWu3Zrg_A">REF</a></p>
</blockquote>
<p>I have been thinking about it like this... consider some big enterprise makes this commitment, they have to get some financial approval for the act and may have committed to some outputs. Now let's say that AI is golf clubs and we just gave everyone a real nice set followed by, be good at golf by the end of the month. All this hype is just from people who own sporting goods stores. The latest debacle about cursor creating a browser without a human in the loop where it didn't compile and humans were in the loop still can land in the post truth world we live in. If my job was being told things are being accomplished and I get access to a todo-list that tells me my tasks are done it's gunna be real hard to not be attracted to such things.</p>
<p>I get to see the outputs of the C-Suite from time to time. The model tries to do the engineering work for me and guided by a visitor it often misses where the rules matter and where the rules can be bent.</p>
<p><img src="../1_3iC6cilfUdvndZUVRELmBA.webp" alt="throughput-over-precision" /></p>
<p>It's this ^ a very enticing concept. What of course is missed is I have to keep watching the bots work and stop them from looping. I guarantee it will get better but if the need for progress is all we care about maybe we should be thinking back to something simpler. People of Process, if we need to get things done we need to cut the red tape not unroll all the red tape into a ball and then wonder why we can't find anything.</p>
</div>
<div class="devlog-entry">
<h2 id="20-01-2026">20 01 2026 <a class="anchor" href="#20-01-2026">🔗</a>
</h2>
<p>This one is more just the fun of working with other engineers and AI. While I will not post the code I was impacted by the size of the rebase it caused and the need for me to rewrite my feature. The code the model wrote only cared about things working. It built 200 line blocks of deeply nested conditional logic into existing functions adding catch clauses for exceptions that mean another service has failed and should not be caught. The telling part is when we reviewed the code with the developer he was unable to explain the why these things existed. It's a noob mistake but it's one that AI tends to promote. The endless "Trust Me Bro," and instead this wasted 6 hours of developer time and 3 days in a feature rewrite.</p>
<p>I know there is a mentality that encapsulation adds to cognitive overhead in humans but it exists because 5 levels of if statements is higher. But what happens when the same code was reviewed by the same model that produced it. The code seems to make sense and without the context of the architecture aka we just focused the changes on a single file we end up with some real new debt.</p>
</div>
The Magic of Stubbing shThu, 09 Oct 2025 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/the-magic-of-stubbing-sh/
https://developmeh.com/i-made-a-thing/the-magic-of-stubbing-sh/<h2 id="the-magic-of-stubbing-sh">The Magic of Stubbing sh <a class="anchor" href="#the-magic-of-stubbing-sh">🔗</a>
</h2>
<p>I really love sh and bash but I often feel alone and I get some regular negativity when I solve a problem with it. I know why too, shell scripts can have a broad level of complexity that has other languages embedded into it. But its not as esoteric as you might think, more another domain we should be comfortable with. One of the ways I learned to deal with unknown domains was to read the tests. Because tests tend to use some common language they are often more literate. Here's the thing, I keep getting people tell me that shell scripts don't have tests, and they are wrong. See I have this trick, its called BATS and I talked about it over here <a href="/tech-dives/test-anything-means-testing-bash">Test Anything Protocol</a> where I showed an example of stubbing <code>helm</code> but that example was not the whole story. Since the BATS framework is itself bash we have all those nasty tools at our disposal to manipulate our subject under test.</p>
<h2 id="subject-under-test">Subject Under Test <a class="anchor" href="#subject-under-test">🔗</a>
</h2>
<p>Boring as it may be the purpose here is to observe and verify the output and side-effects of commands run by the shell. We need to respect this boundary between our scripts and the tests for those scripts. One of the challenges to this is how commands avoid observation like <code>rm</code> <code>mktemp</code>, if my script creates a tempfile and then removes it it’s hard to verify if that step occurred without modifying the subject. Of course we can write traces to <code>&>2</code> using echo but that proves nothing more than the presence of the echo statement. I need to verify the validity of these intermediate steps. In traditional programming languages we have mocks and spies which capture the fundamental flow of the code by interfering with the call sites and through reflection. We can do something similar.</p>
<h2 id="mocking-or-stubbing-whatever">Mocking or Stubbing... Whatever <a class="anchor" href="#mocking-or-stubbing-whatever">🔗</a>
</h2>
<p>Now there are BATS mocking libraries and they are a wondrous cornucopia of features but in my experience they don't expose much more than a new way of describing, a DSL, how to intercept and modify interactions. So go learn and use those, but for many normal use cases I wanna show you how to do this by hand and use the existing shell language you already know. In the following example we are going to observe tempfiles so we can keep track of an intermediate state, while exposing debugging information when doing TDD, more on that down the line though.</p>
<h3 id="example">Example <a class="anchor" href="#example">🔗</a>
</h3>
<p><strong>temp.sh</strong> Subject Under Test</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;">#!/bin/bash -e
</span><span>
</span><span style="color:#db784d;">local </span><span>workspace</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(mktemp -d)
</span><span>
</span><span>touch </span><span style="color:#f8bb39;">"$workspace/not_temp.sh"
</span><span>
</span><span style="color:#db784d;">local </span><span>first</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(mktemp)
</span><span style="color:#db784d;">local </span><span>second</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">$(mktemp)
</span><span>
</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"WOW" </span><span style="color:#d65940;">> </span><span>$second
</span><span>
</span><span>rm $first
</span><span>rm $second
</span></code></pre>
<p><strong>temp.sh.bats</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span style="color:#95cc5e;">set </span><span>+x
</span><span>
</span><span>bats_require_minimum_version 1.5.0
</span><span>
</span><span style="color:#3c4e2d;"># Load Bats libraries
</span><span>load ../../.test/bats/bats-support/load
</span><span>load ../../.test/bats/bats-assert/load
</span><span>
</span><span style="color:#3c4e2d;"># Stub rm to capture files deleted
</span><span style="color:#95cc5e;">function </span><span style="color:#60a365;">rm</span><span>() {
</span><span> </span><span style="color:#d65940;">for</span><span> arg </span><span style="color:#d65940;">in </span><span style="color:#f8bb39;">"$@"</span><span style="color:#d65940;">; do
</span><span> </span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[[ </span><span style="color:#f8bb39;">"$arg" </span><span style="color:#d65940;">!=</span><span> -</span><span style="color:#d65940;">* </span><span style="color:#95cc5e;">]]</span><span style="color:#d65940;">; then
</span><span> cp </span><span style="color:#f8bb39;">"$arg" "${TEST_DIRECTORY_RUNNING}/tmp/$(basename "$arg").captured" </span><span style="color:#d65940;">|| return</span><span> 0
</span><span> </span><span style="color:#d65940;">fi
</span><span> </span><span style="color:#d65940;">done
</span><span> </span><span style="color:#95cc5e;">command</span><span> rm </span><span style="color:#f8bb39;">"$@"
</span><span>}
</span><span>
</span><span style="color:#3c4e2d;"># Stub mktemp to track temp files for cleanup
</span><span style="color:#95cc5e;">function </span><span style="color:#60a365;">mktemp</span><span>() {
</span><span> </span><span style="color:#db784d;">local </span><span>tmp
</span><span> </span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[[ </span><span style="color:#f8bb39;">"$1" </span><span style="color:#d65940;">== </span><span style="color:#f8bb39;">"-d" </span><span style="color:#95cc5e;">]]</span><span style="color:#d65940;">; then
</span><span> tmp</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}"
</span><span> </span><span style="color:#d65940;">else
</span><span> </span><span style="color:#95cc5e;">read </span><span>-r counter </span><span style="color:#d65940;">< </span><span>$TEMPS_COUNTER
</span><span> ((counter</span><span style="color:#d65940;">++</span><span>))
</span><span> </span><span style="color:#95cc5e;">echo </span><span>$((counter)) </span><span style="color:#d65940;">> </span><span>$TEMPS_COUNTER
</span><span> tmp</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}/tmp/bats.${counter}"
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$tmp" </span><span style="color:#d65940;">>> </span><span>$TEMPS
</span><span> </span><span style="color:#d65940;">fi
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$tmp"
</span><span>}
</span><span>
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> </span><span style="color:#db784d;">export </span><span>TEST_DIRECTORY</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"./.tests/res"
</span><span> </span><span style="color:#db784d;">export </span><span>TEST_DIRECTORY_RUNNING</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"./.tests/res_tmp"
</span><span> </span><span style="color:#db784d;">export </span><span>TEMPS_COUNTER</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">${TEST_DIRECTORY_RUNNING}/tmp/.counter
</span><span> </span><span style="color:#db784d;">export </span><span>TEMPS</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">${TEST_DIRECTORY_RUNNING}/tmp/.temps
</span><span> cp -r </span><span style="color:#f8bb39;">"${TEST_DIRECTORY}/." "${TEST_DIRECTORY_RUNNING}/"
</span><span> mkdir -p </span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}/tmp"
</span><span> </span><span style="color:#db784d;">export </span><span>-f mktemp
</span><span> </span><span style="color:#db784d;">export </span><span>-f rm
</span><span>
</span><span> touch $TEMPS_COUNTER
</span><span> touch $TEMPS
</span><span> </span><span style="color:#95cc5e;">echo</span><span> 0 </span><span style="color:#d65940;">> </span><span>$TEMPS_COUNTER
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown</span><span>() {
</span><span> </span><span style="color:#d65940;">for</span><span> tmp </span><span style="color:#d65940;">in </span><span style="color:#f8bb39;">"${temps[@]}"</span><span style="color:#d65940;">; do
</span><span> </span><span style="color:#95cc5e;">command</span><span> rm -f </span><span style="color:#f8bb39;">"$tmp"
</span><span> </span><span style="color:#d65940;">done
</span><span>
</span><span> </span><span style="color:#95cc5e;">unset </span><span>-f mktemp
</span><span> </span><span style="color:#95cc5e;">unset </span><span>-f rm
</span><span>
</span><span> </span><span style="color:#95cc5e;">command</span><span> rm -f </span><span style="color:#f8bb39;">"$TEMPS_COUNTER"
</span><span> </span><span style="color:#95cc5e;">command</span><span> rm -f </span><span style="color:#f8bb39;">"$TEMPS"
</span><span>
</span><span> </span><span style="color:#95cc5e;">unset</span><span> TEST_DIRECTORY
</span><span> </span><span style="color:#95cc5e;">unset</span><span> TEST_DIRECTORY_RUNNING
</span><span> </span><span style="color:#95cc5e;">unset</span><span> TEMPS_COUNTER
</span><span> </span><span style="color:#95cc5e;">unset</span><span> TEMPS
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">'test intermediate files' </span><span>{
</span><span> local second_tempfile_expected=</span><span style="color:#f8bb39;">"WOW"
</span><span> run bash ./.tests/temp.sh
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># note the captured
</span><span> local second_tempfile_actual=</span><span style="color:#f8bb39;">"$(cat ${TEST_DIRECTORY_RUNNING}/tmp/bats.2.captured)"
</span><span> assert_success
</span><span>
</span><span> assert_equal $(cat </span><span style="color:#f8bb39;">"$TEMPS_COUNTER"</span><span>) 2
</span><span> assert_equal </span><span style="color:#f8bb39;">"$(</span><span style="color:#95cc5e;">[ </span><span style="color:#f8bb39;">-f $TEST_DIRECTORY_RUNNING/not_temp.sh </span><span style="color:#95cc5e;">] </span><span style="color:#d65940;">&& </span><span style="color:#95cc5e;">echo</span><span style="color:#f8bb39;"> 0 </span><span style="color:#d65940;">|| </span><span style="color:#95cc5e;">echo</span><span style="color:#f8bb39;"> 1)"</span><span> 0
</span><span> assert_equal $second_tempfile_actual </span><span style="color:#3c4e2d;">#second_tempfile_expected
</span><span> assert_output --regexp </span><span style="color:#f8bb39;">'Done'
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># _Note_ The use of `command` which bypasses our function export of `rm` introduced by `export -f rm` this makes sure we use the original command and not our mock.
</span><span> command rm -rf </span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}"
</span><span>}
</span></code></pre>
<p>Lets explore the mocking... ignoring the directory paths we intercept calls to mktemp and if the commands first argument is <code>-d</code> for directory we inject a static location we control. Otherwise we create a unique file in that directory. When we do this we capture the temp file and the number created so far so we can verify the interfaction later. Both these files can be observed during execution.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># Stub mktemp to track temp files for cleanup
</span><span style="color:#95cc5e;">function </span><span style="color:#60a365;">mktemp</span><span>() {
</span><span> </span><span style="color:#db784d;">local </span><span>tmp
</span><span> </span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[[ </span><span style="color:#f8bb39;">"$1" </span><span style="color:#d65940;">== </span><span style="color:#f8bb39;">"-d" </span><span style="color:#95cc5e;">]]</span><span style="color:#d65940;">; then
</span><span> tmp</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}"
</span><span> </span><span style="color:#d65940;">else
</span><span> </span><span style="color:#95cc5e;">read </span><span>-r counter </span><span style="color:#d65940;">< </span><span>$TEMPS_COUNTER
</span><span> ((counter</span><span style="color:#d65940;">++</span><span>))
</span><span> </span><span style="color:#95cc5e;">echo </span><span>$((counter)) </span><span style="color:#d65940;">> </span><span>$TEMPS_COUNTER
</span><span> tmp</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}/tmp/bats.${counter}"
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$tmp" </span><span style="color:#d65940;">>> </span><span>$TEMPS
</span><span> </span><span style="color:#d65940;">fi
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$tmp"
</span><span>}
</span></code></pre>
<p>When we write clean scripts we also clean up after ourselves this good behavior provides a challenge to checking the contents of these intermediate files. Because shell scripts are file system based the most common way for data to make its way between processes is to write and read from the filesystem. But if we are tracing a bug in our code we have to regularly interfere with out subject under test to observe its intermediate steps. But if we capture the <code>rm</code> command we can conditionally retain some of the progress. In this example we capture all the args and if one includes a path we extract the filename, append <code>.captured</code> and copy it to our running directory. Ultimately, even if we don't stub mktemp we can still capture deleted tempfiles this way.</p>
<p><em>Note</em> The use of <code>command</code> which bypasses our function export of <code>rm</code> introduced by <code>export -f rm</code> makes sure we use the original command and not our mock.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;"># Stub rm to capture files deleted
</span><span style="color:#95cc5e;">function </span><span style="color:#60a365;">rm</span><span>() {
</span><span> </span><span style="color:#d65940;">for</span><span> arg </span><span style="color:#d65940;">in </span><span style="color:#f8bb39;">"$@"</span><span style="color:#d65940;">; do
</span><span> </span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[[ </span><span style="color:#f8bb39;">"$arg" </span><span style="color:#d65940;">!=</span><span> -</span><span style="color:#d65940;">* </span><span style="color:#95cc5e;">]]</span><span style="color:#d65940;">; then
</span><span> cp </span><span style="color:#f8bb39;">"$arg" "${TEST_DIRECTORY_RUNNING}/tmp/$(basename "$arg").captured" </span><span style="color:#d65940;">|| return</span><span> 0
</span><span> </span><span style="color:#d65940;">fi
</span><span> </span><span style="color:#d65940;">done
</span><span> </span><span style="color:#95cc5e;">command</span><span> rm </span><span style="color:#f8bb39;">"$@"
</span><span>}
</span></code></pre>
<p>Now lets review the test, first we can do traditional expectation with the assert module following the standard, Given, When, Then structure we love. Let's look at how the When is structured too, because this is bash whichever assertion fails the program will exit there. So note the last line where we clean up the temp directory for the test. By leaving this as the last statement we keep the test artifacts if the test fails. Which enables better TDD, where we write a test that fails and continue to iterate until that test passes, meanwhile the test is also producing trace and debugging information about our work. We can do this with any command though, say we call <code>git diff</code> and we want to verify what we produced. We can intercept any command and have it write a file to our test workspace. Importantly, while not changing the subject under test.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>@test </span><span style="color:#f8bb39;">'test intermediate files' </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># Given
</span><span> local second_tempfile_expected=</span><span style="color:#f8bb39;">"WOW"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># When
</span><span> run bash ./.tests/temp.sh
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># Then
</span><span> local second_tempfile_actual=</span><span style="color:#f8bb39;">"$(cat ${TEST_DIRECTORY_RUNNING}/tmp/bats.2.captured)"
</span><span> assert_success
</span><span>
</span><span> assert_equal $(cat </span><span style="color:#f8bb39;">"$TEMPS_COUNTER"</span><span>) 2
</span><span> assert_equal </span><span style="color:#f8bb39;">"$(</span><span style="color:#95cc5e;">[ </span><span style="color:#f8bb39;">-f $TEST_DIRECTORY_RUNNING/not_temp.sh </span><span style="color:#95cc5e;">] </span><span style="color:#d65940;">&& </span><span style="color:#95cc5e;">echo</span><span style="color:#f8bb39;"> 0 </span><span style="color:#d65940;">|| </span><span style="color:#95cc5e;">echo</span><span style="color:#f8bb39;"> 1)"</span><span> 0
</span><span> assert_equal $second_tempfile_actual </span><span style="color:#3c4e2d;">#second_tempfile_expected
</span><span> assert_output --regexp </span><span style="color:#f8bb39;">'Done'
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># _Note_ The use of `command` which bypasses our function export of `rm` introduced by `export -f rm` this makes sure we use the original command and not our mock.
</span><span> command rm -rf </span><span style="color:#f8bb39;">"${TEST_DIRECTORY_RUNNING}"
</span><span>}
</span></code></pre>
<h2 id="just-test-things-and-be-happy">Just Test Things and Be Happy <a class="anchor" href="#just-test-things-and-be-happy">🔗</a>
</h2>
<p>This is just one dumb example of how to think about your testing and how to build up useful tooling that caters to your work. Now go write some bash and make sure you test it, trust me orchestrating a call to <code>git</code> is 10 times easier than screwing around with some git integration for your language of choice. These tools were meant to work together in the shell and you will be happier just getting things done. Double happy when you can prove it works with a test.</p>
<h2 id="errata">Errata <a class="anchor" href="#errata">🔗</a>
</h2>
<h3 id="sh-is-not-bash-and-vice-versa">sh is not bash and vice versa <a class="anchor" href="#sh-is-not-bash-and-vice-versa">🔗</a>
</h3>
<p>While not functionally errors, the title of this work should be focused on bash. Since a lot of the sample code are bash-isms especially <em>exported functions</em>.</p>
<h3 id="the-sh-alias-and-ci">the sh alias and CI <a class="anchor" href="#the-sh-alias-and-ci">🔗</a>
</h3>
<blockquote>
<p>run sh ./.tests/temp.sh</p>
</blockquote>
<p><code>sh</code> is often an alias on modern systems and this can have a huge impact when you scripts run in CI or more namely a non-interactive or non-login session. Where you CI might offer an Ubuntu or Alpine Linux image that provides <code>bash</code> as an alias for <code>sh</code> it may use a lighter weight implementation like <code>dash</code> when running your tests. Because we are using features that are explicitly bash we should have our test suite <code>run bash ./.tests/temp.sh</code> as such I have altered the above example accordingly.</p>
Copying LifeSun, 13 Jul 2025 00:00:00 +0000[email protected]
https://developmeh.com/devex/copying-life/
https://developmeh.com/devex/copying-life/<h2 id="copying-life">Copying life <a class="anchor" href="#copying-life">🔗</a>
</h2>
<p>With a stated unawareness of times prior to my own experiences, but with my present perspective that while times change human nature is very repeatable.
At some inflection point more concern was given to the consideration of others than the self. While this is possibly quite a natural process of aging,
it is also possible it was influence. That pressure that you have to grow into a specific thing, in business it feels like all final destinations are
management. In life maybe it's to drive a lambo or have a luxury lifestyle, full of travel and expensive foods. Logically, you will see a destination
and experience the pressure to drive towards it if the world around you is flowing the same way. This could be a local community influence, family,
friends, or that of other media.</p>
<p>Originality of thought and experimentation is how we build great things, even if we drop the lofty language and ignore the other side of this see-saw
which leads to contrarianism. Copying on the other hand may express a deeper lack of control of ones environment. How does one find something new
without trying? By keeping up with the Joneses, homogeny has always been something that has been forced on us in a surprising way. We laude the
successful eccentric but criticise the awkward. To be clear this definition of awkward is only those who seem to be existing tangential to the norm.
Forgoing that nuance of traditional human hypocrisy there is a clear place where unusual is preferable.</p>
<p>From the perspective of a simpler system that is software, a lot of time is spent defining the nature of "social" interactions with code, ahead of time.
Within each of us is a 3PO that helps us determine how to deal with other people. Oppositionly, the interface is created in realtime and only sometimes
repeated. I would allow the extrapolation that software has it easier and humans take shortcuts by copying.</p>
<p>As a child we learn through mocking and then progress through a rebellous phase to generate self identity. A curious process that flip-flops from very little
identity to an over-abundance. Life then gets complicated a little later and the rally point is to reduce complexity through normalizing with our community.
After you have seen your 10th beige to tope patio home complex a new concern begins, what is everyone doing all day? Next comes the hard question which
might be the trigger for a natural midlife crisis. Similar to that previous transition from mocking to rebellion does this cycle repeat? There is a middle
phase of development where rebellion is once again preferred to recertify our independence from a system that demands considerable conformity. Although,
much of this is self-imposed, the systems around corporate offices and family structures leverage the need for our core needs to be leveraged against our
higher emotional needs.</p>
<p>Consider diversity in the office, a term that rightly has been co-opted in support of underrepresented groups also includes the rest of us. Its only when we
lift together do we all get to share in something better. Those that often reach the higher tiers did so with a considerable amount of conformity. That could be
meeting goals or promoting company ideals but some of us who excel in these places have a personality that allows for those tradeoffs. This isn't saying that
those people were or were not in conflict during that process but, repetition does become habitual, and we find ourselves repeating the dogma after enough time.
You may find yourself having silenced that independent voice that represented your individual spirit. Conformity is like a disease that damages your creative
tissues. The way to measure your interactions with others should be aligned ith the Big 5, of those; Openness to experience, Conscientiousness, and Agreeableness
is how we should measure our success.</p>
<p>Someone can always work harder than you, someone can always eat your lunch, and you may lose your business or job, but your success will be dictated by how you
deal with others. This is its own kind of rebellion against capitalism's identity of humans are exploitable for money. While not a direct avocation of Marxism,
there is something about servant leadership that resonates with it. Although, this may be more about a reduction in obsession with monetary assets and greed
compared to our compassion for our fellow man.</p>
<p>Unfortunately, when we are influenced in non-conversational experiences we also lose some of our agency though reinforcement that should be met with rebellion.
Your days should be spent thinking and expressing, a willingness to learn while trying not to judge others. If you start your days with your own thoughts
you build a barrier against accepting the wrote of others that will invariably be pushed upon you. You should also identify who you will allow to influence you
using the same measures you will use for others. You should refuse those who do not meet them, in a world where dialogs are rarer you must save your time,
control your inputs and expand your outputs.</p>
Sufficient ComplexityTue, 01 Jul 2025 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/sufficient-complexity/
https://developmeh.com/soft-wares/sufficient-complexity/<h2 id="sufficient-complexity-and-pipe-herding">Sufficient Complexity and Pipe Herding <a class="anchor" href="#sufficient-complexity-and-pipe-herding">🔗</a>
</h2>
<p>I still think a lot of what I do day to day in software aligns well with plumbing. DevOps is like warehouse work, and Product is herding. Before the last batch of product engineers, pipe herders, I recall much conversing about things like pragmatism and simplicity. When you think about the pipes in your home, you hope they aren't too complicated. Complication in pipes leads to pinholes, leaks, and sudden noises from the dark places we dare not look. Plastered over and expected to last the long haul, we forget exactly where each one is until we wanna make an addition, drill a hole, or something goes terribly wrong. Software is often like this too, long forgotten, and sometimes completely unused sections live in the shadows. I really thought this was rather boring in my younger years and creates an interesting condition in the enterprise around scope. Too much scope, and we get exactly what we think we want. Too little scope and we end up with too many pipes. I still believe that much of this was dogmatic rhetoric. A book makes the rounds and is praised, others consume it and take it as cannon, producing an effect that is the same as not knowing, over-knowing. Now, with much of that kind of nonsense falling out of style, we wonder why we can't keep our software from crashing. To be crystal I am not saying that due to a failure to care about craft is the reason this is happening. Craft, for better or worse, is its one kind of over-knowing, consider if through rigorous focus and process one can eliminate mistakes or bad design. I argue it's just a matter of speed, things that go fast tend to have lower survivability. A generalization for sure but its better to relate this to change management over velocity in the mind.</p>
<p>So Sufficient Complexity is the mark where we can say something is done, not to be confused with finished. This feels impossible in the land of building products on the web, but I promise it is still achievable. It also doesn't matter if you are working with a monolith or a microservice, but it is about dependencies at the core. Step 1 is to eliminate the word <em>common</em> from your vernacular, followed by <em>shared</em>. While they may look safe, these are traps, hear to eat your time and sanity. Just like we can have a perfect project layout like <a href="https://developmeh.com/soft-wares/sufficient-complexity/devex/the-perfect-dev-env/">here</a>, we can have just the right size of features in a box.</p>
<p>Now here is the most important lesson you will learn, <strong>Everything is a File or Folder</strong> depending on your observable distance. This counts for how you organize your project in version control all the way to how you deploy your application containers. Folders and files are what matter and the relationships they play to each other. I feel like the oft overlooked power of all of this is the <em>interface</em> or the <em>interaction pattern</em>. It gives us a fixed view of how something is to be consumed or constructed and provides the most meaning in relation to producing moderately stable software. Consider the following I have a folder full of functions on the left and a folder of consumers on the right, between them, I organize those functions into groups called interfaces. Once two interfaces share the same function, I have created a problem, one that is sometimes unsolvable but often avoidable.</p>
<span style="" class="mermaid">
graph LR
subgraph "Functions"
F1[Function 1]
F2[Function 2]
F3[Function 3]
F4[Function 4]
F5[Function 5]
end
subgraph "Interfaces"
I1[Interface A]
I2[Interface B]
end
subgraph "Consumers"
C1[Consumer X]
C2[Consumer Y]
end
%% Good pattern - clean separation
F1 --> I1
F2 --> I1
F3 --> I2
F4 --> I2
F5 --> I2
I1 --> C1
I2 --> C2
%% Problem case - shared function
F3 -.-> I1
style F3 fill:#f96,stroke:#333
style I1 stroke:#f00,stroke-width:2px
style I2 stroke:#f00,stroke-width:2px
</span>
<blockquote>
<p>The diagram above illustrates the concept. On the left, we have a collection of functions (Function 1-5). In the middle, we have interfaces (A and B) that group these functions. On the right, we have consumers (X and Y) that use these interfaces.</p>
<p>The problem occurs when Function 3 is shared between Interface A and Interface B (shown by the dotted line). This creates coupling between the interfaces and can lead to issues when one interface needs to change but can't without affecting the other interface. This is why interface segregation is important - each interface should have a single, focused purpose with its own dedicated functions.</p>
</blockquote>
<p>That's just if we share the same functions across two interfaces. Imagine how the rest of the internet works? So here is where your bugs are coming from most likely. To be clear, there is nothing you can do to avoid it, so lets get the doom and gloom out of the way. Code like pipes work the best when singularly focused. For example a pipe that feeds other pipes isn't a faucet line but a feed line. Maybe it started its life out going from the source to your bathroom faucet, and at some later point you installed a shower. At that point you created an interface, a physical one, and the nature of the pipe changed. At first it interfaced with the faucet and then that line fed an interface which interfaced with the new pipes, those interfaced with the shower and faucet individually. If we were then going to install a washing machine, (a beautiful European concept) in our bathroom, we might realize that the feed line in place doesn't meet the volume our washing machine needs. We will have to run a separate line for our washing machine.</p>
<p>We don't usually make the same decision with software, though, bits are very malleable, and our pipes are scalable with an injection of cash. I like to think of how to deal with coupling the same way I deal with pipes. If my needs cannot be met at the current interface, it's time for a new line maybe that's a new module or a new microservice and it might even copy some of the code from the existing pipe but it doesn't make a dependency on it. Long term we want to create solid permanent things. That are resistant to external change unless acted upon. I know this sounds like heresy and a lot of work but I promise its worth it. You will not end up with a bunch of duplicate code that matters. The things you copy will be boilerplate specific to the cause. The parts you don't copy are the items that you can depend on that don't require you to modify their interface.</p>
<p>Sounds hokey and more of a cry for "hey this way stinks, go do it this other way cause it's different." It's not a new concept, though, because this is the principle of module boundaries. I usually explain this to my team as not <em>Goals</em> and <em>Non Goals</em> but <em>Spiritual Goals</em> if I wanted to draw a circle around a unit if code such that it produced no more and no less than it needed to and met the <em>spirit</em> of its purpose that is what we build. Its not as hand wavy as it sounds, but it does require understanding the scope of the work completely which I'll admit is not something everyone can always do. Arguably its this need to navigate ambiguity which leads more to poor design than inexperience. But I like this more formal term of <em>Sufficient Complexity</em> to make <em>Spiritual Goals</em> less techo hippie. Allow us to continue, a module is sufficiently complex when it provides a new complete boundary of its context. Your ears might be itching because this sounds a lot like Domain Driven Design(DDD), and you would probably be right. But DDD is interested in pathways through a system and is a very top down kind of concept. I, on the overhand am proposing a bottoms up approach. Something that I might slot into Agile or XP where we don't know all the scope before we start, and that's both normal and ok. But as we discover complexity, we promote context boundaries instead of the shortest path to completion.</p>
<h4 id="examples">Examples <a class="anchor" href="#examples">🔗</a>
</h4>
<p>Let's explore a couple of simple examples, first the webapp-common lib and then the universal modal dialog.</p>
<p><strong>Webapp-common</strong>
In <strong>webapp-common</strong> as the title describes, we are going to configure a number of tools and dependencies that all our webapps share inside a single module. The first question we should ask, after we stop screaming because we successfully forgot the word <em>common</em> from earlier, is does this module describe a clear boundary for behavior? <strong>No</strong> not really. If you said yes, that's ok, you may even think the boundary is web apps. Still not wrong but not great either, because this exceeds <strong>Sufficient Complexity</strong> how can I imagine this module every being done. Since we will have many web apps and they will have all kinds of responsibilities its likely not every web app will need all the functionality of the webapp-common. This introduces a risk of being a dumping ground of interdependent libraries that over time, event versioned, will slowly start to poison each other. Because these libraries are also commonly shared, this pollution will touch everything.</p>
<p>Whats the solution? Well its always about informing on the pattern through an interface. If this is a Java Springboot project, we would want to introduce bean configurations optional transitive dependencies. Check out this sample project <a href="https://github.com/developmeh/java-no-more-common-lib">java-no-more-common-lib</a></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>spring-gradle-example/
</span><span>├── build.gradle </span><span style="color:#3c4e2d;"># Root project build file with common configurations
</span><span>├── settings.gradle </span><span style="color:#3c4e2d;"># Project settings file
</span><span>├── jackson-module/ </span><span style="color:#3c4e2d;"># Jackson module with baseline dependencies and configurations
</span><span>│ ├── build.gradle </span><span style="color:#3c4e2d;"># Jackson module build file with Jackson dependencies
</span><span>│ └── src/
</span><span>│ └── main/
</span><span>│ ├── java/
</span><span>│ │ └── com/example/jackson/config/
</span><span>│ │ └── JacksonConfig.java </span><span style="color:#3c4e2d;"># Auto-configured Jackson configuration
</span><span>│ └── resources/
</span><span>│ └── META-INF/
</span><span>│ └── spring.factories </span><span style="color:#3c4e2d;"># Auto-configuration registration
</span><span>└── service-module/ </span><span style="color:#3c4e2d;"># Service module that uses the jackson module
</span><span> ├── build.gradle </span><span style="color:#3c4e2d;"># Service build file that overrides Jackson versions
</span><span> └── src/
</span><span> └── main/
</span><span> └── java/
</span><span> └── com/example/service/
</span><span> └── ServiceApplication.java </span><span style="color:#3c4e2d;"># Spring Boot application
</span><span>
</span></code></pre>
<blockquote>
<p>Here the solution is to avoid common and instead create building blocks, this works with maven or gradle but gradle is a little clearer. The only thing that <strong>jackson-module</strong> exposes is a specific configuration for jackson and provides a baseline for the jackson version. For our implementation to be sufficient we can use that baseline or in this case override it with the version we want. Given the version we choose can meet the bean configuration this provides we can apply this as an interface to our web-app. I picked jackson here because they are pretty bad at SEMVER, and I often have features that work in one minor version and not in another due to poor planning or deprecation. The problem is my need to jackson is intermingled with a whole common library usually. I can, of course, override and qualify a new bean as primary but I would rather have a choice if I should include it first. So instead of having spring.factories load this bean, I can @Import it in my application. This way I can control what my application consumes from its library. While that would be foolish for such a simple dependency. There is a real case where jackson and a few other libraries are joined together in a serde (serialization/deserialization) module. Which has a slightly broader context but it supplies common configurations for our serde.</p>
</blockquote>
<p>Regardless, the point here is we often see the <em>simplicity</em> of a common library that sets up all our dependencies as a big win when we start a group of projects. It quickly becomes something of a pain point when too many live together without a spiritual goal or shared contextual binding to make it meaningful to include it. Also, setting up a new project is not where the time in development is lost. So making that to focus on speed is a false hope. We always want to make maintenance of a code base the easiest solution. But moreso we want to create a space where we don't have to touch a codebase for a long time because its <strong>done</strong> and thus <strong>Sufficiently Complex</strong>.</p>
<p><strong>The Universal Modal</strong></p>
<p>So now lets switch to Javascript and React land, we have a fantastic pattern for module size in this ecosystem and we probably don't have some annoying common library mucking up our sanity. But, we also work in the more visual scope and that means less technical people can fail to understand the nature of our work. They kind of see it as "configuring the browser" and less of a formal data flow and user interaction platform. We have been asked to build a model that can act a little like a slide deck. Starting in one context and then on each step asking if there is another context and providing a new interaction on each slide. Honestly, this sounds pretty cool, and I bet many someones out there have tried to make something like this. The first question we should ask, after we stop screaming because we are building power point as a model, is does this module describe a clear boundary for behavior? Once again <strong>No</strong> not really. If you said yes, that's ok, you may even think the boundary is a dynamic modal or an iframe. Still not wrong but not great either, because this exceeds <strong>Sufficient Complexity</strong> how can I know all components and interactions a designer might want to sequence before we know they exist?</p>
<p>When we build general purpose components for the case of reuse we are falling into the trap of creating code that over-knows about its purpose. This is the poster child of the cat with 4 normal lets, and a human ear and arm jammed on there we all saw back in school to describe software in the wild. This code will never be done and will continue to acquire features and conditions until it becomes to complex to work with. It will also be a nightmare to test because while a flow of slides can be static in intention they are in fact dynamic and the synchronization of what a test can verify and what we will present will diverge.</p>
<p>This is very much the counter example of the former. Instead of worrying about known competing dependencies we are building something smarter than we are right now that can anticipate the changes of tomorrow with complex design. The time spent building such a tool will never pay off, yes we can make something new nearly instantly but we can't test it instantly. It will also be a constant source of bugs that will also be complicated to verify.</p>
<p>What do we do instead? Well focus on what is <em>Sufficiently Complex</em>. The next question we should be asking is, do we know what we want to build? <strong>No</strong> doesn't sound like it. Every time the idea of building the solve everything module comes up just accept that it would be better to know what needs doing now. How we can codify a process that makes repetition require less discovery for the next person and build exactly what we need. I promise when you need to come back and adjust slide 3 of flow 1 you will be much happier that you build layout components so you have uniform styling and that just because you are changing the info link on slide 3 it doesn't automatically bump around all the info links on the other flows. Like all good poetry we need to start with a rhyme. Software is hard to rhyme at first and poets spend a lot of time with words before becoming poets. So an expert can create a successful general system but there is also a lot of bad poetry out there. We build similar components and keep an ear out for the rhymes. Each pass we make we reverberate those sounds until we have something that repeats. Essentially, you don't start with the poem you start with they rhyme and the theme. Which is at its core an interface we make with this kind of component.</p>
Do Devs Really Do DevOps in your Org?Thu, 26 Jun 2025 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/do-devs-really-do-devops/
https://developmeh.com/soft-wares/do-devs-really-do-devops/<h2 id="do-devs-really-do-devops-in-your-org">Do Devs Really Do DevOps in your Org? <a class="anchor" href="#do-devs-really-do-devops-in-your-org">🔗</a>
</h2>
<p>Recently, I learned the more formal definition of shift-right and shift-left in terms of Agile DevOps. For a brief refresher and for brevity it goes a little something like this:</p>
<ul>
<li>Shift Right -> Validation and testing in production</li>
<li>Shift Left -> Validation and testing are before production</li>
</ul>
<p>Now that's kind of the intended definition, and it makes perfect sense. In fact, I would probably go, "hell yea, this is just smart." I naturally subscribe to the XP (Extreme Programming) subset of Agile, and that generally means I just pump out tiny slices rapidly that are often not a complete feature. Think of it like writing a book chapter by chapter and having your editor review it as you go. This kind of process means you will miss some things on the first pass but spend less time on <strong>discovery</strong>. Not advocating just calling out the causal nature of this decision. So there is a lot of refactoring and revelation through the process that this creates.</p>
<p>Generally, shift left proposes some big claims; the stinkiest are the following: <a href="https://www.dynatrace.com/news/blog/what-is-shift-left-and-what-is-shift-right/#the-benefits-of-shift-left">ref</a></p>
<ul>
<li>Reduces Cost</li>
<li>Improves Collaboration</li>
<li>Faster time to market</li>
</ul>
<p>Holy cow, sign me up, I want <em>cheaper</em>, <em>faster</em>, and <em>better</em>. Even if the claim violates the mere existence of the <em>good</em>, <em>fast</em>, and <em>cheap</em> love triangle. We all smell nothing, but poking fun at the top keyword buyer shill article wrapped as a blog post isn't the goal here. I wanna bring this into focus of practicals and current experiences.</p>
<p>Here is my experience with shift left devops in the wild. As a developer, I am giving access to a CI runner that can execute terraform or cloudformation, doesn't matter. I am given some tools that might add some constraints to that process like a terraform wrapper or a set of CI templates. I am then told I can just build whatever I want. Except:</p>
<ul>
<li>I have no way to interact directly with the terraform state.</li>
<li>I can't view resources in the cloud providers console, and I cannot manage IAM roles/policies.</li>
</ul>
<p>What I have been granted is the illusion that I can self-service and a new stack of problems to solve through a fog. Because, I can ask for support, but it will be through a ticketing system, and the resolution will take weeks.</p>
<p>While I am skilled in devops, I would say that 80% of my peers are not, and thinking about it, the condition I have described is simply, go use terraform but never actually run terraform. All operations must be performed through an environmental suit. Let's revisit our targets, unrealizable as they may seem. Have we reduced cost? In some ways, yes, by distributing the workload we have reduced the need for specialized staff, we can argue that most devops are routine and probably cookie cutter, so having a group oversee the orgs work streams is a better spend. Does it improve collaboration, probably not. As I have been the member of now a SecOps team, run a DevOps team, lead a L3 Support team, and spent the rest of my career in mines as a developer. When you create a centralized management team, you have a choice, they can directly collaborate, providing deep value and insight as they touch the people they support, or you can make them the slaves of a ticketing system. Since its rather difficult to account for performance and costs associated with staffing without any figures to back them up, you will end up with a ticketing system, probably. Let's be honest, that's not, "Improves Collabortion," that's, "Sets up a call center." So you put some very talented people place them in the complaints department and say, drink from the firehose, k thx bye. This is getting too dark, let's circle back about cost again, so your devs are reaching out to your devops team for support through your ticketing system, and you can track your MTTR as 3 days. Wow, Kudos, we are doing business, good business actions! But had we honestly shifted left, the developer would have probably solved the issue themself had there been some trust and access. It would have taken a couple of hours and possibly a message in a chat channel for a code review. I get it this is argumentative but its also a generalized understatement. MTTR isn't 3 days, it's 3 weeks, and the dev would have solved it in days, not hours, but it's about ratios. That, of course, puts a finger on the last target of delivery speed.</p>
<p>I kinda see this like AI for devs these days, giving away our ownership feels bad because for many of us it means we might be less special. It might mean that critical thinking and planning are the real skills, not butts in seats. Of course this is about devops, not LLMs and code gen, so what's the next step?</p>
<p>First off, let's make the sharing of responsibility for devops not a chore. I need infrastructure and devops people to do the good work, help me pick the right tools, be experts in the cloud or servers. I see that as the divider, <strong>knowledge not access</strong>. What you want from your developers is to be able to find resources and use tools; sometimes people and sometimes documentation. Then you want them to be able to evaluate those things before they make their way to production. This means taking away the training wheels, devops teams produce products, not interfaces. There was a time when terraform was the new kid on the block that devops teams provided modules to isolate the patterns they want to repeat. I know this is how I did it when I ran my first DevOps team. We didn't kind any of the sausage and we provided support like any <em>Open Source</em> project would. There was documentation and READMEs, along with tools and tutorials. Most importantly, there was access, developers could run terraform locally, create infra using their developer accounts, and submit pull requests for our modules. Better yet, they could read our modules. We did a lot of the first people on the scene to a new concern; once vetted, it was normalized for repeat use. We saw ourselves as the caretakers, with a motto of "yes and." We used a ticketing system but only internally if something was more than just a conversational solution. We took notes and turned those conversations into FAQs. We did a lot of work in chat, we relied on the fact that our company chat was searchable. We keep discussions in public channels as a backup body of knowledge. We trained devs to talk devops, and what we discovered is or devs loved to learn. Plenty of the time we didn't even need to respond to a chat request, since it was probable someone else had already encountered the issue they would speak up.</p>
<p>I know it sounds like, I have been poo-pooing the farcical benefits of shift left, and I surely am. I just want to remind you when we talk about money and timelines in polite company its considered gauche. Although, we can spend 10 minutes of a 30-minute meeting worried about efficiencies with 6 other humans. That's an hour spent bub, by the way. Instead, we <em>focus on the people and the problem</em>, <em>not the problem is the people</em>. All I am asking is for you to consider this famous quote, "The more you tighten your grip, the more star systems will slip through your fingers." There are orgs that get this kinda thing right and it's not just devops. Its always good to identify if you are in one of those orgs and what you can do to change it. If you wanna shift left, shift left. Otherwise, hire some more smart people and adopt waterfall for your projects, it works great, honestly. If you can't trust your devs to mind the cashbox with regard to infrastructure access, it means you haven't done the upfront work to position your DevOps team as a sheppard of their craft.</p>
Creative Impostor SyndromeWed, 25 Jun 2025 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/creative-impostor-syndrome/
https://developmeh.com/soft-wares/creative-impostor-syndrome/<h2 id="another-syndrome">Another Syndrome?! <a class="anchor" href="#another-syndrome">🔗</a>
</h2>
<p>In reality, I am merely saying I had impostor syndrome, but not the workplace kind, albeit its a bit related. Unlike the kind many of us have early in our careers, my relationship with creativity has much more complicated roots. My definition of what creation is has always been mired in a deranged triangle of, <em>Art</em>, <em>Income</em>, and <em>Moat</em>. The outputs of creation should be classifiable as <em>Art</em> while being consumable to make <em>Income</em>, but without reducing my <em>Moat</em> that protects my ideas. How does one create something from inspiration and spend their time mostly worrying about how to keep others from copying it and stealing my profits? When I say it out load it sounds as insane as it is. This fuels my FOSS rebellion; give away whatever I write because in the end a product is the <em>Moat</em> not the software, which is only a vehicle. I am not saying that work can't be creative but its not a source of inspiration, it is a profitable way to expand my skills, like the opposite of academia in a way. I have spent my time in academia as well, paying fiats to its own kind of feudal system contrasting the word games and politics of the enterprise world. In the enterprise world you have to be rather special or lucky to get involved in novel projects; I think I have been rather successful in the latter.</p>
<p>Somewhere the enterprise weaseled its way into OSS, and a lot of it seemed to turn into a freemium where a focus on <em>Income</em> could be associated with free. I see less and less of work being broadcast back from the enterprise to the community, a project now goes from launch to subscription in a single go. If you spend any time on LinkedIn, your feed will be filled with peddlers promoting vaporware and "Stealth Startups." I am not event just talking about AI. I can recall in a previous role we were looking for a static and dynamic security tool for our Ruby monolith. In case you don't have familiarity with the language, it heavily supports meta-programming (code generates itself) and before there was real AST support, it dramatically resisted analysis like this. Of course, CodeQL started peeking out at this time, and that's the solution I organized a deal for overall, if you don't know go check it out. Anyway, the number of vendors that would pull us into a demo call knowing we were using Ruby and knowing their tool provides nothing more than a Brakeman wrapper would still try to sell us a subscription. The demo would never happen, and it was almost like we should buy their product because they existed, or worse, bought Brakeman and bolted on a UI that I didn't need. Brakeman was eventually forked and is a solid project, it was someone's creative endeavor that got lost to "Income."</p>
<p>You should be picking up the conflicts now.</p>
<p><strong>Creativity, by my definition, is giving some reality where only inspiration was present.</strong> It doesn't mean we need to have a new idea, just new to us at the time.</p>
<p>I used to say that in the job we are just plumbers wiring up dependencies so the toilets flush just right. I still think some of that is true, because there was space for invention long ago, the frameworks were toolboxes and not ecosystems. The latter is good for throughput, take my agency away and give me a thing that works, so I can get something done, is a big win. Its why pipes have standard diameters and schedules, so we can add a valve to an existing line and a tap for our new washing machine. I don't want the plumber to have to think too hard, there is an application and the right medium to move that water. Ultimately, what I want them to be good at is preparing the pipe, brazing/soldering, and fitting selection. Is there artful plumbing, yep, but it's in the detail, <em>Art</em> tends to compete with <em>Function</em>? In the software world this is why we despise cleverness, we are focused on the function, and inventing our own pipe is a challenging endeavor.</p>
<p>So, taking that into account, creativity might seem a little out of scope for my field. So I want to set down some rules for the push-pull. If you are creating for <em>income</em>, your creativity isn't focused on the software its channeled to the human product, Kudos humanitarian. I don't wanna really build a product that feels like it would lead me to talking to too many people. If you are creating for beauty, your creativity isn't focused on the software; its focus is syntax and semantics, Kudos you poet. If you are creating for fame, your creativity isn't focused on the software, it's focused on your brand, Kudos teacher. If you are creating for experience, your creativity isn't focused on the software, it's channeling your need to control your world, Kudos explorer. I have yet to find the answer that results in a focus on the software, but I think it might be somewhere intersecting, simulation and game design. That's not the message, though; what I concluded is just because I don't much find passion in building products or fame doesn't mean I am not creative.</p>
<p>I don't need to call myself an artist to feel whole, but I do want to feel I could joke about it from time to time. I am driven to feel <em>Authentic in my Art</em>, which is probably only something I can truly come to terms with. The problem has always been that I can create something from nothing, and I can do that expressively such that I have caused others to feel emotions in a wide spectrum. But so much of that creation intersects with the need for pragmatism, protection, and income that I could not realize it as a creative act. Problem-solving is the brush, and the code is the canvas, but the production is mechanical. The momentum of creating software for the purpose of money has all but destroyed it as an artistic pursuit. The revelation started when I needed something to write about, I don't want to talk about Passkeys or Nix really, I would like to talk to others about them, but the novelty is gone. I just stopped worrying about if what I was building was a good idea, and the world opened up to me, I built pointless beautiful things that would be only useful anecdotally and I was free. I just dub them, <em>crazy ideas</em>. I have a pile of pipes, can I make a snowman? Or, could I make a form send data directly to a server on my phone without a persistent connection? The value is solely the novelty and what I will learn during the journey. That is the creative act I want to align with, and I have been doing that my whole life. I only failed to see the truth because I kept injecting the requirement that others need to love it, or it needs to be a revenue stream. The techno hippie in me firmly believes that software is the closest thing we have to a hidden force that can shape society without being interfered with by politics and money if we just committed to it.</p>
<p>If you want to peruse some of the crazy ideas, they are over here <a href="https://developmeh.com/i-made-a-thing/this-weeks-crazy/#this-week-s-crazy-idea">This Week's Crazy Idea</a></p>
<p>"The immature poet imitates and the mature poet plagiarizes" - T.S. Eliot</p>
This Week's Crazy IdeaSun, 08 Jun 2025 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/this-weeks-crazy/
https://developmeh.com/i-made-a-thing/this-weeks-crazy/<h1 id="this-week-s-crazy-idea">This Week's Crazy Idea <a class="anchor" href="#this-week-s-crazy-idea">🔗</a>
</h1>
<p>In all honestly tech is completely boring. Nothing shakes me to my core anymore. Remember Web-Rings? I do, they sucked but it was a time when peoples ideas were well constrained by context. Back in those days you had to get a host, write you own HTML and CSS, accessability concerns of the time aside, things were ugly and simple. It made up for all the complexity of getting a few paragraphs to show up on someone elses screen. We kinda gave all that up for the global town square, just worry about the paragraph and maybe some photos, the content baby. Like a small business there is some charm in the agency to make something fantastic and utterly fail at it. Its about the effort and the intent, that being self-expression, and the barrier to entry was just heavy enough to keep the boring outa the way.</p>
<p>A price we pay for giving everyone a voice is not everyone has something interesting to say. Its that little pain like wanting to run a newsletter for a thousand people but having a dot-matrix printer. There is a little nagging voice in the back of your head saying, I cant listen to that think print a thousand sheets just so someone can throw it out. But it was just that kind of drive that got you to do it, the creative act of getting someone to react to what your wrote.</p>
<p>Thats what this is all about, the creative act. It's not doing things because they are profitable or even relevant, but because they are interesting or fun.</p>
<p>Though in some ways I am talking about giving every internet connected person a voice but one that they control and not one that promotes clout. There is clearly a value to a central platform for discovery, and in some past world that was the responsibility of the search engine. Now I think this is more about append histories instead of sitemaps and some very clever automation for a federation that provides an index of the internet.</p>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h2 id="14-07-2025">14 07 2025 <a class="anchor" href="#14-07-2025">🔗</a>
</h2>
<h3 id="just-build-binaries">Just build binaries <a class="anchor" href="#just-build-binaries">🔗</a>
</h3>
<p>The more time I spend with LLM Coding Assistance I become aware of how bad the tool is at any significant planning regarding a complete feature. What it does well though is create human interfaces be that actual UI, or CLI / API, they are pretty good. Better than even I might build on my own. Those interfaces are also build with incredible speed in one go. So this brings me to a consideration in how I should build shared tooling. At one point my goal would be to build a PoC in my language of choice the expose a common set of features and wait for others to copy the project into their language of choice. Now this seems foolish since I will, with or without, the LLM have to spend my effort on the actual problem and can nearly divorce myself from the interface I think a project should be focused on defining an interface specification in a language agnostic way that expresses the usage intent and doesn't bother with the implementation details.</p>
<p>The core behavior then can be written in something that exposes an interface through ABI / FFI essentially, compiles to a shared lib. This really isn't anything new and generally the way everything goes is once its gunna be shared someone starts to build a generalized library and a series of wrappers. What I am conjecturing is maybe we should just start there. Build our tool and immediately expose it as a binary interface. This even opens the door for tool using LLMs to directy open and call symbols from our libraries. This is kind of LLMs in the kernel where they can control the underlying operating system. Instead of working on top of it. I mean nothing sounds worse to me than a non-deterministic operating system. But, one that can generalize a command from the underlying C building blocks then means that someone has to build the building blocks.</p>
<p>I have experienced that LLMs break down when dealing with anything that has a clock attached, specifically in my case related to networking. If a process needs to wait for collaborators to connect it can't seem to figure that part of the sequence out.</p>
<p>The result of this idea is something like a CLI framework. Not one that helps layout the commands and flags but that provides CLI features. Like network tools and storage. The real crazy idea relates to k8s. Which I often have exec access to a pod but often don't have enough tools. Debugging some issues in the past I learned I can write files to a pod through my connection and my next thought is why not build a tool that can inject an agent on demand into my pod and then act as a proxy for diagnostics. Copying binaries, building tools and extracting logs into time series dbs by polling log files. All of this without having to monkey with the container image :) then clean up on exit.</p>
<p>Well thats the nature of the project and I want it to be plug-able. Injecting things even using embedded runtimes and binary quines. It feels like a hackers toolkit but dealing with containers are time consuming and given your exec perms let you write to the fs and exec chmod which is clearly in the scope of the container maintainer its just a feature set.</p>
<p>So keep an eye out for that.</p>
</div>
<div class="devlog-entry">
<h2 id="13-07-2025">13 07 2025 <a class="anchor" href="#13-07-2025">🔗</a>
</h2>
<h3 id="everything-is-a-stream">Everything is a stream <a class="anchor" href="#everything-is-a-stream">🔗</a>
</h3>
<p>With ths shutdown of Pocket I started thinking about the Krappy Internet project and what kind of noise that would have made. Would anyone read the stream of content? Is streaming the right answer? Probably, not. Tech has grown to a point where it tries to consume our focus and at some level is really just documents over the interent. Some of this is the issue with being fixed to a protocol like HTTP, the rest is sunk cost. I know the concept of a search engine is rather the core interaction process for any library. But that index is on a pull model, I could see a world where that is only a push. I wonder what kind of architecture we would need to build an index for the internet in real time. I think about how this site is built. I complete some written nonsense and then push that to a repository. The result is to render that to a CDN. Thats a majority of the useful content that the internet used to provide. Alternatively, if the content of twitter was much bigger and we treated comments as a natural part of the original article each update would be more meaningful. The validation of spam would of course have to shift to the content provider which is likely going to be a failure but if something like that had a consistent identity then like email we would know what sub content to automatically exclude.</p>
<p>At some level every idea distills back to persistent identity and that then conflicts with the need for anonyminity. There is probably a simple problem here, we don't generally index items without identity. Those naturally become live streams and maybe grouping by event and time like timeseries data like a human telemetry platform is interesting.</p>
</div>
<!-- ## 06 07 2025
### Decentralized advertising
So one of the things I find frustrating where I live is the amount of effort it takes to find interesting activities to attend. So it got me thinking how we could have something like a marketplace without having a central marketplace. Recently I have been playing with LoRa and meshtastic which is an interesting platform that I expect could be extended for things like this. When I think of centralization I think of history but that doesn't mean there is some special ownership. With commodity hardward its would be possible to hand off information about activities and offers. Assuming a critical mass of devices there is no reason to believe that a buisness would be unable to effictively distribute info to its customer base. Clearly, this is about shrinking the userbase while providing a similar level of visibility. When I think back to something similar -->
<div class="devlog-entry">
<h2 id="21-06-2025">21 06 2025 <a class="anchor" href="#21-06-2025">🔗</a>
</h2>
<h3 id="opentelemetry-and-the-question-of-ditching-logs">OpenTelemetry and the question of ditching logs <a class="anchor" href="#opentelemetry-and-the-question-of-ditching-logs">🔗</a>
</h3>
<p>This morning I had this thought that maybe one of the reasons tracing and Open Telemetry are kind of after thoughts in about 99% of the enterprise projects I work on may be the developer tools gap. Consider this, as a developer many of us only experience tracing "In Production" and only through a rather expensive platform. Is there really a place where tracing is the new debugging. See also those same 99% of enterprise projects also moved to structured logging a while back and to me, the structured log is a trace done poorly. That's an opinion of course but its informed by the fact that most of the time I need distributed correlation more than I need information about the state of the request. When I think of selective logging, I find that I am often making the choice of what not to log where with tracing the only thing I am missing is the context.</p>
<p>Anyways, the point isn't to try an convince anyone to go one way or the other, but the utilization would be greater if more of the tools were used during development. Here is where my ultimately crazy idea comes in. Jager and ZipKin are great but I don't really want to run an ELK (Elasticsearch/Logstash/Kibana) stack on my dev machine. Its a lot of extra setup and its a bit fiddly. I like to think of developer tools as just the basics of a production system. It also makes me think of how we just GDB and other debuggers. We execute them are runtime and use them to debug a specific process, often around a test. When I observe myself and other developers we tend to drop a lot of breakpoints on and around the flaw to identify the code flow that leads to the failing condition. I think of step into and step through functionality of GDB and I want a way to also get detailed trace info at the same time.</p>
<p>Guess what, its not just a crazy idea, its kind of a dumb one. Here is what I learned from the experience. Firstly, I tried to write my own OTEL collector in golang. Not so bad, but processing and visualizing all the traces as a waterfall was a little challenging. My work in progress on <a href="https://github.com/ninjapanzer/otel-tracer">Github</a>. So after I learned a whole lot about tracing and Open Telemetry I cam back to the drawing board and though how would this look if it was part of GDB already. The fun fact is that its kinda already there, not in this irrelevant auto instrumented way that I am proposing but in the nature of whats called a "tracepoint". Check it out I put together a sample you can try yourself as long as you have Go installed. <a href="https://github.com/developmeh/debug-tracing">Debug Tracing</a>.</p>
<p>So the short answer, yes there should be something easier than Jaeger and ELK locally to explore OTEL, but if you wanna enhance your own development process. Time to get comfortable with some more of the debugger tools that already have valuabl tracing and frame logging built in.</p>
<p>When you are in a tool like Goland or IntelliJ, you can have it add something more akin to logs at tracepoints so you don't have to stop on those or modify your code. Where GDB is powerful is it works on your binaries but language level tools work on the runtime code.</p>
<p>Expect more about a lightweight OTEL tracer for exploring traces locally too.</p>
</div>
<div class="devlog-entry">
<h2 id="15-06-2025">15 06 2025 <a class="anchor" href="#15-06-2025">🔗</a>
</h2>
<h3 id="webrtc-and-what-not-to-ask-ai-to-do">WebRTC and what not to ask AI to do <a class="anchor" href="#webrtc-and-what-not-to-ask-ai-to-do">🔗</a>
</h3>
<p>So to my great surprise I figured that the LLMs would be the right place to funnel my learnings about WebRTC. A technology that has been just outside my vision since I started my career. Why shouldn't I assume that building a trivial implementation with it with LLM support would save me a lot of cognitive overhead, given the long context of such a technology. I was wrong, it seems that as I delve into the underbelly of network topologies away from the chrome of NextJS and CLI tools the bottom falls our of the LLM as well. Its been a consistent thing on my radar that LLMs are only good at the tasks that push products to market but not the work that makes the products work.</p>
<p>Here is an enumeration of things that the LLMs tend to struggle with:</p>
<ul>
<li>Maintaining complex conditional states -- when logical nesting is needed it tends to get confused and will cycle back and forth breaking, fixing, and re-breaking sequences of operations</li>
<li>Understanding anything about internet topology including TLDs, eTLDs, eTLD+1, and private registries -- while working on the <a href="https://github.com/developmeh/passkey-origin-validator">Passkey Origin Validator</a> I was amazed that when I presented these concepts it generally couldn't maintain coherence about the meaning of those terms even though they are rather central to how domains work.</li>
<li>Establishing well documented network handshakes -- Something of a combination of the previous two. There is often a kind of ballet that happens establishing standard and p2p network connections. Since its a set of nested conditionals and requires an understanding of how time works, it struggles.</li>
<li>Dealing with dependency version changes -- My favorite class of failure, if the library changes the name of a package or a constant the LLM will just assume that the library is broken and remove it. What I find the most awkward is since the LLM is interacting with my computer and my project it has access to my dependencies and could search it to try and resolve the change.</li>
</ul>
<p>On the other hand a few items I think it nails every time:</p>
<ul>
<li>CI/CD pipelines -- Every time I need to run tests on a branch or release on a tag. The LLM handles it in one go.</li>
<li>CLI Frameworks -- Cobra nad Viper, for example an LLM sets up a fantastic set of arguments, config files and considers a lot of the edge cases for comfortable CLI use by humans.</li>
<li>Sequence Diagrams -- When I wanna learn a new technology finding a "basic" diagram for how it works is rather annoying. Theres always lots of specs to read but all the pictures are build dependent on a use-case. For example this one it built for my exploration <a href="https://github.com/developmeh/webrtc-poc/blob/master/WEBRTC_CONNECTION_DETAILS.md#mermaid-sequence-diagram">WebRTC</a></li>
</ul>
<p>So in the end I got some joy from the LLM with WebRTC but I kinda had to treat it like a slow version of myself that is also blind and doesn't like to do a web search. I had it explain in a doc how it should work for itself and then asked it to make a boiler plate project with lots of debugging messages. It struggled a lot even with this guidance and I am sure I could have done the same work myself and gained a deeper understanding if I hadn't asked it to do the work.</p>
<p>As this is part of the bigger <a href="https://sr.ht/~ninjapanzer/krappy_internet/">Krappy-Internet</a> project I then used this poc to try and fix its previous failed implementation. But clearly there is a conceptual block for how the LLM deals with network debugging that it couldn't take a working version and use it to fix a broken version. I did learn something in the process but if this was an actual work activity I would have been stressed, instead of just killing time between blog posts on a rainy Sunday.</p>
</div>
<div class="devlog-entry">
<h2 id="14-06-2025">14 06 2025 <a class="anchor" href="#14-06-2025">🔗</a>
</h2>
<h3 id="webrtc-nat-traversals-and-american-manufacturing">WebRTC, NAT Traversals, and American Manufacturing <a class="anchor" href="#webrtc-nat-traversals-and-american-manufacturing">🔗</a>
</h3>
<p>So my new view of the architecture required to handle something like dynamic home hosting still requires a method for establishing a p2p connection. While this isn't that big of a deal it does require a consistent connection to be publicly accessible somewhere that is not behind a firewall. Which is rather annoying when trying to make this whole thing work on a phone. It is possible to run a webrtc signaling server phones tend to use "Carrier Grade NAT" CGN means there is no port-forwarding so the phone cannot respond to the signaling request to establish a NAT bypass. I think in this case its still possible but I am uncertain how the signaling server will connect the phone to the browser client when its not expecting to make a connection since it might be asleep.</p>
<p>The next pass would be that this isn't really the best solution for the phone. But general processing would be. Since the point of the phones interaction is to allow the owner of the site to have content interactions follow them it might be appropriate to produce a secure append only log and require the sites submission features require an <em>Always On</em> host to handle requests but this is also a good case for a serverless function. While its still on a cloud provider it could also be handled by a DHT. In that case the easy path would be a function which can accept data requests and append them to a signed log on the same site. The phone of course can then poll the log and prompt the user for activity. Since the polling trivial and we don't actually care about <em>Real Time</em> for these interactions its fine.</p>
<p>Probably the reason its a crazy idea in fact is everything about this rolls back a decades worth of nonsense on the internet from realtime streaming connections to dumping things to files and processing them when its convenient. Its more like reading your email, there isn't really a dopamine hit and the only content that grows is those engaged. The final content is text and permanent. The reason for a lot of real-time communications was to give a faithful response to online transactions, but I see that is one of the ways retailers have complicated buying. They want to allocate inventory but if I am selling maguffins from my garage, inventory is really just a nuance. This isn't a solution for the Amazons of the world, its focus is to create a simpler experience for both a business owner or a blogger. I see the time of complicated sites which have sales funnels is more providing the same value it once did. Deep down we wanna find a thing, buy a thing, and know its gunna show up at some point.</p>
<p><strong>Simpler</strong>, is probably very subjective but I can see a mechanism around this course work in this project that makes this all a daemon.</p>
<p>In some way this has become a diatribe on why we can't build anything in America. Its because we assume that all items need to be produced at a scale to buy at a Lowes. I think consumer expectations for products is they should be complicated but I think we should start looking back to the items we find at thrift stores. The modality should start to wander towards, "I want to make a good X" not so much "I need a new solutions for X". But thats just my opinion in reality.</p>
</div>
<div class="devlog-entry">
<h2 id="08-06-2025">08 06 2025 <a class="anchor" href="#08-06-2025">🔗</a>
</h2>
<h3 id="krappy-internet-dynamic-dns-and-hosting-at-home">Krappy Internet Dynamic Dns and Hosting at Home <a class="anchor" href="#krappy-internet-dynamic-dns-and-hosting-at-home">🔗</a>
</h3>
<p>I heard recently that the future of the internet is AI. 🤣 ok ok ok... yes if I was investing a bunch of other peoples money in a technology startup that sold AI I would say a lot of crazy things too. I am not so sure the internet is a "thing" anymore that can go away. Its the substrate for communication and while the way we consume the internet may change there will always need to be a source of personal expression. For the age I come from that would have been the blog, the forum, and the comments section. I was there when Twitter started but it wasn't my thing. I am from the days of GeoCities and Anglefire, shared hosting where a hand-full of webpages was enough to give you a voice. All the backgrounds were tessellated poorly, the text was an odd color but the vibes were true. Frequenting final-fantasy fan sites and reading conspiracies about aliens.</p>
<p>I have this wild idea that the answer for the kludge that is "return the means of production to the people!" The forever cry of the decentralized internet, most of us have multiple internet providers and we have computers just burning dead dinosaurs to watch useless noise videos with plenty of capacity to share.</p>
<p>Regarde-moi! What if we just hosted our own content from our own machines in our own houses? What if it didn't really matter when that server was offline? See there are a lot of us and none of us have anything that's interesting to say, which is a kind of magic when you think how much we talk. It's the community not the communication that matters, we need to feel connected, which is exactly the power of the internet.</p>
<p>So here is the project <a href="https://git.sr.ht/~ninjapanzer/krappy-dyndns">https://git.sr.ht/~ninjapanzer/krappy-dyndns</a></p>
<p>The assumption is that if you own a domain you likely also own some free hosting, really lame html hosting but a small piece of the internet that is yours as long as you pay for it. Kinda like a house and property taxes... but lets not go down that road. So your ISP gives you a ton of bandwidth so you can watch <em>Better Call Saul</em> on Netflix but whats it doing when you aren't binging? Just idling like car insurance... but lets not go down that road either. Point is theres a lot of spare internet for the 50 people a month that are going to look at your website. That's pretty cool to be honest when you think about the number of people you might interact with on the average Friday at your local coffee shop. So here are the problems we need so solve:</p>
<ul>
<li>give your "special content" the impression its from a fixed location for the sake of discoverability</li>
<li>find a normal way to allow a browser or application to call back from your internet house to your house house without being bungled by your ISP</li>
<li>make it easy to maintain some services from your laptop or phone</li>
<li>keep those things kinda working when those devices are offline</li>
</ul>
<p>Yes, the idea is to host your site from your phone while its in your pocket. Crazy yes, possibly maybe, am I gunna try, yes.</p>
<p>So back to the point, you own some internet property and with the help of some krappy-dyndns we can publish a text file to the "free" hosting thats attached to your domain. This falls under the guise of what we call these days as .well-known. <code>https://youraddress.com/.well-known/krappy-dyndns-8abe777a</code> holding a binary stream of IP address histories and encoded with the name of a service. It's just an IP address and while its your IP address its also shared by others so its vaguely you. The daemon service runs on your target device and on an interval figures out what your IP is and then if it changes pushes it to that well-known file.</p>
<p>A user comes along and wants to leave a comment on your site. It makes a call to the comment service you run on your laptop and the client making the request knows what service it wants to interact with finding the correct .well-known and thus collecting an IP address. Next the tricky part, we have to trick your ISP to accept an incoming connection without an outbound call. Thats the whole NAT thing, probably utilizing something like <a href="https://git.sr.ht/~ninjapanzer/krappy-dyndns">https://en.wikipedia.org/wiki/Hole_punching_(networking)</a>. So your laptop will also host this service on your IP and allow for some underlying protocol like WebRTC to allow the initial transaction and boom the comment has been sent. Now, this is an internet that isn't trying to waste your time, so we take the comment and after its moderated we write it once back to our free hosting and if our laptop gets turned off for the night, who cares, people just cant leave a comment but the imporant stuff stays there. I mean they could always just send an email too.</p>
<p>Just one step in this crazy plan complete this week and another piece of the Krappy Internet is available.</p>
</div>
HomeFri, 06 Jun 2025 00:00:00 +0000[email protected]
https://developmeh.com/
https://developmeh.com/<div class="hero-section">
<img src="https://github.com/developmeh.png" class="hero-logo" alt="Developmeh Logo">
<div class="hero-content">
<h2>Developmeh</h2>
<div class="subtitle">Develop ¯\_(ツ)_/¯</div>
<p>Contained within are harebrained ideas that have no commercial value... still here... you are one of the special ones.</p>
</div>
</div>
<div class="home-layout">
<div class="main-column">
<div class="callout info">
<span class="callout-title">Perspective</span>
I have done a lot of software engineering in my life and after all that time I have come to appreciate an industry in constant evolution.
<p>I, though, seem to stand as a fixed point, arriving to accomplish a specific task and obstinately refusing to become a tradesman.</p>
</div>
<div class="callout success">
<span class="callout-title">Welcome</span>
For those of you who have a craft and participate in a creative act on the regular, I salute you. Your bravery is what I idolize. In pursuit of of some kind of self-idolatry I create toys to expand my knowledge and forgive myself for being a shill.
<p>But who cares? Welcome to my workshop!</p>
</div>
<div class="callout warning">
<span class="callout-title">Standards</span>
This is a safe space for all ideas; the point is to have fun with it; you don't wanna write tests...suuuuure....
<p>GET THE HELL OUT! I am not some kind of heathen. I have standards, bud.</p>
</div>
<div class="card-stack">
<div class="card">
<h3>Devlogs</h3>
<ul>
<li><a href="/i-made-a-thing/kwik-e-mart-who-needs-a-gas-town#15-03-2026">15-03-2026 Kwik-E-Mart (v0.5 — Making tools for robots)</a></li>
<li><a href="/soft-wares/ai-diaries#06-03-2026">06-03-2026 The AI Diaries (My Own Ideas)</a></li>
<li><a href="/soft-wares/ai-diaries#01-03-2026">01-03-2026 The AI Diaries (Limitless Abstraction)</a></li>
<li><a href="/soft-wares/ai-diaries#22-02-2026">22-02-2026 The AI Diaries (Unboudned Growth)</a></li>
<li><a href="/i-made-a-thing/catalyst-orchestrator#14-02-2026">14-02-2026 Catalyst Orchestrator (The Daemon Creates Steps at Runtime)</a></li>
<li><a href="/i-made-a-thing/catalyst-orchestrator#13-02-2026">13-02-2026 Catalyst Orchestrator (The Daemon Parses, The Daemon Routes)</a></li>
<li><a href="/i-made-a-thing/catalyst-orchestrator#11-02-2026">11-02-2026 Catalyst Orchestrator (The Haiku Decides)</a></li>
<li><a href="/soft-wares/ai-diaries#08-02-2026">08-02-2026 The AI Diaries (80/20 Rule Still Applies)</a></li>
<li><a href="/soft-wares/ai-diaries#03-02-2026">03-02-2026 The AI Diaries (Composable Code Future)</a></li>
<li><a href="/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events#02-02-2026">02-02-2026 Rust Dancing Banana (SSE vs Chunked Encoding)</a></li>
<li><a href="/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events#02-02-2026-1">02-02-2026 Rust Dancing Banana (Rust's Async Streams)</a></li>
<li><a href="/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events#01-02-2026">01-02-2026 Rust Dancing Banana (Compile-Time Frame Embedding)</a></li>
<li><a href="/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events#01-02-2026-1">01-02-2026 Rust Dancing Banana (Nix for Rust)</a></li>
<li><a href="/soft-wares/ai-diaries#28-01-2026">28-01-2026 The AI Diaries (Eager Intern Problem)</a></li>
<li><a href="/soft-wares/ai-diaries#27-01-2026">27-01-2026 The AI Diaries (Throughput over Precision)</a></li>
<li><a href="/soft-wares/ai-diaries#20-01-2026">20-01-2026 The AI Diaries (AI-generated Code Debt)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#14-07-2025">14-07-2025 This Week's Crazy Idea (Just build binaries)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#13-07-2025">13-07-2025 This Week's Crazy Idea (Everything is a Stream)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#21-06-2025">21-06-2025 This Week's Crazy Idea (OpenTelemetry and the question of ditching logs)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#15-06-2025">15-06-2025 This Week's Crazy Idea (WebRTC and what not to ask AI to do)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#14-06-2025">14-06-2025 This Week's Crazy Idea (WebRTC, NAT Traversals, and American Manufacturing)</a></li>
<li><a href="/i-made-a-thing/this-weeks-crazy#08-06-2025">08-06-2025 This Week's Crazy Idea (Decentralized DynamicDns Krappy-DynDns)</a></li>
<li><a href="/projects/krappy-internet/#24-02-2025">24-02-2025 Krappy Internet (Working around the browser)</a></li>
<li><a href="/projects/krappy-internet/#11-02-2025">11-02-2025 Krappy Internet (An Ideal World)</a></li>
<li><a href="/i-made-a-thing/ruby-streaming-banana-dancer/#31-01-2025">31-01-2025 Streaming Dancing Banana (Nix Cross Platform Improvements)</a></li>
<li><a href="/projects/krappy-internet/#devlog">29-01-2025 The Krappy Internet (Protocol Servers)</a></li>
<li><a href="/i-made-a-thing/ruby-streaming-banana-dancer/#27-01-2025">27-01-2025 Streaming Dancing Banana (Nix Build and Deploy to K8s)</a></li>
<li><a href="/projects/gol/#21-01-2025">21-01-2025 Distributed Game of Life (Debugging stats)</a></li>
<li><a href="/projects/gol/#20-01-2025">20-01-2025 Distributed Game of Life (Stats)</a></li>
<li><a href="/projects/gol/#19-01-2025">19-01-2025 Distributed Game of Life (Profiling)</a></li>
<li><a href="/projects/gol/#15-01-2025">15-01-2025 Distributed Game of Life (Getting Started</a></li>
<li><a href="/i-made-a-thing/recreating-kafka-blind/#25-12-2024">25-12-2024 Krappy Kafka (k0s Deployment)</a></li>
<li><a href="/i-made-a-thing/recreating-kafka-blind/#22-12-2024">22-12-2024 Krappy Kafka (Handler Cleanup and Func Interface)</a></li>
<li><a href="/i-made-a-thing/recreating-kafka-blind/#22-12-2024">05-11-2024 Krappy Kafka (Shared Consumer Groups)</a></li>
</ul>
</div>
<div class="card">
<h3>Articles</h3>
<ul>
<li><a href="/devex/automatic-programming-iteration-4">Automatic Programming: Iteration 4</a></li>
<li><a href="/tech-dives/bats-testing-bash-like-you-mean-it">BATS - Testing Bash Like You Mean It</a></li>
<li><a href="/i-made-a-thing/keep-your-eyes-on-the-ide-and-your-robots-on-the-tickets">Keep Your Eyes on the IDE, and Your Robots on the Tickets</a></li>
<li><a href="/soft-wares/agentic-patterns-elements-of-reusable-context-oriented-determinism">Agentic Patterns: Elements of Reusable Context-Oriented Determinism</a></li>
<li><a href="/soft-wares/just-forget-about-owning-code">Just Forget About Owning Code</a></li>
<li><a href="/i-made-a-thing/rust-streaming-banana-dancer-server-sent-events">Rust Dancing ANSI Banana with Server-Sent Events</a></li>
<li><a href="/tech-dives/a-deterministic-box-for-non-deterministic-engines">A Deterministic Box for Non-Deterministic Engines</a></li>
<li><a href="/soft-wares/claude-or-clod">Claude or Clod</a></li>
<li><a href="/i-made-a-thing/the-magic-of-stubbing-sh">The Magic of Stubbing sh</a></li>
<li><a href="/soft-wares/sufficient-complexity">Sufficient Complexity</a></li>
<li><a href="/soft-wares/do-devs-really-do-devops">Do Devs Really Do DevOps in your Org?</a></li>
<li><a href="/soft-wares/the-good-sergeant">The Good Sergeant</a></li>
<li><a href="/soft-wares/creative-impostor-syndrome">Creative Impostor Syndrome</a></li>
<li><a href="/devex/the-perfect-dev-env/">The Perfect Dev Env Part 1</a></li>
<li><a href="/projects/gol/">Distributed Game of Life</a></li>
<li><a href="/i-made-a-thing/kwik-e-mart-who-needs-a-gas-town">Kwik-E-Mart: Who Needs a Gas Town When a Gas Station Will Do</a></li>
<li><a href="/i-made-a-thing/recreating-kafka-blind">Krappy Kafka</a></li>
</ul>
</div>
</div>
</div>
<div class="side-column">
<div class="callout info">
<span class="callout-title">Connect</span>
<h3 id="everything-is-on-github">Everything is on GitHub:</h3>
<ul>
<li><svg class="github-link-icon" height="16" width="16" viewBox="0 0 16 16" fill="currentColor"><path d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"/></svg><a href="https://github.com/developmeh">https://github.com/developmeh</a></li>
<li><svg class="github-link-icon" height="16" width="16" viewBox="0 0 16 16" fill="currentColor"><path d="M8 0C3.58 0 0 3.58 0 8c0 3.54 2.29 6.53 5.47 7.59.4.07.55-.17.55-.38 0-.19-.01-.82-.01-1.49-2.01.37-2.53-.49-2.69-.94-.09-.23-.48-.94-.82-1.13-.28-.15-.68-.52-.01-.53.63-.01 1.08.58 1.23.82.72 1.21 1.87.87 2.33.66.07-.52.28-.87.51-1.07-1.78-.2-3.64-.89-3.64-3.95 0-.87.31-1.59.82-2.15-.08-.2-.36-1.02.08-2.12 0 0 .67-.21 2.2.82.64-.18 1.32-.27 2-.27.68 0 1.36.09 2 .27 1.53-1.04 2.2-.82 2.2-.82.44 1.1.16 1.92.08 2.12.51.56.82 1.27.82 2.15 0 3.07-1.87 3.75-3.65 3.95.29.25.54.73.54 1.48 0 1.07-.01 1.93-.01 2.2 0 .21.15.46.55.38A8.013 8.013 0 0016 8c0-4.42-3.58-8-8-8z"/></svg><a href="https://github.com/ninjapanzer">https://github.com/ninjapanzer</a></li>
</ul>
<h3 id="correspondence">Correspondence</h3>
<p>Please address all hate mail <a href="https://github.com/orgs/developmeh/discussions/categories/general">here</a></p>
</div>
</div>
</div>
The Krappy InternetWed, 29 Jan 2025 00:00:00 +0000[email protected]
https://developmeh.com/projects/krappy-internet/
https://developmeh.com/projects/krappy-internet/<h2 id="what-if-the-internet-stopped-being-shit-and-was-instead-krappy">What if the internet stopped being shit and was instead Krappy? <a class="anchor" href="#what-if-the-internet-stopped-being-shit-and-was-instead-krappy">🔗</a>
</h2>
<p>The Krappy Internet is an attempt to re-envision how we trust data from the internet. This is barely even a hypothesis but in the pursuit of something closer to what the internet once was without bike shedding blockchains and onion routers I am building my own internet, just for me. Others can use it if it ever does anything.</p>
<h3 id="components">Components <a class="anchor" href="#components">🔗</a>
</h3>
<p>Krappy Utils (In Progress) -> https://git.sr.ht/~ninjapanzer/krappy
Krappy Content Linker (In Progress) -> https://git.sr.ht/~ninjapanzer/krappy_internet
Krappy Navigator (Planned)</p>
<h3 id="mircocosoms">Mircocosoms <a class="anchor" href="#mircocosoms">🔗</a>
</h3>
<p>In the beginning content lived on distinct domains which declared their purpose clearly in their domain name or the commonality of the content they maintained. Much like an address to a folder on a huge distributed computer hyperlinks created the connective tissue
between the content storage and meaningful reference. Even search engines only acted to provide a searchable inventory of those same links. In the early 2000s this modus changed in the drive to reduce barriers for users to publish their thoughts to the internet.
I don't know who to blame first but lets just say my earliest memory related to something like "Global Consciousness" a rather ugly site, appropriate for th time, where you could post a few worlds and it would show up for everyone. Kind of mind boggling the scale
of something like that back then. This wasn't the first though, as email was the first "social media" through usenets and before that bulleten board systems. The biggest difference between those early examples and what we have now is the nature of the silos created.
Content is restricted to a domain and distribution is controlled by the domain owners marketing budget at best, and at worst by the nefarious moderation of maddmen. The flow of information is best modulated at the consumer and not in the ivory towers of the board rooms.</p>
<p>When a content silo is generally healthy we will see an even discourse of thoughts and an opportunity to learn. The opposite is a self-reinforcing place where we can avoid the conflict of new ideas and further ambiguate reality. The need for critical thinking is personal
obligation of a democratic society.</p>
<h3 id="nefarious-moderation">Nefarious Moderation <a class="anchor" href="#nefarious-moderation">🔗</a>
</h3>
<p>If I was to correlate discovery of knowledge with my youth some 30 years ago the challenge would be finding my way to he library and then finding the right book. The process very much aligns with the manner we extract data today but the moderation is opaque. I had
the choice of either using the "card catalog" or speaking to a "research librarian" to identify my resources. Both are somewhat expensive in terms of human expenditure but rely heavily curation and expertise. These two avenuse are aligned with search engines and wikipedia as
direct analogs a decade later. The value position of that system is a direct proportion to its speed and the agency of the curators to treat knowledge as uniform expression. This of course is the ideal and not all libraries were neutral, none could be free of inherant bias, and
thus are another form of imperfection. If we instead try to observe the form of the library and the librarian the intent is to act as a free store of knowledge, organized by consistent means and discoverable by the average human.</p>
<p>Moderation is at its core a kind of applied bias, one that slides towards societal norms. The locality of those norms is mediated by the range of human contact; in a town that was limited to hundreds and on the internet thats limited by language and discoverability.
Because a card in the catalog at the library has a fixed dimension there is also a limited topical granulatirty it can describe about an entry. Someone also has to use interpretation to categorize and prioritize those classifiers, another layer of invisible bias.
I want to believe that those involved take the role seriously but honestly, but I also know that this cannot be true, but I do believe that the default nature of people is to do good and those that do ill are a smaller portion of the whole. I expect that libraries
have been crowd sourcing classification for as long as they have existed. At some point the number of texts exceeds the capacity of the librarian to verify and we have to rely on publishers and other libraries to do the bulk work.</p>
<p>The same is true for content on the internet, but the value and classification has to benefit humans and not reinforce the dopamine factories. When we are rewarded for the sensational or rhetorical we assume a bias towards these topics and the value repeats instead
of grows. If we were to view "content" independent of "platform" and interacted with it as we would in a library, what would that card catalog look like? Who would fill out the cards? Who curates the summer reading list? The publisher or the librarian?</p>
<h3 id="identity-and-emergence">Identity and Emergence <a class="anchor" href="#identity-and-emergence">🔗</a>
</h3>
<p>I adhere that you should put your name on things. I am American, and I with the mythology of figures like John Hancock, who's apocryphal heroism is laid out by signing his name large enough on the Decaration of Independence such that the landord could read it unassited.
Regardless of the veracity and accuracly of this take its influence what it means to "have a position" and "to express ones thoughts" where there is no place for anonymity. Its a bias allowed by my privilege, also I don't spend a lot of time in proximity to the emergence
of fact. So there is clearly a place for strong assumption of consistent identity and the emergence of information without a clear owner. The value is weighed by its validation, when giving credence to a statement it must have proof. Proof is well established through
consistency of action by a trusted identity, or by the expression of evidence. I wanna believe there is a place for investigative journalism's protected informants, for whistle-blowers, and for those fighting oppression to communicate. When the platforms are not aligned
with protecting the actors, which if you look at the long history of centralized platforms is under constant violation by state run organizations, hackers, and corporate greed I agrue no one is anonymous.</p>
<p>A person should be able to own whatever they publish, not by license but by attribution, you can prove you said it. You can also say it anonymously, since an identity is really analogous with trust an identity doesn't need to be a "person" but it should be "consistent".
Naturally, this means an identity can be an organization or a person, and content is aligned with that instead of their domain. Domains don't own identity they only hold content and act as addessable geographies. Many libraries carry the same books and in some cases
they trade those books with each other with decentralized ownership. But what can't change is the authors, the editors, and publishers, they are fixed and they act as the identities we assign or reject the proof of over time.</p>
<p>The value of identity is we can account for its duality, both the bad and the good are relatable and the only moderation will be self-moderation. Honestly, this is a really tricky subject, the lines were drawn long ago where accountability is a double edged sword. It
protects the mass from victimization and at the sametime subjects the part to possible ostracisation or harm. For now I like to think of identities as properties or assets. They are idempotent and addressable but not individual, an actor may have multiple identities. How
those identities assume trust and proof is based on the system that passively assigns it its trust.</p>
<p>While identities publish, it is the published material itself that is graded and the author doesn't receive immediate feedback about its reception. There are other networks and processes to be placed that help users collect and consume those publications wholly owned by
them.</p>
<h3 id="krappy-utils">Krappy Utils <a class="anchor" href="#krappy-utils">🔗</a>
</h3>
<p>A persistent connection multiplexing TCP protocol server library. Since everything is going to eventually have a binary protocol it makes sense to hoist that from Krappy Kafka and speed up how fast I can spin up a new protocol processor.</p>
<ul>
<li><input disabled="" type="checkbox"/>
Figure out how to test connection management is working as expected.</li>
</ul>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="02-02-2026">02 02 2026 <a class="anchor" href="#02-02-2026">🔗</a>
</h3>
<h4 id="wasm-is-the-way-in">WASM is the way in <a class="anchor" href="#wasm-is-the-way-in">🔗</a>
</h4>
<p>Something that has become clear is that with the introduction of WASM my desire to move on from webapps and browser experiences that expand past HTTP has become more common. I see a future where content is delivered from central sources and interactions are handled with decentralized networks. I keep thinking that the problem is its all or nothing when it comes to tool like TOR and I2P.</p>
<p>Thinking about what the internet is quite good at its linking documents and even if some of the major search engine players are failing at delivering valuable content, the content is stable and addressable.</p>
<p>Clearly, if the debacle with Cursor trying to build a new browser building browsers is hard and we probably need to take another stab at browser extensions. Locking down the browser was once necessary in the days of IE but now we can provide actual functionality that is quite interesting and doesn't require and evolution in JavaScript to accomplish. WASM gives me complex tools and introduces them to the internet operating system of the browser.</p>
</div>
<div class="devlog-entry">
<h3 id="24-02-2025">24 02 2025 <a class="anchor" href="#24-02-2025">🔗</a>
</h3>
<h4 id="working-around-the-browser">Working around the browser <a class="anchor" href="#working-around-the-browser">🔗</a>
</h4>
<p>So one of the challenges of making a side-channel connection to the krappy internet is through a proxy. I don't really see the need to try and forklift the world of current browsers. The plan for this is to create an extension that loads a WASM module wrapping a webrtc data channel. This way I can maintain a socket like stream to another client that is not restricted by the rules of the browser. I can then establish a TCP or QUIC connection to the content tree.</p>
<p>The long road here is probably going to end up being the short one in reality. Browsers are quite irritating and intrusive. I think about how ToR works and how its challenging to link around to things on it. Some of that is due to the impermenance of those servers and the lack of an index. Something like this could act as a generalized bridge between those and other platforms. In the same way that gemini capsules and gopher sites will deploy an http proxy. This proxy is local to the machine so creators can pick any protocol for their site and they could be linked together. I rather like the idea of going to the wallstreet journal and having a tor link to a gemini capsule with the pages content behind the paywall.</p>
<p>It will also be much harder to destroy content as any page that changes can be relinked to something like the internet archive. The control side of this is important, and I wonder if users should opt into other users links. So the defacto nature is we provide our own content and only we can see it, there would need to be some opt in model. I keep seeing it as if the world was one big logseq where content from various location is joined without ownership of any of the sources. Even if it isn't useful its rather cool to think about annotating the internet and building a webring around content that can have a deployed algo track updates.</p>
<p>Dreaming dreams.</p>
<p>For now I am planning on building a PoC from https://github.com/pion/webrtc which will then be compile to WASM and connected to a proxy server.</p>
</div>
<div class="devlog-entry">
<h3 id="11-02-2025">11 02 2025 <a class="anchor" href="#11-02-2025">🔗</a>
</h3>
<h4 id="an-ideal-world">An Ideal World <a class="anchor" href="#an-ideal-world">🔗</a>
</h4>
<p>I see the internet as a great library archive, while I haven't done the math, I expect the rate at which we create material is roughly at the same rate we improve storage density. At least I can account for that in my own life.</p>
<p>So here is a random vision for the internet. I pay for connection to the network. In deference to the world I live in today, that used to mean something a little different in my youth. Something that drives me to view myself a more of a producer/consumer than just a consumer. I am sure I am not alone.</p>
<p>We pay a provider and I get some simple addressable hardware from them, now I get a public IP address but moreover a dynamic DNS built into my hardware. My provider acts a kind of lookup service which allows me to host applications within my infrastructure and make them available to the greater internet. When I share an image, I share it from my network. My provider also acts as a cache so allow my devices and services to be offline without interruption.</p>
<p>It's not an X or Y kind of situation, personally hosted lives alongside the giants. Services like Vercel or Hetzner still exist for hosting. But when I share text to comment on Bluesky I own that text and it is hosted on my device and cached by Bluesky. When I revoke access to my post, its not gone, but its removed from the cache in the same way we handle DNS propagation. It would be a wild and noisy place and the problem to solve is how to find the things you wanna read. The ecosystem for applications changes as well. Everything is a server, I mean it already is except you don't know what its serving and to who...</p>
<p>An idealistic view of a future state that still requires a lot of work.</p>
</div>
<div class="devlog-entry">
<h3 id="06-02-2025">06 02 2025 <a class="anchor" href="#06-02-2025">🔗</a>
</h3>
<h4 id="getting-over-the-browser">Getting over the Browser <a class="anchor" href="#getting-over-the-browser">🔗</a>
</h4>
<p>So recently I came to this understanding of the nature of the Modern OS, which includes the web browser. So there are really two ways to go. Create a new browser using an open source project or build a side-channel daemon.</p>
<p>I rather like the daemon concept because getting something integrated and deployed into a bespoke browser build is going to be an unlikely way to get someone to use something.</p>
</div>
<div class="devlog-entry">
<h3 id="29-01-2025">29 01 2025 <a class="anchor" href="#29-01-2025">🔗</a>
</h3>
<h4 id="building-a-tcp-server-library">Building a TCP server Library <a class="anchor" href="#building-a-tcp-server-library">🔗</a>
</h4>
<p>While this project has been in the works for a while its also an avenue for me to learn. The first task was to build a modern high performance TCP server that has a concept of an easy to manage binary protocol. For this I picked CBOR https://cbor.io/ RFC 8949 Concise Binary Object Representation. Its not the fastest and I am looking for a solution that has a zero copy buffer like flat buffers maybe.</p>
<p>The challenge is making sure that connection management happens as we expect. Since the goal is to allow a client to reuse a connection to stream multiple requests its important that the connection be persistent and also go away as soon as we are done using it so it can be recycled for a future client. In the Krappy Kafka project there are cases where this management appears to get out of sync and blocking causes all go routines to be consumed. Where connections should have been released they were not. Now that project uses a lot of competing mutexes that are likely the cause of deadlocks. The next version of that and all future protocol servers will rely on channels.</p>
<p>From here we move to the Content Linker, in something like a WoT (Web of Trust) model we want to allow content registration for trust. While we want to allow anonymous users to contribute whatever they want we also want content to have a machine like identity. The hope is to promote that content linking is how we establish a chain of custody for truth. User provided consensus then helps to build this trust. This means that content from public identities doesn't have to join a web of trust. Its just available and as it gains consensus the trust of that content is improved as authoritative.</p>
<p>A good model would be wikipedia, Content can be copied and modified but its moderation is the responsibility of the whole. While this doesn't mean that mistruth is evicted, it means that it will often be short lived and even hard to find. Burrying is not something you can effectively pay for but the community can dimish the impact of garbage so much it may never be seen. There are going to need to be some algorithms to help address cheating here but this is the resonsibility of the consumer. The content model is just a weighted data store. You look at whatever you want albeit the model will promote some decisions.</p>
</div>
The Perfect Development EnvironmentTue, 28 Jan 2025 00:00:00 +0000[email protected]
https://developmeh.com/devex/the-perfect-dev-env/
https://developmeh.com/devex/the-perfect-dev-env/<h2 id="the-perfect-development-environment">The Perfect Development Environment <a class="anchor" href="#the-perfect-development-environment">🔗</a>
</h2>
<p>Let's be clear, this is all opinions and while this serves equally for those who focus on a single technology chain, it is optimized for those whom work on multiple projects with varied exacting dependencies and runtimes.</p>
<p>For example just in Ruby alone I may have a a legacy ruby 1.9 project on the same machine where I have multiple ruby 3.x projects. You might not see any conflict here and you would be right. Given each ruby project has a unique ruby version we don't really have any annoyances. But the moment I have 2 ruby 3.2 projects with various verisons of imagemagick I will find myself fighting. Of course this is related to a nuance of gems that bind to static libs and are somewhat opinionated about which exact version they need while that need is being provided by a system level package manager like Homebrew.</p>
<p>To be clear, I love homebrew, it made me who I am, but like JQuery its a product of an age that has passed.</p>
<p>What murders me is that Nix is 20 years old. I could have been using this the whole time, if it had any of today's features back then. But I am jumping the gun, lets continue with the targets of this project.</p>
<ul>
<li>Produce a template-able structure for any project</li>
<li>Use open source tools that are well maintained</li>
<li>Use patterns that make project adoption easier</li>
<li>Management of dependencies must be project specific and avoid env collisions</li>
<li>Leave artifacts behind that inform but don't require use</li>
</ul>
<p>The guiding principle is <em>Leave artifacts behind that inform but don't require use</em> we can't say this is done unless we can make this true. Everything before this makes its conditionality possible. If we do this work correctly we can allow the technological landscape to evolve and these techniques can be replaced with new superior solutions as they become vogue.</p>
<h3 id="produce-a-template-able-structure-for-any-project">Produce a template-able structure for any project <a class="anchor" href="#produce-a-template-able-structure-for-any-project">🔗</a>
</h3>
<p>I have worked with a number of navel gazing developers who like to build walls around their languages and techniques marking them simultaneously superior and exclusive to their corners of the world. I would alike this to what has happened with protobuf and protoc in python.</p>
<h4 id="a-plea-for-protocol-politeness">A plea for protocol politeness <a class="anchor" href="#a-plea-for-protocol-politeness">🔗</a>
</h4>
<p>You can skip this section if you don't care about my personal experiences with protoc and python, while this is not a problem limited to protoc or python its a story of the smell produced by monoculture; something that no longer has a place in modern development.</p>
<p>In proper diatribe format our story is about the value of protocols and our common inability to avoid abstraction in the face of having to learn something new.</p>
<p>For those who have not used <a href="https://protobuf.dev/">Protobuf</a> and its cli utility protoc (pronounced pro-toc), it has a rather simple protocol for adding extensions to its command line. Mind you python was not officially supported until sometime mid-2024 and all generators were community provided. Here is the catch like I attempted here <a href="https://github.com/ninjapanzer/grpc_generator">my grpc generator in rust</a> required some funny incantations to get things working. At the time python developers wrapped protoc in a bespoke python library at the time those incantations would looks something like this <code>python -m grpc_tools.protoc -I. --protobuf-to-pydantic_out=. example.proto</code> and didn't publicly expose the actual plugin for protoc which expects something more like <code>protoc -I. --protobuf-to-pydantic_out=. example.proto</code>.</p>
<p>The protocol I speak of is the product of a clever CLI, <strong>--protobuf-to-pydantic_out</strong> expects that in the current path is something executable (including a shell script) that goes by the name <strong>protoc-gen-protobuf-to-pydantic</strong> so whatever is before <strong>out</strong> must then be able to exist prefixed by <strong>protoc-gen</strong>. While somewhat poorly documented this protocol for extension makes it super easy to bolt on 1 or 10 plugins to build out a whole organizations worth of runtime specific artifacts.</p>
<p>Like I mentioned before '24 we had to do it the hard way and because the python communities view of DevEx and ergonomics is annoying binaries like protoc should be wrapped under the glaze of a python module.</p>
<p>So the lesson here is probably two fold. Firstly, I wasn't the only one who probably thought this whole python thing was dumb and by bringing python into the fold it has a spec plugin and now I don't have to worry about protocol violations to generate artifacts from protobuf IDL files. Secondly, its expected that when you produce abstractions for public consumption you are obligated to do so in observation of the authors protocol when wrapping their work.</p>
<p>Simplicity doesn't mean brevity, thus I am advocating for <strong>Clarity</strong> over <strong>Ease</strong>. It should be easy to understand or do, so in terms of python authors overstepped here. We are going to try and do the same thing with our project layouts each piece will respect the patterns of its community regardless of expectations of the projects norms.</p>
<p><img src="../standards.png" alt="xkcd standards" /></p>
<p>Yep, I am thinking it too so just hang on.</p>
<h3 id="back-to-the-main-event">Back to the main event <a class="anchor" href="#back-to-the-main-event">🔗</a>
</h3>
<p>What we want to inform on is how to consume a fresh project which usually has a few externals we should concern ourselves with. First of those are the runtime dependencies and here we have a ton of options. Just in my short life I have used all of the following:</p>
<ul>
<li><a href="https://rvm.io/">rvm</a></li>
<li><a href="https://github.com/rbenv/rbenv">rbenv</a></li>
<li><a href="https://github.com/nvm-sh/nvm">nvm</a></li>
<li><a href="https://maven.apache.org/wrapper/">maven wrapper</a></li>
<li><a href="https://docs.gradle.org/current/userguide/gradle_wrapper.html">gradle wrapper</a></li>
<li><a href="https://github.com/Schniz/fnm">fnm</a></li>
<li><a href="https://sdkman.io/"><strong>sdkman</strong></a></li>
<li><a href="https://github.com/jenv/jenv">jenv</a></li>
<li><a href="https://brew.sh/"><strong>homebrew</strong></a></li>
<li><a href="https://en.wikipedia.org/wiki/APT_(software)"><strong>apt/dpkg</strong></a></li>
<li><a href="https://en.wikipedia.org/wiki/Yum_(software)"><strong>yum</strong></a></li>
<li><a href="https://wiki.archlinux.org/title/Pacman"><strong>pacman</strong></a></li>
<li><a href="https://nix.dev/"><strong>nix</strong></a></li>
<li><a href="https://asdf-vm.com/"><strong>asdf</strong></a></li>
<li><a href="https://github.com/jdx/mise"><strong>mise-en-place (mise) / rtx</strong></a></li>
<li><a href="https://github.com/ansible/ansible"><strong>ansible</strong></a></li>
<li><a href="https://github.com/pyenv/pyenv">pyenv</a></li>
<li><a href="https://github.com/brainsik/virtualenv-burrito">virtual burrito</a></li>
<li><a href="https://github.com/phpenv/phpenv">phpenv</a></li>
<li><a href="https://github.com/hjbdev/pvm">pvm</a></li>
</ul>
<p>While I am sure I forgot some you will notice two groups I have highlighted the ones that belong together. Whats different about these bold tools is they try to be the new standard of how to collect any runtime with some broad variances. I think this is a good place to start, but we should remember our goals and immediately eliminate those which don't project our project env from collisions with other projects. That means we say goodbye to all the package managers aside from <strong>nix</strong> and <strong>ansible</strong> albeit ansible has a special use case and is probably muddying the waters.</p>
<p>Of the remaining list we have <strong>sdkman</strong>, <strong>asdf</strong>, <strong>mise</strong>, and <strong>nix</strong>. Thats a pretty tight list lets go over how these work. The first three all do the same thing and will isolate each runtime in your home directory and shim you environment and each allows for a global system version and a config file driven variant per project folder albeit the format for sdkman is unique and asdf/mise are interchangeable in some cases. That leaves <strong>nix</strong> which is our ugly duckling, as its syntax is rather obtuse so you need to have the right reason to use it for your project. That reason is probably more related to your build system then it is one of getting a runtime local to a project.</p>
<p>To be honest I don't generally advise using <strong>nix</strong> to prepare your development environment runtime since it wants to own the whole environment using it to install say ruby means you also need to teach it how to install your gems. Which isn't horrible but might be overreaching, <a href="https://github.com/developmeh/ruby_streaming_ansi_banana/blob/dba7eba58ccca137975a5a29ac720b6f5084cb32/flake.nix">an example with nix</a> while that also builds a docker image with the same context, that other reason you might wanna use <strong>nix</strong> I mentioned, its pretty heavy compared to the competitors. Those primarily expose their configuration through something we expect a file that lists a runtime and a version. The tool then helps you install those versions on your computer and the configuration file acts as human readable documentation about the project in case your developer doesn't wanna use it.</p>
<h4 id="enough-talk-lets-see-something">Enough talk lets see something <a class="anchor" href="#enough-talk-lets-see-something">🔗</a>
</h4>
<p>Remember our opinion is that a git repo is just a folder and folders can live in folders, I don't wanna tell you to always put multiple projects in a repo or one project per repo so we only describe the project as a folder and where that folder lives is up to you.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>/
</span><span>├── .ci/
</span><span>│ └── scripts/
</span><span>├── .git/
</span><span>├── .gitignore
</span><span>├── .tool-versions
</span><span>├── .deploy/
</span><span>├── └── scripts/
</span><span>├── .build/
</span><span>├── └── scripts/
</span><span>├── GETTING_STARTED.md
</span><span>├── Makefile
</span><span>├── README.md
</span><span>├── src/
</span><span>└── ...
</span></code></pre>
<p>I have seen variants of this generally where are scripts share one folder but generally I look at this from the approach of interfaces that are tool agnostic. That interface is exposed though make. Regardless of the actual build steps or build system, like bazel or nix. I should be able to say, <strong>make build</strong> or <strong>make deploy</strong> and I will get some feedback on how that is going.</p>
<p>The same is true with CI, which will probably be augmented by the dot file for your executor configuration be that CircleCI, Gitlab, Github, Sourcehut, or something else we will always need a place to hide some scripts and then bind them to make so our CI can also make the same calls that we might call like <strong>make test</strong>. The specific language for the cross project targets is outside the scope of this document but the three I stated should be a default with strong consideration for <strong>make init</strong> or something to setup a first time run.</p>
<p>Thats not the only reason we want to put our scripts in their own project folders though. We want to be able to test them. I have become a huge fan of <a href="https://github.com/bats-core/bats-core">BATS</a> as exampled <a href="/tech-dives/test-anything-means-testing-bash/">here</a> each scripts folder can be extended with a tests folder for its given sub project like this without concern of polluting the scope of the actual codebase of the project.</p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span>/
</span><span>├── .ci/
</span><span>│ └── scripts/
</span><span>│ └── tests/
</span></code></pre>
<p><strong>IF YOU EVEN ONCE SAY WE DON'T NEED TO TEST OUR BASH GET THE HELL OUT</strong> Test everything is TEST EVERYTHING!</p>
<p>So now if we have everything write we have a repo with a file to define its runtime dependencies like ruby or java that is explicit. If our project needs both, all the better.</p>
<p>Our makefile provides a common interface to declare activities and it mostly calls scripts in our various targets like build or deploy.</p>
<h3 id="lets-pick-it-apart">Lets pick it apart <a class="anchor" href="#lets-pick-it-apart">🔗</a>
</h3>
<h4 id="but-i-am-using-gradle">But I am using gradle <a class="anchor" href="#but-i-am-using-gradle">🔗</a>
</h4>
<p>Sweet, gradle is cool and while you may consider that you can just run <strong>gradle install</strong> instead of <strong>make build</strong> we often have to bake extra commands and options to the actual build tool.</p>
<pre data-lang="make" style="background-color:#12160d;color:#6ea240;" class="language-make "><code class="language-make" data-lang="make"><span style="color:#60a365;">.PHONY</span><span style="color:#d65940;">: </span><span style="color:#f8bb39;">build
</span><span>
</span><span style="color:#60a365;">build</span><span style="color:#d65940;">:
</span><span> </span><span style="color:#db784d;">@</span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"Building with Gradle"
</span><span> </span><span style="color:#db784d;">@</span><span>gradle install
</span></code></pre>
<p>Its not that hard and no one says you have to use it, but I bet after the 8th project you open that offers <strong>make build</strong> and you don't even know if it is using bundler gradle or maven and you stop caring we have won.</p>
<h4 id="but-my-golang-project-has-a-deploy-module">But my golang project has a deploy module <a class="anchor" href="#but-my-golang-project-has-a-deploy-module">🔗</a>
</h4>
<p>Yea thats cool, thats why our project folders are all prefixed witha dot, just like git they are almost ephemeral if I deleted them all I wouldn't get a better project but the project would still be the project.</p>
<h4 id="i-don-t-need-to-deploy-right-now-and-my-builds-are-uncomplicated">I don't need to deploy right now and my builds are uncomplicated <a class="anchor" href="#i-don-t-need-to-deploy-right-now-and-my-builds-are-uncomplicated">🔗</a>
</h4>
<p>Great, the point is to define a template, if you don't need <strong>.deploy</strong> don't use it. Same is true for <strong>.build</strong> but when the decision comes, where should I put my scripts, this is the hint. If you need some other kind of special target for your project consider creating a special dot folder for it and give it some meaning while exposing it to make.</p>
<h3 id="reasoning">Reasoning <a class="anchor" href="#reasoning">🔗</a>
</h3>
<p>At each level we are creating a little border around the tools and patterns we use, creating a project protocol if you will. So projects can be more interchangeable and better to keep them simple. We are intentionally saying "don't think about it just follow this pattern." While this might feel like overstepping into someone elses agency it should feel more like a relief because its a decision you don't have to make and ultimately you are not bound to. We should invite a repeatable protocol because being clever is like getting a puppy, its a lot of responsibility.</p>
<p>We started this discussion deep in the type of tooling but in reality this is about knowledge artifacts. We are trying to answer the following questions with our protocol:</p>
<ul>
<li>what version of x do I need to install</li>
<li>how do I boot this up</li>
<li>how do I deploy this</li>
<li>how do I build this</li>
<li>how do CI/Gitops/Automation happen</li>
</ul>
<p>I have done all that without having to ensure someone already knows and better yet if they have seen a project they already know.</p>
<p>A template exists here for you consumption <a href="https://github.com/developmeh/the-perfect-project-template">Template</a></p>
<h2 id="that-was-the-easy-part">That was the easy part <a class="anchor" href="#that-was-the-easy-part">🔗</a>
</h2>
<p>Now we need to address the reality of bigger projects, the stuff it needs from the OS to build complex projects.
(Coming soon)</p>
TAPS - Not just a reporting protocolTue, 28 Jan 2025 00:00:00 +0000[email protected]
https://developmeh.com/tech-dives/test-anything-means-testing-bash/
https://developmeh.com/tech-dives/test-anything-means-testing-bash/<h2 id="test-anything-protocol">Test Anything Protocol <a class="anchor" href="#test-anything-protocol">🔗</a>
</h2>
<p>So I rather love writing tests. Mostly because I don't understand my code and the code of the libraries I am implementing. But I sure as hell can understand the results. Maybe if there was a reason to write tests that would be it. I just kinda know I am dumb and its easy to write bugs so why not be a little sure. Recently, I was working in an unfamiliar codebase with a completely familiar command language ba_sh_. I wanted to be sure as I iterated thought a series of changes, ones that inevitably can't run on my machine and only in CI. When you take into consideration <a href="/terms-and-afflictions/eula">DEVELOPER EULA</a> regarding bespoke OS specific bash commands this starts to make sense why you might want to just double check your code works.</p>
<p>Similarly in Ruby and other typeless languages the developer takes on the role of the compile time checker as well as feature writer. If that makes you wonder how they get anything done and write tests, the answer is as long as no one ever leaves the project things are going to be fine. So while I don't know why people still argue about if they should be writing tests and doing test driven development, all I can say is, lots of normal things are confusing. You know what I am talking about, climate change deniers, flat earthers, anti-vaxers, the over-woke(Sleepless in Seattle...).</p>
<p>Here is the point, when I got around to the part of the work where I was like, do I really wanna test this in production? Cowboy hat in hand, I thought, <em>Never drive black cattle in the dark</em>. So I took my good old time and asked the stars for guidance and what did I find? <a href="https://github.com/bats-core/bats-core">BATS</a> which led me to a curious mistake. <a href="https://testanything.org/">TAP</a>, Test Anything Protocol, and I find out it doesn't test anything, in reality its a test reporting format and manner of consuming the results of tests, a protocol if you will. So that's all the history, but its what it inspired in me that brought me joy.</p>
<p>I don't know if you are familiar with <a href="https://ebpf.io/what-is-ebpf/">ePBF</a> which is related to why <a href="https://en.wikipedia.org/wiki/2024_CrowdStrike_incident">Crowdstrike broke the internet</a> that one day in '24. So here is what I wanted TAP to be, ePBF is a tech that lets you run ane extend software running with privilege, you know like kernel extensions that control your Windows System Security at the Airport. Oh yea they don have it because of some interesting non-competition reasons... (cough) Greed. Ok sorry, Test Anything, means to me we have a single interface and mechanism for mocking and asserting our running code. Imagine we don't have to have a bespoke test framework with gads of hard to understand YAML files in your go project. Instead we just have symbols at runtime that can always test a live running application. Its outlandish sure, but a guy can dream right. I mean its cool, so back to BATS which is pretty cool.</p>
<h3 id="let-s-look-at-a-quick-example-script">Let's look at a quick example script <a class="anchor" href="#let-s-look-at-a-quick-example-script">🔗</a>
</h3>
<p><strong>helm.sh</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;">#!/usr/bin/env bash
</span><span>
</span><span>DEFAULT_TIMEOUT</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">30m
</span><span>TIMEOUT</span><span style="color:#d65940;">=</span><span style="color:#f8bb39;">"${1</span><span style="color:#d65940;">:-</span><span style="color:#f8bb39;">$DEFAULT_TIMEOUT}"
</span><span>
</span><span>helm3
</span><span>--wait
</span><span>--timeout ${TIMEOUT}
</span></code></pre>
<p>Pretty easy, we snag the first arg or provide the default. I have probably done this a dozen ways over the years but often skipped setting up any kind of testing. Really, this has just been good luck and the fact that these kinds of scripts are often small and rarely touched.</p>
<h3 id="setup-bats">Setup BATS <a class="anchor" href="#setup-bats">🔗</a>
</h3>
<p><strong>install-bats.sh</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;">#!/bin/bash -e
</span><span>
</span><span style="color:#d65940;">if </span><span style="color:#95cc5e;">[ </span><span>-d </span><span style="color:#f8bb39;">"./test/bats" </span><span style="color:#95cc5e;">]</span><span style="color:#d65940;">; then
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"Deleting folder $FOLDER"
</span><span> rm -rf </span><span style="color:#f8bb39;">"./test/bats/"
</span><span> mkdir -p ./test/bats
</span><span style="color:#d65940;">else
</span><span> mkdir -p ./test/bats
</span><span style="color:#d65940;">fi
</span><span>
</span><span>git clone --depth 1 https://github.com/bats-core/bats-core ./test/bats/bats
</span><span>rm -rf ./test/bats/bats/.git
</span><span>git clone --depth 1 https://github.com/ztombol/bats-support ./test/bats/bats-support
</span><span>rm -rf ./test/bats/bats-support/.git
</span><span>git clone --depth 1 https://github.com/ztombol/bats-assert ./test/bats/bats-assert
</span><span>rm -rf ./test/bats/bats-assert/.git
</span><span>git clone --depth 1 https://github.com/jasonkarns/bats-mock.git ./test/bats/bats-mock
</span><span>rm -rf ./test/bats/bats-mock/.git
</span></code></pre>
<p>Here we dump the bats under a central <em>test</em> directory and we include all the libs:</p>
<ul>
<li><a href="https://github.com/ztombol/bats-support">bats-support</a> - required for other libraries</li>
<li><a href="https://github.com/ztombol/bats-assert">bats-assert</a> - adds deep support for asserts</li>
<li><a href="https://github.com/jasonkarns/bats-mock">bats-mock</a> - allows for stubbing</li>
</ul>
<h3 id="the-test">The Test <a class="anchor" href="#the-test">🔗</a>
</h3>
<p><strong>helm.sh.bats</strong></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#3c4e2d;">#!/usr/bin/env bats
</span><span>
</span><span>bats_require_minimum_version 1.5.0
</span><span>
</span><span style="color:#3c4e2d;"># Load Bats libraries
</span><span>load ../test/bats/bats-support/load
</span><span>load ../test/bats/bats-assert/load
</span><span>
</span><span style="color:#95cc5e;">function </span><span style="color:#60a365;">helm3</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># Captures and echos all the arguments each time helm3 is invoked
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$@"
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"helm3 executed"
</span><span>}
</span><span>
</span><span style="color:#60a365;">setup</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># export -f allows the function to be exported into the current shell env
</span><span> </span><span style="color:#3c4e2d;"># What's cool about this is the shell looks for functions before commands
</span><span> </span><span style="color:#3c4e2d;"># So if we have helm3 installed or not during the test this will be resolved first
</span><span> </span><span style="color:#db784d;">export </span><span>-f helm3
</span><span>}
</span><span>
</span><span style="color:#60a365;">teardown</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># unset is quite important if this shell is to be reused
</span><span> </span><span style="color:#95cc5e;">unset </span><span>-f helm3
</span><span>}
</span><span>
</span><span style="color:#3c4e2d;"># Test cases
</span><span>@test </span><span style="color:#f8bb39;">'when timeout is provided it will be set' </span><span>{
</span><span> </span><span style="color:#3c4e2d;"># The first step is to run our script so bats can capture its output and setup the env for
</span><span> </span><span style="color:#3c4e2d;"># our assertions
</span><span> run sh ./helm.sh 18m
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># allows us to assert a line and verify if any line in the output contains (--partial)
</span><span> </span><span style="color:#3c4e2d;"># our expected string
</span><span> assert_line --partial </span><span style="color:#f8bb39;">"--timeout 18m"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># a catchall to verify we called our stub and as we expect
</span><span> assert_line </span><span style="color:#f8bb39;">"helm3 executed"
</span><span>
</span><span> </span><span style="color:#3c4e2d;"># asserts that the command exited with a 0 exit code
</span><span> assert_success
</span><span>}
</span><span>
</span><span>@test </span><span style="color:#f8bb39;">'when timeout is not provided it will be the default' </span><span>{
</span><span> run sh ./kube/install.sh
</span><span>
</span><span> assert_line --partial </span><span style="color:#f8bb39;">"--timeout 30m"
</span><span>
</span><span> assert_line </span><span style="color:#f8bb39;">"helm3 executed"
</span><span> assert_success
</span><span>}
</span></code></pre>
<p>So thats it, you can test a bash script and mock the commands that we want to verify.</p>
<p>Of course we can also introduce a spy in the case we don't want to mock <em>helm3</em></p>
<pre data-lang="bash" style="background-color:#12160d;color:#6ea240;" class="language-bash "><code class="language-bash" data-lang="bash"><span style="color:#95cc5e;">function </span><span style="color:#60a365;">helm3</span><span>() {
</span><span> </span><span style="color:#3c4e2d;"># Captures and echos all the arguments each time helm3 is invoked
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"$@"
</span><span> </span><span style="color:#3c4e2d;"># Forces a PATH search and forwards arguments
</span><span> </span><span style="color:#95cc5e;">command</span><span> helm3 $@
</span><span> </span><span style="color:#95cc5e;">echo </span><span style="color:#f8bb39;">"helm3 executed"
</span><span>}
</span><span>
</span><span style="color:#db784d;">export </span><span>-f helm3
</span></code></pre>
<p>Will allow the following execution:</p>
<p><code>$ helm3 "HI"</code></p>
<ol>
<li>Will call the helm3 function</li>
<li>Echo the args</li>
<li>Call the helm3 command from the PATH</li>
<li>Echo our status message</li>
</ol>
<p>In some cases you don't want your test to execute destructive operations but inspect its assumptions. Other times you need to know something happened but don't want to interfere with it. Because run captures all outputs we formulate our assertions around verifying those lines
produced in the output that are meaningful.</p>
<p>Here we have only explored interacting with arguments but its possible to assert anything that bash can test. If a file was updated, if a file was created, ultimately if a binary or built-in command holds our context for a valid assertion we can verify it.</p>
<p>Its not quite <em>Test Anything</em> but its damn close.</p>
Go Generics ExampleWed, 22 Jan 2025 00:00:00 +0000[email protected]
https://developmeh.com/software-architecture/go-generics-example/
https://developmeh.com/software-architecture/go-generics-example/<h2 id="go-generics-an-example">Go Generics an Example <a class="anchor" href="#go-generics-an-example">🔗</a>
</h2>
<p>So in my recent Go Game of life (GOGol) projects I have had a personal goal to define some repeatable interfaces. A <strong>renderer</strong> and <strong>game world</strong> that lets me plug in new implementations of a life engine and not have to change too much else. For the renderer this is pretty straight forward. We have some common rendering primitives and we can expose them a methods.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">type </span><span>Renderer </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Beep</span><span>()
</span><span> </span><span style="color:#60a365;">Draw</span><span>(</span><span style="font-style:italic;color:#db784d;">string</span><span>)
</span><span> </span><span style="color:#60a365;">DrawAt</span><span>(</span><span style="font-style:italic;color:#db784d;">int</span><span>, </span><span style="font-style:italic;color:#db784d;">int</span><span>, </span><span style="font-style:italic;color:#db784d;">string</span><span>)
</span><span> </span><span style="color:#60a365;">Dimensions</span><span>() (y </span><span style="font-style:italic;color:#db784d;">int</span><span>, x </span><span style="font-style:italic;color:#db784d;">int</span><span>)
</span><span> </span><span style="color:#60a365;">Start</span><span>()
</span><span> </span><span style="color:#60a365;">End</span><span>()
</span><span> </span><span style="color:#60a365;">Refresh</span><span>()
</span><span> </span><span style="color:#60a365;">BufferUpdate</span><span>()
</span><span> </span><span style="color:#60a365;">Clear</span><span>()
</span><span>}
</span></code></pre>
<p>Now I'll admit some of the underlying goncurses leaked into this interface but this is a work in progress and has yet to be refined. Of course my two implementations either ignore functionality or make everything a log message. I have Mock renderer which captures events as statistics and a shell renderer which displays my game board to the world. Because I am able to express most of this interaction with primitives and commands I give it a hearty thumbs up.</p>
<p>Now on the other hand we have a world, and the world has to describe the life within. That life is specific to the world it lives in. Life as a generalization looks a little something like this:</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">type </span><span>Life </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool
</span><span> </span><span style="color:#60a365;">SetState</span><span>(</span><span style="font-style:italic;color:#db784d;">bool</span><span>)
</span><span>}
</span></code></pre>
<p>In reality any consumer of the world really only needs to be able to see a individual state or possibly mutate that state. The world could be expressed like this:</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">type </span><span>World </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="color:#95cc5e;">Life
</span><span> </span><span style="color:#60a365;">ComputeState</span><span>()
</span><span> </span><span style="color:#60a365;">Bootstrap</span><span>()
</span><span>}
</span></code></pre>
<p>And that works just fine if we only ever need to know about cells as a RW-able entity accessible through our world. The basic game of life would call <strong>ComputeState()</strong> on the world and then iterate through the <strong>Cells()</strong> two dimensional array on each render tick. A little something like this:</p>
<p><em><em>Display</em> is the terminal screen being written to</em>_</p>
<p>Methods from goncurses</p>
<ul>
<li><strong>MovePrint</strong></li>
<li><strong>Refresh</strong></li>
</ul>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">for </span><span>y, row </span><span style="color:#d65940;">:= </span><span style="color:#67854f;">range </span><span>w.Cells() {
</span><span> </span><span style="color:#d65940;">for </span><span>x, cell </span><span style="color:#d65940;">:= </span><span style="color:#67854f;">range </span><span>row {
</span><span> </span><span style="color:#d65940;">if </span><span>cell.State() {
</span><span> w.display.Display.MovePrint(y, x, </span><span style="color:#f8bb39;">"0"</span><span>)
</span><span> } </span><span style="color:#d65940;">else </span><span>{
</span><span> w.display.Display.MovePrint(y, x, </span><span style="color:#f8bb39;">"-"</span><span>)
</span><span> }
</span><span> }
</span><span>}
</span><span>w.display.Display.Refresh()
</span></code></pre>
<p>Because everything is synced to the main render tick we don't need to include any specialized behaviour to our cells. This is he mechanism used in <strong>tradgol</strong> from https://github.com/ninjapanzer/gogol/blob/01b637beca8b1123aad77390286681883edab265/cmd/tradgol/main.go</p>
<p>You might notice in that project I also attempted parallelgol, it was a failure because I struggled to produce generic types for world and game such that I could have radically different implementations of those entities. Time heals all wounds and for me it was understanding how a generic in Go might differ from a Generic in Java another typed language I was familiar with.</p>
<h4 id="here-is-how-i-thought-it-should-go">Here is how I thought it should go <a class="anchor" href="#here-is-how-i-thought-it-should-go">🔗</a>
</h4>
<table style="width:100%">
<thead>
<tr>
<th style="width:50%">
My Idea
</th>
<th>
Reality
</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#67854f;">package </span><span>main
</span><span>
</span><span style="color:#95cc5e;">type </span><span>Life </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool
</span><span> </span><span style="color:#60a365;">SetState</span><span>(</span><span style="font-style:italic;color:#db784d;">bool</span><span>)
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelCell </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> </span><span style="text-decoration:underline;font-style:italic;color:#db784d;">Life
</span><span> state </span><span style="font-style:italic;color:#db784d;">bool
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool </span><span>{ </span><span style="color:#d65940;">return </span><span style="color:#db784d;">false </span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">SetState</span><span>(state </span><span style="font-style:italic;color:#db784d;">bool</span><span>) {}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>World[T Life] </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="color:#95cc5e;">T
</span><span> </span><span style="color:#60a365;">ComputeState</span><span>()
</span><span> </span><span style="color:#60a365;">Bootstrap</span><span>()
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelWorld[T Life] </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> cells [][]</span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">T
</span><span> initProb </span><span style="font-style:italic;color:#db784d;">float64
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(w </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelWorld</span><span>[T]) </span><span style="color:#60a365;">ComputeState</span><span>() {}
</span><span>
</span><span style="color:#95cc5e;">func </span><span style="color:#60a365;">main</span><span>() {
</span><span> world </span><span style="color:#d65940;">:= &</span><span>ChannelWorld[ChannelCell]{}
</span><span> </span><span style="color:#d65940;">...
</span><span>}
</span></code></pre>
<p><strong>ChannelCell does not satisfy Life (method SetState has pointer receiver)</strong> and I was stuck, CellChannel implements the Life interface and thus should be substitutable for the <em>Life</em> generic in <em>ChannelWorld</em>. I am wrong!</p>
</td>
<td>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#67854f;">package </span><span>main
</span><span>
</span><span style="color:#95cc5e;">type </span><span>Life </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool
</span><span> </span><span style="color:#60a365;">SetState</span><span>(</span><span style="font-style:italic;color:#db784d;">bool</span><span>)
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelCell </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> </span><span style="text-decoration:underline;font-style:italic;color:#db784d;">Life
</span><span> state </span><span style="font-style:italic;color:#db784d;">bool
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool </span><span>{ </span><span style="color:#d65940;">return </span><span style="color:#db784d;">false </span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">SetState</span><span>(state </span><span style="font-style:italic;color:#db784d;">bool</span><span>) {}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>World[T Life] </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="color:#95cc5e;">T
</span><span> </span><span style="color:#60a365;">ComputeState</span><span>()
</span><span> </span><span style="color:#60a365;">Bootstrap</span><span>()
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelWorld[T ChannelCell] </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> cells [][]</span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell
</span><span> initProb </span><span style="font-style:italic;color:#db784d;">float64
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(w </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelWorld</span><span>[T]) </span><span style="color:#60a365;">ComputeState</span><span>() {}
</span><span>
</span><span style="color:#95cc5e;">func </span><span style="color:#60a365;">main</span><span>() {
</span><span> world </span><span style="color:#d65940;">:= &</span><span>ChannelWorld[ChannelCell]{}
</span><span> </span><span style="color:#d65940;">...
</span><span>}
</span><span>
</span></code></pre>
<p>The nuance is small but super important in its simplicity. Because ChannelWorld is going to implement World which is generic it must provide a type constraint for T. That type constraint in itself needs to be the concretion this instance of the specific struct we will use. Here is the tricky part, the type constraints when binding a generic interface to a generic implementation is a double edged sword. I presented the error about how the implementation didn't satisfy the constraint interface <strong>Life</strong>.</p>
</td>
</tr>
</tbody>
</table>
<h4 id="another-example">Another example <a class="anchor" href="#another-example">🔗</a>
</h4>
<p>Lets get a little wild</p>
<table style="width:100%">
<thead>
<tr>
<th style="width:50%">
Works and implements the interface
</th>
<th>
Works but doesn't implement the interface
</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#67854f;">package </span><span>main
</span><span>
</span><span style="color:#95cc5e;">type </span><span>Life </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool
</span><span> </span><span style="color:#60a365;">SetState</span><span>(</span><span style="font-style:italic;color:#db784d;">bool</span><span>)
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelCell </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> </span><span style="text-decoration:underline;font-style:italic;color:#db784d;">Life
</span><span> state </span><span style="font-style:italic;color:#db784d;">bool
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool </span><span>{ </span><span style="color:#d65940;">return </span><span style="color:#db784d;">false </span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">SetState</span><span>(state </span><span style="font-style:italic;color:#db784d;">bool</span><span>) {}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>World[T string] </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="color:#95cc5e;">T
</span><span> </span><span style="color:#60a365;">ComputeState</span><span>()
</span><span> </span><span style="color:#60a365;">Bootstrap</span><span>()
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelWorld[T string] </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> cells [][]</span><span style="color:#d65940;">*</span><span style="font-style:italic;color:#db784d;">string
</span><span> initProb </span><span style="font-style:italic;color:#db784d;">float64
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(w </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelWorld</span><span>[T]) </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="font-style:italic;color:#db784d;">string </span><span>{
</span><span> </span><span style="color:#d65940;">return </span><span style="color:#db784d;">nil
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span style="color:#60a365;">main</span><span>() {
</span><span> world </span><span style="color:#d65940;">:= &</span><span>ChannelWorld[ChannelCell]{}
</span><span> </span><span style="color:#d65940;">...
</span><span>}
</span></code></pre>
<p>Focus on <strong>type World[T string] interface</strong></p>
<p>Here the secret is the <strong>World[T string]</strong> interface is constraint accepts the <strong>ChannelWorld[T string]</strong> constraint and thus we met all the conditions. This is the tricky part that kept me guessing because I expected the other side to be an error which it wasn't.</p>
</td>
<td>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#67854f;">package </span><span>main
</span><span>
</span><span style="color:#95cc5e;">type </span><span>Life </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool
</span><span> </span><span style="color:#60a365;">SetState</span><span>(</span><span style="font-style:italic;color:#db784d;">bool</span><span>)
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelCell </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> </span><span style="text-decoration:underline;font-style:italic;color:#db784d;">Life
</span><span> state </span><span style="font-style:italic;color:#db784d;">bool
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">State</span><span>() </span><span style="font-style:italic;color:#db784d;">bool </span><span>{ </span><span style="color:#d65940;">return </span><span style="color:#db784d;">false </span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(c </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelCell</span><span>) </span><span style="color:#60a365;">SetState</span><span>(state </span><span style="font-style:italic;color:#db784d;">bool</span><span>) {}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>World[T Life] </span><span style="color:#95cc5e;">interface </span><span>{
</span><span> </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="color:#95cc5e;">T
</span><span> </span><span style="color:#60a365;">ComputeState</span><span>()
</span><span> </span><span style="color:#60a365;">Bootstrap</span><span>()
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">type </span><span>ChannelWorld[T string] </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> cells [][]</span><span style="color:#d65940;">*</span><span style="font-style:italic;color:#db784d;">string
</span><span> initProb </span><span style="font-style:italic;color:#db784d;">float64
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span>(w </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">ChannelWorld</span><span>[T]) </span><span style="color:#60a365;">Cells</span><span>() [][]</span><span style="font-style:italic;color:#db784d;">string </span><span>{
</span><span> </span><span style="color:#d65940;">return </span><span style="color:#db784d;">nil
</span><span>}
</span><span>
</span><span style="color:#95cc5e;">func </span><span style="color:#60a365;">main</span><span>() {
</span><span> world </span><span style="color:#d65940;">:= &</span><span>ChannelWorld[ChannelCell]{}
</span><span> </span><span style="color:#d65940;">...
</span><span>}
</span><span>
</span></code></pre>
<p>Once again we are now back to this __type World[T Life] interface __</p>
<p>Because interface implementation is passive in Go I expected this mismatch to be an exception or a compiler error but instead what I have is an unused generic interface and a generic struct. Of course downstream I needed something that implemented World and the rest of my code broke.</p>
</td>
</tr>
</tbody>
</table>
<p>Anyways, this was a big learning for me, I hope it helps.</p>
Distributed Game of LifeTue, 21 Jan 2025 00:00:00 +0000[email protected]
https://developmeh.com/projects/gol/
https://developmeh.com/projects/gol/<h2 id="go-channel-based-poc">Go Channel Based PoC <a class="anchor" href="#go-channel-based-poc">🔗</a>
</h2>
<p><a href="https://github.com/ninjapanzer/gogol_channels">GOGol Channels</a></p>
<p>Some work reused from <a href="https://github.com/ninjapanzer/gogol">GOGol</a></p>
<h2 id="gol">GoL <a class="anchor" href="#gol">🔗</a>
</h2>
<p>I have always found simulations exciting. While the Game of Life is a shallow simulation it is fun how fast you can stand it up. In the old days I would always standup a language and create a Rock Paper Scissors game to prove some minor competency. Now its GoL, I like having to build an animation or a statistics engine. What has gotten to me these days is the scale of GoL and then injecting new rules.</p>
<h3 id="distributed">Distributed <a class="anchor" href="#distributed">🔗</a>
</h3>
<p>So one of the things about distribution I am excited about is the noise from eventual consistency. In a traditional GoL we have what I refer to as the <strong>World</strong> nothing more than a matrix of state that is roughly binary, alive or dead.</p>
<p>The world is pre-populated with a seed, some intentional or random spattering of alive to get the whole thing started.</p>
<p>Skipping the rules now we extend each cell in our matrix from a binary to a stateful object. Maybe they have names now like "bert" and "harry". They can have progeny and a history.</p>
<h3 id="time-series-and-genealogy">Time series and genealogy <a class="anchor" href="#time-series-and-genealogy">🔗</a>
</h3>
<p>My first thought was I could track this history using a time-series db, and even attempted to build on in ERLang. But then I realized I could probably make that a little more interesting if I did it with something distributed like NATS or Kafka.</p>
<h2 id="phase-1">Phase 1 <a class="anchor" href="#phase-1">🔗</a>
</h2>
<p>So the first phase here is to introduce only a distributed <strong>World</strong> that can be queried from a compacted topic. Creating some form of client SDK to observe the world graph</p>
<h2 id="phase-2">Phase 2 <a class="anchor" href="#phase-2">🔗</a>
</h2>
<p>Unbound the graph, focusing only on neighborhoods and seeing if I can just query within a contiguous window of the simulation so I could have different views of the same simulation running at the same time.</p>
<h2 id="phase-3">Phase 3 <a class="anchor" href="#phase-3">🔗</a>
</h2>
<p>Heredity, try and see if I can trace the lineage of a cell through this process of events and a graph datastore to add new rules to the game as a cells heredity expands.</p>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="21-01-2025">21 01 2025 <a class="anchor" href="#21-01-2025">🔗</a>
</h3>
<h4 id="debugging-stats">Debugging stats <a class="anchor" href="#debugging-stats">🔗</a>
</h4>
<p>In the last build there was a predilection for the program to pin the host systems CPU. Back on the 19th I spent some time exploring profiling but this didn't produce much fruit. The path to producing my own stats was the right choice. The first symptom was related to rendering the stats intermittently from consuming the stats channel.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">go </span><span style="color:#95cc5e;">func</span><span>() {
</span><span> </span><span style="color:#d65940;">for </span><span>{
</span><span> </span><span style="color:#d65940;">for </span><span>{
</span><span> time.Sleep(</span><span style="color:#95cc5e;">250 </span><span style="color:#d65940;">* </span><span>time.Millisecond)
</span><span> s.Update # draw the stats to the window
</span><span>
</span><span> breakDrain:
</span><span> </span><span style="color:#d65940;">select </span><span>{
</span><span> </span><span style="color:#d65940;">case </span><span>e </span><span style="color:#d65940;">:= <-</span><span>s.eventChan:
</span><span> </span><span style="color:#d65940;">if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Heartbeat {
</span><span> hps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.heartbeats </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Broadcast {
</span><span> bps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.broadcasts </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Died {
</span><span> dps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.died </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Resurrected {
</span><span> dps </span><span style="color:#d65940;">-= </span><span>e.count
</span><span> s.died </span><span style="color:#d65940;">-= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> }
</span><span> </span><span style="color:#d65940;">default</span><span>:
</span><span> </span><span style="color:#d65940;">break </span><span>breakDrain
</span><span> }
</span><span> }
</span><span> }
</span><span>}()
</span></code></pre>
<p>The idea here is that in a goroutine this is backgrounded. When there is nothing to consume it will sleep -> render -> drain channels. What I didn't anticipate is that the speed of heartbeats overwhelmed the consumption. The stats channel can hold 10000 messages but so many were being produced that we never broke the consume loop.</p>
<p>I solved this by introducing a ticker like this:</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">go </span><span style="color:#95cc5e;">func</span><span>() {
</span><span> ticker </span><span style="color:#d65940;">:= </span><span>time.NewTicker(time.Second)
</span><span> </span><span style="color:#d65940;">defer </span><span>ticker.Stop()
</span><span>
</span><span> </span><span style="color:#d65940;">for </span><span>{
</span><span> </span><span style="color:#d65940;">select </span><span>{
</span><span> </span><span style="color:#d65940;">case <-</span><span>ticker.C:
</span><span> s.Update # draw the stats to the window
</span><span> </span><span style="color:#d65940;">case </span><span>e </span><span style="color:#d65940;">:= <-</span><span>s.eventChan:
</span><span> </span><span style="color:#d65940;">if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Heartbeat {
</span><span> hps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.heartbeats </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Broadcast {
</span><span> bps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.broadcasts </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Died {
</span><span> dps </span><span style="color:#d65940;">+= </span><span>e.count
</span><span> s.died </span><span style="color:#d65940;">+= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> } </span><span style="color:#d65940;">else if </span><span>e.name </span><span style="color:#d65940;">== </span><span>Resurrected {
</span><span> dps </span><span style="color:#d65940;">-= </span><span>e.count
</span><span> s.died </span><span style="color:#d65940;">-= </span><span style="font-style:italic;color:#db784d;">int64</span><span>(e.count)
</span><span> }
</span><span> </span><span style="color:#d65940;">default</span><span>:
</span><span> }
</span><span> }
</span><span>}()
</span></code></pre>
<p>Which forced a render to happen and boy howdy I was producing millions of events a second and thats what was pinning the CPU. Adjusting production of those heartbeats. I eventually moved rendering of the stats window to a separate goroutine with an explicit sleep and used this ticker to compute just the period event counts. Which introduced a new condition with the renderer. The stats details would be printed randomly around the window sometimes. I can imagine that under the hood, ncurses, is moving the cursor to all kinds of random locations on the screen and these instructions go on a stack. Sometimes the location its printing doesn't update fast enough and we get a shadow. Remember we are not doing a full screen refresh, only updating the exact location where a cells state has changed. Once we put a shadow somewhere if there is no activity it remains, and thats ugly.</p>
<p>I was really making goncurses do too much work. The solution to erratic rendering is to put the stats in their own window. I think of it like creating a absolutely positioned div with a z axis as front as possible. In goncurses parlance the z is based on the order its created and thats a new way to introduce a bug but providing a fixed location to write my stats to keeps it from being accidentally rendered somewhere else. I imagine that ncurses creates a new stack for each window and combines their buffered stats on update. But who knows, this is the result:</p>
<p><img src="../with_stats.png" alt="dancing-banana" /></p>
<p>There is still a condition where we might write cells over the stats but its good enough.</p>
</div>
<div class="devlog-entry">
<h3 id="20-01-2025">20 01 2025 <a class="anchor" href="#20-01-2025">🔗</a>
</h3>
<h4 id="stats">Stats <a class="anchor" href="#stats">🔗</a>
</h4>
<p>Added channel based stats that help display how many broadcasts and the aggregate of life in the game.</p>
<p><strong>TODO</strong> Provide some change / second stats related to rendering.</p>
<p>The next phase is to build a more rational renderer in SDL. Ncurses is fine when doing simple displays or text overlays. But high fidelity renders require some ASICS and even using garbage built in graphics hardware acceleration will provide a better experience, and it will be a little fun.</p>
</div>
<div class="devlog-entry">
<h3 id="19-01-2025">19 01 2025 <a class="anchor" href="#19-01-2025">🔗</a>
</h3>
<h4 id="profiling">Profiling <a class="anchor" href="#profiling">🔗</a>
</h4>
<p>What I have learned is that for the best effect in rendering we need to set some timing standards. The listen timing for a cell should be roughly double the heartbeat rate. Although this introduces an interesting issue where we might start blocking on our buffered channel. I don't know if the buffers should be bigger and if we should process all events. I suspect draining to the end of every channel and collecting only the final state is the correct item and listening more often.</p>
<p>I found that profiling in golang is a little weak, as a beginner. I am more familiar with executing any binary observed by a profiler. In Goland at least it prefers to only execute profiling during tests. Which promotes testing and atomic profiling. But I haven't gotten to test design for this PoC yet and goncurses introduces its own challenge when running any code that requires a terminal. Again this is mostly a goland issue.</p>
<p>The solution I found was to execute my test with profiling and then move the command that goland generated to a new terminal that goncurses would support and open the pprof it generated in Goland. I think the real fix here is to :TODO: check if the terminal is supporte by goland and skip window generation if its not. While this impacts performance around goncurses rendering it will at least allow me to focus on where my own code is non-performant.</p>
</div>
<div class="devlog-entry">
<h3 id="15-01-2025">15 01 2025 <a class="anchor" href="#15-01-2025">🔗</a>
</h3>
<h4 id="getting-started">Getting Started <a class="anchor" href="#getting-started">🔗</a>
</h4>
<p>So while this project was going to wait for <a href="/i-made-a-thing/recreating-kafka-blind">Krappy Kafka</a> I got to a point where I needed to see a simulacra of the larget project. In this PoC I also added the following requirements:</p>
<ul>
<li>Must used concurrency</li>
<li>Must not use mutexes (So its going to be channels)</li>
</ul>
<p>The larger scale of the work stays the same. We will establish a neighborhood of automata and then allow them to communicate with each other. In this specific scenario they will broadcast themselves to their neighbors.</p>
<p>Each cell owns its own channel and when the neighborhood initializes cells look for their neighbors and collect their channels. So with exception of the edges each cell holds a collection of 8 other cells. When all the cell is initialized it also gets bit vector of each of its neighbors in the order the neighbors are registered.</p>
<p>Here is how neighbors are registered:</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">for </span><span>i </span><span style="color:#d65940;">:= -</span><span style="color:#95cc5e;">1</span><span>; i </span><span style="color:#d65940;"><= </span><span style="color:#95cc5e;">1</span><span>; i</span><span style="color:#d65940;">++ </span><span>{
</span><span> </span><span style="color:#d65940;">for </span><span>j </span><span style="color:#d65940;">:= -</span><span style="color:#95cc5e;">1</span><span>; j </span><span style="color:#d65940;"><= </span><span style="color:#95cc5e;">1</span><span>; j</span><span style="color:#d65940;">++ </span><span>{
</span><span> </span><span style="color:#d65940;">if </span><span>i </span><span style="color:#d65940;">== </span><span>y </span><span style="color:#d65940;">&& </span><span>j </span><span style="color:#d65940;">== </span><span>x {
</span><span> </span><span style="color:#d65940;">continue
</span><span> }
</span><span> </span><span style="color:#d65940;">if </span><span>y</span><span style="color:#d65940;">+</span><span>i </span><span style="color:#d65940;">< </span><span style="color:#95cc5e;">0 </span><span>{
</span><span> </span><span style="color:#d65940;">continue
</span><span> }
</span><span> </span><span style="color:#d65940;">if </span><span>y</span><span style="color:#d65940;">+</span><span>i </span><span style="color:#d65940;">>= </span><span>height {
</span><span> </span><span style="color:#d65940;">continue
</span><span> }
</span><span> </span><span style="color:#d65940;">if </span><span>x</span><span style="color:#d65940;">+</span><span>j </span><span style="color:#d65940;">< </span><span style="color:#95cc5e;">0 </span><span>{
</span><span> </span><span style="color:#d65940;">continue
</span><span> }
</span><span> </span><span style="color:#d65940;">if </span><span>x</span><span style="color:#d65940;">+</span><span>j </span><span style="color:#d65940;">>= </span><span>width {
</span><span> </span><span style="color:#d65940;">continue
</span><span> }
</span><span>
</span><span> cell.AddChannel(cells[y</span><span style="color:#d65940;">+</span><span>i][x</span><span style="color:#d65940;">+</span><span>j].BroadcastChan())
</span><span> }
</span><span>}
</span></code></pre>
<p>One caveat here is that our renderer is ncurses which tends to view the world as <code>y,x</code> not <code>x,y</code> which is a nuance but just means we tend to look our our world as a 2 dimensional array with <code>y</code> being the first index.</p>
<p>Assuming <code>x</code> and <code>y</code> is the location of the cell looking for its neighbors we create an offset of -1 to 1 in both horizontal and vertical directions to touch all 8 of its neighbors. You can visualize it this way:</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>+-----------+-----------+-----------+
</span><span>| | | |
</span><span>| (-1,-1) | (0,-1) | (1,-1) |
</span><span>| | | |
</span><span>+-----------+-----------+-----------+
</span><span>| | | |
</span><span>| (-1, 0) | (0, 0) | (1, 0) |
</span><span>| | | |
</span><span>+-----------+-----------+-----------+
</span><span>| | | |
</span><span>| (-1, 1) | (0, 1) | (1, 1) |
</span><span>| | | |
</span><span>+-----------+-----------+-----------+
</span></code></pre>
<p>So at this point we don't have self-initialization which is a problem as we are initializing the world before we start the simulation. But that is something for the future.</p>
<p>Because a cell needs to know its starting state and the starting state of its neighbors we also accumulate the initial bit vector using <strong>1</strong> for <strong>Alive</strong> and 0 for <strong>Dead</strong>.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">if </span><span>cells[y</span><span style="color:#d65940;">+</span><span>i][x</span><span style="color:#d65940;">+</span><span>j].State() {
</span><span> cell.AddNeighborState(</span><span style="color:#95cc5e;">1</span><span>)
</span><span>} </span><span style="color:#d65940;">else </span><span>{
</span><span> cell.AddNeighborState(</span><span style="color:#95cc5e;">0</span><span>)
</span><span>}
</span></code></pre>
<p>As cells live they listen for updates on its neighbors channels. These are buffered so they do not immediately block and each time a cell checks the mail it doesn't expect that a neighbor has sent them anything. So we assume no news is good news, and future state management is done by collecting a bit mask of the updates. We then compute the neighbors state by applying the bit mask repeatedly to the starting state. Ultimately, a cell is always hoping its memory is good enough to stay alive.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">for </span><span>_, neighborChan </span><span style="color:#d65940;">:= </span><span style="color:#67854f;">range </span><span>c.neighborChans {
</span><span> </span><span style="color:#d65940;">select </span><span>{
</span><span> </span><span style="color:#d65940;">case </span><span>_ </span><span style="color:#d65940;">= <-</span><span>neighborChan: </span><span style="color:#3c4e2d;">// If a message is received on this channel
</span><span> latestStates </span><span style="color:#d65940;"><<= </span><span style="color:#95cc5e;">1
</span><span> latestStates </span><span style="color:#d65940;">|= </span><span style="color:#95cc5e;">1
</span><span> </span><span style="color:#3c4e2d;">//gotUpdate = true
</span><span> </span><span style="color:#d65940;">default</span><span>:
</span><span> latestStates </span><span style="color:#d65940;"><<= </span><span style="color:#95cc5e;">1
</span><span> latestStates </span><span style="color:#d65940;">|= </span><span style="color:#95cc5e;">0
</span><span> }
</span><span>}
</span></code></pre>
<p>The bit vector of the initial state is in the same order as the channels the cell is listening to. This will allow us to later collect updates as a bit mask to progressing state on each cycle of the cells life.</p>
<p>This is a good point to mention that we are deviating from the traditional GoL structure now because we are not binding the world state and the render state of the world. Each cell has some randomness in how fast it lives and this dictates how often it produces a heartbeat and its render speed. Due to this variety of signals we tend not to see smooth growth. Instead we see snapshots because our cells are sometimes living faster than we can observe. There is some work that can be done to tune a more consistent view although its not quite as <strong>beautiful</strong>.</p>
<p>Cells only produce an event when they are alive and the moment of death. This means that when a cell comes alive it doesn't announce itself. This provides a small delay before impacting other cells. When the cells state changes we mark the change with the renderer. In this case we treat the terminal screen object as a frame buffer. While ncurses is particularly bad for reactive programming it does expose some primitives which provide an async render loop.</p>
<p>Under the hood</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#3c4e2d;">// NoutRefresh, or No Output Refresh, flags the window for redrawing but does
</span><span style="color:#3c4e2d;">// not output the changes to the terminal (screen). Essentially, the output is
</span><span style="color:#3c4e2d;">// buffered and a call to Update() flushes the buffer to the terminal. This
</span><span style="color:#3c4e2d;">// function provides a speed increase over calling Refresh() when multiple
</span><span style="color:#3c4e2d;">// windows are involved because only the final output is
</span><span style="color:#3c4e2d;">// transmitted to the terminal.
</span><span style="color:#95cc5e;">func </span><span>(w </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">Window</span><span>) </span><span style="color:#60a365;">NoutRefresh</span><span>() {
</span><span> C.wnoutrefresh(w.win)
</span><span> </span><span style="color:#d65940;">return
</span><span>}
</span></code></pre>
<p>Thus we can create a render loop in our main function that only calls <strong>Update</strong> on the <strong>Screen</strong>. As I mentioned before this means not every update is rendered and we get a smooth but not very organic rendering. There is also no clear a redraw we are progressively updating the display and accumulating writes. If you have worked in a rendering engine before you may be aware of the render loop that broadcasts a tick to all components and then draws the accumulated changes to screen. The smoothness of such renderings is based on the rate of change in distance. Our eye does a good job of building the motion between near states but not so good when the distance is rather far.</p>
</div>
An Internet of Changing MoralityMon, 14 Oct 2024 00:00:00 +0000[email protected]
https://developmeh.com/terms-and-afflictions/software-ethics/
https://developmeh.com/terms-and-afflictions/software-ethics/<p><img src="../dune_slave.jpg" alt="dancing-banana" /></p>
<p>As one might expect, Automated Imitation has dramatically changed the sales position for what it is to produce code and code-like products. As one might imagine there is a delusion that LLM's are producing code faster for startups at lower costs... Maybe. I don't see a lot of formal proof of this, which also doesn't mean it doesn't exist. I also don't tend to read hacker news so take that how you like.</p>
<p>I have also noticed this move away from "Clean Code," which is fine, I guess because that "exciting" seems to be the same as my opinions on the dogmatic use of DRY and other "can't apply everywhere" navel-gazing.</p>
<p>I think LLMs themselves are just a step in a direction, a real breakthrough for reliable generation of specifics from a broad context. My senior thesis back 20 years ago was just this. I wanted to shove the works of Mark Twain into a computerized token store and have it answer questions in the style and opinion of his writing. I was naive and had beautiful ideas...</p>
<p>I wasn't able to accomplish that, but if there was an aspect of that project I should have spent more time on it was the ethics of the thing. There is clearly nothing unethical about the idea just that ethics always plays a part. A more interesting project would have been to partner with someone in my college's theology or philosophy departments to explore some of the bigger ramifications of the work.</p>
<p>I have seen an interesting shift, between the layoffs and "New" product hype that's worth mentioning. I see companies refusing to apply the label AI to their products and instead terming them as LLM and ML products. This has always rung true to me, not because I am a person who avoids reductionist language, clearly, but because it more precise to describe a product as what it is. Intelligence might be taking it too far is all. To me the position of computers has always been the same, do annoying repetitive things and leave the fun stuff to me. I don't wanna do DNS lookups, and I don't think anyone ever has.</p>
<p>I know this because in the 90s, I was on AOL(America Online) in a time without modern search engines and an internet barely with images. I had a printed book, like a phone book, with printed pages that listed AOL Keywords for Businesses. Yes, before there was really an internet, companies were competing over keywords to capture users' attention. For example CAFFE STARBUCKS - Caffe Starbucks or CANT SLEEP - The Late Night Survey. Back then, there was even a kind of human-run auction house where keywords were bought and traded. All this to capture the attention of my parents, who could barely be bothered. Everything was new, and yet it was just a better link aggregator, albeit like others from that time, like Geocities, long forgotten.</p>
<p>All that is consistent is our need to share, and be free to do so. The cycles that we repeat are glorious in how they change our lives. Once they did so by what felt like an accident, now those are marketed to us trying to re-capture that glory. The days of guestbooks and visitor counters were replaced with comments and likes. We were never slaves to these tools; our limitation was our self-education. Once we learned CSS and Java Applets to grab visitors' attention. Now, we invent clever camera and editing techniques to collect users. The only change is the evolution of what was the product.</p>
<p>Having stood sentinel to a world where the internet was almost free, through the advent of "popups" and "AdSense," and continuing to today, where there is no escape from the cacophony of influencers. I have always known that at its core all marketing is dishonest and all sales are exploitation but we all need the Money.</p>
<p>I have often considered the impact of all these deceptive interactions on my psyche. The noise is so fervent and quick that it loses meaning creating a kind of alternate reality where I feel myself disconnecting from reality in defense of the conflict it creates in me. It makes me want to cry, but it also has so little meaning that I would give it, which cannot be named collective value. My brain wasn't formed for this, and I cannot confirm that any brains are, but it has become viral.</p>
<p>What comes next, I wonder? Hopefully, it will be a renaissance through the revelation of Automated Imitation. A brain training on the noise produces "slop." It feels like that is probably the peek of its capability. We clearly have to create and build to grow the capabilities of a system that is a focused mirror on our desires.</p>
<p>Is it then true that the world of Clean Code is no longer needed? Why bother with the craft of structuring code if we can rely on a computer to do it for us. It's not exactly the fun part of the work. It is to me and I will continue to do it, but that doesn't mean you do. But what rules will the computers use and will we even read the code anymore? Is there even a need for anything not to be a binary? Consider a website is just conjured when requested as opposed to rendered. What's the point of not even having a website anymore, and how will it differentiate the ads? Will the content I receive ever be free of targeted influence. Consider the existential horror when the same query posed by my partner and me on different terminals produces completely different experiences. Using different terminology, injecting vague opinions referring to targeted marketing that spans multiple queries, and tuning our searches toward products.</p>
<p>The ethics of this astound me. Do all things need to grow? Are we all just enterprises? What if you didn't need to be a millionaire? How is luxury anything more than a prison?</p>
<p>Back in my senior year of college, we also took an ethics course tailored to our future careers in software, guided by the understanding that there was no end to this roller-coaster. It was going to be the builders that set the standards of not quality but morality. Like the Clean Code, it takes very few poor actions to impact the whole negatively. If you wanna hear a fantastic take on "broken window theory" I'll direct you over to <a href="https://blog.codinghorror.com/the-broken-window-theory/">Jeff Atwood</a> for our purposes though the windows are broken, and its time to "take the neighborhood back."</p>
<p>I took my current position because the idea of being closer to the world of Clean Code was inviting, but in selling a service, I have to find the balance of saying yes to Money and no to producing slop. I think the same challenge is that of the pharmaceutical industry, an enterprise ridden with questionable morality. I deeply attest that there are workers, scientists, sales agents, custodial staff, and IT who are fighting for the moral right to the best anyone can. I specifically refuse to believe that a scientist wants to use science to hurt people and that they take the evaluation and communication of risk very carefully, I am probably naive. How do we get to a place where the collective view of the product of their acts is less than ethical?</p>
<p>If you had my 8th-grade history teacher, there could be only one answer. Let's all say it together, Money! Not that it is evil, but it is a prime motivator. It drives change because it's both the Golden Apple we covet and the Golden Apple we die for. Not a formal death mind you, but a spiritual one. We should be cautious of the need to succeed, and yes, maybe the era of worrying about the quality of our code has ended; I am still suspicious, but I prefer that the narrative be moved to Clean Values. We are not building so we can meet some arbitrary velocity that proves to our users, "we are doing things," but what we do matters.</p>
<p>I would argue that every engineer who decided to stay with Apple when they were required to have an open app store in the EU so 2 app stores could be produced was an indicator of the issue. Heck, I remember when Apple stopped using DRM on MP3s because the juice wasn't worth the squeeze to keep building DRM as music prices fell. The decision to have a legally distinct app store proves they think the juice is worth the squeeze to maintain a monopoly on certain users. Humans will do the work, and those humans, in some small way, accept exploitation as acceptable. They are protecting the profits of their parent company over their fellow humans, including other developers, creating valuable things for humans. That was a battle lost where those who control the means of production should have forced liberation for all.</p>
<p>You don't have to be a rebel to be moral; you just have to look at your actions and continue to take the one that produces the least suffering. We follow the rule, "Leave the campground cleaner than you found it." I would honestly rather argue about whether to build this on moral grounds during a code review or ticket grooming session than fussing about polymorphism anyway.</p>
Am I the Crazy One?Thu, 10 Oct 2024 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/aitco/
https://developmeh.com/soft-wares/aitco/<p>The feeling of AITA or "Am I just Crazy" happens to me a lot. I think it might be the core weakness when trying to manifest confidence in my work.</p>
<p>Sometimes, it is not enough that your changes, be they code or process, tend to work correctly or, if you are lucky, efficiently. Something always comes in and makes me think, "No, I must be the crazy person. Why am I getting so much pushback?"</p>
<p>Sometimes, I think it's the evolution of some impostor syndrome, but the reality is that everything I tend to do is the first and maybe the only time I will touch it. A lifetime of one-off experiences is quite exciting, but divining a wall to kick off from at the start of every endeavor has its costs.</p>
<p>The reality is really perception and comprehension. I have learned that the smarter we are has very little to do with actual knowledge and, in some cases, even how fast we learn things. IQ might point otherwise, but that's not quite what I am talking about cause it's only a facet.</p>
<p>I really overlooked the quality of perception, especially when playing DnD. It is attached to your Wisdom trait somewhat interchangeably.</p>
<p>"Wisdom reflects how attuned you are to the world around you and represents perceptiveness and intuition." Dnd 5e</p>
<p>But a person's depth of perception is really what sets us apart when it comes to divining sources of support. Simply knowing I can is almost enough. Getting others to understand you can, too, proves it.</p>
<p>Contextually it would be easier if you could advertise your Wisdom in a way that tells others to let you run free without having to build their confidence in your skills. Perception has been the hardest thing I have ever had to explain to someone else because some parts of my awareness are not available to them. It sounds arrogant, sure, but it's not about being better; it's just different. I also see perception as a continuum; mine is all logical and mechanical. I can't perceive an object in a new color, but I can imagine what it looks like inside out.</p>
<p>We have the same problem: We have a language problem explaining and comparing our perceptions. Maybe the core here is respecting others for their differences. When others don't understand you or your processes, it's possibly because we fundamentally see the world differently. Don't hold it against them or yourself.</p>
Not Invented HereMon, 30 Sep 2024 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/nih/
https://developmeh.com/soft-wares/nih/<p>Not Invented Here Syndrome is something that has been entering my world a lot lately.</p>
<p>To be a bit reductionist this is a bias to create everything by hand or rely on our own opinions vs those from "foreign" groups.</p>
<p>I tend to think a brash example of this would be the Flat Earth movement. They prefer to rely on what they could experientially see and define themselves and refute those from public doctrine.</p>
<p>But these days I see it when we practice Big Business Bullshit Bingo (4Bs). The constant need to redefine value and process with our own words to enact a sense of possession over them.</p>
<p>Kinda like Science Doctrine, we have been doing the business thing for a long time. We have been selling and building and billing for much of human existence and like science we have done the due diligence and the tools have been laid bare on the table. Would it be so wrong to borrow and use them? Perhaps even enhance them with our own distinct color and give them back.</p>
<p>I find this modality funny because I love open source which constantly teeters on NIH and standardization. But as a proof NIH becomes standardization not the other way around successfully. That process is through great collaboration and effort.</p>
<p>There must be some psychological effect that happens as we mature in a career where we, myself included, start forgetting that not all good ideas are our own and that bespoke for the sake of possession is meerly a siren song.</p>
<p>TLDR; Be boring, get shit done, fix only whats broke!</p>
The Software Delivery TrapFri, 13 Sep 2024 00:00:00 +0000[email protected]
https://developmeh.com/terms-and-afflictions/software-delivery/
https://developmeh.com/terms-and-afflictions/software-delivery/<p>I have been hearing a lot lately about the focus on "Delivery," and it has always struck me as somewhat reductionist and linear in its thinking.</p>
<p>I mean, I get it. Somewhere, a group of (non/ex) builders said, fallaciously, that everything can be built and delivered as a trackable lifecycle. There I go now, being a reductionist, but the difference between building and selling is in play here.</p>
<p>My conflict concerns ownership of the software product and production of technical debt. This is informed by my personal experiences, so keep that in mind. Software work comes in multiple flavors, which primarily impacts the technical debt. For a long time, I have said.</p>
<p>"a software engineer's job isn't to build software but to manage, even anticipate, change."</p>
<p>The pragmatic programmers can leave now and go yell at someone else's clouds.</p>
<p>For those who remain, there is no reason we can't do both. Think of the box that we will ship now and the box that we will ship next week as distinct but part of a system. I often don't see that happen because, in my experience, the roadmap is less than six weeks long for many companies.</p>
<p>Let's talk about behaviors that are successful, though, and not mire them in Big Business Bullshit Bingo terminology. For a product to be successful and be deliverable it must:</p>
<ul>
<li>Have at least a 3-month roadmap with stretch goals</li>
<li>Understand and have mapped the dynamics of the "socio-technical" system it will be delivered in [Teams, Vendors, Business Processes]</li>
<li>Presented to teams as something more significant than a 2-week unit</li>
<li>Must leave space for non-deliverables [System Design and Maintenance]</li>
<li>Must be assigned champions, not just responsible for the team but for driving the completion of the work and owning when it will inevitably go a little pear-shaped.</li>
</ul>
<p>That last one is the next thing we need to train and hire for: ownership. In the days of my grandfather, we would call it "pride in our work." I am going to do a thing, and I am going to be proud of it. If this seems abstract at all, I advise you to read "The Four Agreements" by Don Miguel Ruiz. It's a quick read, I promise!</p>
<p>We build ownership in a few ways, but the easiest is continuity of effort. We work on a problem until it is either SOLVED or GOOD ENUF. The vagueness of the latter is important. Consider this a real-life example of P versus NP.</p>
<p>"can every problem whose solution can be quickly verified can also be quickly solved"</p>
<p>The answer is Nope :)</p>
<p>"Good enuf" means we have at least awareness of the tradeoffs we made and have left some record of the decisions that led us there.
So:</p>
<ul>
<li>Be proud of your work</li>
<li>Be prepared to say no to work that doesn't have a clear roadmap</li>
<li>Be prepared to ask for different if you don't want to be an SME</li>
</ul>
<p>To be clear ownership is not being a Subject Matter Expert. Sometimes toxic, some of us don't want to be pigeon holed and that scares some from ownership. In a healthy work system three has to be a balance were what you do often will be things you have done in the past but it shouldn't be the only thing you do.</p>
<p>There is another tifle though, uninspired work, in this case I often see developers doing a little as possible to move on to something more interesting. Once again I go back to my grandfather who didn't believe in small work. There's a waste of time, and those don't count, it is our responsibility to contextualize the value of our activities and seek clearity when value is low. But, a job still needs to be completed, and doing your best work means doing the work until its Done. I see the opposite in places with uninspired work, the team kinda wanders off from boring work and take so long it gets re-backlogged. The choise of learning something new to solve a problem and letting it kinda become more expensive than its worth occurs.</p>
<p>This being the rarest trait I find in most workers, not just software, is doing the job right, completing the work, and having standards. Sounds rude yea, well sometimes the truth hurts, and thats a learning process. I am not judging you, but if you feel some pangs, well...</p>
Sankey GitWed, 11 Sep 2024 00:00:00 +0000[email protected]
https://developmeh.com/projects/sankey-git/
https://developmeh.com/projects/sankey-git/<h2 id="idea-generation">Idea generation <a class="anchor" href="#idea-generation">🔗</a>
</h2>
<p>Something odd about idea generation is its heredity. Maybe a sankey isn't the right way to diagram it but I think it would look cool if it was weighted by number of commits and the distance between commits.</p>
<p>Probably relational timestamps for related gits could inform on the branching order. I dunno.</p>
<h3 id="tagging">Tagging <a class="anchor" href="#tagging">🔗</a>
</h3>
<p>So for that work I need a network of git repos or todolist items sharing the same tags</p>
<h3 id="logseq">Logseq <a class="anchor" href="#logseq">🔗</a>
</h3>
<p>Probably the same is true about logseq graphs if I can track time.</p>
Learn Event Streaming by Recreating KafkaTue, 10 Sep 2024 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/recreating-kafka-blind/
https://developmeh.com/i-made-a-thing/recreating-kafka-blind/<h1 id="learn-event-streaming-by-recreating-kafka">Learn Event Streaming by Recreating Kafka <a class="anchor" href="#learn-event-streaming-by-recreating-kafka">🔗</a>
</h1>
<blockquote>
<p><strong>I don't know if I really like Kafka all that much</strong></p>
</blockquote>
<p>That said its an interesting way for applications to communicate. See I have been imagining a global game of life implementation with distributed realtime events for each cell in the world. While that is a rather far off dream it made sense to tackle one of its problems. In the past I have setup Kafka development envs with <a href="https://github.com/ninjapanzer/game_of_life_kafka/blob/main/flake.nix">Nix</a> this is all pretty easy. It even goes so far as to try and track how many shells are currently running to track shutting down Kafka when all is said and done. The only reason I do this is because Kafka is written in Java, famous for the addage, <em>"write once use 80% of the resources everywhere."</em> For better or worse that has always stunk for me.</p>
<blockquote>
<p><strong>Kafka can't be that complicated write?</strong></p>
</blockquote>
<p>Thats probably both true and false. The only way to tell would be to try. So while this might be structured like a tutorial its really a devlog of the failures to interpret the features.</p>
<h2 id="goals">Goals <a class="anchor" href="#goals">🔗</a>
</h2>
<p>To start we want to create a realtime streaming platform that provides at least:</p>
<ul>
<li>A binary protocol</li>
<li>A protocol built on TCP</li>
<li>TCP connection multiplexing for consumers and producers</li>
<li>Event Stream and Log Sequence Merge storage</li>
<li>Be filesystem oriented where possible</li>
<li>Don't invent everything just whats needed</li>
<li>GUI testing and debugging tools</li>
<li>Protocol client for consumers and producers</li>
</ul>
<h2 id="arch-diagram">Arch Diagram <a class="anchor" href="#arch-diagram">🔗</a>
</h2>
<p><svg xmlns="http://www.w3.org/2000/svg" style="cursor:pointer;max-width:100%;max-height:231px;" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="481px" viewBox="-0.5 -0.5 481 231" content="<mxfile host="app.diagrams.net" agent="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:133.0) Gecko/20100101 Firefox/133.0" version="26.0.3">
 <diagram name="Page-1" id="87HI9CB669h_auib6GwQ">
 <mxGraphModel dx="2524" dy="1296" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
 <root>
 <mxCell id="0" />
 <mxCell id="1" parent="0" />
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-1" value="Krappy Server" style="rounded=0;whiteSpace=wrap;html=1;verticalAlign=top;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="280" y="120" width="280" height="230" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-7" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" source="EgLxcdtnQqiLWyNZ09j0-2" target="EgLxcdtnQqiLWyNZ09j0-1" edge="1">
 <mxGeometry relative="1" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-2" value="Producer Client" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="80" y="120" width="120" height="60" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-4" value="LSM Store" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="290" y="190" width="120" height="60" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-5" value="Stream Store" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="430" y="190" width="120" height="60" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-6" value="Consumer Group Props" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="290" y="270" width="120" height="60" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-9" style="edgeStyle=orthogonalEdgeStyle;rounded=0;orthogonalLoop=1;jettySize=auto;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" source="EgLxcdtnQqiLWyNZ09j0-8" target="EgLxcdtnQqiLWyNZ09j0-1" edge="1">
 <mxGeometry relative="1" as="geometry" />
 </mxCell>
 <mxCell id="EgLxcdtnQqiLWyNZ09j0-8" value="Consumer Client" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#eeeeee;strokeColor=#36393d;" parent="1" vertex="1">
 <mxGeometry x="80" y="260" width="120" height="60" as="geometry" />
 </mxCell>
 </root>
 </mxGraphModel>
 </diagram>
</mxfile>
" onclick="(function(svg){var src=window.event.target||window.event.srcElement;while (src!=null&&src.nodeName.toLowerCase()!='a'){src=src.parentNode;}if(src==null){if(svg.wnd!=null&&!svg.wnd.closed){svg.wnd.focus();}else{var r=function(evt){if(evt.data=='ready'&&evt.source==svg.wnd){svg.wnd.postMessage(decodeURIComponent(svg.getAttribute('content')),'*');window.removeEventListener('message',r);}};window.addEventListener('message',r);svg.wnd=window.open('https://viewer.diagrams.net/?client=1&page=0&edit=_blank');}}})(this);"><defs><style type="text/css">@import url(https://fonts.googleapis.com/css2?family=Architects+Daughter:wght@400;500);
</style></defs><g><g><rect x="200" y="0" width="280" height="230" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe flex-start; justify-content: unsafe center; width: 278px; height: 1px; padding-top: 7px; margin-left: 201px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Krappy Server</div></div></div></foreignObject><text x="340" y="19" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">Krappy Server</text></switch></g></g><g><path d="M 120 30 L 160 30 L 160 115 L 193.63 115" fill="none" stroke="#36393d" style="stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 198.88 115 L 191.88 118.5 L 193.63 115 L 191.88 111.5 Z" fill="#36393d" style="fill: light-dark(rgb(54, 57, 61), rgb(186, 189, 192)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" stroke-miterlimit="10" pointer-events="all"/></g><g><rect x="0" y="0" width="120" height="60" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 30px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Producer Client</div></div></div></foreignObject><text x="60" y="34" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">Producer Client</text></switch></g></g><g><rect x="210" y="70" width="120" height="60" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 100px; margin-left: 211px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">LSM Store</div></div></div></foreignObject><text x="270" y="104" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">LSM Store</text></switch></g></g><g><rect x="350" y="70" width="120" height="60" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 100px; margin-left: 351px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Stream Store</div></div></div></foreignObject><text x="410" y="104" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">Stream Store</text></switch></g></g><g><rect x="210" y="150" width="120" height="60" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 180px; margin-left: 211px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Consumer Group Props</div></div></div></foreignObject><text x="270" y="184" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">Consumer Group Props</text></switch></g></g><g><path d="M 120 170 L 160 170 L 160 115 L 193.63 115" fill="none" stroke="#36393d" style="stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke-miterlimit="10" pointer-events="stroke"/><path d="M 198.88 115 L 191.88 118.5 L 193.63 115 L 191.88 111.5 Z" fill="#36393d" style="fill: light-dark(rgb(54, 57, 61), rgb(186, 189, 192)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" stroke-miterlimit="10" pointer-events="all"/></g><g><rect x="0" y="140" width="120" height="60" fill="#eeeeee" style="fill: light-dark(rgb(238, 238, 238), rgb(32, 32, 32)); stroke: light-dark(rgb(54, 57, 61), rgb(186, 189, 192));" stroke="#36393d" pointer-events="all"/></g><g><g transform="translate(-0.5 -0.5)"><switch><foreignObject style="overflow: visible; text-align: left;" pointer-events="none" width="100%" height="100%" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: flex; align-items: unsafe center; justify-content: unsafe center; width: 118px; height: 1px; padding-top: 170px; margin-left: 1px;"><div style="box-sizing: border-box; font-size: 0; text-align: center; color: #000000; "><div style="display: inline-block; font-size: 12px; font-family: "Helvetica"; color: light-dark(#000000, #ffffff); line-height: 1.2; pointer-events: all; white-space: normal; word-wrap: normal; ">Consumer Client</div></div></div></foreignObject><text x="60" y="174" fill="light-dark(#000000, #ffffff)" font-family=""Helvetica"" font-size="12px" text-anchor="middle">Consumer Client</text></switch></g></g></g><switch><g requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"/><a transform="translate(0,-5)" xlink:href="https://www.drawio.com/doc/faq/svg-export-text-problems" target="_blank"><text text-anchor="middle" font-size="10px" x="50%" y="100%">Text is not SVG - cannot display</text></a></switch></svg></p>
<h2 id="protocol-design">Protocol Design <a class="anchor" href="#protocol-design">🔗</a>
</h2>
<span style="" class="mermaid">
---
title: "Message"
---
packet-beta
0-31: "Message Size UInt32"
32-63: "Message Type UInt32"
64-95: "Message Payload (Variable Length)"
</span>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h2 id="25-12-2024">25 12 2024 <a class="anchor" href="#25-12-2024">🔗</a>
</h2>
<p>Having now deployed this to k8s through k0s on a local but remote server I have noticed there are throughput issues. While, on the same machine it is possible for a high velocity producer to continuously send events at raw go runtime speed while the server consumes them. But the server is a little more limited and we are finding that either due to disk access speed introduced by containerd or that systems slower architecture we can easily exceed the available threads and crash the app.</p>
<p>I have a couple of ideas of how to handle this:</p>
<ol>
<li>Right now I allocate and write to both a log file and pebbled db on each message received. I might be better batching writes and feeding them into pebble through a channel.</li>
<li>The condition seems to be limited to producer events and possibly there is a bug related to how connections are closed. Possibly, they are not closing immediately on client close and are waiting 5 seconds, effectively backlogging.</li>
</ol>
</div>
<div class="devlog-entry">
<h2 id="22-12-2024">22 12 2024 <a class="anchor" href="#22-12-2024">🔗</a>
</h2>
<p>There has been some work setting up k0s and learning the toolchains involved. I also integrated a build pipeline using Nix. Allows this project to now produce a 40mb image that is easy to deploy to my local k0s. Intentionally, this k8s instance is on a remote machine so I have at least a small non-localhost network effect when testing. I refactored how event handlers are declared under a specific interface for handlers.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">func </span><span>(h </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">Handlers</span><span>) </span><span style="color:#60a365;">ExecuteHandler</span><span>(name </span><span style="font-style:italic;color:#db784d;">string</span><span>, ctx context.</span><span style="color:#95cc5e;">Context</span><span>, contract </span><span style="color:#95cc5e;">interface</span><span>{}) (context.</span><span style="color:#95cc5e;">Context</span><span>, </span><span style="font-style:italic;color:#db784d;">error</span><span>)
</span><span style="color:#95cc5e;">func </span><span>(h </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">Handlers</span><span>) </span><span style="color:#60a365;">ExecuteWithWriteHandler</span><span>(name </span><span style="font-style:italic;color:#db784d;">string</span><span>, ctx context.</span><span style="color:#95cc5e;">Context</span><span>, contract </span><span style="color:#95cc5e;">interface</span><span>{}, w io.</span><span style="color:#95cc5e;">Writer</span><span>) (context.</span><span style="color:#95cc5e;">Context</span><span>, </span><span style="font-style:italic;color:#db784d;">error</span><span>)
</span></code></pre>
<p>while I dont know the normality of creating functional interfaces in Golang, this felt more natural than constructing a type because I had some regular variance in types to controll write access.</p>
<p>That said a message handler did become a type</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">type </span><span>Handlers </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> s </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">Server
</span><span> messageHandlers </span><span style="color:#95cc5e;">map</span><span>[</span><span style="font-style:italic;color:#db784d;">string</span><span>]</span><span style="color:#95cc5e;">func</span><span>(ctx context.</span><span style="color:#95cc5e;">Context</span><span>, contract </span><span style="color:#95cc5e;">interface</span><span>{}, writer io.</span><span style="color:#95cc5e;">Writer</span><span>) (context.</span><span style="color:#95cc5e;">Context</span><span>, </span><span style="font-style:italic;color:#db784d;">error</span><span>)
</span><span>}
</span></code></pre>
<p>Handler registration is then</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span>handlers </span><span style="color:#d65940;">:= </span><span>NewHandlers(s)
</span><span>handlers.RegisterHandler(ConsumerRegistrationHandler, consumerRegistration)
</span><span>handlers.RegisterHandler(ProducerRegistrationHandler, producerRegistration)
</span><span>handlers.RegisterHandler(PollHandler, pollHandler)
</span><span>handlers.RegisterHandler(MessageHandler, messageHandler)
</span></code></pre>
<p>And execution looks like</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#d65940;">case </span><span>ConsumerRegistration:
</span><span> </span><span style="color:#d65940;">if </span><span>ctx, err </span><span style="color:#d65940;">= </span><span>h.ExecuteHandler(ConsumerRegistrationHandler, ctx, m); err </span><span style="color:#d65940;">!= </span><span style="color:#db784d;">nil </span><span>{
</span><span> slog.Error(</span><span style="color:#f8bb39;">"Error registering consumer"</span><span>, </span><span style="color:#f8bb39;">"Error"</span><span>, err)
</span><span> cancel()
</span><span> </span><span style="color:#d65940;">break
</span><span> }
</span><span> ctx </span><span style="color:#d65940;">= </span><span>context.WithValue(ctx, </span><span style="color:#f8bb39;">"ConsumerGroup"</span><span>, m.ConsumerName)
</span></code></pre>
<p>The variance the aforementioned interface provided is related to if the handler will have access to the connection so it can write messages back to the client. I currently only have a single usecase which involves polling. Since this is based on a TCP connection we can infer ACK and appropriately handle those errors on the connection.</p>
<p>At this point I decided that a 5s context timeout might not account for long running connection and on each message publish we extend the timeout. Generally, my opinion is that if you are actively sending we should keep you alive and allow timely termination if you pause. One concern is that each time context timeout is extended we deepend the context object. I assume this causes it to increase in size. I need to do some research to assure if a connection was kept alive for days it wouldn't prove a memory leak.</p>
</div>
<div class="devlog-entry">
<h2 id="05-11-2024">05 11 2024 <a class="anchor" href="#05-11-2024">🔗</a>
</h2>
<p>Consumer groups</p>
<p>Having implemented a polling mechanism it came to mind that I might have multiple concurrent consumers polling for messages. So I need to maintain a shared offset of the all the consumers registered in a consumer group. So I modified my consumer contract.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">type </span><span>ConsumerRegistration </span><span style="color:#95cc5e;">struct </span><span>{
</span><span> </span><span style="text-decoration:underline;font-style:italic;color:#db784d;">ApiContract
</span><span> TopicName </span><span style="font-style:italic;color:#db784d;">string </span><span style="color:#f8bb39;">`codec:"topic,string"`
</span><span> ConsumerName </span><span style="font-style:italic;color:#db784d;">string </span><span style="color:#f8bb39;">`codec:"consumer,string"`
</span><span> Offset </span><span style="font-style:italic;color:#db784d;">uint64 </span><span style="color:#f8bb39;">`codec:"offset,uint64"`
</span><span>}
</span></code></pre>
<p>We now allow the consumer to name themselves and this allows us to allocate a new handle for each client reading allowing two clients to read from the same file at different offsets.</p>
<pre data-lang="go" style="background-color:#12160d;color:#6ea240;" class="language-go "><code class="language-go" data-lang="go"><span style="color:#95cc5e;">func </span><span style="color:#60a365;">declareConsumer</span><span>(consumerName </span><span style="font-style:italic;color:#db784d;">string</span><span>, store </span><span style="color:#d65940;">*</span><span style="color:#95cc5e;">EventStore</span><span>) (</span><span style="font-style:italic;color:#db784d;">string</span><span>, </span><span style="font-style:italic;color:#db784d;">error</span><span>) {
</span><span> i </span><span style="color:#d65940;">:= </span><span style="color:#95cc5e;">0
</span><span> </span><span style="color:#d65940;">for </span><span>{
</span><span> </span><span style="color:#95cc5e;">var </span><span>name </span><span style="color:#d65940;">= </span><span>consumerName </span><span style="color:#d65940;">+ </span><span style="color:#f8bb39;">"_" </span><span style="color:#d65940;">+ </span><span>strconv.Itoa(i)
</span><span> </span><span style="color:#d65940;">if </span><span>exists </span><span style="color:#d65940;">:= </span><span>store.Get(name); exists </span><span style="color:#d65940;">== </span><span style="color:#db784d;">nil </span><span>{
</span><span> </span><span style="color:#d65940;">return </span><span>name, </span><span style="color:#db784d;">nil
</span><span> } </span><span style="color:#d65940;">else </span><span>{
</span><span> i</span><span style="color:#d65940;">++
</span><span> }
</span><span> </span><span style="color:#d65940;">if </span><span>i </span><span style="color:#d65940;">> </span><span style="color:#95cc5e;">1000 </span><span>{
</span><span> </span><span style="color:#d65940;">return </span><span style="color:#f8bb39;">""</span><span>, errors.New(</span><span style="color:#f8bb39;">"Too many consumers"</span><span>)
</span><span> }
</span><span> }
</span><span>}
</span></code></pre>
<p>While a little hacky and providing an arbitrary limit of 1000 consumers per group per server but we generate a sequential consumername for our event store. This will find the first open gap in the list of 0-1000. I have wondered if I can have a range like coroutine that retains the global sequence but I wanted to ensure the list was not exhausted after 1000 but that only 1000 could exist concurrently.</p>
</div>
<div class="devlog-entry">
<h2 id="11-09-2024">11 09 2024 <a class="anchor" href="#11-09-2024">🔗</a>
</h2>
<p>Addressing handshake I decided with some trepidation to use the go ctx object to hold state while a handler is looping. The sequence is a little something like this</p>
<span style="" class="mermaid">
sequenceDiagram
Client->>+Server: Open TCP Conn
Server-->Client: OK
loop connection
alt Communicates
Client->>Server: Register Producer
Server->>Client: ACK
Client->>Server: Publish Event
Server->>Client: ACK
else
Client->>Server: Disconnect
end
end
Server->>-Server: CLOSE CONNECTION
</span>
<p>During thisconnection loop we retain a context stack with a timeout of 5 seconds. So each time we connect to the krappy server we have to establish why we are there and that goroutine then waits until there is more data. Each message is handled by the same goroutine. The choice of using context was intentional but I could have also stored that data outside the context in the closure formed by the goroutine.</p>
<p>The same process happens with the consumer registration. I did want the connection to be a reusable as possible though so once producer is registered that connection could be reused to register a different producer. I don't know if there is a usecase for that but without more reasons to want to isolate a context I followed this process.</p>
</div>
<div class="devlog-entry">
<h2 id="10-09-2024">10 09 2024 <a class="anchor" href="#10-09-2024">🔗</a>
</h2>
<p>The first real learning here was about how kafka deals with compacted topics. So this project is blind of the formal implementation so I looked for an algorithm that was designed to collapse a stream of events to its final value. I looked at a number of tree like patterns that would allow me to collect all the events out of sync and then be able to refer to only the latest but I stopped with the LSM (Log-Structued Merge) <a href="https://en.wikipedia.org/wiki/Log-structured_merge-tree">Wikipedia</a> which I discovered was similar to the rocksdb implementation that kafka uses. I selected pebbleDB as it was based on the original LevelDB implementation I had used in a previous Erlang project. So turns out getting a compacted topic is pretty easy and as long as I can guarantee the write of the produced message I can guarantee that I can have a top value.</p>
<p>The two reasons I selected go for this project was to give me a strong toolchain for concurrency and speed. Golang has great libs for handling network connections and building servers but that still left me to understand how best to build something that could handle massive throughput. I will set my benchmarks running on k8s on an x86-64 linux env.</p>
<p>But this project has a lot to do with managing concurrent resources and mutex which golang does a pretty good job of too.</p>
<p>So the first release includes full stream retention and a compacted topic using leveldb in memory.</p>
<p>I created a couple of e2e tools that allow me to hammer the server both as a producer and consumer but understanding the lifecycle of a connection is my biggest challenge.</p>
<p>Whats ugly about this project is creating a API that requires multiple steps to complete an initial handshake.</p>
</div>
We Do Delivery Now Eh?Tue, 10 Sep 2024 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/the-future-is-delivery/
https://developmeh.com/soft-wares/the-future-is-delivery/<p>I have been hearing a lot lately about the focus on "Delivery," and it has always struck me as somewhat reductionist and linear in its thinking.</p>
<p>I mean, I get it. Somewhere, a group of (non/ex) builders said, fallaciously, that everything can be built and delivered as a trackable lifecycle. There I go now, being a reductionist, but the difference between building and selling is in play here.</p>
<p>My conflict concerns ownership of the software product and production of technical debt. This is informed by my personal experiences, so keep that in mind. Software work comes in multiple flavors, which primarily impacts the technical debt. For a long time, I have said.</p>
<p>"a software engineer's job isn't to build software but to manage, even anticipate, change."</p>
<p>The pragmatic programmers can leave now and go yell at someone else's clouds.</p>
<p>For those who remain, there is no reason we can't do both. Think of the box that we will ship now and the box that we will ship next week as distinct but part of a system. I often don't see that happen because, in my experience, the roadmap is less than six weeks long for many companies.</p>
<p>Let's talk about behaviors that are successful, though, and not mire them in Big Business Bullshit Bingo terminology. For a product to be successful and be deliverable it must:</p>
<ul>
<li>Have at least a 3-month roadmap with stretch goals</li>
<li>Understand and have mapped the dynamics of the "socio-technical" system it will be delivered in [Teams, Vendors, Business Processes]</li>
<li>Presented to teams as something more significant than a 2-week unit</li>
<li>Must leave space for non-deliverables [System Design and Maintenance]</li>
<li>Must be assigned champions, not just responsible for the team but for driving the completion of the work and owning when it will inevitably go a little pear-shaped.</li>
</ul>
<p>That last one is the next thing we need to train and hire for: ownership. In the days of my grandfather, we would call it "pride in our work." I am going to do a thing, and I am going to be proud of it. If this seems abstract at all, I advise you to read "The Four Agreements" by Don Miguel Ruiz. It's a quick read, I promise!</p>
<p>We build ownership in a few ways, but the easiest is continuity of effort. We work on a problem until it is either SOLVED or GOOD ENUF. The vagueness of the latter is important. Consider this a real-life example of P versus NP.</p>
<p>"can every problem whose solution can be quickly verified can also be quickly solved"</p>
<p>The answer is Nope :)</p>
<p>"Good enuf" means we have at least awareness of the tradeoffs we made and have left some record of the decisions that led us there.
So:</p>
<ul>
<li>Be proud of your work</li>
<li>Be prepared to say no to work that doesn't have a clear roadmap</li>
<li>Be prepared to ask for different if you don't want to be an SME</li>
</ul>
Decoupling Patterns in Ruby: OverviewFri, 30 Aug 2024 00:00:00 +0000[email protected]
https://developmeh.com/software-architecture/decoupling-patterns-in-ruby-overview/
https://developmeh.com/software-architecture/decoupling-patterns-in-ruby-overview/<h2 id="an-uncomplicated-picture">An Uncomplicated Picture <a class="anchor" href="#an-uncomplicated-picture">🔗</a>
</h2>
<p>I was once asked, "Where would you put your business logic in an MVC application?"</p>
<p>As you can expect, I sardonically responded. Not in the MVC application. That impertinence requires some explanation, so let's cool down for a moment and see if we can set some ground rules.</p>
<p>As Sandi Metz kindly explains, DI isn't so scary; none of SOLID is, and as a set of principles, it guides us away from smells and towards code that is easier for humans to understand. We get closer to that dream of well-abstracted, isolation-tested components.</p>
<p>Specifically, in my lifetime working with Ruby on Rails developers, there has been a pattern I will describe as "Lazy Coupling." You will not find that pattern in the Gang of Four, and if you google it, don't get distracted by Loose Coupling. I believe Sandi Metz covered this while describing SOLID in the context of Dependency Injection with this <a href="https://sandimetz.com/blog/2009/03/21/solid-design-principles#example4pain">example</a>. While Sandi isn't intending to pick on the Rails world, I am not so kind. It's the world of clean code that Rails often violates. By its design, it presents a house of "broken windows" to new developers, and it takes considerable effort to break that dogma as they mature.</p>
<p><strong>Lazy Coupling</strong> is when we directly assign a constant to the return value of an instance method of another class. It looks a little something like this:</p>
<h3 id="example">Example <a class="anchor" href="#example">🔗</a>
</h3>
<table style="width:100%">
<thead>
<tr>
<th style="width:50%">
Without DI
</th>
<th>
With DI
</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#d65940;">class </span><span style="text-decoration:underline;color:#db784d;">LazyCouple
</span><span> </span><span style="color:#d65940;">def </span><span style="color:#60a365;">perform
</span><span> data </span><span style="color:#d65940;">=</span><span> data_source(</span><span style="color:#95cc5e;">1</span><span>)
</span><span> data.</span><span style="color:#95cc5e;">name
</span><span> </span><span style="color:#d65940;">end
</span><span>
</span><span style="color:#95cc5e;">private
</span><span> </span><span style="color:#d65940;">def </span><span style="color:#60a365;">data_source</span><span>(id)
</span><span> </span><span style="font-style:italic;color:#db784d;">TheDataSource</span><span>.find(id)
</span><span> </span><span style="color:#d65940;">end
</span><span style="color:#d65940;">end
</span></code></pre>
</td>
<td>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="color:#d65940;">class </span><span style="text-decoration:underline;color:#db784d;">LooseCouple
</span><span> </span><span style="color:#d65940;">def </span><span style="color:#60a365;">initialize</span><span>(data_source </span><span style="color:#d65940;">= </span><span>TheDataSource)
</span><span> @data_source </span><span style="color:#d65940;">=</span><span> data_source
</span><span> </span><span style="color:#d65940;">end
</span><span>
</span><span> </span><span style="color:#d65940;">def </span><span style="color:#60a365;">perform
</span><span> data </span><span style="color:#d65940;">= </span><span>@data_source.find(</span><span style="color:#95cc5e;">1</span><span>)
</span><span> data.</span><span style="color:#95cc5e;">name
</span><span> </span><span style="color:#d65940;">end
</span><span style="color:#d65940;">end
</span></code></pre>
</td>
</tr>
<tr>
<td>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="font-style:italic;color:#db784d;">RSpec</span><span>.describe LazyCouple </span><span style="color:#d65940;">do
</span><span> describe </span><span style="color:#f8bb39;">'#perform' </span><span style="color:#d65940;">do
</span><span> let(</span><span style="color:#db784d;">:data_source_instance</span><span>) </span><span style="color:#d65940;">do
</span><span> double(</span><span style="color:#f8bb39;">'DataSourceInstance'</span><span>, </span><span style="color:#db784d;">name: </span><span style="color:#f8bb39;">'Some Name'</span><span>)
</span><span> </span><span style="color:#d65940;">end
</span><span>
</span><span> before </span><span style="color:#d65940;">do
</span><span> allow(TheDataSource).to receive(</span><span style="color:#db784d;">:find</span><span>)
</span><span> .with(</span><span style="color:#95cc5e;">1</span><span>).and_return(data_source_instance)
</span><span> </span><span style="color:#d65940;">end
</span><span>
</span><span> subject { described_class.</span><span style="color:#95cc5e;">new</span><span>.perform }
</span><span>
</span><span> it </span><span style="color:#f8bb39;">'fetches data and returns the name' </span><span style="color:#d65940;">do
</span><span> expect(subject).to eq(</span><span style="color:#f8bb39;">'Some Name'</span><span>)
</span><span> </span><span style="color:#d65940;">end
</span><span> </span><span style="color:#d65940;">end
</span><span style="color:#d65940;">end
</span></code></pre>
</td>
<td>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span style="font-style:italic;color:#db784d;">RSpec</span><span>.describe LooseCouple </span><span style="color:#d65940;">do
</span><span> describe </span><span style="color:#f8bb39;">'#perform' </span><span style="color:#d65940;">do
</span><span> let(</span><span style="color:#db784d;">:data_source_instance</span><span>) </span><span style="color:#d65940;">do
</span><span> double(</span><span style="color:#f8bb39;">'DataSourceInstance'</span><span>, </span><span style="color:#db784d;">name: </span><span style="color:#f8bb39;">'Some Name'</span><span>)
</span><span> </span><span style="color:#d65940;">end
</span><span> let(</span><span style="color:#db784d;">:mock_data_source</span><span>) </span><span style="color:#d65940;">do
</span><span> double(</span><span style="color:#f8bb39;">'MockDataSource'</span><span>, </span><span style="color:#db784d;">find:</span><span> data_source_instance)
</span><span> </span><span style="color:#d65940;">end
</span><span>
</span><span> subject { described_class.</span><span style="color:#95cc5e;">new</span><span>(mock_data_source).perform }
</span><span>
</span><span> it </span><span style="color:#f8bb39;">'fetches data and returns the name' </span><span style="color:#d65940;">do
</span><span> expect(subject).to eq(</span><span style="color:#f8bb39;">'Some Name'</span><span>)
</span><span> </span><span style="color:#d65940;">end
</span><span> </span><span style="color:#d65940;">end
</span><span style="color:#d65940;">end
</span></code></pre>
</td>
</tr>
</tbody>
</table>
<p>These two look nearly identical, but can you spot the big difference? It's in the specs, which often are our best mirror on implementation. Our Spec doesn't need to reference the constant for TheDataSource; it instead provides its own mock, and that mock is a double.</p>
<p>I hear you saying, "Big deal!"</p>
<p>It is a BIG DEAL!</p>
<p>These little changes add up. The DI test is a little easier to read as it references more of its own constants. It is also completely isolated from the system. If you need to refactor this code between gems, this test could be transported along, and we can guarantee that our coverage and quality don't degrade.</p>
<p>Without going too far out on a limb, we have provided a space for this class to be Open/Closed; we can extend its behavior without modifying the class. Say we decided to change the data source. We may want to continue to use this behavior, but we have been developing a new data source that ActiveRecord does not support, like a network call. This class can stay the same, and our Spec will still validate this behavior in isolation. It also provided documentation of our protocol with the dependency.</p>
<p>It's important to recall that Ruby communicates over <strong>Protocols</strong> not <strong>Contracts</strong> by default. Each implementation of a new data source is backed by its integration test while the core functionality continues its life Closed.</p>
<p>Listen, I can hear the chant in the background slowly growing... yagni... Yagni... YAGni... YAGNI. The truth is I agree with you, and it's a balancing act. I have always said that our job as engineers is to manage the change in our systems and not necessarily write code.
Consider this quote from Robert Nystrom.</p>
<blockquote>
<p>There's no easy answer here. Making your program more flexible so you can prototype faster will have some performance cost. Likewise, optimizing your code will make it less flexible.</p>
</blockquote>
<p>Interestingly, Robert identifies that a more rigid software architecture free from additional abstractions is the pre-optimization. We trade off current performance for flexibility. Of course, the point concerns systems performance, but it also applies to delivery performance. It takes a little longer to design and implement a flexible architecture, and we get that payback in agility when changing the system later.</p>
<p>To explore these trade-offs, let's discuss a decoupled system architecture for building an API platform in Ruby on Rails.</p>
<p>The components of our system are not limited to the following, and while some literally are represented within Rails, they are also spiritual boundaries for logic. I prefer to consider software architecture design from the same direction as execution, allowing us to assume we have a running Ruby on Rails application. It is a stack of middleware attached to Rack run by a threaded executor like Puma, standard.</p>
<p>Components:</p>
<ul>
<li>Controller (Protocol Handler)</li>
<li>Services (Business Logic)</li>
<li>Model (Data Access Layer)</li>
</ul>
<p>The request comes in and touches Puma, which creates a thread and populates the request context within Rack; at this point, we have a Ruby hash with all the details from the network request. Much later, we get to the controller, which represents the entry point of our API logic, and because we have bespoke logic, we need to provide a business case for its execution. This API in question is a Command that offers a "Buy-it-Now" feature for our dog food e-commerce portal. It will, assuming the actor is logged in, send a product using our default payment instrument to our default shipping location.</p>
<pre style="background-color:#12160d;color:#6ea240;"><code><span>HTTP REQUEST
</span><span>|
</span><span>PUMA
</span><span>|
</span><span>|--> Thread Spawned
</span><span>| |
</span><span>| |--> Convert HTTP Data to Ruby Hash
</span><span>| |--> RACK
</span><span>| | |
</span><span>| | |--> Rack Middleware
</span><span>| | |--> Rails Middleware
</span><span>| | | |
</span><span>| | | |--> Routing Middleware -> Create Controller Instance
</span><span>| | | | |
</span><span>| | | | |--> Controller Action
</span><span>| | | | |--> __YOUR CODE__
</span><span>| | | |<-- |<-- Response
</span><span>| | | |--> Other Middlewares (if any)
</span><span>|<-- |<-- |<-- Return Hash to PUMA
</span><span>|
</span><span>HTTP RESPONSE
</span></code></pre>
<p>As you can imagine, there are many opportunities for decoupling that Rails appears to want to fight us on but, in fact, happily supports. Our first stop will be the Protocol Handler or, more commonly, the controller. In our next part, we will describe the SOLID implication of the controller and design a pattern for building the entry point for any request, be that consumed by:</p>
<ul>
<li>API (Application Programming Interface)</li>
<li>LPC (Local Procedure Call)</li>
<li>deferred worker</li>
<li>ESB (Enterprise Service Bus)</li>
<li>RPC (Remote Procedure Call)</li>
</ul>
<p>From there, we will move on to the design of our Service Layer; this will include observability concerns as well as auditing and Actors. We will, of course, always keep this in the context of enterprise production systems; we will enforce security and implement RBAC (Role Based Access Control). Our practice will also include a deeper dive into relevant programming patterns outside those introduced by SOLID.</p>
End User Languor AgreementSat, 10 Aug 2024 00:00:00 +0000[email protected]
https://developmeh.com/terms-and-afflictions/eula/
https://developmeh.com/terms-and-afflictions/eula/<h2 id="equipment">Equipment <a class="anchor" href="#equipment">🔗</a>
</h2>
<p>You will <strong>only</strong> operate a device that is <strong>not</strong> the same architecture as your production systems at all times. This device will have enough mass to damage your furnature if dropped. While this equipment will come pre-installed with tools applicable to your daily activities, they will be woefully outdated and do not work the same as industry standard. While not able to be deleted you will need to shadow these tools with public versions and keep them configured. You will be provided with no less than two hardware virtualization platforms both of which require significant performance impacts on daily operations. You will use this equipment as a form of status in the caste system of enterprise software development. You are required to politely shun others with "inferrior" equipment. You will be required to operate your daily activities with half the storage resources as "inferrior" equipment due to the great overhead of modern day silicon chips and design processes.</p>
CI Over CDThu, 08 Aug 2024 00:00:00 +0000[email protected]
https://developmeh.com/devex/ci-cd/
https://developmeh.com/devex/ci-cd/<h2 id="you-have-continuous-deployment-not-continuous-delivery">You have Continuous Deployment, not Continuous Delivery <a class="anchor" href="#you-have-continuous-deployment-not-continuous-delivery">🔗</a>
</h2>
<p>I am a consultant, and I have lived in more production systems than most. I will tell you that CD(continuous delivery) is easy. Everyone does it, and in general, it works. I say this because this blog has CD, and at this point in time, the configuration for that is 43 lines of YAML. It probably could be shorter but it was easy and its pretty reliable. When you think about CD, it is a very light extension of what our toolchains do every day. I press the play button or run the <code>npm test</code>, and it passes, or it doesn't. Moving those steps to another computer with NIX or Docker is thus trivial. I duplicate the same toolchain plus a couple of release-specific language tools, and I am in production.</p>
<p>So, stepping back for a moment, just because I have seen a lot of systems doesn't generally mean I have seen a lot of good ones. But, I have caught a thread: the cost of true continuous delivery is at odds with little "a" Agile and our impression of what is good 'nuf. I am a TDD nut with a penchant for E2E testing; I am from the Pittsburgh area, and I love pickles; gherkin and cucumber are my best friends. The Steelers are a fantastic foosball club; pickles don't come for free. You have to grow some cucumbers and marinate them; waiting is something you have to get used to; running E2E tests is slow, but in both cases, the results are worth it. If you are a Dallas fan, that's "ok," too.</p>
<h2 id="slow-is-fast">Slow is fast <a class="anchor" href="#slow-is-fast">🔗</a>
</h2>
<p>Consider this: you have a huge platform that, like all huge platforms, cannot $$permit$$$$$$ themselves to be offline for even seconds, and yet we have decided not to use a Waterfall methodology, yea I know it sounds crazy. So what do we do? Build up a test suite, <a href="https://martinfowler.com/bliki/ConwaysLaw.html">Conway's Law</a>, some teams, and start building systems that can create artifacts for deployment. Regardless of whether you let the computer deploy them or not, those artifacts are "deployable," and that's where we stopped. We fill in the gaps with humans, pressing buttons, watching graphs, and generally doing the stuff humans are bad at. Surprisingly, bugs sneak past, and then more humans furiously tell relaxed humans to figure it out. Like ants during a termite invasion, you've been there, in the special "command center" of 30 some people with only 3 talking. Each team has their assigned scouts waiting for a fight and shielding the colony from wasting their time. Feels busy, feels dangerous, feels fast! Good thing they don't build houses this way, right?</p>
<p>So what's the alternative? And no, it's not "MOAR tests!" Yep, surprised even myself there, cause I kinda wanna write some more tests. It's a concept of coverage, though; just because I have merged my change doesn't mean I need it to deploy immediately. I do need it to be tested, though, and I need that evaluation to be feature-aware. It's not so surprising that we create a system for a fixed purpose, and we, some fancy folks with paper hats, create a dish for our customers to consume. Now say it with me, they planned the dish, the cooks made it, and then.... they threw the recipe away. Yep, go find a product person adjacent to your team and ask them to describe a given feature's purpose. Not the technical implementation, but the "5 whys", how we got here, and what problem we were trying to solve. Nine out of ten directors will struggle to describe the answer. If you are in the bottom 10 percent, stay there and count your blessings. The rest of us, on the other hand, are entering the Pith helmet phase, wandering off to recolonize the heathens we abandoned a generation or two of developers past.</p>
<p>Consider if you will the transition of humans past from their oral tradition to a written one. Product features are nothing more than folklore we must adhere to, not because they are good but because we don't quite understand them. They say things move fast in tech, developers average life span on a project or company is 2 years. Ten times faster than that of the organized media's two-decade reminder that a new generation exists as a meaningless moniker to help inject market separation. While it may feel we are far from the point, it's our lack of history that makes us ignorant and the tradition of ignorance that makes us complicit.</p>
<p>There is a better world! It involves no longer adhering to the convention that we need to deliver a product to appease a schedule but to be art. An artifact if you will of a moment of creation, never to exist again but to persist into infinity. And we do this with test automation, documentation, and continuous Integration. The simple nature of continuous Integration can find its core in how we organize commits. Change management starts at the very level of the code change and the changes being recorded. Each commit is; atomic, can be built and tested(doesn't have to pass), and is a complete change. One step up we have a branch or feature, which is complete in its specification and includes tests internal and external. If external, it exposes immutable contracts that express specific intents that require change management and re-evaluation to evolve. One step further, our entire product is a hierarchy based on the quality of our commits.</p>
<pre>
Deployment_Artifact
└── E2E_Testing
└── External Contracts
└── Unit_Tests
└── Feature or Branch
└── Atomic Commit
</pre>
<p>So taking a bit more time to build out this hierarchy on each delivery means we don't have to maintain a long-lived understanding of all of the parts. Instead of long-lived tribal knowledge within a team, we use the details each team exposes and trust that the tests cover our goals. This doesn't mean we will never have bugs or errors because, at some level, we are still humans writing the feature for a computer we partially understand. Each level requires a set of completion criteria that makes us think less until some ragamuffin pushes a "quick fix" and leaves us forever wondering why one request can either have a customer_id or a customer_uuid, but the fix was valuable enough and that developer doesn't work here anymore. Since it is now the basis of our entire product path, it is going to stay there. It may feel like <a href="https://en.wikipedia.org/wiki/Broken_windows_theory">broken windows theory</a>, but go on, buy anything of reasonable cost at the big box appliance store; you are gonna want a discount for that scratch and dent. While broken window theory has made its way into the Clean Code cult, the reality is more about pride in our environment and respect for our ergonomics. The opposite being <a href="https://en.wikipedia.org/wiki/Trail_ethics">Trail Ethics</a> where we all ascribe to a set of conditions that leave the world untouched and <a href="https://en.wikipedia.org/wiki/Leave_No_Trace">Leave no trace</a>. While not completely possible when evolving software, we can respect our impacts.</p>
<h2 id="when-you-do-things-right">When you do things right <a class="anchor" href="#when-you-do-things-right">🔗</a>
</h2>
<iframe width="560" height="315" src="https://www.youtube.com/embed/edCqF_NtpOQ?si=9NRFNbsYDupZ176J" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<p>Simply, "Watch the pennies and the dollars will take care of themselves. - Franklin", good work is based on a steady stream of quality and consistency. Quality takes effort and time; the results lead to more time saved, not asking why. Early in my career, it seemed normal to expect the highest quality before speed. We didn't advocate for speed until quality could be achieved and even then it was a bonus. This has been replaced for absent-minded "quick win" methodology, which is sometimes linked to <a href="https://en.wikipedia.org/wiki/Lean_software_development">Lean Software Development</a>, but mind you read about it first. The truth is lean is not about fast but about eliminating waste, one of those being "relearning". While not quite the same as "don't make me think", it is about having an environment where truth is self-evident in process and in execution. Sounds fancy, and that's cause it is. Continuous Integration is the process of allowing developers to not focus on the past or the tangental concerns and instead focus on the work at hand, management processes, and other operations concern themselves with specific cross-cutting concerns that are related, unblocking, and orthogonal.</p>
<p>All of the companies that have been a joy to work for and have eventually grown to something great have followed these ideals.</p>
<p>What does the "ideal" CI process look like?</p>
<p>A developer runs their test suite and merges their feature, already a number of tests have passed and tests that proved there were enough tests. Processes have evaluated the product to prove that standard practices have been followed. Next, we prepare an artifact for integration testing, which can be eventually promoted to an artifact that will make its way to production. In this phase, a series of end-to-end tests are run looking for fires, smoke tests if you will, that evaluate success over completeness. These tests are run against both production artifacts and other pre-production artifacts awaiting release. Given these steps pass, we are what I often refer to as the "rubber meets the road" part of the failure path. We actually deploy it, and we deploy it to a portion of our consumers. There are two conditions that need to be met to complete a deployment; enough interactions have happened to consider the artifact valid, and we have observed no aberration in behavior. The former is related to SLOs (Service Level Objectives) or Core Flows that are expected to work with a specific performance and accuracy. The latter is about real-world requests and their impacts, not artificial, may be referred to as <strong>canary</strong>. When both of these items pass, we complete the deployment and immediately start the next release over and over until time immemorial or the VC runs out, whichever comes first.</p>
<p>A lot of stuff just happened in those few steps, and there's lots of permutations of how to achieve this. In a perfect world, a release is not a single developer's code but instead a batch of code that can describe its own changelog through the accuracy of its commit messages. Something that when it fails allows us to extract a change using tools like bisect to eliminate offending work and allow it to be staged later. While this might sound like stuff that only Google and Amazon do, I have worked in places significantly smaller doing this much or more. We just had a commitment to it and a team dedicated to its perfection. We had engineers interested in the ergonomics of the work outside themselves. And, we had leaders that tracked and informed on the process.</p>
<p>If the definition of vacation is not having to carry keys, the definition of Continuous Integration is deploying on a Friday and turning off your computer. Its not always going to be precisely possible but we can get close.</p>
Ruby Dancing ANSI Banana for CurlSat, 27 Jan 2024 00:00:00 +0000[email protected]
https://developmeh.com/i-made-a-thing/ruby-streaming-banana-dancer/
https://developmeh.com/i-made-a-thing/ruby-streaming-banana-dancer/<p><strong>Hey there, fellow coder! Ever seen a parrot dance in your terminal?</strong> 🦜💃</p>
<p>If you've taken a trip to <code>parrot.live</code>, you know exactly what I'm talking about. It's quirky, it's fun, and yes, a parrot dances right there in your terminal. But what if I told you there's another dancing star in town? And it's not a bird. Meet the <em>Ruby Streaming ANSI Banana</em>! 🍌</p>
<p>Yeah, I did say banana.</p>
<p>So, here’s the scoop. I was chilling, thinking about how much I enjoyed that dancing parrot, and a thought popped up: "Could I do this with Ruby? And maybe... not a parrot?" Fast forward, and ta-da, a dancing banana was born. It clears your terminal screen with some nifty ANSI tricks and then gets its groove on. It's like the parrot, but you know, it’s a banana... and it's Ruby.</p>
<p>Now, I hear you: "But... why a banana?" The real question is, why not? Coding isn’t just about solving serious problems; it's also about having a bit of fun, letting your hair down, and, occasionally, making fruit dance in your terminal.</p>
<p>The best part? If you're team CURL, you're just one command away from some smooth banana moves. The banana doesn’t just dance—it does so smoothly, with some chunk-encoded charm ensuring that every move is in sync, right in your terminal.</p>
<p>So, curious? Want to dive into some fruity fun? Swing by <a href="https://github.com/developmeh/ruby_streaming_ansi_banana">ruby_streaming_ansi_banana</a> and see it for yourself. And if you’re feeling extra creative, why not customize it? Maybe a shimmying strawberry or a waltzing watermelon?</p>
<p>Bottom line: In the world of code, there's always room for a dance, even if it's just a banana showing off its moves. So, let's not take ourselves too seriously and enjoy the rhythm, one ASCII character at a time!</p>
<p>Try it: $ <strong>curl <a href="https://dancing-banana.developmeh.com/live">https://dancing-banana.developmeh.com/live</a></strong></p>
<hr />
<p>Hope your keyboard’s ready for some dancing fun! 🍌🕺🎵</p>
<p><img src="../dancing-banana.gif" alt="dancing-banana" /></p>
<h2 id="devlog">DevLog <a class="anchor" href="#devlog">🔗</a>
</h2>
<div class="devlog-entry">
<h3 id="31-01-2025">31 01 2025 <a class="anchor" href="#31-01-2025">🔗</a>
</h3>
<h4 id="beating-nix">Beating Nix <a class="anchor" href="#beating-nix">🔗</a>
</h4>
<p>In the last update I made some breaking changes to the project's cross-platform-ness which bothered me but I find nix challenging at times. Since I was starting with something that worked though migrating it to one that supports all platforms was easier.</p>
<p><a href="https://github.com/developmeh/ruby_streaming_ansi_banana/blob/v0.2.0/flake.nix">flake.nix</a></p>
<p>Starting here we create a lambda that accepts the <strong>system</strong> argument. Later this will inherit the supported system of that loop over <strong>[ "x86_64-linux" "aarch64-linux" "x86_64-darwin" "aarch64-darwin" ]</strong>.</p>
<pre data-lang="nix" style="background-color:#12160d;color:#6ea240;" class="language-nix "><code class="language-nix" data-lang="nix"><span>forEachSupportedSystem </span><span style="background-color:#00a8c6;color:#f8f8f0;">=</span><span> f: nixpkgs</span><span style="color:#d65940;">.</span><span>lib</span><span style="color:#d65940;">.</span><span>genAttrs supportedSystems (system: f {
</span><span> </span><span style="color:#db784d;">system </span><span style="color:#d65940;">= </span><span>system; </span><span style="color:#3c4e2d;"># Ensure the 'system' is passed into the function
</span><span> </span><span style="color:#db784d;">pkgs </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">import </span><span>nixpkgs { </span><span style="color:#67854f;">inherit </span><span style="color:#db784d;">system</span><span>; };
</span><span>})</span><span style="background-color:#00a8c6;color:#f8f8f0;">;</span><span>
</span></code></pre>
<p>Next we will define how <strong>gems</strong> are created and they will inherit the supported system for the host env.</p>
<p>This function accepts the system and builds the bundler env.</p>
<pre data-lang="nix" style="background-color:#12160d;color:#6ea240;" class="language-nix "><code class="language-nix" data-lang="nix"><span>gems </span><span style="background-color:#00a8c6;color:#f8f8f0;">=</span><span> system: </span><span style="color:#67854f;">let
</span><span> </span><span style="color:#db784d;">buildpkgs </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">import </span><span>nixpkgs { </span><span style="color:#db784d;">system </span><span style="color:#d65940;">= </span><span>system; };
</span><span style="color:#67854f;">in </span><span>buildpkgs</span><span style="color:#d65940;">.</span><span>bundlerEnv {
</span><span> </span><span style="color:#db784d;">name </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"ruby-dancing-banana"</span><span>;
</span><span> </span><span style="color:#db784d;">ruby </span><span style="color:#d65940;">= </span><span>buildpkgs</span><span style="color:#d65940;">.</span><span>ruby_3_2;
</span><span> </span><span style="color:#db784d;">gemfile </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">./Gemfile</span><span>;
</span><span> </span><span style="color:#db784d;">lockfile </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">./Gemfile.lock</span><span>;
</span><span> </span><span style="color:#db784d;">gemset </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">./gemset.nix</span><span>;
</span><span>}</span><span style="background-color:#00a8c6;color:#f8f8f0;">;</span><span>
</span></code></pre>
<p>Next we use that function to inherit the bundler env for our docker image. The gems function is invoked <strong>gemEnv = gems systemAttrs.system</strong>. Those attributes were generated in this scope using the nixpkgs.lib.genAttrs. In that block above we exposed a <strong>system = ...</strong> its values were mapped to <strong>systemAttrs</strong>.</p>
<p>We do the same thing to alias our package source for nix <strong>buildpkgs</strong>.</p>
<p>When we do <strong>nix build</strong> buildImage will be invoked.</p>
<pre data-lang="nix" style="background-color:#12160d;color:#6ea240;" class="language-nix "><code class="language-nix" data-lang="nix"><span>buildImage </span><span style="background-color:#00a8c6;color:#f8f8f0;">=</span><span> systemAttrs: </span><span style="color:#67854f;">let
</span><span> </span><span style="color:#db784d;">buildpkgs </span><span style="color:#d65940;">= </span><span style="color:#95cc5e;">import </span><span>nixpkgs { </span><span style="color:#db784d;">system </span><span style="color:#d65940;">= </span><span>systemAttrs</span><span style="color:#d65940;">.</span><span>system; };
</span><span> </span><span style="color:#db784d;">gemEnv </span><span style="color:#d65940;">= </span><span>gems systemAttrs</span><span style="color:#d65940;">.</span><span>system;
</span><span style="color:#67854f;">in </span><span>buildpkgs</span><span style="color:#d65940;">.</span><span>dockerTools</span><span style="color:#d65940;">.</span><span>buildImage {
</span><span> </span><span style="color:#db784d;">name </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"ruby-dancing-banana"</span><span>;
</span><span> </span><span style="color:#db784d;">created </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"now"</span><span>;
</span><span> </span><span style="color:#db784d;">tag </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"latest"</span><span>;
</span><span> </span><span style="color:#db784d;">copyToRoot </span><span style="color:#d65940;">= </span><span>buildpkgs</span><span style="color:#d65940;">.</span><span>buildEnv {
</span><span> </span><span style="color:#db784d;">name </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"image-root"</span><span>;
</span><span> </span><span style="color:#db784d;">paths </span><span style="color:#d65940;">= </span><span>[
</span><span> gemEnv
</span><span> ];
</span><span> </span><span style="color:#db784d;">postBuild </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">''
</span><span style="color:#f8bb39;"> mkdir -p $out/app
</span><span style="color:#f8bb39;"> cp ${./main.rb} $out/app/main.rb
</span><span style="color:#f8bb39;"> cp -r ${./ascii_frames} $out/app/ascii_frames
</span><span style="color:#f8bb39;"> ''</span><span>;
</span><span> };
</span><span> </span><span style="color:#db784d;">config </span><span style="color:#d65940;">= </span><span>{
</span><span> </span><span style="color:#db784d;">Cmd </span><span style="color:#d65940;">= </span><span>[ </span><span style="color:#f8bb39;">"${gemEnv</span><span style="color:#d65940;">.</span><span style="color:#f8bb39;">wrappedRuby}/bin/ruby" "/app/main.rb" "-o" "0.0.0.0" </span><span>];
</span><span> </span><span style="color:#db784d;">WorkingDir </span><span style="color:#d65940;">= </span><span style="color:#f8bb39;">"/app"</span><span>;
</span><span> </span><span style="color:#db784d;">ExposedPorts </span><span style="color:#d65940;">= </span><span>{ </span><span style="color:#f8bb39;">"4567/tcp" </span><span style="color:#d65940;">= </span><span>{}; };
</span><span> };
</span><span>}</span><span style="background-color:#00a8c6;color:#f8f8f0;">;</span><span>
</span></code></pre>
<p>This is the same for our devShells. The difference here is we need these values in our <strong>output</strong> defined at the very top so we call <strong>forEachSupportedSystem</strong> and the attached block defines our default shell for the host env.</p>
<p>I have to admit. This might be the first time I have understood what I created in nix. The rest of the time I have been trying to just guess my way through by copy-pasta'ing examples. Its not a tough syntax and provides much more than what is popular. But all this functionality comes at the cost of being understandable.</p>
<p><strong>Cloudflare Streaming</strong></p>
<p>The final part of todays journey was addressing streaming with <strong>Cloudflare Tunnel</strong>. Being its sitting inside my connection it makes its own rules and that means I have to force it to not buffer my streams otherwise the rendering is faulty.</p>
<p>Solution:</p>
<pre data-lang="ruby" style="background-color:#12160d;color:#6ea240;" class="language-ruby "><code class="language-ruby" data-lang="ruby"><span>headers.delete(</span><span style="color:#f8bb39;">'Content-Length'</span><span>)
</span><span>headers </span><span style="color:#f8bb39;">"Content-Encoding" </span><span>=> </span><span style="color:#f8bb39;">"identity"
</span><span>headers </span><span style="color:#f8bb39;">"Content-Type" </span><span>=> </span><span style="color:#f8bb39;">"text/event-stream"
</span><span>headers </span><span style="color:#f8bb39;">"Transfer-Encoding" </span><span>=> </span><span style="color:#f8bb39;">"chunked"
</span></code></pre>
<p>Removing <em>Content-Length</em> makes sure the proxy can't anticipate the stream and wait for it.</p>
<p><em>Content-Encoding</em> Identity helps to keep compression for being activated.</p>
<p><em>Content-Type</em> text/event-stream is the magic bullet which hints to the proxy that we want the data streamed and to disable any caching or bursting.</p>
<p><em>Transfer-Encoding</em> "chunked" makes sure we send a full block at a time. Since I flush on each image presented each write is a chunk.</p>
</div>
<div class="devlog-entry">
<h3 id="27-01-2025">27 01 2025 <a class="anchor" href="#27-01-2025">🔗</a>
</h3>
<h4 id="nix-and-planning-for-ruby-streaming">Nix and planning for ruby streaming <a class="anchor" href="#nix-and-planning-for-ruby-streaming">🔗</a>
</h4>
<p>I gotta admit I love nix a lot. Its the underdog to docker and even so I use it to create docker images. Although this is kind of a weird thing because when you consider nix you don't really need containers. Ultimately, nix was an alternative view of container runtimes. That said we have k8s and that is a container orchestration tool. That means I use nix to define consistent builds that produce docker images.</p>
<p>Lets walk back what using nix with something like ruby is like. I have previously done this with golang and that was a fun path https://git.sr.ht/~ninjapanzer/krappy_kafka/tree/main/item/flake.nix#L41-59</p>
<p>There we have a buildGoApplication extension to package a binary. That is rather easy because once done we execute that binary and orchestrate any file systems required.</p>
<p>In ruby we don't have a compile phase so we need to carry our bundled baggage. In this world we take the artifacts from bundler and describe them as nix store resources. Those are then copied to the container. In this world even the ruby version is part of the bundled nix context. Within the container we find these in the /nix/store path.</p>
<p>What I learned is that everything eventually is sourced from the nix store. Whats nice is we don't need bundler anymore since the path around the ruby runtime mixed with the context of our gems. While locally we might need to run <strong>bundle exec ruby ...</strong> now we use <strong>"${gems.wrappedRuby}/bin/ruby</strong> as our cmd.</p>
<p>Since this is my second time building a docker image for an arbitrary project, it feels less confusing. The big difference is where the build phase happens. In golang we build and then create an image. In languages like ruby we prepare our deps and then do build operations while copying files to root.</p>
<p>This really makes sense since we do the exact same thing outside nix. In go I would have a make operation that builds our binary and our dockerfile copies that file to the image.</p>
<p>With ruby we tend to bundle within the docker image creation. That is where this is magic, we don't do that anymore and as a result the docker image creation is super fast. Since we don't have to re-bundle on each build its easier to cache the nix env for our gems and the operation becomes one of copying instead of re-downloading and possibly compiling them. This eliminates the <strong>compiling</strong> we are used to in the ruby space.</p>
<p><img src="../compiling.png" alt="xkcd compiling" /></p>
<p>Sorry guys, gotta keep working now</p>
</div>
The Good SergeantWed, 15 Dec 2021 00:00:00 +0000[email protected]
https://developmeh.com/soft-wares/the-good-sergeant/
https://developmeh.com/soft-wares/the-good-sergeant/<h2 id="the-good-sergeant">The Good Sergeant <a class="anchor" href="#the-good-sergeant">🔗</a>
</h2>
<p>As I age in my software development career, I find myself falling into unofficial management roles. Analogous to a Product Owner who also acts as a Business Analyst. If you are like me, you spend a lot of time interacting with people at work. There is an expectation that as you move forward in your career as a software developer, you will naturally make more decisions about process than you did the year before. It's the benefit of experience, and hopefully, as you have aged, you have gained some wisdom. If not, that's not the end of the world, and I think that's why job titles are simultaneously very important and the least important in this domain.</p>
<p>I often find myself aware of the opportunity to make decisions that drive change for my teams. I would classify myself as "intense" and "opinionated," I apply influence to my teams and though my confidence, which gives them the safety to follow along. It's critical to remember that this isn't about giving permission but providing opportunity. You are their servant, it's a collaborative act, and it's a matter of trust. Servant leadership is the marriage of different (attractive) and expert authority.</p>
<p><strong>Attractive</strong>(referent) authority is charisma, it can be achieved in a number of ways. You could be funny and confident, or be great at relationship building and interpersonal skills. It's measured like most things by outcomes, this skill makes deposits in the emotional bank of their co-workers as Covey would call it. Thus, this investment provides opportunities to influence future decisions. You do things for people you like, it's really a no-brainer. An example of how I often make deposits into the emotional bank is by protecting and supporting my co-workers.</p>
<p><strong>Expert</strong> authority, on the other hand, is the result of being able to communicate knowledge or through intelligent expression which produces trust. Your co-workers don't need to verify if you are in-the-know because you have set the example through your actions that you are brilliant or well experienced in a subject.</p>
<p>I think different authority is easier to gain immediately if you are the kind of person who tells stories or is willing to share their weaknesses. I believe that people are attracted to people who can comfortably share their vulnerabilities. While, expert authority has to be demonstrated through considerable performance. Luckily, enough work is all about performance, so you shouldn't find a shortage of opportunities there either.</p>
<p>I don't like making analogies of work and war, but I do respect the ideology of the "good sergeant." As you may be aware, the term sergeant means the "one who serves." Its interest to me is it's part of a small subset of ranks called NCO's, non-commissioned officers. In general, the magic here is that this person has through promotion reached leadership and thus trust and did so without a commission or a formal warrant for such authority. I don't have a military background, so this understanding is more romantic than it is likely accurate. The connection to the work world can be made by comparing a commission with roles of manager and above. Thus, a lead within a technical organization is someone who has reached a rank of authority without entering the management hierarchy.</p>
<p><strong>Authority is not unary</strong>, you can have an entire team of leaders all acting on each other's behalf. Authority is <strong>not about holding power</strong>, it's about <strong>exercising influence</strong>. Here in lies the trap, you make the decision, and now you own the decision. It hurts if the decision changes, is taken away from you, or is in conflict. If you find yourself feeling this way don't be too bothered, caring is an important trait, But, here is an opportunity to remember, <strong>you are not your work</strong>. If you ever wanted to learn how to identify ego driven actions, become a leader, you will have ample opportunity. It's a practice like everything else, and I sure struggle with it sometimes, but ego is your enemy. Ego is the unary force that makes you right and everyone else wrong. My solution for this is to keep presence of mind over the goal. The best path to take is always the one that achieves the best value towards the goal; everything else is a negotiation.</p>
<p><strong>Secret time!</strong> If you find that leadership is often in conflict, things seem disorganized, and decisions take a long time to make, take a step back and observe if they are all acting towards the same goal. I bet you they aren't, and that is not your fault. This is the reason we have executives, people who set the direction and steer the ship. I know that sounds hand wavy but hierarchies are smaller at the top because it's less distracting. A few make some big (pronounced: broad or vague) decisions and leave the smaller and smaller ones to those below them. It's a collective narrowing of focus, it's a critical side effect of good leadership, and it's probably why you see more conflict than you expect.</p>