Zarar's blog

Commit-based Progress Report

2026-03-29T20:20:14.528870+00:00

I had to give someone an update on what's new with the app over the last couple months, what value was added and what, if any, technical improvements were made. Took an agent 2 minutes using a common-sense prompt to produce the below. I asked it to scan git log and based on that produce an update. The value of commit messages was always important, but now it's even more so since they are the ultimate source of truth (along with the code) on what actually was done.

Jump Platform — Changes Summary (Feb 1 – Mar 29, 2026)

Product Features & Improvements

Payments & Checkout

Apple Pay & Google Pay — Customers can now tap to pay with Apple Pay or Google Pay on their phones, making checkout as fast as buying a coffee.
Smarter Email Typo Detection — If a customer accidentally types "gmial.com" or similar at checkout, a helpful popup suggests the correction before they complete the purchase — fewer missed confirmation emails.
Payment Progress Indicator — During PayPal payments, customers now see a clear step-by-step progress bar so they're never left wondering if their purchase went through.
Clearer Payment Errors — When something goes wrong with a payment, customers see a specific, helpful message instead of a generic error — making it easier to retry or fix the issue.
Optimized Payment Processing — Purchases are now automatically routed to the best payment processor based on currency, meaning lower fees for organizers and faster processing for customers.
Redesigned Quantity Selector — The ticket quantity picker now uses clean, branded +/- buttons that are easier to tap on mobile devices.

Tickets & Orders

Text Message Order Details — After purchasing, customers can text themselves a link to their order details — handy when they need quick access on their phone at the door.
Shareable Order Link — Every order now has its own link that customers can open without logging in, making it easy to forward order info to a friend or check details from any device.
Invoice Generation — Customers and organizers can now generate and download a proper invoice for any order — useful for expense reports, accounting, or tax purposes.
Automatic Ticket Grouping — Returning customers who buy tickets across multiple orders will see all their tickets grouped together automatically — no more hunting through separate emails.
Group Tickets with Friends — Customers can share an invite link so friends can join their ticket group, making it simple to coordinate who's sitting where at an event.
Badge Printing — Organizers running conferences or networking events can now print name badges for attendees directly from the platform.
Apple Wallet Passes — Tickets can be saved to Apple Wallet for one-tap access at the event door — no need to dig through emails.
Smarter Ticket Swap Notifications — When a ticket is swapped to a different showtime, an updated confirmation email is sent automatically so customers always have the right details.

Events & Products

Product Options & Variants — Organizers selling merchandise, food, or other items can now offer options like size and color, each with its own price — customers see a clean picker and the price updates automatically.
Digital Products — Organizers can sell digital downloads (e.g., recordings, guides) with automatic delivery emails sent right after purchase.
Related Event Suggestions — Event pages can now highlight related shows or products, helping customers discover more of what they'd enjoy.
Add-Ons Now Available to All — Every organizer can now offer add-ons (parking, merch, VIP upgrades) alongside their tickets — previously this was limited to select accounts.
Flexible Pricing Display — Organizers can set special "override" prices that are always visible to customers, useful for early promotions or group deals.

Passes & Subscriptions

Redesigned Passes Experience — The passes system got a major refresh with clearer management pages, making it easier for organizers to set up and for customers to understand what they're getting.
Automated Monthly Payouts — Organizers offering pass-based products now receive their revenue automatically each month — no manual payout requests needed.

Promotions & Discounts

Buy-One-Get-One Deals — Organizers can now create BOGO promotions targeted to specific events, customer groups, or date ranges — a powerful tool for filling seats.
Smart Promotion Targeting — Promotions can be configured to appear only for the right audience based on rules like past purchases, event type, or time window.
One-Click Promo Links — Organizers can generate a URL that automatically applies a discount when customers click it — perfect for social media posts, emails, or influencer partnerships.
QR Codes for Promo Codes — Each promo code now comes with a downloadable QR code, ideal for printing on flyers, posters, or table cards at events.

Marketing & Campaigns

Email Ticket Buyers Directly — Organizers can send a beautifully designed email to everyone who bought tickets for a specific event — great for pre-show reminders, post-show thank-yous, or upselling future events.
Showtime-Specific Waitlist Alerts — When a sold-out showtime reopens, only the people who signed up for that exact date and time get notified — no more irrelevant "tickets available!" emails.
Saved Audience Filters — Organizers can save their favorite email audience filters and reuse them for future campaigns, cutting setup time for repeat sends.
Campaign Performance Tracking — Email campaigns now show how many orders they generated, so organizers can see exactly which campaigns are driving ticket sales.
Email Health Tool — A built-in tool that finds and fixes problematic email addresses in the customer database, improving email deliverability and reducing bounces.

Checkout & Forms

Custom Checkout Questions — Organizers can add their own questions to checkout (meal preferences, t-shirt sizes, accessibility needs) with a new section header option for organizing longer forms.
Smart Fulfillment Fields — Checkout automatically asks for the right info based on what's being purchased — shipping address for merchandise, no address needed for digital products.

Discovery & Search Visibility

Better Google Search Results — Event pages now show up as rich results in Google with the event date, price range, and venue displayed right in the search listing — helping customers find events faster and driving more clicks.
Fresher Search Listings — Search engines now see accurate "last updated" dates for events, meaning upcoming shows get crawled and indexed more frequently.
Jump Blvd Landing Page — A dedicated landing page for Jump Blvd with a custom design and search engine optimization, giving the platform its own public-facing presence.
Multi-Tenant Onboarding — New organizations joining the platform can now set themselves up through a guided onboarding flow without needing manual help from the Jump team.

AI & Analytics

Soma AI (Rebrand) — The built-in AI assistant was rebranded to Soma AI with a refreshed look and feel.
Step-by-Step AI Guides — Soma AI can now walk organizers through platform features with interactive, step-by-step tutorials — like having a personal onboarding coach.
Historical Sales Analytics — A new analytics view lets organizers explore their ticket sales history across events and currencies, helping them spot trends and plan future pricing.
AI Conversation Monitoring — An internal tool that helps the Jump team review how organizers interact with Soma AI, enabling continuous improvements to the AI experience.

Other Improvements

Capacity-Aware Comp Tickets — Complimentary ticket invites now check venue capacity before being issued, so organizers never accidentally oversell a show.
Accurate Refund Tracking — Refundable amounts update in real-time as partial refunds are issued, giving organizers a clear picture of what's left to refund on any order.
Refreshed Order Details Page — The admin order details view was redesigned with a cleaner layout and quick-action dropdown, making order management faster.
Updated About Page — A refreshed company about page with improved content and design.
Correct Tax on Absorbed Fees — Fixed an issue where tax was miscalculated when organizers chose to absorb service fees, ensuring customers are charged the right amount.
Multi-Currency Sales Reports — Fixed sales reports to correctly show all currencies when an event sells in more than one currency.

Technical & Architectural Changes

Medusa-to-Elixir Migration

Cart Operations — Replaced Medusa HTTP-based cart operations (add/delete line items, recalculate shipping) with native Elixir modules (CartOperations, CartContext, CheckoutService, VariantPriceCalculator), eliminating cross-service HTTP overhead.
Discount Application — Moved discount/promo code application logic from Medusa to Elixir with native Ecto operations.
Order Totals Calculation — Individual order totals (tax, shipping, discounts, gift cards, service fees) are now computed entirely in Elixir, replacing the Medusa dependency.
Service Fee Calculation — Service fee is now derived from unit_price at calculation time rather than a stale cached value, fixing accuracy issues with PWYL and absorbed pricing.
Seat Allocations Endpoint — Migrated the seat allocations API to Elixir to resolve Node.js performance bottlenecks under load.
Campaign Stats Sync — Moved SendGrid campaign stats sync to Elixir with retry logic for transient API failures.
Reserved Seating Cleanup — Migrated the 15-minute reserved seat expiry job from Medusa to an Oban worker.
Order Cache Elimination — Replaced metadata.cache reads with direct Ecto column access on the orders table, removing stale-data risk.
Get Order Consolidation — Replaced Ecommerce.Orders.get_order HTTP calls with a direct Context.Orders.get_order DB query, reducing latency.
Cart API Expansion — The cart API now returns full cart and line item totals computed in Elixir, enabling the frontend to rely on Elixir-calculated values.

PayPal Integration Overhaul

Dead-View PayPal Controller — Replaced the PayPal LiveView with a traditional Phoenix controller to avoid iframe destruction issues caused by LiveView re-renders.
Authorization Status Check — Cart completion now verifies PayPal authorization status before proceeding, preventing silent payment failures.
Apple Pay / Google Pay via PayPal — Integrated PayPal Advanced Checkout SDK with domain verification, cents-to-dollars conversion, and billing contact handling per PayPal docs.

Fulfillment & Email System

Product-Level Fulfillment Types — Added product.fulfillment_type_id with a derivation hierarchy (variant > product > NULL) and a Services.Fulfillment module for resolution logic.
Email Orchestration — Four separate fulfillment email pipelines (add-on codes, digital downloads, admin notifications, merchandise fulfillment) with Swoosh/MJML templates.
Async Email Dispatch — Emails are now sent asynchronously to avoid blocking request processing.

Pricing Engine

Modifier Pricing — Extended the pricing system to support combination-based pricing with early bird, group, and override types, calculated through VariantPriceCalculator.
Dynamic Refund Computation — refundable_amount is now computed at query time (total - SUM(issued_refunds)) rather than stored as a mutable column.

Database & Performance

Ecto Connection Pooling — Switched to Ecto-managed connection pooling with tuned pool size for better resource utilization.
Race-Safe Token Generation — Public order tokens use a WHERE public_token IS NULL guard to prevent race conditions during concurrent order processing.
Variant Deletion Guard — Variant combinations referenced by existing orders are now protected from deletion, preventing data integrity issues.

Frontend Architecture

Centralized Store Layout — Removed per-LiveView padding in favor of a single layout-level spacing system (store2 layout).
Shared UI Components — Extracted line_item_display, OrderDetailsComponents, ccreator_link, and OrderHelpers to reduce template duplication across cart, payment, and order pages.

Observability

PostHog Event Tracking — SMS sends from cart completion are tracked as PostHog events for funnel analysis.
Email Send Logging — All outbound emails now log delivery metadata for debugging and deliverability monitoring.
LLM Chats Viewer — Admin LiveView with paginated session list, infinite scroll via IntersectionObserver, and per-message metadata (model, latency, tokens, ratings).

Spec-Driven Development: From Vibe Coding to Structured Development

2026-02-25T00:48:06.655671+00:00

Note: I currently work with a Payments Engineering team and wrote this as we are introducing spec-driven development into our development workflow.

Introduction

If you've used an AI coding tool in the last year, you've probably had the experience: you describe what you want, the AI generates something that looks right, you run it, and... it doesn't quite work. You refine your prompt. The AI fixes one thing and breaks another. Three iterations later you're debugging code you didn't write and don't fully understand.

This is the failure mode of what Andrej Karpathy called "vibe coding" and it's become the default way most developers interact with AI. Spec driven development (SDD) is the emerging counter movement. Instead of throwing prompts at an LLM and hoping for the best, you write a structured specification first, then let the AI implement against it.

The idea isn't new. We've been writing requirements documents since forever, but the tooling is new. Tools like GitHub's Spec Kit, Amazon's Kiro, and Fission AI's OpenSpec are attempting to formalize this workflow into something repeatable. Whether that formalization helps or hinders depends entirely on what you're building, how you're building it, and the tradeoffs you're willing to make.

Our team uses OpenSpec, so most of the practical examples in this post come from that experience. But the principles apply regardless of which tool you pick.

The Problem: Why "Just Prompting" Breaks Down

The pitch for AI assisted coding is attractive: describe what you want in English and get working code back. And for simple tasks, a helper function, a config change, renaming a module, it works remarkably well. The challenges starts when changes aren't trivial but require edits to multiple files or packages/modules.

The core issue is context loss. When you're five prompts deep into a feature, the AI has no persistent memory of the architectural decisions you made in prompt one. It doesn't know you chose a specific idempotency strategy for a reason. It doesn't remember that you explicitly avoided storing raw card data outside the tokenization boundary. Every new prompt starts from a partial view of the world, and the AI fills in the gaps with whatever patterns it's seen most in training data.

In payments systems, this produces particularly dangerous failures. Reconciliation logic scattered across three different modules because each prompt generated its own approach. A refund handler that doesn't account for partial captures. Currency conversion applied twice because the AI didn't know about the upstream normalization step. And perhaps most critically in our domain, security flaws: API keys committed to source, missing input validation on transaction amounts, authorization checks that live on the client instead of the server. Studies have found that roughly 45% of AI generated code contains security vulnerabilities. In a payments context, that's more than just a bug but a compliance issue.

The other failure is architectural drift. Without a shared plan, each prompt/response cycle makes locally reasonable decisions that are globally incoherent. The AI can't refactor itself out of architectural problems it doesn't understand. You ask it to add retry logic to a payment gateway call and it builds a standalone retry mechanism, unaware that you already have a circuit breaker pattern in your infrastructure layer. Once the codebase reaches a certain size, the context window can only see fragments of it. You end up with a system that processes transactions but that nobody, including the AI, fully understands anymore.

This isn't the AI being dumb. It's the natural consequence of building without a map.

What Spec Driven Development Actually Is

At its simplest, spec driven development means: write down what you're building before you write the code, and make that written artifact the thing your AI agent works from.

That might sound like waterfall but It's not, or at least, it doesn't have to be. The key differences are timescale and scope. Traditional waterfall specs were project level documents written over weeks and often carved in stone. SDD specs are feature level documents written in minutes and meant to evolve. You're not planning an entire system upfront; you're planning the next meaningful chunk of work in enough detail that an AI can implement it without guessing.

A typical SDD workflow looks like this:

Define requirements. What should this feature do? Who is it for? What are the acceptance criteria? What are the edge cases?
Create a technical design. How should it be implemented? What's the data model? What APIs are involved? What patterns should be followed?
Break it into tasks. What are the discrete, testable units of work? In what order should they be done?
Implement. The AI executes against the task list, one piece at a time, with the full spec as context.

You're not writing all of this yourself. You describe the intent in natural language, and the AI generates the spec artifacts: the proposal, the requirements, the design, the task breakdown. Your job is to review, refine, and correct. You steer and the AI does the heavy lifting. This is what makes the process fast enough to be practical. Writing a 200 line spec by hand for every feature would be painful. Having the AI draft it in 30 seconds and then spending 5 minutes reviewing and adjusting it is a different proposition entirely.

The spec becomes a persistent artifact, a "super prompt" that doesn't disappear when your chat session ends. It lives in version control alongside your code. When the AI drifts, you point it back to the spec. When requirements change, you update the spec and regenerate.

The fundamental shift is that the specification becomes the source of truth, and code becomes the derived artifact. Traditional documentation describes code that already exists. SDD inverts that relationship. You define the behaviour, constraints, and architecture in the spec, and the AI produces code that conforms to it. The spec isn't something you write after the fact to explain what was built but the input that determines what gets built. Code is the output.

The Tooling Landscape

Three tools have emerged as the most prominent in this space. Each takes a different philosophical approach.

GitHub Spec Kit

Spec Kit is an open source CLI from GitHub that scaffolds a spec driven workflow into your existing project. It's agent agnostic, working with GitHub Copilot, Claude Code, Gemini CLI, and others. The workflow follows rigid phases driven by slash commands: /speckit.constitution to establish project principles, /speckit.specify to create feature specs, /speckit.plan for a technical plan, /speckit.tasks for work items, and /speckit.implement to execute.

Strengths: Thorough documentation output, the "constitution" concept for project wide principles, works with many agents.

Weaknesses: Heavyweight. Sometimes it get generate a lot of artifacts for simple changes. Rigid phase gates mean you can't easily jump back and forth between planning and implementing.

Amazon Kiro

Kiro is a full IDE (a VS Code fork) with spec driven development baked into the editing experience. The workflow follows a similar shape (requirements → design → tasks → implement) but is tightly integrated with the editor. It generates user stories with acceptance criteria, creates technical design documents, and produces task lists. It also introduces "Hooks," user defined prompts triggered by file changes.

Strengths: Most polished integrated experience. The Hooks system is excellent and something you'd have to configure manually if you decide to do it on your own. No context switching between planning and editing because of the IDE integration.

Weaknesses: You're locked into their IDE and limited to Claude models. Can be overkill for small changes. One developer reported a simple bug fix generating 4 user stories with 16 acceptance criteria. The overhead can be significant.

OpenSpec (Fission AI)

OpenSpec is the most lightweight of the three. It's a TypeScript CLI with a fluid, iterative workflow and no rigid phase gates. Where Spec Kit enforces a strict sequence and Kiro wraps everything in an IDE, OpenSpec gets out of your way and lets you move between planning artifacts freely.

Its distinguishing philosophy is "brownfield first." While the other tools are optimized for building new things from scratch, OpenSpec is designed to work with existing codebases. Each change produces a "spec delta," a document that captures what's being added, modified, or removed relative to the existing system. Over time, these deltas merge into a living specification that reflects the current state of the system.

OpenSpec also handles change history better. Every completed change is archived with its full artifact set: the original proposal, the spec deltas, the design, and the task list. This means you can go back and see not just what changed in the system, but why it changed, what alternatives were considered in the design, and what the original acceptance criteria were. Spec Kit and Kiro generate artifacts during planning but don't have the same structured archive and merge cycle. In OpenSpec, the openspec/changes/archive/ directory becomes a chronological record of every significant change to the system, and the openspec/specs/ directory is always the merged, current truth. For regulated environments where auditability matters, this distinction is significant.

Strengths: Works with 20+ AI tools including Claude Code, Cursor, Copilot, Windsurf, and many others. The brownfield focus is valuable in our context as most real work is on existing codebases. Fluid workflow lets you update any artifact at any time and you are not forced into a linear way of working. The archive/merge cycle produces both a living spec and an auditable change history.

Weaknesses: Less hand holding in the spec writing process is the trade-off it makes while allowing you to navigate back-and-forth between spec and implementation. The tool is newer and the ecosystem is still growing.

Installing OpenSpec

OpenSpec requires Node.js 20.19.0 or higher.

Install OpenSpec globally:

npm install -g @fission-ai/openspec@latest

Then navigate to your project directory and initialize:

cd your-project
openspec init

The init process will ask which AI tool you're using and configure the appropriate slash commands or agent instructions for your environment.

OpenSpec also works with pnpm, yarn, bun, and nix. See the official installation docs for alternative paths.

Keeping OpenSpec Updated

Upgrade the package:

npm install -g @fission-ai/openspec@latest

Then refresh agent instructions in each project:

openspec update

OpenSpec's Workflow in Depth

Understanding the full lifecycle of an OpenSpec change is worth the time, because the artifacts it generates serve different roles on the team in different ways.

The Core Commands

OpenSpec's workflow is built around the opsx slash commands. Here's the complete set, the ones you interact with the most are bolded:

Command	Purpose
`/opsx:onboard`	Guided tutorial through the complete workflow using real code
`/opsx:explore`	Think through ideas, investigate problems, clarify requirements before committing to a change
`/opsx:new`	Create a new change folder with metadata
`/opsx:continue`	Progress a change to its next phase (proposal → design → tasks)
`/opsx:ff`	"Fast forward": generate all planning artifacts at once
`/opsx:apply`	Implement tasks, writing code and checking off items
`/opsx:verify`	Validate that implementation matches the artifacts (completeness, correctness, coherence)
`/opsx:sync`	Merge delta specs into main specs without archiving (useful for long running changes)
`/opsx:archive`	Archive a completed change, merging delta specs into main specs
`/opsx:bulk-archive`	Archive multiple completed changes at once, handling spec conflicts

The typical flow is new → ff → apply → archive, but the power of OpenSpec is that you can break out of that sequence at any point. Need to revisit the design after you've started implementing? Just edit design.md. Want to add acceptance criteria while coding? Update the spec delta. There are no phase gates forcing you to "finish" one stage before moving to another.

Starting a Change: Explore vs. New

One of the first decisions in any OpenSpec workflow is how you enter it. There are two entry points, and choosing the right one makes a real difference in the quality of what comes out the other side.

/opsx:new is for when you know what you're building. You have a clear feature in mind, you understand the requirements well enough to describe them, and you're ready to start generating planning artifacts. Maybe you've already discussed this in a planning meeting. Maybe you've built something similar before. Maybe the ticket is well defined and you just need to formalize it. In these cases, /opsx:new add-feature-name followed by /opsx:ff gets you from zero to a full set of planning documents in minutes.

You: /opsx:new add-payment-retry-with-exponential-backoff

AI:  Created openspec/changes/add-payment-retry-with-exponential-backoff/
     Ready to create: proposal

You: /opsx:ff

AI:  Creating all planning artifacts...
     ✓ proposal.md
     ✓ specs/
     ✓ design.md
     ✓ tasks.md
     Ready for implementation!

This works well when the problem space is familiar and constraints are understood. You've implemented retry logic before, you know the gateway's behavior, you just need to formalize it.

/opsx:explore is for when you're still figuring things out. Maybe you have a vague idea but the requirements aren't clear. Maybe you're investigating a production anomaly and you don't know where the root cause is yet. Maybe a stakeholder said "we need to support instant payments" and you need to figure out what that actually means for your settlement architecture before you can spec anything.

Explore opens a broader, open ended conversation with the AI before any change folder is created and before any artifacts are generated. There's no structure imposed, no proposal template to fill out, no pressure to define acceptance criteria. You're just thinking out loud with an assistant that has context on your codebase.

You: /opsx:explore

AI:  What would you like to explore?

You: We're seeing intermittent failures on our direct debit 
     collections via the pain.008 pathway. Some mandates are 
     being rejected with AM04 (insufficient funds) but the 
     retry logic doesn't seem to be kicking in correctly. I'm 
     not sure if it's a status mapping issue, a timing problem 
     with our batch processing window, or something in our 
     mandate lifecycle management. I want to figure out where 
     to focus before committing to a specific fix.

AI:  Let me help investigate. Based on your codebase, I can 
     see a few potential areas...
     [Analyzes code paths, suggests profiling approaches, 
      identifies likely bottlenecks, asks clarifying questions]

The explore conversation might last five minutes or an hour. It might branch into several tangents before converging on a clear problem statement. That's intentional. You're doing the messy, nonlinear thinking that produces good requirements, and you're doing it before the spec exists rather than discovering gaps during implementation.

This is also where existing requirements documents from business analysts become valuable. In payments organizations, BAs often produce detailed feature requirements independently of any SDD tool: field mapping spreadsheets, business rule matrices, sample ISO 20022 payloads (pain.001, pain.008, pacs.008), regulatory constraint documents, and workflow diagrams. These artifacts are the raw material that explore turns into actionable specs.

Rather than asking the AI to invent requirements from scratch, you feed it the BA's work:

You: /opsx:explore

AI:  What would you like to explore?

You: Our BA has written up requirements for adding SEPA Instant 
     Credit Transfers. I'm attaching their document which 
     includes the field mappings from our internal format to 
     pacs.008, the business rules for amount limits and BIC 
     validation, and sample XML payloads. I need to understand 
     how this fits into our existing payment orchestration 
     layer and what the technical implications are before we 
     spec the implementation.

AI:  I've reviewed the BA requirements document. Let me walk 
     through the key integration points...
     [Maps BA requirements against existing codebase, identifies 
      gaps, flags technical decisions that need to be made]

The explore phase becomes a bridge between the BA's domain knowledge and the engineering reality of the codebase. The BA doesn't need to know about your GenServer architecture or your Ecto schema conventions. The developer doesn't need to memorize the ISO 20022 payload structure. Explore lets both perspectives converge into a proposal that reflects both business intent and technical feasibility.

When you've reached clarity, you transition naturally into the structured workflow:

You: OK, the main complexity is in the real-time settlement 
     confirmation flow. The BA's field mappings look solid 
     but we need to add timeout handling for the 10 second 
     SCT Inst window. Let's spec that.

You: /opsx:new add-sepa-instant-credit-transfers

AI:  Created openspec/changes/add-sepa-instant-credit-transfers/
     Ready to create: proposal

Now the proposal and specs will be grounded in both the BA's requirements and the technical understanding you built during exploration, rather than being generated from a one line prompt.

When to use which:

Use /opsx:new when you can describe the feature or fix in a sentence and you're confident in the scope. Use /opsx:explore when any of the following are true: you're unsure what the root cause of a problem is, the requirements are ambiguous or underspecified, you need to evaluate multiple approaches before committing to one, or you want to pressure test an idea before investing in formal planning. In practice, we find ourselves using explore more often than we initially expected. The few minutes spent thinking before speccing consistently produce better specs, which in turn produce better code.

The Artifact Lifecycle

When you run /opsx:new add-idempotent-refunds, OpenSpec creates a change directory:

openspec/changes/add-idempotent-refunds/
├── .openspec.yaml          # Metadata: change name, status, timestamps
└── (ready for artifacts)

Running /opsx:ff (or stepping through with /opsx:continue) generates the planning artifacts:

openspec/changes/add-idempotent-refunds/
├── .openspec.yaml
├── proposal.md             # Why we're doing this, what's changing, scope
├── specs/                  # Requirements and scenarios (the spec delta)
│   └── refunds/
│       └── spec.md         # Functional requirements with ADDED/MODIFIED/REMOVED markers
├── design.md               # Technical approach, data model, component structure
└── tasks.md                # Ordered implementation checklist

Each of these artifacts has a specific purpose and a specific audience. Let's look at what goes into them.

proposal.md is the "why" document. It describes the motivation for the change, the scope of what's included and excluded, and any constraints or dependencies. This is the document you'd share in a planning meeting or attach to a ticket. It answers the question: "Why are we doing this, and what does 'done' look like at a high level?" For a refunds feature, this might capture that the driver is duplicate refund incidents costing the business money, that the scope includes full and partial refunds but excludes chargebacks, and that the constraint is backwards compatibility with the existing refund API contract.

specs/ contains the spec delta, the functional requirements for this specific change. Requirements are marked as ADDED, MODIFIED, or REMOVED relative to the current system. Each requirement uses structured language ("The system SHALL...") with clear acceptance criteria and scenarios. This is where edge cases live. This is where you define what happens when a refund is submitted with the same idempotency key as a previous request, what the system does when the gateway returns a timeout mid refund, or how partial refunds interact with the original transaction's settlement status.

design.md is the technical blueprint. It covers the data model, API contracts, component architecture, sequence flows, and any technology choices specific to this feature. For the refunds example, it's where you'd document the idempotency key storage strategy, the state machine transitions for refund lifecycle, and the gateway adapter interface for multi acquirer support.

tasks.md breaks the work into discrete, ordered implementation steps. Each task is small enough to verify independently, ideally something that can be implemented in under 30 minutes. Tasks have clear completion criteria so both the developer and the AI know when they're done.

What Happens at Archive

When all tasks are complete and verified, /opsx:archive does something important: it merges the spec deltas from the change back into the main openspec/specs/ directory. The change folder moves to openspec/changes/archive/, preserving the history. The main specs now reflect the updated state of the system.

This is the mechanism that turns specs into a living document. After a dozen features have been built and archived, openspec/specs/ contains a comprehensive, up to date description of what the system does. Not what it was designed to do originally, but what it actually does right now.

Who Benefits: SDD Across Roles

One of the underappreciated aspects of spec driven development is that the artifacts aren't just for the developer writing the code. They create value across every role that touches the project.

For Developers

The immediate benefit is implementation quality. Instead of translating a vague Jira ticket into code via a series of increasingly frustrated prompts, you're working from a spec that already captures requirements, edge cases, and technical decisions. The AI produces better code because it has better context. You spend less time debugging and reworking because misunderstandings surface during spec review, not during code review.

The longer term benefit is onboarding and maintenance. When you come back to a feature six months later, or when a new developer joins the team, the spec explains not just what the code does but why it was built that way. The proposal captures the business motivation. The design doc captures the technical rationale. The spec captures the behavioral contract.

For Business Analysts and Product Managers

The proposal and spec artifacts are written in structured natural language, not code. A BA or PM can read proposal.md and immediately understand the scope, motivation, and acceptance criteria for a change without needing to parse a pull request.

More importantly, they can contribute to these documents. If the spec says "The system SHALL retry failed direct debit collections up to 3 times" and the BA knows the scheme rules mandate a maximum of 2 retries with specific interval requirements, they can flag that in the spec before any code is written. The spec becomes a shared contract between product and engineering, reviewable by both sides.

BAs in payments organizations often produce detailed requirements documents that exist outside of any development tool: field mapping spreadsheets between internal formats and ISO 20022 messages, business rule matrices for transaction routing, sample payloads for pain.001 or pacs.008 messages, regulatory constraint documents, and scheme specific validation rules. These documents don't need to be rewritten into OpenSpec format. Instead, they serve as input to the /opsx:explore conversation and as reference material that the proposal and specs can point to. The spec might say "Field mappings follow the BA's pain.008 mapping document (see docs/ba-requirements/sepa-dd-field-mappings.xlsx)" rather than duplicating that content. OpenSpec captures the engineering requirements; the BA's documents capture the domain requirements. The two reference each other.

For teams practicing any kind of requirements analysis, the spec delta format (ADDED/MODIFIED/REMOVED) maps naturally to how BAs think about change impact. You can see at a glance exactly what existing behavior is changing and what's new.

For QA Engineers

The specs are essentially test plans waiting to happen. Each requirement with its acceptance criteria maps directly to test cases. "WHEN a refund is submitted with an idempotency key matching a previously completed refund, THEN the system SHALL return the original refund response without processing a duplicate" is a test case in all but name.

QA can review specs before implementation begins, catching gaps in test coverage at the cheapest possible point in the development cycle. In payments, where edge cases around timeouts, partial failures, and concurrent operations are where bugs hide, having QA eyes on the spec early is especially valuable. They can also use specs to verify completeness: does the implementation actually cover every scenario in the spec? OpenSpec's /opsx:verify command automates part of this check, but human QA review of the spec itself is where the real value lies.

For Tech Leads and Principal Engineers

The design document is where architectural oversight happens. A principal can review design.md to ensure the proposed approach fits the system's overall architecture, without needing to wait for a code review to discover that someone introduced a new database table that duplicates an existing one, or bypassed the payment gateway abstraction layer by calling the acquirer API directly.

The proposal document is equally valuable at this level. It provides enough context to make prioritization decisions, estimate impact on downstream systems like settlement and reconciliation, and flag dependencies before work begins.

For organizations running architecture review boards or design review processes, OpenSpec artifacts slot directly into those workflows. The artifacts are markdown in version control, which means they can be reviewed through the same pull request process as code.

For the Whole Team

The openspec/specs/ directory, the living spec that accumulates as changes are archived, becomes something like institutional memory for the project. It captures not just the current state of the system but the evolution of requirements over time. New team members can browse the specs to understand the system. Archived changes provide an audit trail of what changed, when, and why.

This is especially valuable for distributed teams where not everyone is in every meeting. The spec is always available, always current, and always in the repo.

Bridging BA Requirements and Engineering Specs

In most payments organizations, business analysts produce detailed requirements documents long before any developer opens an IDE. These documents are the product of weeks of domain analysis: field mapping spreadsheets that map internal data structures to ISO 20022 message formats like pain.001 or pacs.008, business rule matrices that define transaction routing logic, validation rules derived from scheme specifications, sample XML payloads, regulatory constraint summaries, and workflow diagrams for processes like mandate lifecycle management or chargeback dispute flows.

These artifacts are enormously valuable. They represent concentrated domain knowledge that no AI model has in its training data. But they exist outside of any development tool, usually in Confluence pages, shared drives, or Excel files. The question is: where do they plug into the SDD workflow?

The answer is /opsx:explore.

Rather than asking the AI to invent payment domain requirements from scratch (which it will get wrong in subtle, dangerous ways), you feed the BA's existing documents into the explore conversation. The AI reads the BA's field mappings, business rules, and sample payloads, then maps them against your actual codebase. The explore conversation becomes a structured dialogue about feasibility: which business rules can be implemented directly, which ones require architectural changes, where the BA's requirements conflict with existing behavior, and what edge cases the BA may not have considered from a technical perspective.

You: /opsx:explore

AI:  What would you like to explore?

You: The BA team has finalized requirements for adding SEPA 
     Direct Debit collections. I'm attaching their document 
     which includes the pain.008 field mappings from our 
     internal mandate format, the business rules for collection 
     timing windows, the R-transaction handling matrix (AM04, 
     MS02, MD01 etc.), and sample XML for both first and 
     recurring collections. I need to understand what this 
     means for our payment orchestration layer before we plan 
     the implementation.

AI:  I've reviewed the BA's requirements. Let me walk through 
     the integration points with your existing codebase...
     
     Your mandate schema already covers most of the creditor 
     reference fields, but the BA's mapping requires three 
     additional fields for the CreditorSchemeIdentification 
     that aren't in your current model...
     
     The R-transaction handling matrix defines 12 reason codes 
     but your existing rejection handler only supports 4 
     generic rejection states. This will need a more granular 
     status model...

This conversation might reveal that 70% of the BA's requirements map cleanly to existing patterns and 30% require new design decisions. Those design decisions then flow into the proposal and spec with full context, rather than being invented by the AI from a one sentence prompt.

The key principle: the BA's documents are input to the spec, not replaced by it. The OpenSpec proposal can reference them directly ("Field mappings follow the BA's pain.008 mapping document, see docs/ba-requirements/sepa-dd-field-mappings.xlsx"). The spec captures the engineering interpretation of business requirements, while the BA's artifacts remain the authoritative source for domain rules. The two complement each other.

For teams with a strong BA function, this workflow turns explore into the most valuable step in the entire process. It's where domain expertise meets technical reality, and where misunderstandings between product and engineering get caught before they become expensive.

Beyond Epics and User Stories

For years, the standard way to decompose work in software organizations has been the Agile hierarchy: Epics break into Features, Features break into User Stories, User Stories break into Tasks. Each layer adds structure, and each layer adds overhead. Grooming sessions to refine stories. Estimation ceremonies to assign points. Sprint planning to negotiate what fits. Story splitting when something is "too big." Acceptance criteria written in Given/When/Then format.

This process was designed for a world where humans wrote every line of code, and work needed to be decomposed into pieces small enough for one developer to complete in a sprint. The granularity served a coordination function: if three developers are working on the same feature in parallel, you need clearly bounded units of work to avoid stepping on each other.

With AI agents handling the bulk of code generation, developers now work in significantly larger chunks. A feature that would have been split into 8 user stories with 24 tasks can be described as a single spec and implemented in one session. The AI doesn't need two week sprints to context switch between stories. It doesn't need story points to estimate effort. It doesn't care whether a unit of work is a 3 or a 5. It needs a clear description of what to build and enough context to build it correctly.

The overhead of the old hierarchy was always significant. Ceremonies consume 15-30% of a team's time. The BA writes detailed requirements and translates them into epics and stories. The tech lead estimates them. The developer re-interprets them during implementation. Each translation step is an opportunity for information loss.

SDD collapses this. Instead of an Epic with 5 Features containing 20 User Stories containing 60 Tasks, you have a proposal that captures the business intent, a spec that defines the requirements, a design that describes the technical approach, and a task list that the AI executes against. The entire chain from "why are we doing this" to "what code gets written" lives in one change folder, reviewable as a single unit. There's no translation loss and the spec is the shared artifact that all of those roles read, review, and contribute to.

This doesn't mean you abandon planning. It means the unit of planning shifts from "what can one developer finish in two days" to "what is the next coherent change to the system." That change might be small (fix a settlement rounding error) or large (add multi currency support). The spec scales to fit either. And because the AI generates the task breakdown from the spec, you don't need to spend planning meetings debating granularity. The tasks exist to guide the AI's implementation, not to fill a sprint board.

SDD offers a natural landing spot for a switch to AI-first development. You keep the parts that matter (clear requirements, technical review, incremental delivery) and shed the parts that were only ever coordination scaffolding.

A Practical Workflow: Adding Multi Currency Settlement to a Payment Engine

Let's walk through a concrete example using OpenSpec to add multi currency settlement support to a payment processing platform.

Step 1: Explore (optional but useful)

If the requirements aren't fully clear, or if you have BA created documents that need to be digested before planning, start with exploration:

You: /opsx:explore
AI:  What would you like to explore?

You: We need to support settlement in multiple currencies. 
     Right now everything settles in the merchant's home 
     currency, but we're onboarding merchants in the EU who 
     need EUR settlement even when the cardholder pays in GBP.
     I've attached the BA's requirements document which includes 
     the field mappings, currency pair rules, and the regulatory 
     constraints for cross border settlement. I need to 
     understand how this fits into our existing reconciliation 
     pipeline and whether our current batch architecture can 
     handle currency grouping.

AI:  I've reviewed the BA requirements document. Let me map 
     these against your existing settlement pipeline...
     [Analyzes BA requirements against codebase, identifies 
      integration points, flags technical decisions needed]

This step is for thinking, not committing. No change folder is created yet.

Step 2: Create the change

You: /opsx:new add-multi-currency-settlement

AI:  Created openspec/changes/add-multi-currency-settlement/
     Ready to create: proposal

Step 3: Generate planning artifacts

If you have a clear picture of what to build, fast forward through all planning stages:

You: /opsx:ff

AI:  Creating all planning artifacts...
     ✓ proposal.md
     ✓ specs/
     ✓ design.md
     ✓ tasks.md
     Ready for implementation!

Alternatively, use /opsx:continue to step through one phase at a time, reviewing and refining each artifact before moving to the next.

Step 4: Review and refine the artifacts

This is the most important step. Read the generated proposal, specs, and design. Push back on anything that doesn't match your intent:

You: The design proposes fetching FX rates at settlement time. 
     We need to lock the rate at authorization and store it on 
     the transaction record. Also the spec is missing the 
     requirement that settlement files must include both the 
     original transaction currency and the settlement currency.

AI:  I'll update design.md to lock FX rates at authorization 
     and add the dual currency settlement file requirement 
     to the spec.
     [Updates artifacts]

Step 5: Implement

You: /opsx:apply

AI:  Working through tasks...
     ✓ 1.1 Add fx_rate and settlement_currency to transactions schema
     ✓ 1.2 Create FX rate locking module with gateway adapter
     ✓ 1.3 Update authorization flow to capture and persist rate
     ✓ 2.1 Modify settlement batch builder for multi currency grouping
     ✓ 2.2 Update reconciliation to match on settlement currency
     ✓ 2.3 Add settlement file format with dual currency columns
     All tasks complete!

Step 6: Verify

You: /opsx:verify

AI:  Checking implementation against artifacts...
     ✓ All requirements covered
     ✓ Design patterns followed
     ✓ All tasks marked complete
     ⚠ Spec mentions "handle rate expiration gracefully" but 
       current implementation raises an unhandled error when 
       the cached rate is older than 30 minutes. Consider a 
       re-fetch with merchant notification.

Step 7: Archive

You: /opsx:archive

AI:  ✓ Merged specs into openspec/specs/
     ✓ Archived to openspec/changes/archive/2026-02-24-add-multi-currency-settlement/
     Specs updated. Ready for the next feature.

Handling Interrupts and Stopping Mid Implementation

Real work isn't linear. OpenSpec handles context switches gracefully because your plan lives in the artifacts, not in the AI's memory.

One important detail: during /opsx:apply, you can stop the AI at any point while monitoring its output. If you see it heading in the wrong direction on task 1.3, you stop it, correct course, and resume. The task list in tasks.md tracks what's been completed, so nothing about your plan is lost. This is fundamentally different from vibe coding, where stopping mid stream means losing context and hoping you can reconstruct where you were.

The same principle applies to bigger interrupts. You can pause an entire feature to handle something urgent:

You: I'm working on multi-currency-settlement but we have an 
     urgent issue: the nightly batch is double-settling some 
     transactions.

You: /opsx:new fix-duplicate-settlement-batch

AI:  Created openspec/changes/fix-duplicate-settlement-batch/

You: /opsx:ff
     [Plans the fix]

You: /opsx:apply
     [Implements the fix]

You: /opsx:archive
     ✓ Archived fix-duplicate-settlement-batch

You: Let's get back to multi currency settlement.

You: /opsx:apply add-multi-currency-settlement
AI:  Resuming add-multi-currency-settlement...
     Picking up at task 2.2: Update reconciliation to match 
     on settlement currency...

The multi currency feature picks up exactly where it left off. The artifacts held the plan while you were away, whether that was five minutes or five days.

When NOT to Use Spec Driven Development

SDD is not appropriate in all cases. Here are some cases where the overhead isn't worth it:

Quick bug fixes. If you know exactly what's wrong and the fix is a one line change to a gateway timeout value, writing a spec is like filing a building permit to hang a picture frame. Just fix it.

Exploratory prototyping. When you're trying to figure out what to build, not how to build it, specs slow you down. Vibe coding is genuinely great for rapid exploration. If you're prototyping a new merchant dashboard layout to see what feels right, just build it iteratively.

Highly visual or interactive work. SDD tools are text based. If your feature is primarily about UI layout, animation, or interaction design, you'll spend more time describing the visual result in markdown than you'd spend just building it with visual feedback (though pairing SDD with TideWave can work wonders for UI work).

Trivial features. Updating an error message string, renaming a config key, bumping a dependency version. These don't need a spec. Use your judgment about the complexity threshold.

Rapidly changing requirements. If you're in a phase where the payment scheme keeps revising the spec and requirements shift weekly, maintaining your own specs becomes overhead that fights against your pace. Get to stability first, then spec the features that need to stick.

The general rule: if you can hold the entire change in your head and verify it by looking at it, you probably don't need a spec. If the change involves multiple files, multiple concerns, or behavior you can't verify visually, a spec starts paying for itself.

What to Watch Out For

Having used these tools and studied the experiences of others, here are the traps:

Spec bloat. The AI loves to generate exhaustive specifications. A feature that would take you 30 minutes to implement can produce 800+ lines of markdown. You have to be disciplined about trimming specs to what's actually useful. If you're not reading the spec carefully, it's worse than not having one because you'll have false confidence that edge cases are covered when they're not.

The waterfall trap. SDD can slide into big design up front if you're not careful and start bundling many features into one spec. If changing the spec feels expensive or bureaucratic, you've over formalized. OpenSpec's fluid workflow helps here since there are no phase gates, but you still need the discipline to keep specs lightweight enough to throw away and rewrite if you find yourself going down the wrong path.

Spec drift. The spec says one thing; the code does another. This happens when you make implementation fixes outside the spec workflow. Either update the spec when you deviate, or accept that the spec is aspirational rather than authoritative. OpenSpec's /opsx:sync command can help keep specs aligned during long running changes.

The AI ignores its own spec. This is a real and documented problem. Context windows are larger, but that doesn't mean the AI attends to everything in them equally. People have reported that AI agents generate code that contradicts the spec they just wrote, creating duplicate classes, ignoring constraints, or implementing patterns the spec explicitly avoided. The /opsx:verify step exists specifically to catch this.

Review fatigue. SDD adds a new category of artifact to review. You're now reviewing specs AND code. If your team doesn't value spec review as highly as code review, specs become rubber stamped documents that provide an illusion of rigour.

Over application to small changes. The tooling doesn't scale down well. Applying the full SDD workflow to a minor feature creates overhead that dwarfs the implementation time. You need a personal threshold for when to spec and when to just build.

The Waterfall Question

Every discussion of SDD eventually arrives at the same question: isn't this just waterfall with better marketing?

The comparison is fair to raise and unfair to leave unexamined. Traditional waterfall failed because of long feedback loops: months of design, months of implementation, and discovery at the end that the design didn't match reality. The feedback cycle was measured in quarters.

SDD, practiced well, has feedback cycles measured in minutes to hours. You write a spec for a single feature, not an entire system. You review the generated design before implementation starts. You implement in small, verifiable tasks. And critically, changing the spec and regenerating is cheap. The whole point is that code is a derived artifact you can throw away and recreate.

SDD can slide into waterfall like rigidity if you treat specs as immutable, if the spec writing phase becomes its own bottleneck, or if you use SDD as a substitute for iterative discovery. As Gojko Adzic observed, the movement builds on solid intent-first ideas but could reintroduce rigidity if practitioners aren't thoughtful about it.

The Thoughtworks perspective captures the nuance well: the problems of vibe coding come from being too fast, spontaneous, and haphazard, while the problems of waterfall come from being too slow, rigid, and disconnected from reality. SDD, when practiced well, occupies the middle ground. It provides a mechanism for shorter and more effective feedback loops than either extreme.

The honest answer is that SDD sits on a spectrum. At one end, you have "spec as lightweight sketch," a quick outline that gives the AI direction without constraining it. At the other end, you have "spec as source of truth," a comprehensive document that the code must conform to. OpenSpec's fluid approach leans toward the lighter end of that spectrum, which is why it appeals to teams who want discipline without ceremony.

Pros and Cons

What SDD Gives You

Reduced rework. Catching misunderstandings at the spec level is dramatically cheaper than catching them in code. When a BA's field mapping is wrong, you want to discover that while reviewing a proposal, not while debugging a failed settlement file at 2 AM.

Persistent context. Specs survive session boundaries, tool switches, and team changes. Six months from now, when someone asks why the FX rate locking works the way it does, the spec and its proposal explain both the what and the why.

Reviewable intent across roles. You can review a spec without reading any code. Product managers, BAs, QA, and principals can participate in spec review and catch requirement gaps before implementation begins. In a payments context, this means compliance can review the spec for regulatory alignment without needing to read Elixir.

What SDD Costs You

Time upfront. Writing and reviewing specs takes time that vibe coding doesn't require. For simple tasks, this overhead is pure cost with minimal benefit.

False precision. Detailed specs can create an illusion of completeness. Just because the spec covers edge cases on paper doesn't mean the AI will implement them correctly. You still need to test.

Tool immaturity. These tools are all early stage. Expect rough edges, breaking changes, and workflow gaps. The ecosystem is moving fast, which means today's best practices may be obsolete in six months.

Where This Is Heading

Spec driven development is less than a year old as a named practice, and the tooling is evolving fast. The fundamental insight, that AI agents produce better code when given structured intent rather than ad hoc prompts, seems durable even if the specific tools don't survive.

What's interesting is the convergence. BDD (Behavior Driven Development), TDD (Test Driven Development), and now SDD all share the same DNA: define the desired behavior before writing the implementation. SDD is that idea adapted for a world where the implementer is an AI agent rather than a human developer.

The open question is whether specs will remain the domain of dedicated tools, or whether this discipline gets absorbed into the AI coding tools themselves. We're already seeing Cursor, Claude Code, and Copilot add planning and multi step reasoning capabilities that accomplish some of what SDD tools do, without the explicit spec writing step.

For now, the practical takeaway is simple: if you're doing anything more complex than a quick prototype with AI coding tools, some form of structured planning, whether you call it SDD or just "thinking before prompting," will produce better results than vibing your way through it. The tools can help enforce that discipline, but the discipline itself is what matters.

The spec isn't the point. The thinking is.

Podcast: New Age Development Workflows

2026-01-15T13:43:56.472609+00:00

With Cheezy and Csaba. Topics include:

Importance of ancillary tools built on top of coding agents
Review of Stride requirements system so you don't waste your tokens
Does Programming Language matter? Survey says...
What exactly are Product Managers doing?
Code Review and AI Trust - this won't just won't die
How important are Software Engineering principles?

It's a Great Time to be a Software Engineer

2026-01-07T14:49:12.572487+00:00

Here are some thoughts on AI development based on my experience of the last two years. As with any list, these are in no particular order.

Get excited. AI is only coming for your job if you treat it as an optional part of your job. It's here to help you become a better and more efficient software engineer. Embrace it wholeheartedly just like you embraced IDEs in favour of text editors. Using AI doesn't make you a lesser programmer and not using it doesn't make you special in any way. In fact, not using it or resisting it makes you look out-of-touch. This is what you have been waiting for to love your job again, and it just might remind you that you got into this business because it feels great to create things, not necessarily code things.
Most code (upwards of 80%) should be AI generated at this point. If it's not, there is something inherently flawed about your workflow. Just put your pride aside, and acknowledge that AI is a better programmer than you. Your coding skills are now worth little, but your software engineering skills are worth a lot more. Invest in the latter, don't cling on to the former. AI code is still "your" code so you can take the same pride in it as you did before. You just learned how to type faster. A lot faster!
SRP, DRY, SOLID and clean design/code should be the focus of the programmer. Guiding AIs to get these right requires understanding the business context in which the software is being used, which AI doesn't know. How a feature is expected to change in the future, and what trade-offs need to be made is something you need to be an expert at. Do I create a new module? Is this method named appropriately? Is it taking too many parameters? Am I violating Demeter's Law? Is this file getting too big? Should I separate these two concerns? What would make this more reusable? These are the decisions you should be spending time on. This requires understanding the product more than you needed to in the past. You're not only a Software Engineer, you're a Product Engineer, and that requires a deep understanding of something you may have ignored in the past.
Context management (or engineering) is where efficiencies are to be gained. If you find yourself repeating things to a forgetful AI, then that's a problem to be solved. Simple solutions include Claude Skills and more sophisticated ones include using Beads. Your workflow should be constantly "saving" things to memory to make you more efficient. Sometimes I find myself frustrated by having to remind Claude that it needs to "do X first when it's doing Y" - those rules should be codified. Don't treat AGENTS.md or any other instruction file as a static document or it'll waste your time. How to manage your own context (and your team's) is something to dedicate time to. If you work in a large company, this is an especially interesting challenge as you have to balance alignment and autonomy, hard rule and guidelines, etc.
Everyone should read a book where you build an LLM from scratch. It's going to be painful and, like me, you're probably going to have to re-read chapters just to get it through your head (I did, many times), but when it does, you'll be better off for it. Though chances are you'll never develop your own LLM and probably use a frontier model most of the time, it helps knowing how things are working underneath the hood. You'll need to tweak model parameters at some point in your career, and having this foundational knowledge will be the difference between winging it and knowing what you're doing.
Code review is the new bottleneck. The good news is that we already have tools popping up that make this easier (e.g., Code Rabbit). For reviewing code locally, multi-agent workflows work great. Having a separate agent contextualized to reviewing code for correctness, security etc. with rules and guidelines are easy to implement, e.g. claude-code review --aspect "correctness" src/ > /tmp/review_correctness.md. If you're not using multi-agent workflows, this is an easy place to start. Here's a couple other candidates: 1) an agent dedicated to providing good commit messages based on git diff, 2) test refactoring agent which gets invoked to clean up tests; shoving test clean up rules into the "development" context may be too much, so having a separate focused agent will work better.
There is no excuse not to have clean code. Refactoring is cheap, writing tests is cheaper. If you have code that's not clean, generate higher-level tests for it, and then ask the agent to refactor. The tests will serve as your guiding light on whether something went wrong. This is especially valuable in brownfield codebases where changes are the riskiest. Having dedicated workflows to "clean up code" is another example of easy to implement multi-agent workflows.
Documentation is free. Whether it be inline code documentation, architectural diagrams or Correction of Error analysis, what used to take days now takes minutes. There is simply no excuse not to have comprehensive and up-to-date documentation, both from a product and engineering point of view. Not only should your code describe what it does where clarity is needed, it should also indicate the business rules behind it (whether it be inline or linked to external docs). A programmer reading the code should have a single point of entry to understand both the design decisions and the context in which the customer is using it.
Cost optimization is now part of software engineering. Not every task needs Claude Opus, and knowing when to delegate to cheaper AIs is a skill. Even better, a free one like Qwen Code should be installed locally for simple tasks and basic CRUD operations (which is about 90% of all development). Complex refactoring with business context is worth the Opus pricing. You should have mental models about which model to reach for given the problem at hand. Track your AI costs per feature just like you'd track compute costs on AWS so you can optimize your workflow and not just the code. Running expensive models on trivial tasks is wasteful and unprofessional.
High-Level System Design is where you are needed. AI will crush implementation details but architectural decisions require human judgment that understands business constraints, team capabilities, and long-term maintenance burden. You need to get better at system design, understanding trade-offs between different architectural patterns, and making decisions that account for factors AI can't know - like the fact your team hates microservices or that you're planning to acquire a company next quarter. This is where your value multiplies.

How to Setup a New WordPress Site on Digital Ocean Droplet

2026-01-04T18:14:43.939380+00:00

I had setup a Wordpress site a while back on Digital Ocean using their One-Click Installer. That installer sets it up for exactly one site with no direction on how to add additional sites. Months later I had to setup another one, and after some fiddling around, figured out all the steps. So here they are (more or my reference than anything).

This guide assumes you setup the site with the One-Click WordPress droplet and want to add additional WordPress sites to it.

Prerequisites

Digital Ocean droplet with WordPress One-Click already installed
SSH access to your server
A domain name pointed to your droplet's IP address

1. Create the Virtual Host Configuration

Create a new Apache config file for your site:

sudo nano /etc/apache2/sites-available/yoursite.conf

Add the following configuration (replace yourdomain.com and yoursite with your actual values):

<VirtualHost *:80>
    ServerName yourdomain.com
    ServerAlias www.yourdomain.com
    DocumentRoot /var/www/yoursite

    <Directory /var/www/yoursite>
        AllowOverride All
        Require all granted
    </Directory>

    ErrorLog ${APACHE_LOG_DIR}/yoursite_error.log
    CustomLog ${APACHE_LOG_DIR}/yoursite_access.log combined
</VirtualHost>

2. Create the Document Root Directory

sudo mkdir /var/www/yoursite
sudo chown -R www-data:www-data /var/www/yoursite

3. Enable the Site

sudo a2ensite yoursite.conf
sudo apache2ctl configtest
sudo systemctl reload apache2

4. Setup HTTPS with Certbot

Certbot is preinstalled on the droplet. Run:

sudo certbot --apache -d yourdomain.com -d www.yourdomain.com

Follow the prompts to enter your email and agree to terms. Certbot will automatically configure SSL and set up auto-renewal.

Verify auto-renewal works:

sudo certbot renew --dry-run

5. Create MySQL Database and User

Get the MySQL root password:

cat /root/.digitalocean_password

Log into MySQL:

mysql -u root -p

Create the database and user:

CREATE DATABASE yoursite;
CREATE USER 'yoursite_user'@'localhost' IDENTIFIED BY 'your_strong_password_here';
GRANT ALL PRIVILEGES ON yoursite.* TO 'yoursite_user'@'localhost';
FLUSH PRIVILEGES;
EXIT;

6. Download and Install WordPress

cd /var/www/yoursite
sudo wget https://wordpress.org/latest.tar.gz
sudo tar -xzf latest.tar.gz --strip-components=1
sudo rm latest.tar.gz
sudo chown -R www-data:www-data .

Visit your domain in a browser to complete the WordPress installation. You'll need:

Database Name: yoursite
Username: yoursite_user
Password: (the password you set)
Database Host: localhost
Table Prefix: wp_ (or customize for security)

Importing an Existing Database (Optional)

If you have a mysqldump file to import:

mysql -u yoursite_user -p yoursite < /path/to/your-database.sql

For gzipped files:

gunzip < your-database.sql.gz | mysql -u yoursite_user -p yoursite

Troubleshooting

Viewing Apache Error Logs

tail -f /var/log/apache2/yoursite_error.log

Enabling WordPress Debug Mode

Edit wp-config.php:

nano /var/www/yoursite/wp-config.php

Add or modify these lines:

define('WP_DEBUG', true);
define('WP_DEBUG_LOG', true);
define('WP_DEBUG_DISPLAY', false);

Then view the debug log:

tail -f /var/www/yoursite/wp-content/debug.log

PHP Not Working

PHP 8.0 is enabled globally on the droplet, so it should work automatically. Test with:

echo "<?php phpinfo(); ?>" | sudo tee /var/www/yoursite/test.php
sudo chown www-data:www-data /var/www/yoursite/test.php

Visit https://yourdomain.com/test.php — then delete the file after testing:

sudo rm /var/www/yoursite/test.php

Notes

The default web root for the original site is /var/www/html
MySQL credentials are stored in /root/.digitalocean_password
XML-RPC is blocked by default; enable with a2disconf block-xmlrpc if needed
Apache config files are in /etc/apache2/sites-available/
Enabled sites are symlinked in /etc/apache2/sites-enabled/

Struct Updates Now Require Pattern Matching in Elixir 1.19

2026-01-02T17:36:03.310436+00:00

After upgrading to Elixir 1.19, you might see warnings like this:

warning: a struct for Amplify.Models.Product is expected on struct update:

    %Amplify.Models.Product{product | variants: variants}

but got type:

    dynamic()

The fix is to pattern match on the struct when you define the variable:

# Before (now warns)
{:ok, product} = Products.get_product(id)
product = %Product{product | variants: filtered}

# After (two options)

# Option 1: Pattern match + struct update
{:ok, %Product{} = product} = Products.get_product(id)
product = %Product{product | variants: filtered}

# Option 2: Pattern match + map update (recommended)
{:ok, %Product{} = product} = Products.get_product(id)
product = %{product | variants: filtered}

Both work. Elixir's hint suggests Option 2 since the pattern match already guarantees the type.

Why the change? The struct update syntax %Product{product | key: value} implies a runtime assertion that if product isn't actually a Product, it crashes. The compiler trusted you knew what you were doing but there wasn't any runtime enforcement despite the code appearing to look like a type was enforced.

This also doesn't work well with type inference, which is Elixir's typing approach (as opposed to explicit annotations). When product comes from a function returning {:ok, any()}, the compiler sees it as dynamic(). It can't verify the struct update is safe without runtime execution.

This change was needed since Elixir 1.19 continues the rollout of set-theoretic types, i.e., a gradual typing system that infers types from patterns and guards rather than explicit annotations. The goal is catching bugs at compile time without requiring you to annotate everything.

For this to work, the compiler needs evidence. A pattern match like %Product{} = product provides that evidence. Without it, the type system has to treat the variable as dynamic(), which defeats the purpose.

The struct update syntax was convenient shorthand, but it created a hole in type inference. You could write code that the compiler couldn't verify, getting neither the safety of static typing nor a clear signal that you'd opted out. The new approach is more verbose, but the pattern match documents your assumption explicitly. Six months from now, when someone refactors get_product/1 to return a different struct, the compiler will catch it instead of production.

If you're updating many struct fields and want the old convenience, the map update syntax %{product | key: value} works identically at runtime but only after you've pattern matched when defining the variable.

Why Regex Serialization Changed in Elixir 1.19/OTP 28

2026-01-02T06:23:03.497042+00:00

When upgrading to Elixir 1.19 with Erlang/OTP 28, the deployment failed with a cryptic error:

** (Mix) Could not write configuration file because it has invalid terms

Application: :cors_plug
Key: :origin
Invalid value: ~r/.*\.jumpcomedy\.com$/
Reason: you must use the /E modifier to store regexes

The fix was simple - just add one characters:

config :cors_plug,
-  origin: ~r/.*\.jumpcomedy\.com$/
+  origin: ~r/.*\.jumpcomedy\.com$/E

But why did this break? And what does that /E actually do?

When you deploy an Elixir application, the build process creates a release, which is a packaged version of your app that includes all your code and configuration pre-compiled and ready to run (like a JAR in Java).

For this to work, Elixir needs to serialize your configuration into a format that can be stored on disk and loaded later. Erlang/OTP 28 has introduced runtime optimizations but to benefit from them, regexes need to be stored differently on disk than how they're represented in code.

The /E modifier tells Elixir: "Convert this regex to a format that can be stored in releases."

Without /E:

Your regex works fine in development (running with mix phx.server)
Your regex works fine when running tests
Your regex fails when building a production release

With /E:

Your regex works everywhere, including in releases

/E stands for "external term format" and ensures the regex can be packaged for deployment. In our case, we were using a regex in our CORS configuration:

config :cors_plug,
  origin: ~r/.*\.jumpcomedy\.com$/E

This lives in config/runtime.exs, which gets evaluated when the release starts up. For that to work, the regex needs to be in a format that can be saved to the release bundle, hence the need for /E.

You need the /E modifier when:

Using regexes in application configuration (like we did with CORS)
Storing regexes in struct defaults (though Elixir 1.19 now prevents this entirely - see the release notes)
Any place where the regex will be serialized into a release

You don't need /E when:

Using regexes in your runtime code (pattern matching, validations, etc.)
Using regexes in tests
Using regexes that are created dynamically at runtime

Elixir 1.19's stricter handling of regexes reflects Erlang/OTP 28's more sophisticated internal representation. While it requires updating configuration files, it's a small price to pay for the performance improvements.

Trade-offs in Aggregate Design when implementing CQRS in Elixir

2025-12-29T16:52:02.059205+00:00

Introduction

Event sourcing with CQRS is a powerful feature, but it presents difficult design decisions which can challenge dogmatic Domain Driven Design theory. Ultimately as with all software engineering trade-offs, the business need dictates whether the complexity is worth it.

It's not an easy decision to introduce the CQRS pattern when simpler ones appear to be adequate, at least on the surface. In this blog post we'll cover how we used it to solve the seemingly simple problem of waitlist notifications. We'll also cover how it addressed the need for efficient analytics and history tracking of sales.

This post walks through our implementation of CQRS (Command Query Responsibility Segregation) with the Commanded library to build a complete inventory audit trail. We'll cover:

The business problem: Why simple inventory tracking wasn't enough
Evaluating alternatives: From database triggers to CQRS, and why we chose what we did
Aggregate design: The critical decision of small vs large aggregates
Implementation: Commands, events, aggregates, projectors, handlers, and the service layer
Event handlers: How we used the event stream to solve notifying customers when sold-out items become available
Production challenges: EventStore on managed PostgreSQL
Test synchronization without sleep calls: Writing deterministic tests without brittle sleep calls
Key lessons: What we'd do the same, and what we learned

Let's dive in.

The Business Problem

Notifying Customers When Inventory Becomes Available

Inventory in the system can increase for an item for many reasons: refunds, swaps, or an administrator increasing capacity for an event. The question we are trying to answer is whether a sold out item just became available due to any of these reasons, and if so, can we notify people who signed up to get notified when it became available?

Popular events sell out fairly quickly leaving many disappointed, and instead of having people call the organizer asking to allocate more tickets or people coming refreshing the page to see if something opened up, we decided to implement a "post sell out waitlist" where people can sign up to receive notifications if inventory became available.

Improving Auditability

We also wanted to improve the auditability and granularity of how we track inventory changes over time. The existing system tracked inventory_quantity on each item, but had no history of how it got there, at least not one that is easy and efficient to read. We could do several joins and some in-memory calculations to replay how sales went, but the user experience would be slow and the data wouldn't be conducive to analytics. We wanted efficient reads of sales history for both customer analytics and internal system auditability and debugging, so we know where things went wayward when problems inevitably happen. Customers also wanted to know:

How did this show sell out earlier than expected?
Who adjusted the capacity, and when?
What was the inventory level at any point in time and what happened to get it there?

We didn't just need to know what changed, but why it changed and who made it happen.

Evaluating the Options

Option 1: Database Triggers

PostgreSQL triggers could automatically log changes to the variants table:

CREATE TRIGGER log_inventory_change
AFTER UPDATE OF inventory_quantity ON product_variants
FOR EACH ROW EXECUTE FUNCTION log_variant_change();

This is easy to implement, transparent and requires no application code changes as the database does the work.

The problem with this approach is that business context is completely lost. The trigger sees "quantity changed from 100 to 98" but can't distinguish a sale from a return from an admin adjustment. Maintenance becomes a separate concern from application logic. The days or PL/SQL where business logic sits inside the database are long gone, and application logic holds business rules. This method doesn't have easy access to the larger context of why data is being manipulated.

Option 2: Ecto Callbacks

We could use Ecto's lifecycle callbacks to log changes:

defmodule ProductVariant do
  use Ecto.Schema

  after_update :log_inventory_change

  defp log_inventory_change(changeset) do
    # Log the change...
  end
end

This keeps the code in Elixir and there is some entity specific business context available via the changeset. The logic lives "near" the entity so it's hard to miss and the language seems logical. The issue with this pattern is that it's brittle as it still has to calculate in code why the change happened, which means the changeset will need to be bloated and carry more information than it actually is changing.

For example, just because we're updating the quantity from one value to another, the changeset would have to carry much more information than that to serve the auditability needs. It's also easy to bypass with direct Repo.update_all calls.

It also tightly couples the business transaction with logging needs.

Option 3: Manual Logging in Context Functions

We could add explicit log inserts alongside every inventory-changing operation:

def process_refund(params) do
  # Update inventory
  Repo.update!(variant, inventory_quantity: params.quantity)

  # Log the change
  Repo.insert!(%InventoryLog{
    variant_id: params.variant_id,
    reason: "refund",
    quantity_change: -params.quantity,
    order_id: params.order_id
  })
end

We could do this for sales, swaps etc, and have the full business context available while having explicit code that writes to the log table.

This is easy to forget in some code paths and you have logging code scattered across the codebase. Consistency guaranteed decrease as if the log fails, the transaction will fail which may not be what we always want. There is lots of code duplication as the log entity needs to be constructed in multiple places.

Option 4: CQRS with Large Aggregates

CQRS with Commanded can work if we use aggregates at the Order level. An OrderInventory aggregate would track all inventory changes for an entire order. We get transactional consistency across all line items in an order.

However, aggregate boundary design is hard, and when multiple operations touch the same order we run into consistency challenges. The larger aggregate state needs to load/rebuild frequently and cross-order operations like admin adjustments don't fit the model well as it's not happening within the context of an order. We could design multiple aggregates like OrderInventory and AdminInventory but now there is overlap in concepts and language, which violates some core principles of Domain Driven Design.

Invariants are also hard to construct as the relationship between orders, item inventory and an admin's workflow spans many entities, making the invariant brittle.

Option 5: CQRS with Small Aggregates (Chosen One)

CQRS with Commanded but with smaller aggregates specific to an item/variant's inventory is what we landed on. Specifically, a VariantInventory aggregate per product variant which tracks that items inventory and doesn't explicitly tie the aggregate to larger entities like Order. A big reason we chose this was the guidance provided by Vaughn Vernon in his three-part series (1, 2, 3) discussing aggregate modelling.

There's also minimal contention as different variants get processed concurrently due to the simpler aggregate state. It's easy to reason about as each aggregate answers one question: "What happened to this variant's inventory?"

The audit requirements demanded explicit business intent capture. We needed "this inventory decreased because of a sale on order #123," not just "inventory_quantity changed from 100 to 98."

CQRS with Commanded gave us:

Explicit commands that capture intent (RecordSale, RecordReturn, RecordAdminAdjustment)
Immutable events stored in an append-only log (EventStore)
Separation of write model (aggregates) from read model (projections)
Inventory changes are naturally variant-scoped
High concurrency during ticket sales demands minimal contention
Each aggregate tracks one thing, making it easy to understand and debug

Eric Evans' DDD "Blue Book" often implies larger aggregates that enforce complex invariants. But when the domain naturally partitions (inventory per variant), smaller aggregates reduce complexity and improve performance.

The cons may be that cross-variant operations require multiple commands and we can't enforce cross-variant business rules in a single transaction. This is not currently a business requirement for us, so we went with the smaller, more purposeful aggregates rather than a more traditional one.

Architecture Overview

The CQRS Pattern in Our Context

Here's the flow from a sale to the audit log:

Service Layer (Inventory.record_order_sales)
    ↓
Command (RecordSale)
    ↓
Router (InventoryRouter)
    ↓
Aggregate (VariantInventory.execute)
    ↓
Event (InventoryChanged)
    ↓
Projector (InventoryProjector)
    ↓
Handler (InventoryHandler)
    ↓   
Read Model (inventory_events table)

The sequence diagram illustrates this further:

Open in New Window

Each layer has a specific responsibility:

Commands represent intent ("record a sale of 2 tickets")
Aggregates enforce business rules and produce events
Events represent facts that happened ("inventory changed")
Projectors build read models optimized for queries
Handlers implement side effects (e.g., sending out emails)

Key Components

Component	Module	Purpose
Application	`Amplify.CommandedApplication`	Commanded application, supervises everything
Router	`Amplify.CQRS.Routers.InventoryRouter`	Routes commands to aggregates by variant_id
Aggregate	`Amplify.CQRS.Aggregates.VariantInventory`	Business logic, produces events
Event	`Amplify.CQRS.Events.InventoryChanged`	Immutable fact record
Projector	`Amplify.CQRS.Projectors.InventoryProjector`	Writes to inventory_events table
Handler	`Amplify.CQRS.Handlers.InventoryHandler`	Checks if any business actions with side effects need to be taken
Service	`Amplify.Services.Inventory`	Clean API for callers

Implementation Deep Dive

Command Design

We have six command types, each capturing specific business intent:

# Record a sale from an order
defmodule Amplify.CQRS.Commands.Inventory.RecordSale do
  defstruct [
    :variant_id,
    :order_id,
    :quantity_sold,
    :actor_id,
    :actor_type,
    metadata: %{}
  ]
end

# Record an admin capacity adjustment
defmodule Amplify.CQRS.Commands.Inventory.RecordAdminAdjustment do
  defstruct [
    :variant_id,
    :quantity_remaining,  # Absolute value, not delta
    :actor_id,
    :actor_type,
    metadata: %{}
  ]
end

# Other commands: RecordReturn, RecordSwapIn, RecordSwapOut, RecordVariantCreated

Notice the difference: RecordSale has quantity_sold (a delta), while RecordAdminAdjustment has quantity_remaining (an absolute value). This matches how humans think about these operations. A sale may reduce inventory by 2, but when an admin makes a change they change the overall capacity of an event from 50 to 60 and enter the number 60 into the UI instead of 10 (60-50). This is a tenet of Domain Driven Design where our language matches the business context of an operation.

The Single Event Approach

We use one event type for all inventory changes:

defmodule Amplify.CQRS.Events.InventoryChanged do
  @derive Jason.Encoder
  defstruct [
    :variant_id,
    :order_id,
    :return_id,
    :reason,              # :sale, :return, :admin_adjustment, :swap_in, :swap_out
    :actor_id,
    :actor_type,
    quantity_remaining: 0,
    quantity_sold: 0,
    quantity_adjustment: 0,
    was_sold_out: false,
    is_sold_out: false,
    metadata: %{}
  ]
end

Why one event type instead of InventorySold, InventoryReturned, etc.? Simplicity. The reason field captures the business intent, and the projector handles all events uniformly. We can always split into multiple event types later if needed, but we opted to go for a simpler approach to start.

The Aggregate

The aggregate is where business logic lives. It's identified by variant_id:

defmodule Amplify.CQRS.Routers.InventoryRouter do
  use Commanded.Commands.Router

  alias Amplify.CQRS.Aggregates.VariantInventory
  alias Amplify.CQRS.Commands.Inventory.{RecordSale, RecordAdminAdjustment, ...}

  # Each variant_id gets its own aggregate instance
  identify(VariantInventory, by: :variant_id, prefix: "variant-inventory-")

  dispatch([RecordSale, RecordAdminAdjustment, ...],
    to: VariantInventory,
    identity: :variant_id)
end

The aggregate's execute/2 function takes a command and returns an event:

defmodule Amplify.CQRS.Aggregates.VariantInventory do
  defstruct [
    :variant_id,
    quantity_remaining: 0,
    quantity_sold: 0,
    is_sold_out: false
  ]

  def execute(%__MODULE__{} = state, %RecordSale{} = cmd) do
    new_sold = state.quantity_sold + cmd.quantity_sold
    new_remaining = state.quantity_remaining - cmd.quantity_sold
    new_sold_out = new_remaining <= 0

    %InventoryChanged{
      variant_id: cmd.variant_id,
      order_id: cmd.order_id,
      reason: :sale,
      actor_id: cmd.actor_id,
      actor_type: cmd.actor_type,
      quantity_remaining: new_remaining,
      quantity_sold: new_sold,
      quantity_adjustment: -cmd.quantity_sold,
      was_sold_out: state.is_sold_out,
      is_sold_out: new_sold_out
    }
  end

  def execute(%__MODULE__{} = state, %RecordAdminAdjustment{} = cmd) do
    # Admin adjustments set absolute quantity, not delta
    adjustment = cmd.quantity_remaining - state.quantity_remaining
    %InventoryChanged{
      ...
    }
  end

  # apply/2 updates state from events (for rebuilding from event stream)
  def apply(%__MODULE__{} = state, %InventoryChanged{} = event) do
    %__MODULE__{state |
      variant_id: event.variant_id,
      quantity_remaining: event.quantity_remaining,
      quantity_sold: event.quantity_sold,
      is_sold_out: event.is_sold_out
    }
  end
end

The Projector and Read Model

The projector subscribes to events and writes to the database. Importantly, it enriches the event with product_id and account_id that we derive from the variant:

defmodule Amplify.CQRS.Projectors.InventoryProjector do
  use Commanded.Projections.Ecto,
    application: Amplify.CommandedApplication,
    repo: Amplify.Repo,
    name: "InventoryProjector",
    consistency: :strong

  project(%InventoryChanged{} = event, _metadata, fn multi ->
    # Derive product_id and account_id from the variant
    {product_id, account_id} = get_product_and_account_ids(event.variant_id)

    changeset =
      %InventoryEvent{}
      |> Ecto.Changeset.change(%{
        variant_id: event.variant_id,
        product_id: product_id,
        account_id: account_id,
        order_id: event.order_id,
        return_id: event.return_id,
        reason: to_string(event.reason),
        actor_id: event.actor_id,
        actor_type: to_string(event.actor_type),
        quantity_remaining: event.quantity_remaining,
        quantity_adjustment: event.quantity_adjustment,
        was_sold_out: event.was_sold_out,
        is_sold_out: event.is_sold_out
      })

    Ecto.Multi.insert(multi, :inventory_event, changeset)
  end)

  defp get_product_and_account_ids(variant_id) do
    query = from v in ProductVariant,
      join: p in assoc(v, :product),
      where: v.id == ^variant_id,
      select: {p.id, p.account_id}

    Repo.one(query) || {nil, nil}
  end
end

This is a key design decision: commands only need variant_id, and the projector derives additional context. This keeps commands simple and decoupled. We could have passed in product_id and account_id as part of the command and event, but that seemed like unnecessary proliferation of data, especially when they can be easily and consistently derived.

The Service Layer: Encapsulating CQRS Complexity

Client code shouldn't need to know about commands, aggregates, or Commanded. The service layer provides a clean API:

defmodule Amplify.Services.Inventory do
  alias Amplify.CQRS.Commands.Inventory.{RecordSale, RecordReturn, ...}
  alias Amplify.Context.Orders

  def record_order_sales(order_id, opts \\ []) do
    order = Orders.get_order(order_id)
    dispatch_opts = if opts[:consistency] == :strong,
      do: [consistency: :strong],
      else: []

    Enum.each(order.line_items, fn line_item ->
      cmd = %RecordSale{
        variant_id: line_item.variant_id,
        order_id: order_id,
        quantity_sold: line_item.quantity,
        actor_id: nil,
        actor_type: :system
      }
      Amplify.CommandedApplication.dispatch(cmd, dispatch_opts)
    end)

    :ok
  end

  def record_admin_adjustment(variant_id, quantity_remaining, user_id) do
     ...
  end
end

Compare what client code looks like with and without the service layer:

Without service layer:

# In AMQP worker - messy, repeated, error-prone
order = Orders.get_order(order_id)
Enum.each(order.line_items, fn li ->
  cmd = %RecordSale{
    variant_id: li.variant_id,
    order_id: order_id,
    quantity_sold: li.quantity,
    actor_id: nil,
    actor_type: :system,
    metadata: %{}
  }
  Amplify.CommandedApplication.dispatch(cmd)
end)

With service layer:

# Clean, single line
Inventory.record_order_sales(order_id)

Design Decisions and Trade-offs

Small vs Large Aggregates: A Deep Dive

This was our most impactful architectural decision.

Eric Evans' "Blue Book" tends toward larger aggregates that enforce invariants across related entities. An Order aggregate containing LineItems ensures order totals stay consistent. This makes sense when you need transactional guarantees across the whole.

For inventory tracking, we chose one aggregate per variant rather than per-order or per-product:

Natural Domain Boundaries: When someone buys 2 GA and 1 VIP ticket, those are independent inventory operations. There's no invariant requiring atomic updates across variants.
Concurrency and Contention: During a hot ticket sale, hundreds of concurrent purchases hit the system. With per-product aggregates, every purchase would serialize. With per-variant, GA and VIP process in parallel.
Aggregate Loading Cost: Commanded rebuilds aggregate state by replaying events. Large aggregates accumulate more events, making each command slower.
Cognitive Simplicity: Each VariantInventory answers one question: "What happened to this variant's inventory?"

The trade-off here can be seen in ticket swaps where that affect two variants, but due to our design we can only update one aggregate atomically. The solution is to dispatch two commands (RecordSwapOut, RecordSwapIn). We lose atomic guarantee but can correlate via order_id. In this case, eventual consistency is more than acceptable. My personal view is that eventually consistency is often acceptable and developers tend to over-index to strong consistency models only out of habit or an unfounded fear. We sometimes forget that not too long ago, almost everything was a batch job and never strongly consistent. I digress.

Strong vs Eventual Consistency

CQRS often emphasizes eventual consistency, but we needed both:

For admin adjustments through the UI, we need strong consistency as users expect to see their change immediately as they're waiting on a screen.
Background order processing can be oK with eventual consistency as there is no user waiting so we can focus on maximizing throughput.
For tests we need strong consistency so we can have deterministic assertions without sleep calls (e.g., waiting for background processes to finish making for brittle tests)

# Background jobs: eventual (default)
Inventory.record_order_sales(order_id)

# Admin UI: strong
Inventory.record_admin_adjustment(variant_id, 100, user_id)  # Always strong

# Tests: strong for determinism
Inventory.record_order_sales(order_id, consistency: :strong)

Multiple Sources of Truth for Inventory

As it stands, we have two sources of truth for the inventory number. The first is the value in the inventory_quantity column in the table, and the second is the aggregate. This is an acceptable trade-off as we use the aggregate event sourcing to determine moment in time actions and auditability, while the inventory_quantity field can really be thought of as a read projection which will eventually go away.

Integration and Testing

AMQP Message Handlers

Sometimes new events can come in from other systems via a queue, and our inventory service integrates cleanly with AMQP workers listening for messages. In both examples below we use eventual consistency due to reasons stated earlier.

def handle_deliver(%{queue: "new_order_queue"}, message) do
  order_id = message.payload

  # ... other order processing ...

  # One line to record all inventory changes
  Inventory.record_order_sales(order_id)

  :ok
end

def handle_deliver(%{queue: "return_processed_queue"}, message) do
  return_id = message.payload

  # ... refund processing ...

  Inventory.record_return(return_id)

  :ok
end

End-to-End Testing Without Sleep Calls

One of the best aspects of this architecture is testability. Look at this test:

test "records sale for each line item in order" do
  # Setup: create test data
  account = insert(:account)
  product = insert(:event_product, account: account)
  variant1 = insert(:product_variant, product: product, inventory_quantity: 100)
  variant2 = insert(:product_variant, product: product, inventory_quantity: 50)
  customer = insert(:customer)

  order = create_order(customer: customer)
  insert(:line_item, order: order, variant: variant1, quantity: 2)
  insert(:line_item, order: order, variant: variant2, quantity: 3)

  # Execute: call the service layer with strong consistency
  :ok = Inventory.record_order_sales(order.id, consistency: :strong)

  # Assert: query the read model immediately - no sleep needed!
  events = Repo.all(from e in InventoryEvent, order_by: e.inserted_at)
  assert length(events) == 2

  [event1, event2] = events
  assert event1.variant_id == variant1.id
  assert event1.quantity_adjustment == -2
  assert event1.reason == "sale"
  assert event1.order_id == order.id

  assert event2.variant_id == variant2.id
  assert event2.quantity_adjustment == -3
end

This test exercises the entire CQRS stack:

Service Layer (Inventory.record_order_sales) loads the order and constructs commands
Router routes commands to the correct aggregate instances
Aggregate (VariantInventory.execute) produces events
EventStore persists the events
Projector writes to the inventory_events table
Database stores the read model

And we can assert immediately after the call because consistency: :strong blocks until the projection completes. No Process.sleep(100) hoping the async work finished.

The naive approach with eventual consistency:

# Bad: flaky, slow, non-deterministic
Inventory.record_order_sales(order_id)
Process.sleep(100)  # Hope 100ms is enough... it often isn't
event = Repo.one(InventoryEvent)  # Might still be nil!

Problems with sleep:

Unreliable: 100ms might not be enough under load, and you're always hitting the "floor" waiting time even though you may not need it. And there's no guarantees it's enough.
Slow: 100ms per test × hundreds of tests = minutes wasted during CI and in local development.
It is inherently non-deterministic making continuous integration flaky and unreliable.

Strong consistency in tests solves all of this.

Closing the Loop: Waitlist Notifications

Remember the business problem from the introduction? We needed to notify customers when sold-out tickets become available. With our CQRS architecture in place, we now have all the pieces to solve this.

The Missing Piece: Event Handlers

Commanded provides Event Handlers that subscribe to the event stream and react to events. Unlike projectors (which build read models), handlers execute side effects which is perfect for triggering notifications.

Here's our waitlist notification handler:

defmodule Amplify.CQRS.Handlers.InventoryHandler do
  @moduledoc """
  Event handler that monitors inventory changes and triggers waitlist notifications
  when inventory becomes available (transitions from sold out to available).
  """

  use Commanded.Event.Handler,
    application: Amplify.CommandedApplication,
    name: "InventoryHandler",
    consistency: :strong

  alias Amplify.CQRS.Events.InventoryChanged

  require Logger

  @impl Commanded.Event.Handler
  def handle(%InventoryChanged{} = event, _metadata) do
    if inventory_became_available?(event) do
      # write code to handle waitlist notifications
    end

    :ok
  end

  # Check if inventory transitioned from sold out to available
  defp inventory_became_available?(%InventoryChanged{was_sold_out: true, is_sold_out: false}) do
    true
  end

  defp inventory_became_available?(_event), do: false
end

The Power of Event-Driven Design

Notice what's happening here. We didn't have to:

Modify any existing code as the handler subscribes to the same events the projector already receives
Add notification logic to business operations and the service layer doesn't know or care about waitlists
Track "previous state" manually as the aggregate already computed was_sold_out and is_sold_out

The pattern match is elegant: %InventoryChanged{was_sold_out: true, is_sold_out: false} captures exactly the transition we care about which is inventory that was sold out but isn't anymore.

Testing Event Transitions

We verify the handler's detection logic by testing the events it would receive. The beautify of ExUnit and how easily it integrates with databases gives confidence to our tests. We do almost zero manual testing of even the most complex use cases due to a strong integration test suite.

test "event correctly tracks sold out to available transition" do
  account = insert(:account)
  product = insert(:event_product, account: account)
  variant = insert(:product_variant, product: product, inventory_quantity: 0)
  user = insert(:user)

  # First, record sold out state
  :ok = Inventory.record_admin_adjustment(variant.id, 0, user.id)

  # Now increase inventory - this creates the waitlist trigger event
  :ok = Inventory.record_admin_adjustment(variant.id, 50, user.id)

  events = Repo.all(from e in InventoryEvent, order_by: [desc: e.inserted_at], limit: 1)
  [event] = events

  # This event represents the exact transition the handler looks for
  assert event.was_sold_out == true
  assert event.is_sold_out == false
  assert event.quantity_remaining == 50
end

Why This Architecture Shines

This is where CQRS pays off. The business asked: "Can we notify people when tickets become available?" With traditional CRUD, we'd need to:

Find every place inventory gets updated
Add "was it sold out before?" checks to each location
Hope we didn't miss any code paths
Couple notification logic to inventory operations

With event sourcing, we added one handler that subscribes to the event stream. Every inventory change - sales, returns, swaps, admin adjustments - flows through the same pipeline. The handler sees them all, filters for the transition it cares about, and triggers notifications.

The aggregate already tracked the state transition (was_sold_out → is_sold_out) because we designed events to capture complete before/after context. We can't anticipate what features are needed next, but this design gives us extensibility as new features become subscribers to existing events, not modifications to existing code. This is fundamentally why we decided the complexity was worth it.

Challenges and Resolutions

EventStore Setup on Managed PostgreSQL

When deploying to production on DigitalOcean's managed PostgreSQL, we hit an issue:

** (Postgrex.Error) ERROR 3D000 (invalid_catalog_name): database "postgres" does not exist

The Problem was that EventStore.Tasks.Create.exec connects to a postgres maintenance database to create the EventStore database. Managed PostgreSQL often doesn't have this default database so we needed to specify a default_database in our event store configuration which wasn't needed locally.

Swap Operations Spanning Two Variants

Ticket swaps move inventory from one variant to another. But with per-variant aggregates, we can't atomically update both.

The solution, as earlier touched on, was to dispatch two commands and correlate via order_id:

def record_swap(order_id, swapped_out_variant_id, swapped_in_variant_id, opts \\ []) do
  order = Orders.get_order(order_id)
  quantity = get_swap_quantity(order, swapped_out_variant_id)

  # Two separate commands, same order_id for correlation
  dispatch(%RecordSwapOut{
    variant_id: swapped_out_variant_id,
    order_id: order_id,
    quantity_returned: quantity
  }, opts)

  dispatch(%RecordSwapIn{
    variant_id: swapped_in_variant_id,
    order_id: order_id,
    quantity_sold: quantity
  }, opts)

  :ok
end

For audit purposes, this is fine. We can query both events by order_id to see the complete swap.

Lessons Learned

Start Simple

A single event type with a reason field was the right starting point. We can always split into InventorySold, InventoryReturned, etc. later if we need stronger typing. Starting with many event types adds complexity before you understand the domain.

Commands Capture Intent, Events Capture Facts

Commands describe what you want to do: "Record a sale of 2 tickets." Events describe what happened: "Inventory changed, reason: sale, adjustment: -2." This separation is where the audit value comes from.

Projector Enrichment is Powerful

Keeping commands minimal (variant_id only) and letting the projector derive product_id and account_id kept the command interface clean. The projector can afford the extra query; commands should be lightweight.

Make Room for Side Effects

The handlers feature for Commander is critical for implementing side effects. It is an extensible escape hatch where you can do whatever you like (within reason) and aren't tied to CQRS rules like avoiding side effects in aggregate state mutations, always ensuring methods which transform commands to events don't fail, etc.

Strong Consistency Has Its Place

Despite CQRS literature emphasizing eventual consistency, having the option for strong consistency was essential for:

Admin UI responsiveness
Deterministic tests
Critical operations where "fire and forget" isn't acceptable

Conclusion

The Aggregate Design Decision

The most impactful choice wasn't whether to use CQRS - it was aggregate sizing.

Consideration	Large Aggregates	Small Aggregates
Invariant enforcement	Strong (atomic)	Weak (eventual)
Contention under load	High	Low
Event stream size	Large, slow rebuild	Small, fast rebuild
Cognitive load	Higher	Lower
Cross-entity operations	Single command	Multiple commands

For inventory tracking, small aggregates (per-variant) won because:

No cross-variant invariants require atomic enforcement
High concurrency demands low contention
Simple aggregates are easier to debug and evolve

Key Takeaways

Start with the smallest aggregate boundary that makes sense for your domain
Use strong consistency selectively: admin UIs and tests, not background processing
Encapsulate CQRS behind a service layer: callers shouldn't know about commands
Design events with before/after context - you'll thank yourself when new features need state transitions
The "right" aggregate size depends on your invariants, not DDD orthodoxy
Decide what is a "running total" versus what is a snapshot in your read-only projections; this will depend on the business problem you're trying to solve

Should You Use CQRS?

Before reaching for CQRS, evaluate whether your audit needs justify the complexity.

If you need "who changed what when", simple logging might suffice. If you need "why did this change and what was the business intent", CQRS shines. If you need "react to state transitions across the system", CQRS with event handlers is ideal.

For our inventory tracking, the explicit command-driven approach forces developers to think about why inventory changes. That's where the audit value comes from. And when the business asked "can you also notify waitlisted customers when tickets are no longer available?" we added a single event handler with no modifications to existing code. That's the real payoff of event-driven architecture, and that's why the added complexity was worth it.

Building a Double-Entry Payment System in Elixir

2025-12-31T03:01:16.616379+00:00

Multi-tenant platforms which facilitate e-commerce transactions hold people's money. Getting that money to the right people, in the right amounts, at the right time, is the core trust contract with your partners. This is the most critical and sensitive component in our system as if it fails, trust erodes quickly.

We needed a payment engine that could collect funds from multiple sources, track what's owed across different transaction types, batch payments to minimize fees, and handle the messy reality of refunds, multi-currency, and timing mismatches.

This post covers how we:

Implementing double-entry bookkeeping in software: advantages, challenges, and key design decisions
Database schema design for ledgers, transactions, and journal entries
Transaction states, batching strategy, and idempotency
The complete payout pipeline from event completion to funds disbursement
Extensibility: how the system accommodates new transaction types cleanly through extension

The Business Problem

Paying People Accurately and On Time

Event organizers, comedians, and venue partners rely on receiving exactly what they're owed, when they expect it. Someone who performed on Saturday expects their cut by Tuesday at the latest, as cash flow in the industry is generally tight. We are dealing with people's livelihoods and they have trusted us with one of the most important parts of their business: selling shows and collecting funds.

Money Flows Both Directions

Customers buy tickets (revenue flows in), but accounts also purchase card readers, run ads, or get charged service fees (charges flow out). Meanwhile, they earn referral bonuses and tips. The net result: at any given moment, someone either owes us money or we owe them and this account needs to be settled cleanly. Often times clubs have multiple shows where different accounts need to be paid even though all the performances were at the same club.

Minimizing Transaction Costs

Tipalti, our disbursement provider, charges $2.10 per outgoing payment. If we paid each transaction individually, we would bleed money as 50 payouts would cost us $105 in fees, making the business model untenable. Batching them into one payment at the right time and covering a reasonable span of transactions is key to minimizing transaction costs. Across hundreds of accounts paying out weekly, this compounds into significant savings, but this has to be balanced with timing.

Multi-Currency, Multi-Region Support

Events happen in USD, CAD, GBP, EUR. An organizer in London running shows might have ticket sales in pounds and euros, or if they're Canadian, in CAD. We must track revenue per currency, convert correctly, and pay in their preferred currency. Tipalti handles the actual disbursement mechanics, but we must get the numbers right.

One Invoice, Many Line Items

An organizer's weekly payout isn't just "event revenue." It might include:

Event ticket sales: +$500
Tips earned: +$45
Service fee split: +$30
Card reader purchase: -$75
Meta ad spend: -$120
Net payout: $380

All of these must appear on the same statement, clearly itemized.

The Refund Timing Problem

A customer requests a refund on Monday. The event happened Saturday. We already paid the organizer on Sunday. That refund must be accounted for retroactively—deducted from their next payout—and clearly tracked so they understand why this week's payment is lower.

Why Not Off-the-Shelf?

Payment platforms like Stripe Connect handle simpler splits, but our requirements of having multiple transaction types flowing both directions, batched payouts to external payment processors, multi-currency with conversion, and retroactive adjustments, require a purpose-built accounting layer.

We explored Stripe Connected Accounts feature but payment processing delays, and Stripe's unintuitive Connected Account interface was causing a lot of confusion for customers. Their onboarding process was also problematic and often required manual intervention which our customers found annoying. We also explored paying people using services like Interac E-Transfer but hit usage limits very quickly.

Double-Entry Bookkeeping & Key Design Decisions

Why Double-Entry?

We took our inspiration from Square and the Payments Engineer blog which have some guidance on how to think about the complexity tradeoffs that double-entry bookkeeping presents. In short, it's worth it if you can reason about it.

Every money movement creates balanced journal entries. If cash increases (debit), accounts_payable increases (credit). The books always balance, providing an immutable audit trail and catching errors immediately. This is an unintuitive approach unless you're an accountant or in finance, but it creates immutability and an audit trail of what happened and why.

By thinking about money movements as inflows and outflows in certain ledgers, rather than the maintenance of a specific balance, we're able to "replay" history more easily and derive rather than store balances.

Advantages and Challenges in an IT System

Though double-entry bookkeeping is much more complex to implement than maintaining a simple balance amount, it has its clear advantages:

Self-validating — If debits don't equal credits, something is wrong. No silent data corruption. Your accounting becomes intrinsically verifiable by examining individual transactions and the ledgers they impacted, rather than reverse-engineering why a balance is what it is.
Complete temporal audit trail — Every cent is traceable to its source transaction. Every money movement is the result of something changing, and that something is reflected as a transaction. The transaction stores what's changed, not the current state, so we can always ask the question "what was the balance three days ago" because we'd examine the "projections" (CQRS anyone?) up until three days ago.
Flexible reporting — Generate balance sheets, cash flow statements, and account summaries from the same data becomes easy due to the comprehensive audit trail.
Refunds are natural — A refund is simply the reverse entry pattern. No special logic needed. We just increment/decrement (or debit/credit) in the opposite manner.

And challenges:

Cognitive overhead — Developers must think in debits/credits, not just "add money" or "subtract money." This is not an easy hurdle to get past, and having a good understanding of core accounting principles helps (I did an MBA a while back and that came in handy here).
Schema complexity — More tables, more joins, more migrations than a simple balance column.
Query complexity — Calculating a balance requires summing journal entries, not reading a single field. It is easy to get the queries wrong if you don't understand how a "purchase" vs a "charge" is recorded, i.e., what ledgers are modified and how.
Eventual consistency concerns — Must ensure both sides of an entry are written atomically (solved with database transactions). And in cases where transactions span network boundaries, compensation is required. In our case, using the Reactor library.

Database Design & Data Model

Here's the key elements of the data model and a brief description, more on this in the later sections:

Account - a user's account in the system
Payment Destination - something like PayPal, a bank account
Tip, MetaAd - business objects which demand a need for payments
Transaction - models a singular business money movement
LedgerType - an accounting book, e.g., accounts payable
Ledger - an accounting book for a particular account
PayoutTransaction - shares the same physical table as Transaction (with a different type) but models outgoing money movements and is composed of other transaction types (e.g., event revenue). This is a powerful concept in the system.

Entity Relationship Diagram

Open in New Window

Transaction Types and Journal Entry Patterns

Transaction Type	Debit	Credit	Business Meaning
event_revenue	cash	accounts_payable	We collected ticket sales, owe organizer
tips_earned	cash	accounts_payable	We collected tips, owe organizer
gift_card_revenue	cash	accounts_payable	Gift card redeemed for event
service_fee_split	cash	accounts_payable	Organizer's share of service fees
customer_cashback	accounts_payable	cash	Referral reward owed to customer
refund	accounts_payable	cash	We refunded customer, owe organizer less
purchase	accounts_payable	cash	Organizer bought something (card reader)
ads	accounts_payable	cash	Ad spend charged to organizer
payout	payouts	cash	We disbursed funds to organizer

Core Bookkeeping Function

defp create_journal_entry(ledger, transaction, entry_type, %{
       "amount" => amount,
       "currency" => currency
     }) do
  %JournalEntry{
    ledger: ledger,
    transaction: transaction,
    type: entry_type,
    amount: amount,
    currency: currency
  }
  |> Repo.insert!()
end

This simple function is the foundation of the entire system. Every money movement flows through it. Two ledger entries are created in the appropriate ledgers for every transaction, providing us complete traceability on which direction money moved, what account was impacted, and for what reason. The JournalEntry -> Transaction -> Business Table relationship is at the heart of the system and provides verifiability and reversibility at any point.

Transaction States, Batching & Idempotency

Finite State Machine for Transactions

Most transaction types, for example event_revenue which represents money earned from an event follows this (simplified) pattern:

pending → waiting_for_payout → settled

pending — Transaction recorded, awaiting payout batch
waiting_for_payout — Included in a payout batch, not yet sent
settled — Payment executed successfully

States prevent double-payments, enable clear reporting ("what's been paid vs pending"), and allow recovery if processes fails mid-batch. Being in a particular state implies that all previous rules concerning that state have been satisfied. For example, if a transaction of type event_revenue is in waiting_for_payout we can assume that the total payment amount has been calculated, the account has not been paid, and will be paid on the next scheduled run. These are corollaries of a transaction of a particular type being in a particular state.

The combination of transaction type plus transaction state gives us a very clear view on how the money is moving.

The Batching Algorithm

To minimize transaction costs, handle multi currencies, and allow the same account to have funds dispersed to many different bank accounts, we batch the transactions and introduce the concept of a payment_destination which has many-to-one relation with an account.

The create_pending_payout_transactions/0 groups all pending transactions by (currency, payment_destination_id, account_id). Each group becomes one payout transaction, regardless of how many underlying transactions exist.

def create_pending_payout_transactions do
  Repo.transaction(fn ->
    # Fetch all pending transactions grouped by currency, payment_destination_id, and account_id
    pending_transactions =
      Repo.all(
        from je in JournalEntry,
          join: t in assoc(je, :transaction),
          join: l in assoc(je, :ledger),
          where:
            t.status == "pending" and
              t.type in [
                "event_revenue",
                "refund",
                ...
              ],
          group_by: [fragment("lower(?)", je.currency), t.payment_destination_id, l.account_id],
          select: {fragment("lower(?)", je.currency), t.payment_destination_id, l.account_id}
      )

    # Process each group to create a payout transaction
    Enum.map(pending_transactions, fn {currency, payment_destination_id, account_id} ->
      payout_amount = calculate_payout_amount(account_id, currency, payment_destination_id)
      # ... create or update payout transaction
    end)
  end)
end

This is the central algorithm which looks at transactions and their journal entries, and calculates how much is owed to each account, and within each account, how funds must be disbursed to different payment destinations (e.g., PayPal, Bank).

Idempotency by Design

It is important that these processes are not time-dependent and can be run at any time without affecting overall amounts. Processes/jobs should be able to be run multiple times without negative consequences, i.e., they need to be idempotent.

create_pending_payout_transactions/0 is idempotent—if a pending payout already exists for a destination, it updates the amount rather than creating a duplicate.
If a job fails and retries, or runs twice, the result is the same.
No orchestration dependencies—event revenue, tips, refunds, and ads can be processed in any sequence. The batching step always queries current state and produces correct results.

transaction =
  case Repo.get_by(Transaction,
         type: "payout",
         status: "pending",
         payment_destination_id: payment_destination_id
       )
       |> Repo.preload(:journal_entries) do
    nil ->
      # Create new payout transaction
      %Transaction{
        type: "payout",
        status: "pending",
        payment_destination_id: payment_destination_id,
        transaction_date: DateTime.utc_now()
      }
      |> Repo.insert!()

    existing_pending_payout_transaction ->
      # Update existing - idempotent!
      Enum.each(existing_pending_payout_transaction.journal_entries, fn je ->
        je |> Ecto.Changeset.change(%{amount: payout_amount}) |> Repo.update!()
      end)
      existing_pending_payout_transaction
  end

Calculating Net Payout

Calculating payouts is all about consulting the accounts_payable ledger as that keeps track of how much money is owed to a customer:

defp calculate_payout_amount(account_id, currency, _payment_destination_id) do
  from(je in JournalEntry,
    join: t in assoc(je, :transaction),
    join: l in assoc(je, :ledger),
    join: lt in assoc(l, :ledger_type),
    where:
      lt.type == "accounts_payable" and
        l.account_id == ^account_id and
        fragment("lower(?)", je.currency) == fragment("lower(?)", ^currency) and
        t.status in ["pending", "waiting_for_payout"],
    group_by: je.type,
    select: {je.type, sum(je.amount)}
  )
  |> Repo.all()
  |> Enum.into(%{})
  |> then(fn totals ->
    (totals[:cr] || 0) - (totals[:dr] || 0)
  end)
end

Result can be positive (we owe them), zero (nothing to pay), or negative (they owe us—carried forward).

Key Design Decisions

1. Transaction Type Drives Journal Entries

We faced a choice: should the system rely on tracking events by creating specific ledgers for different transactions (e.g., tips, hardware purchases etc.) and then summing debits and credits, or by looking at transaction type and status? We chose the latter for its simplicity while .

Transaction type and journal entries are related but orthogonal concepts:

Transaction type answers: "What happened?" (event_revenue, refund, purchase, payout)
Journal entries answer: "How did it affect money movement?"

When creating a transaction, the type determines which journal entry pattern to apply:

event_revenue → DR cash, CR accounts_payable
refund        → DR accounts_payable, CR cash
payout        → DR payouts, CR cash

Alternately, we could have created a event_cash and event_payable book and examine those, but we found it overkill to reason about so many ledgers. Besides, we can always extend them at no cost. But querying WHERE type = 'event_revenue' AND status = 'pending' is far more intuitive than traversing specific journals based on type to reconstruct intent. Type alone is sufficient.

2. Bridging Accounting and Business Domains via `transaction_id`

Transactions are generic accounting records as they know about amounts, currencies, and ledger entries. But the business needs to answer questions like "which tips are included in this payout?" or "what ad campaigns does this charge cover?"

We solve this by adding transaction_id as a foreign key on business domain tables:

Business Table	Links To	Purpose
`Tip`	Transaction	Which tips were paid in this transaction
`ProfitSharingFee`	Transaction	Which service fee splits are included
`ReferralReward`	Transaction	Which cashback rewards were processed
`MetaAd`	Transaction	Which ad charges are covered
`VariantTransaction`	Transaction + Variant	Which event variants generated this revenue

This creates bidirectional traceability:

Accounting → Business: Given a transaction, find all the tips/ads/rewards it covers
Business → Accounting: Given a tip, find which transaction (and ultimately which payout) included it

When creating a transaction, we update the business records:

# In create_tips_earned_transaction
Repo.update_all(
  from(t in Tip, where: t.id in ^tip_ids),
  set: [transaction_id: transaction.id]
)

This means payment summaries can show "Tips from 12 orders: $45" with full drill-down capability.

3. Per-Account Ledgers, Not Global Ledgers

Each account gets its own set of ledgers (cash, accounts_payable, payouts). We chose per-account because:

Queries for account balances are simpler
Natural isolation between accounts
Ledger-level reporting is account-scoped by default
setup_ledgers(account_id) lazily creates ledgers on first transaction—no upfront provisioning needed

4. Payment Destination as the Batching Key

Transactions are grouped by (currency, payment_destination_id, account_id), not just account. This allows:

One account to have multiple payment destinations (UK bank for GBP, US bank for USD)
Different payout schedules per destination if needed
Clean separation when an account changes bank details mid-cycle
Allows for different events to be paid to different payment destinations within the same account

5. Amounts in Cents as Integers

All amounts are stored as integers representing the smallest currency unit (cents, pence). No floating point arithmetic means no rounding errors accumulating across thousands of transactions.

6. Status Lives on Transaction, Not JournalEntry

Journal entries don't have independent status—they inherit from their parent transaction. This avoids the complexity of partially-settled transactions and keeps state management in one place.

7. Join Table for Payout Associations

Rather than adding payout_transaction_id to the Transaction table, we use a separate PayoutTransaction join table. This:

Preserves the original transaction record unchanged
Allows a transaction to theoretically be part of multiple payout attempts (retry scenarios)
Makes "what was included in this payout?" a clean query

8. Atomic Transaction Creation

Every accounting operation wraps in Repo.transaction/1. If creating the second journal entry fails, the first is rolled back. The books are never unbalanced, even momentarily. As mentioned earlier, the Reactor library is used to compensate/undo transactions when API calls to disbursement system fails.

9. Tax Handling

There were two main strategies of handling tax. Either reflect them in the ledgers as separate transactions and have their own "tax_payable" or similar book. Or reflect tax amounts as part of the business data, e.g, the event_revenue table's tax_collected field.

Either would work when it comes to creating invoices and audit trails, as the business data is simply a transaction_id in a business table from the actual transaction which paid the money out. We chose to go with calculating tax amounts on the business side and having the transaction ledger amounts reflect the full amount including tax (not as a separate ledger entry). If we want tax-specific details, consult the business tables.

Our Ledger Types

cash — Money we hold (bank account)
accounts_payable — Money we owe to organizers
payouts — Money we've disbursed

The golden rule: Every transaction creates exactly two journal entries that sum to zero.

The Payout Pipeline & Execution

Payment Flow Sequence

Open in New Window

Scheduled Orchestration

An Oban worker runs on schedule, executing the full payout flow:

defmodule Amplify.ScheduledJobs.EventPayoutTipalti do
  use Oban.Worker, queue: :event_payouts, max_attempts: 1
  require Logger

  @impl Oban.Worker
  def perform(_args) do
    Logger.info("Starting Creating Payout Transactions")
    Amplify.EventPaymentMaker.create_event_revenue_transactions()
    Logger.info("Finished Creating Event Revenue Transactions")

    Amplify.GiftCardPaymentMaker.create_gift_card_transactions()
    Logger.info("Finished Creating Gift Card Transactions")

    Amplify.DigitalProductPaymentMaker.create_digital_product_transactions()
    Logger.info("Finished Creating Digital Product Transactions")

    Amplify.CustomerCashbackTransactionCreator.create_customer_cashback_transactions()
    Logger.info("Finished Creating Customer Cashback Transactions")

    Amplify.AccountingContext.create_pending_payout_transactions()
    Logger.info("Finished creating pending payout transactions")

    # Start ProcessAdPayments job to handle Meta ad billing
    %{}
    |> ProcessAdPayments.new()
    |> Oban.insert()

    Logger.info("Enqueued ProcessAdPayments job")
    :ok
  end
end

Payment Execution

TransactionPaymentMaker.pay/0 iterates pending payout transactions created earlier to disburse funds and update statuses:

def pay do
  {:ok, result} = AccountingContext.get_pending_payout_transactions()

  result
  |> Enum.map(fn {account_id, payment_destination_id, transactions} ->
    Repo.transaction(fn ->
      Enum.map(transactions, fn transaction ->
        amount = transaction.journal_entries |> List.first() |> Map.get(:amount)
        currency = transaction.journal_entries |> List.first() |> Map.get(:currency)

        payment_destination = Repo.get!(PaymentDestination, payment_destination_id)

        {:ok, [%Transaction{} = payout_transaction]} =
          case amount do
            amount when amount >= 0 ->
              AccountingContext.settle_payout_transactions([transaction.id])
            _ ->
              {:ok, [transaction]}
          end

        {:ok, batch_reference_id} =
          case amount do
            amount when amount > 0 ->
              Tipalti.make_payment(%{
                "amount" => amount / 100,
                "currency" => currency,
                ...
              })
            _ ->
              {:ok, nil}
          end

        if amount >= 0 do
          Tipalti.create_payout_history(%{
            payment_destination_id: payment_destination_id,
            transaction_id: payout_transaction.id,
            batch_reference_number: batch_reference_id,
            amount: amount,
            currency: currency
          })

          send_payment_summary_email(%{
            payout_transactions: payout_transaction.payout_transactions,
            batch_reference_id: batch_reference_id,
            account_id: account_id
          })
        end

        payout_transaction
      end)
    end)
  end)
end

Handling Edge Cases

Post-payment refunds — Creates a new refund transaction that reduces the next payout automatically. The DR accounts_payable, CR cash pattern naturally reduces the amount owed.

Zero/negative payouts — No Tipalti call; balance carries forward to the next payout cycle.

Missing payment destinations — Skip and notify via email; transactions remain pending until the organizer adds payment info.

Extensibility: Adding New Transaction Types

The Model Accommodates Change

New revenue streams and charge types are inevitable. The system was designed so adding a new transaction type doesn't require schema changes or breaking existing logic.

Adding a New Transaction Type Requires:

Define the type string (e.g., "membership_fee")
Decide the journal entry pattern (revenue: DR cash, CR accounts_payable; charge: reverse)
Create a function which creates a transaction using an established pattern
Add the type to the batching query's type in [...] list
Optionally: add transaction_id to a business domain table for traceability

Example: Adding Subscription Revenue

def create_subscription_revenue_transaction(%{"subscription_ids" => subscription_ids} = params) do
  Repo.transaction(fn ->
    setup_ledgers(params["account_id"])
    transaction = create_transaction(params |> Map.put("type", "subscription_revenue"))

    cash_ledger = get_ledger("cash", params["account_id"])
    accounts_payable_ledger = get_ledger("accounts_payable", params["account_id"])

    create_journal_entry(cash_ledger, transaction, :dr, params)
    create_journal_entry(accounts_payable_ledger, transaction, :cr, params)

    # Link to business domain for traceability
    Repo.update_all(
      from(s in Subscription, where: s.id in ^subscription_ids),
      set: [transaction_id: transaction.id]
    )

    Repo.get!(Transaction, transaction.id)
    |> Repo.preload([:journal_entries, :subscriptions])
  end)
end

The batching and payout logic automatically picks it up and no changes needed downstream.

For variations within a type (e.g., differentiating "premium event" vs "standard event" revenue), we can always:

Add a subtype field to Transaction
Store additional context in a JSONB metadata field
Link to domain-specific tables via join tables

The core accounting remains unchanged; subtypes are for reporting and business logic. More important than what needs to change is what doesn't need to change when new payment requirements inevitably popup:

Database schema (unless linking to new domain entities)
Batching logic (queries by status, not type)
Payout execution (reads journal entries, agnostic to transaction type)
State machine (all types follow the same pending → waiting_for_payout → settled flow)

This has been tested thoroughly as since inception we've added support for the following transaction types cleanly:

customer_cashback — Referral rewards paid to customers (reverse flow)
ads — Meta ad spend charged to accounts
service_fee_split — Revenue sharing with partners

Each took hours to implement, not days, because the foundation was solid.

Lessons Learned

1. Invest in the Foundation Early

Building double-entry bookkeeping felt like over-engineering initially. A simple balance column would have shipped faster. But when we needed to add refunds, then tips, then ad charges, then multi-currency, the foundation paid for itself many times over. Each new transaction type slots in cleanly instead of requiring architectural surgery.

2. Idempotency Is Worth the Complexity

Making operations idempotent required more upfront thought—checking for existing records, updating instead of inserting, querying current state rather than relying on sequence. But it eliminated an entire class of bugs: duplicate payments, orphaned records, and state corruption from retried jobs. When your system handles money, "ran twice by accident" cannot mean "paid twice."

3. Separate Accounting from Business Logic

The transaction_id foreign key pattern was a late addition. Initially, we tried to make transactions self-describing with rich metadata. But queries like "show me all tips in this payout" became convoluted. Keeping transactions as clean accounting records and linking them to business entities via foreign keys gave us the best of both worlds: simple accounting logic and rich business context.

4. Status Is Easier Than Ledger Math

We debated whether to derive "what's pending" by summing ledger entries (the pure accounting approach) versus querying transaction status (the pragmatic approach). Status won. Developers can reason about status = 'pending' without understanding debits and credits. The journal entries remain the source of truth for amounts; status is the source of truth for workflow state.

5. Batching Is a Product Feature, Not Just a Cost Optimization

We initially thought of batching as purely an internal optimization to reduce Tipalti fees. But organizers love seeing a single weekly payment with a detailed breakdown rather than dozens of micro-deposits. The batching architecture became a product feature: predictable payment schedules with comprehensive statements.

6. Plan for Retroactive Adjustments

The refund-after-payment scenario seemed like an edge case but it's actually quite popular. Chargebacks, late refund requests, and corrections happen regularly. Because every adjustment is just another transaction with the appropriate DR/CR pattern, the system handles them naturally. The next payout simply reflects the updated balance.

7. Currency Handling Is Subtle

We underestimated currency complexity. It's not just conversion rates, it's also case sensitivity (USD vs usd), timing of conversion (at transaction time vs payout time), and reporting currency vs payment currency. Using fragment("lower(?)", je.currency) everywhere and storing the payment destination's preferred currency solved most issues, but we wish we'd been more deliberate from the start.

8. Traceability Saves Support Time

Every support ticket about payments boils down to "why did I get this amount?" Full traceability—from payout email → payout transaction → source transactions → individual tips/events/charges → original orders—means support can answer definitively instead of guessing. The join tables and foreign keys that felt like extra work during development save hours of investigation weekly.

Conclusion

Building a payment engine from scratch is a significant undertaking, but for multi-sided marketplaces, it may be unavoidable. Off-the-shelf solutions optimize for simpler flows like point-to-point atomic transactions. However, when money moves in multiple directions, across currencies, with retroactive adjustments and batched payouts, you need control over the accounting layer.

The key architectural decisions that served us well:

Double-entry bookkeeping provides self-validation and an immutable audit trail
Transaction types drive journal entries, making business logic explicit
Linking accounting to business domains via transaction_id enables full traceability
Per-account ledgers simplify queries and provide natural isolation
Idempotent operations eliminate duplicate payment risks
State machine for transactions makes workflow explicit and prevents double-processing
Batching by payment destination minimizes fees while improving the organizer experience

The system has processed millions of dollars across multiple currencies, handled thousands of refunds (including post-payment adjustments), and scaled to support new transaction types as the business evolved. The upfront investment in a solid foundation made all of this possible.

If you're building something similar, start with the accounting model. Get the ledgers, journal entries, and transaction patterns right first. Everything else like batching, scheduling, and notifications builds cleanly on top of a sound foundation.

Built with Elixir, Phoenix, Ecto, Reactor and Oban. Accounts Receivable via Stripe and PayPal, Accounts Payable via Tipalti.

Building Embeddable Widgets with Phoenix Channels

2025-12-25T20:00:42.979271+00:00

When you're selling stuff, every click away from your website is a potential lost sale. Customers kept asking us to allow people to buy tickets without leaving their branded site. They loved the service, but wanted to retain brand presence and coherence.

The ask seemed simple enough. Drop a widget on any webpage, show some events, let people check out. But as we dug into it, the technical challenges started stacking up. How do you maintain cart state across page refreshes? How do you track marketing attribution when the purchase happens on someone else's domain? And how do you process payments securely in what is essentially a third party context?

This is the quick post of how we tried a bunch of approaches, hit walls, and eventually landed on Phoenix Channels as our solution.

The Business Problem

Our platform handles ticketing for comedy shows and events. Most of our customers have their own websites where they promote their shows, but when someone wants to buy a ticket, they get redirected to our checkout flow. That redirect isn't an issue if you're coming from Google or some social media platform, but it is a point of friction if you're coming from someone else's website and already viewing event information there.

The numbers were fairly clear: cart abandonment spiked whenever users had to leave the original site. People would click "Buy Tickets", land on our domain, get distracted, and never complete the purchase. Our customers wanted a way to keep buyers on their own turf through the entire flow.

So we set out to build an embeddable widget that could handle the complete purchase experience. Browse events, add to cart, enter customer info, pay with Stripe, get confirmation. All without leaving the host page.

What We Needed to Solve

Beyond the basic checkout flow, we had a bunch of technical requirements that made this tricky.

First, we needed to track marketing parameters. When someone lands on a page with utm_campaign or fbclid in the URL, those values need to follow the user all the way through checkout so we can attribute the sale correctly. Same deal with referral codes and other tracking identifiers.

Second, we needed location detection. Different regions have different currencies and tax rules. We needed to figure out where the user was coming from and apply the right pricing.

Third, cart persistence. If someone adds tickets to their cart and then refreshes the page, that cart better still be there. And ideally, if they come back the next day, same thing.

Fourth, conversion tracking. When an order completes, we need to fire Meta Pixel events, Google Tag conversions, Reddit Pixel tracking. All the usual suspects for marketing attribution.

And finally, all of this had to work on any website, regardless of what tech stack they're running. WordPress, Squarespace, custom Rails apps, static HTML. The widget needed to just work.

The Approaches We Tried

Web Components with REST

Our first instinct was the obvious one. Build a Web Component that makes REST calls to our API for everything. Add to cart? POST request. Update quantity? PUT request. Checkout? More requests.

We actually had an older implementation using this pattern from a few years back, built with Stencil. We got a prototype working pretty quickly. The problem was state management. Keeping the web component UI state in sync required us to retrieve data every time something changed and the component became a giant HTTP client with tons of boilerplate and error handling code. This isn't necessarily bad, but a core tenet of Elixir development is to reduce boilerplate and opt for messaging over RPC, so we quickly ditched this approach.

The bigger issue was that we were essentially rebuilding our entire checkout flow in JavaScript. All the price formatting logic, the discount calculations, the conditional display rules based on product configuration. We had all of this working beautifully in Elixir already. Duplicating it on the client felt wrong. The client side code didn't get gnarly, but there was considerable duplication of what the web component was rendering, and what our website was rendering.

iframes

Next we tried the iframe approach. Just embed our existing checkout in an iframe on the host page. This way we could reuse all our LiveView code with zero modifications.

The cross origin communication was the first problem. Getting the parent page and the iframe to talk to each other meant a bunch of postMessage calls with careful origin checking. But the real killer was third party cookies.

Modern browsers are cracking down on third party cookies hard. Safari was already blocking them by default, Chrome is phasing them out, Firefox has restrictions. Our session management relied on cookies, and suddenly half our users couldn't stay logged in through the checkout flow. The iframe approach was dead on arrival even though the PoC worked. Chasing the JavaScript stack and tracking all the latest tweaks browsers and libraries make isn't a business worth getting into.

LiveView Inside Web Components

This one seemed promising. Phoenix LiveView gives you that real time reactivity we wanted. What if we could mount a LiveView inside a Web Component's Shadow DOM?

We spent a good chunk of time trying to make this work. The problem is that LiveView's DOM patching fundamentally conflicts with Shadow DOM encapsulation. LiveView needs to own the DOM tree it's managing, and Shadow DOM creates a boundary that the patching algorithm can't cross properly. Event delegation breaks. The morphdom updates don't penetrate the shadow boundary correctly.

We looked at the Phoenix source code, read through the forums and Discord. The consensus was clear: this isn't a supported pattern and probably never will be even though libraries like live_portal are trying hard to solve it.

Other Libraries

We evaluated a bunch of other options. Hologram looked interesting but connecting the web component's socket to what Hologram seemed brittle and intrusive, especially as Hologram is relatively early stage and APIs are likely to change. LiveVue and LiveSvelte are designed for using those frameworks inside LiveView, not the other way around.

None of them gave us what we needed.

Why Phoenix Channels Worked

The breakthrough came when we stopped trying to embed LiveView and started thinking about what we actually needed from the server.

We didn't need the full LiveView abstraction. We needed real time communication with state management. We needed the server to be able to push updates when things changed. And we already had this infrastructure running: Phoenix Channels.

Channels give you a persistent WebSocket connection with built in topic subscription, message passing, and presence tracking. We use them extensively for other features in the platform. Why not for the widget?

The key insight was realizing we could send rendered HTML over the channel instead of JSON. This sounds backwards at first. Aren't you supposed to send data and let the client render it? But hear me out.

By rendering HTML on the server, we get to reuse all our existing Elixir code. Price formatting that handles multiple currencies and locales? Already written. Display logic that varies based on brand settings? Just a function call away. Conditional rendering for sold out events versus available ones? Same templates we use everywhere else.

If we sent JSON, we'd have to duplicate all of this in JavaScript. And then keep the two implementations in sync. This might follow some esoteric separation of concern goals, but it feels wonky and unnecessary. Also, we're essentially the same app.

The other thing about WebSockets is they're fast. Once that connection is established, you're not paying the TCP handshake cost on every interaction. Headers get stripped down to almost nothing. For something like a cart that might see dozens of updates in a session, that overhead reduction adds up. A typical response to a call on the channel side looks like:

  @impl true
  def handle_in("event", %{"type" => "go-to-checkout"}, socket) do
    cart_id = socket.assigns[:cart_id]

    if cart_id do
      {:ok, cart} = CartContext.get_cart(cart_id)
      {socket, response} = prepare_checkout_view(socket, cart_id, cart.region_id) # Full HTML 
      {:reply, {:ok, response}, socket}
    else
      {:reply, {:error, %{reason: "No cart available"}}, socket}
    end
  end

Browser support used to be a concern but WebSockets have been solid across all major browsers since 2011. This isn't experimental technology.

The Architecture

Here's how the pieces fit together.

On the client side, we have a Web Component called product-list-widget. You drop it on any page with a script tag and a custom element, pass it a widget ID as an attribute, and it handles the rest.

We debated between Lit and Stencil for building the Web Component. We had used Stencil in our original REST based implementation years ago, but this time we went with Lit. Stencil is more oriented toward building design systems and component libraries, with its own compiler and build toolchain. For our use case of a single focused widget, Lit's lighter weight approach made more sense. You just extend a base class and you're done.

When the component mounts, it establishes a WebSocket connection to our Phoenix application and joins a channel specific to that widget. The channel handler on the server looks up the widget configuration, fetches the relevant products, renders the initial HTML, and sends it back.

From there, every user interaction goes through the channel. Click add to cart? The component sends a message, the server updates the cart, renders the new state, and replies with HTML. During checkout it's the same pattern. The server handles the business logic and responds with the next view to display.

Host Website (any domain)
    └── product-list-widget (Web Component)
          └── Phoenix Channel Connection
                └── ProductListChannel
                      └── CheckoutService (shared with main site)
                            └── All our existing Elixir code

The Web Component provides encapsulation so our styles don't leak into the host page and vice versa. The Shadow DOM keeps everything contained.

Handling Parameters

One of the first problems we hit was tracking parameters. When someone lands on a page with marketing UTMs in the URL, we need to capture those and associate them with any purchase that happens.

The solution is straightforward. When the widget initializes, it reads the current URL and extracts any parameters we care about. UTM campaign, source, medium. Facebook click IDs. Referral codes from our affiliate program. Venue access codes for special pricing.

These get sent along when joining the channel. On the server side, we stash them in the socket assigns and later write them into the cart context when the cart gets created.

const trackingParams = {
  utm_campaign: url.searchParams.get('utm_campaign'),
  utm_source: url.searchParams.get('utm_source'),
  fbclid: url.searchParams.get('fbclid'),
  customer_referral_code: url.searchParams.get('customer_referral_code'),
  ref: url.searchParams.get('ref')
};

channel.join("product_list:1", { tracking_params: trackingParams });

When the order eventually completes, all these parameters are sitting in the cart context, ready for our analytics pipeline to process.

Location Detection

Currency and tax rules depend on where the buyer is located. We handle this by grabbing the IP address from the socket connection and running it through a geolocation service.

The channel join handler pulls the IP from the socket, looks up the location, and uses that to determine the default currency and region. If we detect someone in Canada, they see Canadian dollars. Someone in the US sees USD.

This happens transparently on the first connection. The user doesn't have to select their country from a dropdown or anything like that. We just figure it out and show them relevant pricing.

Cart Persistence

LocalStorage handles cart persistence on the client side. When we create a cart on the server, we get back a cart ID. The widget stores that ID in LocalStorage keyed by the widget ID.

Next time the page loads, the widget checks for an existing cart ID and sends it along when joining the channel. The server validates that the cart exists and is still usable, then restores the previous state. If the cart has expired or been completed, we just start fresh.

private getStoredCartId(): string | null {
  return localStorage.getItem(`widget_cart_${this.widgetId}`);
}

private storeCartId(cartId: string): void {
  localStorage.setItem(`widget_cart_${this.widgetId}`, cartId);
}

This works around the third party cookie problem entirely. LocalStorage is partitioned by origin, but the widget JavaScript runs in the context of the host page, so it has access to that page's storage.

DOM Diffing with Idiomorph

Here's a detail that makes a huge difference in user experience. When the server sends back new HTML, we don't just blast it into the DOM with innerHTML. That would reset focus on input fields, lose scroll position, and generally feel janky.

Instead we use Idiomorph, a library that does intelligent DOM diffing and morphing. You give it the current DOM and the new HTML you want, and it figures out the minimal set of changes needed to transform one into the other. Elements that haven't changed stay put. Focus stays on the input you were typing in. Scroll position is preserved.

import { Idiomorph } from 'idiomorph';

private updateContent(html: string) {
  Idiomorph.morph(this.contentContainer, html, {
    morphStyle: 'innerHTML'
  });
}

This gives us LiveView style reactivity without actually using LiveView. When you update your cart quantity, the rest of the page stays exactly where it was. You don't lose your place or have to refocus on the field. It feels smooth.

Payment Integration

Stripe Elements handles the payment UI. After the customer fills in their info and clicks continue, we create a PaymentIntent on the server and send back the client secret along with instructions to show the payment form.

The widget mounts Stripe Elements into a container within the Shadow DOM. User enters their card, we confirm the payment with Stripe, and on success we send a message back through the channel to complete the order.

Free orders skip the Stripe step entirely. If the cart total is zero (maybe they used a 100% off promo code), we just submit the order directly without involving payment processing.

Conversion Tracking

Getting conversion pixels to fire correctly took some thought. On our main site, we use LiveView's push_event/3 to send conversion data to a JavaScript hook that fires the various tracking pixels. But in the widget context, we don't have LiveView hooks.

The solution is to include the conversion event data in the order success response. When an order completes, the server looks up what analytics elements are configured for that account, builds out the event payloads for each one, and includes them in the response.

{:ok, conversion_events} = CheckoutService.get_order_conversion_data(order_id)

{:reply, {:ok, %{
  html: render_success_view(order_data),
  view: "success",
  conversion_events: conversion_events
}}, socket}

On the client side, we iterate through those events and fire them using the appropriate global functions. fbq for Meta, gtag for Google, rdt for Reddit.

if (response.conversion_events?.length) {
  setTimeout(() => {
    this.fireConversionEvents(response.conversion_events);
  }, 2000);
}

The two second delay matches what we do on the main site. It gives the page time to stabilize before we start making network requests to third party tracking services.

One caveat here: the host page needs to have these tracking libraries loaded. If they haven't included the Facebook Pixel script, our fbq calls just silently fail. We log a warning to the console but don't crash.

Reducing Duplication

One of the things we were most worried about was maintaining two parallel implementations. The main site checkout and the widget checkout doing the same things in slightly different ways.

We solved this by extracting shared logic into a CheckoutService module. Cart validation, customer adding, Stripe setup, order completion. All the core checkout operations live in one place and get called from both the LiveView code and the channel handlers.

defmodule AmplifyWeb.CheckoutService do
  def complete_cart_with_validation(cart_id)
  def setup_stripe_payment(cart_id, currency_code)
  def add_customer_to_cart(cart_id, customer_params, opts)
  def validate_customer_params(params)
  def get_order_conversion_data(order_id)
end

Before this refactor, we had the same cart completion logic written out in three different places. Each one had slight variations that had accumulated over time. Consolidating them into a single service eliminated about a hundred lines of duplicated code and gave us one canonical implementation to maintain.

The conversion event building is a good example. Both the CartComplete LiveView and the widget channel need to fire the same tracking pixels with the same data. Now they both call the same get_order_conversion_data/1 function and get the same events back.

What we didn't try to share is the template rendering. The widget uses string interpolation templates since it's returning raw HTML over the channel. The main site uses HEEx templates with all the LiveView compile time goodness. These are fundamentally different rendering contexts, and trying to abstract over them would have been more trouble than it's worth. DRY isn't always good, sometimes some duplication is OK.

Monorepo Structure

We keep the Lit widget code in the same repository as the Phoenix application. The widget lives in assets/widget/ with its own package.json, TypeScript config, and build scripts. This wasn't an obvious choice at first since we could have created a separate npm package and published it independently.

The monorepo approach won out for a few reasons. First, the widget and the server are tightly coupled. When we change how the channel sends data, we often need to update how the widget handles it. Having both in the same repo means we can make those changes atomically in a single commit. No version coordination, no wondering if the deployed widget matches the deployed server.

Second, shared types. We generate TypeScript interfaces from our Elixir structs for the channel message formats. Keeping everything together means the types stay in sync automatically. When we add a field to a response, the TypeScript compiler immediately tells us everywhere that needs updating.

Third, simpler CI. One repo means one pipeline. We run the Elixir tests, build the widget, and deploy everything together. If the widget build fails, the whole deploy fails. No chance of shipping a broken widget because someone forgot to bump a version number.

The build process is straightforward. We have a mix task that shells out to npm to build the widget, then copies the output to priv/static/widget/. The widget JS and CSS get served as static assets. When someone includes our script tag, they're pulling from the same CDN that serves the rest of our static files.

amplify/
├── lib/                    # Elixir code
│   └── amplify_web/
│       └── channels/
│           └── product_list_channel.ex
├── assets/
│   ├── js/                 # Main app JavaScript
│   └── widget/             # Widget package
│       ├── src/
│       │   └── product-list.ts
│       ├── package.json
│       └── tsconfig.json
└── priv/static/widget/     # Built widget output

This also helps with our customer UX as they just have to paste a script tag into their Squarespace site. Hosting the widget ourselves and keeping it in the monorepo gives us full control over the experience.

What We Learned

Start with the primitives you already have. We wasted some time in trying to shoehorn LiveView into a context it wasn't designed for. Phoenix Channels were sitting right there the whole time, battle tested and ready to go. Sometimes the answer isn't a new library or a clever workaround. It's the boring infrastructure you've been using for years.

Server rendered HTML over WebSockets is underrated. The conventional wisdom says send JSON and render on the client. But that only makes sense if you don't already have rendering logic on the server. We had years of Elixir code handling edge cases in price formatting, discount calculations, and conditional display logic. Sending HTML meant we could reuse all of it. The widget got our production tested rendering for free.

DOM diffing libraries are table stakes for any dynamic UI. We almost shipped with raw innerHTML updates. The first time we saw a user lose their place in a form because the whole container re-rendered, we knew we needed Idiomorph. The integration took maybe two hours. The UX improvement was immediate and obvious. If you're updating DOM content dynamically, use a morphing library. It's not optional.

LocalStorage beats cookies for cross origin state. The iframe approach died because of third party cookie restrictions. LocalStorage doesn't have that problem since the widget runs in the host page's context. The tradeoff is that carts are now per device rather than per user session, but for our use case that's actually fine. Most ticket purchases happen in a single session anyway.

Monorepo simplifies everything when client and server are tightly coupled. We briefly considered publishing the widget as a separate npm package. The coordination overhead would have been brutal. Every channel message format change would require syncing versions across repos. Keeping everything together means atomic commits and one deployment pipeline. The simplicity is worth the slightly larger repo.

Wrapping Up

Phoenix Channels gave us the foundation we needed to build an embeddable widget that actually works. The persistent WebSocket connection handles real time updates elegantly. Sending rendered HTML lets us reuse our existing Elixir codebase. Idiomorph keeps the DOM updates feeling smooth.

If you're building something similar with Elixir and Phoenix, consider whether channels might be a better fit than trying to embed LiveView directly. The pattern of Web Component plus Channel plus server rendered HTML is surprisingly powerful and sidesteps a lot of the complexity we ran into with other approaches.

The widget is now running in production on customer sites across different platforms and tech stacks. Drop in a script tag, add the custom element, and you've got a full checkout flow without sending users anywhere. Here's what the final widget looks like: