Kaspa Research - Latest posts

Adaptive block sizes

@Bit_Cat Bit Cat — Fri, 20 Feb 2026 11:06:18 +0000

That’s true, if it manages not to get stuck in an equilibrium where everyone pays the same negligible fee and gets the same perfect service. Also when it activates, I’m a little bit uneasy with the hard cutoff that may abruptly exclude experimental use cases, instead of just making them a bit slower. That said, I’m still trying to understand it better, still don’t have a clear picture of all its implications.

Adaptive block sizes

@hashdag Yonatan Sompolinsky — Tue, 17 Feb 2026 00:04:36 +0000

Nice proposal. One minor comment (while I’m digesting it) - LSZ does not undermine quality of service, users paying higher fees are included faster/with higher probability

Adaptive block sizes

@Bit_Cat Bit Cat — Mon, 16 Feb 2026 21:07:28 +0000

Adaptive block size to improve the security budget

I’ve been discussing the idea of adaptive block sizes on X and in private chats, so I thought I would post it here to make it a little bit more formal and visible, and potentially gather some additional feedback. It is not a fully finished thing, just my current state of thinking about the subject.

Motivation

Kaspa’s fee market has great theoretical properties, thanks to the DAG parallelism. It however struggles to activate in practice, since the network has a far greater capacity than the current economic demand for transactions.

While there is no fixed number for when a security budget is deemed “enough”, the current low fees coupled with a very fast emission schedule are worrying many people inside and outside the Kaspa community, which is itself a factor that plays against adoption.

Most development efforts focus on increasing the demand by adding additional functionalities and use cases, and that’s of course great.

However I think that a functional fee market must also have an elastic supply, in order to guarantee sustained miners’ revenues across multiple demand regimes, to avoid starvation when demand is low and congestion if it ever becomes too high.

This text describes a way to do so that is sufficiently simple and conservative as to avoid excessive shocks in the way Kaspa works, but also ambitious enough to potentially alleviate many worries about its future sustainability.

The proposal

I simply propose to divide time in capacity epochs, and have miners regularly vote (by hash power) on the block size limit to be used for the following epoch. For example, capacity epochs could last two weeks and the voting window could be of 3 days. A commit-then-reveal voting scheme could also be used, although I’m not sure if it’s strictly necessary.

Miners would vote on the delta (in Kb) to be added or subtracted from the current block size limit, by including that number into a new dedicated field of the blocks they mine. The protocol would then compute an X percentile of the submitted deltas, and clamp the resulting delta to a bounded range (for example [-Δmax, +Δmax]) to avoid excessive changes in a single epoch. That clamped value would be applied to the current capacity to determine the capacity for the following epoch.

The percentile used could be the median (X=50%) or possibly a higher one (say X=70%) to enforce a larger consensus for a block size decrease. Using asymmetric percentiles can help reduce the risk of sudden fee shocks, although it also introduces a tradeoff in the form of potential long-run drift, so the choice of X should be made carefully.

The maximum block size should of course also be fixed (most likely at the current level).

The philosophy of the proposal is that of a slow block size evolution, not of a hyper reactive feedback system. We want the users to consider the network capacity as locally fixed, because that’s what generates the economic scarcity that leads to a security budget (if capacity reacted too quickly to a surge in economic demand, users would factor that into their optimization problem and make lower bids).

The slow evolution also maintains a good level of short term fee predictability that is important for wallets and L2s.

In summary, the idea is that of setting a dynamic baseline capacity that reflects the general stage of adoption of the network, and is not overly affected by temporary events such as the latest memecoin launch.

Analysis

Maximizing the security budget is equivalent to maximizing the miners’ revenues. Therefore it seems natural to leave to the miners (suppliers of block space) the decision as to how much capacity they want to offer to the network, as it happens in most markets. The key is however to make sure that this is not done at the expense of the users.

So how can we guarantee that miners won’t take advantage of this power by endangering the network just to make more money? This must be done by studying their incentives and by carefully designing safeguards so that they won’t act too selfishly.

Let’s first assume that miners play a single round: they observe a demand function and they individually ask themselves what network capacity would help them maximize their revenues. In this simplified case, miners’ incentives seem to be reasonably well aligned with those of the users. In particular:

They won’t pick a block size that is too high, because that drives fees to zero and they won’t make any money. This is consistent with the interests of the users, because a network where miners make no money is insecure and ultimately useless.
They won’t pick a block size that is too small either, because that would increase the equilibrium fees so much that it would exclude too many people, driving revenues down. This alignment is strongest when demand is sufficiently elastic; if demand were to become highly inelastic at the top end, miners might be tempted by persistent under-capacity, which is why conservative floors, slow adjustments, and a voting system that is robust to downward manipulation are important.

Simulations confirm the intuition that in a one-round game, it is optimal for miners to set a capacity close to the region where fee revenue is maximized, which in the model coincides with serving most economically meaningful transactions.

(See notebook at Google Colab to reproduce the results under different assumptions on the concentration of market demand and number of leaders in the round)

While this is of course a simplification, the result is somewhat reassuring, as it suggests that even in an environment focused purely on short-run revenue, miners still tend to offer substantial total space. That doesn’t imply that every transaction will be included, because some miners will pick up the same transactions, but typically a majority of them will (depending on the concentration of demand).

But of course miners don’t play for only one round, and that has both positive and potentially negative consequences. Positive because the long term involvement is an incentive to care about the network, and to avoid taking decisions that drive away participants or degrade its branding and appeal to potential future users (because that would affect the future profitability of mining). That is also an incentive to proceed with caution when evolving the network capacity, not necessarily optimizing only the one-period revenue, but rather leaning on the safe side of a slightly over-dimensioned capacity that still welcomes experimental use cases and is robust enough to demand fluctuations.

The potentially negative side of the multi-period game is that it opens the door for selfish strategies. Miners are not a monolith and may have different and sometimes opposing interests. While it is fairly clear what their incentives as a group are, it’s not immediately obvious whether they will coordinate to follow them.

For example, can a large miner enforce an excessively high capacity, in an attempt to reduce the overall mining profitability and drive smaller miners away from the competition? That is possible if they control enough hash power to push the chosen percentile in that direction. Conversely, can they shrink the block size to artificially drive fees up in the short term, even if it harms the long term appeal and sustainability of the network? Similarly, that would require controlling enough hash power to dominate the percentile vote.

The choice of X should therefore be based on which attack appears more worrying. If X=50%, the balanced choice, then both attacks would roughly require a majority of hash power. It should be noted that since the maximal capacity delta is also limited, these attacks would require to keep such a majority for a sustained amount of time (several months or years), not just for a couple of periods.

It is important to observe that whatever capacity is voted by the miners, it will hold for everyone. If it produces higher fee revenues, all miners will enjoy them, as proof of work naturally distributes them to everyone (in proportion to their hash power). Therefore if a miner wants to manipulate the capacity away from its optimal value, he will typically suffer losses too, at least in the short term. This does not rule out strategies motivated by relative advantage rather than absolute profit, but delta caps and long adjustment horizons raise the cost of such behavior. Furthermore, since capacity would be capped at the current value, there’s little risk of increases that would affect hardware requirements and the decentralization of the network.

An interesting side effect of this proposal is that if users or merchants are not entirely satisfied with the current state, it gives them a permissionless way to participate by contributing hash power and letting their preferences be reflected, which also has the collateral effect of increasing network security. With its 10 BPS, and soon probably 100 BPS (that is, more than 8.5 million blocks per day), Kaspa provides very frequent opportunities for such participation and signaling.

Another recurrent objection is the following: why having miners vote at all? Can’t we regulate the block size simply based on throughput for example? The problem with that is that throughput is not enough to inform capacity changes in any meaningful way. A naive throughput-based rule would for example increase capacity whenever a bunch of passionate users sets up a swarm of bots to artificially pump up the number of transactions, mistaking that for real market demand (while that’s arguably rather a sign that capacity should be reduced).

A better indicator to use for a heuristic capacity update would be the actual security budget, that is the total amount of fees collected over a given period of time. An algorithm could for example attempt to maximize it using a gradient ascent update of the type: delta_capacity(t) = const * delta_total_fees(t-1) / delta_capacity(t-1).
The problem with such a mechanical approach is that it is completely backward-looking, and cannot take prospective information into account (say, the imminent launch of an L2), nor other intangible factors that affect the medium term health (and therefore profitability) of the network. The second problem is that it can be extremely noisy and slow to converge, especially if it starts very far away from the optimal value in a region that is mostly flat. While such an algorithm can very well be implemented as the default one in a mining software, I believe it’s better not to hard-code it into the protocol, but rather leave the miners free to override it with human knowledge when necessary. This is not because miners are necessarily better forecasters, but because a human-in-the-loop mechanism can incorporate off-chain signals, while conservative caps bound the damage from poor forecasts.

One might also wonder whether miners have any interest in voting a different value than their desired one, to act in a strategic way. This seems unlikely as classic results suggest that when using the median (or any other percentile) and assuming single-peaked preferences (which appears a reasonable approximation here), the dominant strategy for voters is to report the truth. This argument is strongest if preferences remain approximately single-peaked; deviations due to MEV or heterogeneous business models are possible, which again supports slow and bounded adjustments.

Of course it is possible that, especially around the optimal value, the voted capacity may oscillate from one epoch to the other, but that shouldn’t be a problem if maximum delta is sufficiently small.

Another potential concern is related to MEV, since a smaller capacity can generate a higher risk of MEV extraction. This is certainly worth discussing. The high parallelism of the network reduces some MEV opportunities by providing many inclusion paths rather than a single bottleneck, and this property will continue to improve as block rates increase toward 100 BPS. At the same time, tighter capacity can intensify competition for inclusion and ordering. Ongoing efforts to study MEV-reducing techniques by the Core team are therefore highly relevant in this context.

The proposal certainly shifts some power from the users and developers to the miners, which is not entirely without risks. However I believe many worries can be alleviated by a suitable and conservative design that reduces oscillations, by focusing on the incentives, the potential for a more secure and appealing network, and considering the fact that this could motivate additional participants to engage in mining as well, in accordance with the original Satoshi vision.

Relation with alternative solutions

The main proposal that is being discussed to boost the security budget of Kaspa is the LSZ (Lavi/Sattath/Zohar) monopolistic auction system.

While it certainly has merits, it represents a significant departure from the current fee market framework.

My main concern with it is that it eliminates the possibility of price discrimination (ability to pay more for a better quality of service, i.e. faster inclusion), which is a very nice property that Kaspa currently has and is valuable for UX and for extracting revenue and maximizing the security budget under heterogeneous urgency.

Implementation risks

The advantage of adaptive block sizes is that they can be initially implemented in a very conservative way, aggressively limiting the amount of capacity change that miners can vote for in a single epoch and maintaining reasonable floors and ceilings. This would leave the network initially very close to its current state, while giving time for users and miners to adapt to the idea and observe where it leads to, with the possibility to halt it at any time with little harm if any new considerations arise after having observed it in action.

Random Linear Network Coding For Scalable BlockDAG

@Gordon_Murray Gordon Murray — Thu, 09 Oct 2025 01:38:57 +0000

What RLNC is in plain English for transactions:
RLNC turns a large, overlapping set of missing transaction bytes into a stream where any useful packet helps until you finish. Instead of asking peers for specific transactions which risks duplicates, stalls, and repeated request rounds, peers send small coded shreds that are linear combinations of the chunks you are missing. Your node keeps only shreds that add new information and drops the rest as soon as they arrive. Once you collect enough innovative shreds, you recover all original chunks at once and reassemble the transactions.

Why this helps Kaspa even when links are not very lossy:
It is primarily about coordination and duplication across many peers and many parallel blocks, not only about packet loss.

Cross block duplication becomes a single download. If the same transaction appears in multiple parallel blocks, RLNC treats it as one source item inside a small sliding window. You fetch it once and it satisfies all those blocks.
Any useful packet means no per tx choreography. With many peers, classic pulls waste bandwidth on duplicates and bookkeeping about who has which transaction. Under RLNC, whichever peer sends a coded shred, it is very likely useful until you finish.
Tail removal and endgame elimination. The last one to five percent of transactions often dominate wall clock time. RLNC adds a small planned overcode, typically five to ten percent, so you finish without hunting for specific transactions.
Partial peers still help. Even if a peer has only some of the window, its shreds are still useful. RLNC harvests partial availability automatically.

What exactly are the packets, for transactions only:

Build a sliding window of recent levels for IBD or catch up, for example four to eight levels.
Compute the node’s unique set of missing transactions across that window. Duplicates across blocks collapse to one.
Split each transaction into fixed size chunks, for example 512 to 1024 bytes. Across the window you get K chunks that are the sources.
A coded shred equals a tiny seed, for example four bytes, plus a payload that is the bytewise linear combination of those K chunks over GF(256). The seed deterministically expands through a public PRF to the coefficient row, and the receiver recomputes it for verification.

What the receiver does, and why memory remains bounded:

For each shred, regenerate coefficients from the seed, then run one elimination step.
If it increases rank, keep it. If it is dependent, drop it immediately. You do not buffer junk.
When rank reaches K, solve once, recover all chunks, reassemble transactions, and validate.
Memory is bounded at roughly K multiplied by chunk size for the innovative rows plus a small coefficient structure per active window, with strict caps and timeouts.

Why RLNC is not only for lossy networks and fits inherent redundancy in Kaspa:
The main win is removing the coordination cost that comes from overlap.

In a multi peer pull, different peers often resend the same last few transactions. RLNC turns those overlaps into independent information, so they still push you toward completion.
The tail disappears because you do not need those specific last transactions. You need enough degrees of freedom. A small overcode in the range of five to ten percent provides that cushion with no per tx negative acknowledgments.

Scheme in practice:

Window selection on the receiver: choose recent levels, list the transactions you are missing, deduplicate, chunk, and count K.
Coefficient rule that is safe against DoS: each shred carries a seed. Coefficients derive from a public PRF of window id, stream id, and shard index. The receiver can dictate the stream id so peers cannot choose pathological schedules. Schedules can be prechecked for rank growth if desired.
Sending by peers: stream ratelessly about K times one plus epsilon coded shreds, with epsilon around five to ten percent. Multiple peers can send, no need to coordinate who has which transaction. A short systematic phase is optional on very clean links.
Receiving by the node: innovative or drop, stop at rank K, decode once, reassemble transactions, validate. Cache recovered transactions so later blocks do not re download them.
Bounds and fallback: keep a small fixed number of windows in flight, apply hard caps on rows and time. If progress stalls, request a small top up or fall back to per tx or per chunk for the last few, which is the same safety net used today.
Peer quality control: track an innovation ratio per peer, namely innovative over received. Peers that send mostly non innovative or invalid shreds are deprioritized or banned.

What this solves for Kaspa:

Cross block transaction duplication becomes a single download inside the window.
Multi peer overlap stops costing you since any helpful packet advances you.
The long tail and endgame vanish since a small overcode replaces rounds of requests for specific transactions.
IBD and catch up time becomes bandwidth proportional rather than RTT and coordination dominated.
Memory stays bounded using innovative only buffering with sliding window caps.
Partial peers still help without any per tx scheduling.

Non goals and expectations:

RLNC does not compress transactions and does not change consensus or validation. It is a transport and scheduling upgrade.
With a single perfect peer, zero loss, and an ideal scheduler, an oracle style unique pull can match or beat RLNC in bytes. RLNC earns its keep in Kaspa’s real conditions with parallel blocks, many peers, overlapping availability, and some loss.

TL;DR RLNC helps by streaming rateless coded shreds so progress isn’t gated by per-tx NACK/REQ exchanges; completion becomes closer to bandwidth-proportional rather than RTT-bound.

Random Linear Network Coding For Scalable BlockDAG

@FreshAir08 — Tue, 07 Oct 2025 21:54:24 +0000

we can skip Kaspa premiers - for the record parent hashes is not only direct parents but also indirect parents which is why its not “1-3” (that too, I think, is more like 7 since crescendo), rather several hundred hashes and is a major part of redundancy.

But lets just focus on block data (transactions) - assume for now header data is propagated us usual.

The part I’m unfamiliar with is RLNC - I tried reading but I’m not sure how this works - it sounded to me like this is something that is good for lossy scenarios not so much for inherent redundancy. I could be wrong - but I need you to explain it.

Please get into and explain (not code!) the scheme you are suggesting in practice to the highest degree of detail as well as what it pertains to solve.

Random Linear Network Coding For Scalable BlockDAG

@Gordon_Murray Gordon Murray — Tue, 07 Oct 2025 04:52:57 +0000

FreshAir08:

Apologies for a belated response.

Its great to see such an initiative.

I realize part of this is background that I may be missing - but I wish for a more thorough explanation on the proposed scheme - for example - what exactly are the packets? are these random sections of data or do they correspond to something concrete (transactions/header hashes)?

Ignoring mempool, redundancy on Kaspa blockdag proper has two main sources:

a transaction is often shared by several “parallel” blocks

in the header, parents’ (direct and indirect hashes) are shared and repeated by many many blocks.

It has not been clear enough to me if this is what we are trying to optimize or something else. If that is it, please help us understand in detail how the reconstruction of a block looks like.

On practical terms - I believe the most substantial place where something like this could be employed is during IBD.

FreshAir08, thanks reading my post.

A Kaspa block is conceptually similar to a Bitcoin block, but adapted for parallel DAG consensus. Each block encapsulates a set of transactions and references multiple parents instead of a single previous block. This enables high-throughput, low-latency block production while maintaining eventual ordering.

Basic block structure
Each block consists of:

• Header core — version, timestamp, nonce, difficulty, and Merkle roots for transactions and accepted parents.
• Parent hashes — a list of one or more parent block references (Kaspa typically 1–3).
• Transaction list — user transactions selected by the miner.
• Blue score / selected parent hash — values used in consensus to define ordering and cumulative work.
• Subnetwork data (optional) — extra fields reserved for extensions like proof-of-stake, virtual programs, or L2 metadata.

In simplified pseudocode:

Block {
    Header {
        Version
        Timestamp
        Difficulty
        MerkleRootTx
        MerkleRootParents
        SelectedParentHash
        BlueScore
        Nonce
    }
    Parents[] = { hashA, hashB, ... }
    Transactions[] = { tx1, tx2, ... }
}

How it’s propagated
Unlike Bitcoin’s strictly linear chain, Kaspa nodes can receive and validate multiple blocks concurrently. Each block declares its set of parents, allowing it to “attach” to the DAG even before all tips are fully synced. This makes parallel IBD and asynchronous validation possible.

Relation to RLNC-based IBD
Because a Kaspa block’s payload (header + parents + txs) can be partially redundant with others in the same DAG layer, it’s a good candidate for coded propagation. My recent simulation demonstrates how Random Linear Network Coding (RLNC) could exploit this redundancy for faster and more resilient IBD.

Experiment repo: https://github.com/gordonmurray-coding/rlnc-kaspa-ibd-demo

The model factors the block payload into:

Header core (constant-size bytes)
Parent-hash items (shared among overlapping tips)
Transaction bytes (with configurable overlap probability)

By treating each of these as linear symbols over GF(2) or GF(256), RLNC achieves up to 40–50 % byte savings compared to naïve full-block transfer, while remaining loss-tolerant under high packet drop rates.

Key takeaway
Kaspa blocks are composable objects; header, parent references, and transactions, forming the vertices of a global DAG. Their partial redundancy between tips makes Kaspa a natural candidate for coded networking approaches like RLNC, which can reduce bandwidth and improve IBD convergence even under lossy conditions.

This is in reply to Maxim’s question about DoS. I’ve created a MATLAB-based simulation to examine this issue; whether dependent or malformed coded packets could realistically cause a DoS-style stall in an RLNC-based IBD process.

The model emulates a Kaspa-like blockDAG with configurable:

Level width and parent fan-in
Sliding-window IBD with caching
Per-peer bandwidth, RTT, and packet loss
GF(2) and GF(256) fields
Rateless RLNC with feedback and repair batches

Each window compares three strategies:

Full-block transfer (no deduplication)
Unique pull (ideal de-dup baseline)
RLNC coded download (loss-aware, adaptive)

Result summary (20 Mbps / 5 peers / 20 % loss):

RLNC reduced bytes by ≈ 40–50 % vs full-block IBD
Time savings ≈ 30–40 %
All windows reached full rank = K before timeout
GF(256) more stable under loss

demo console

DoS feasibility
A malicious node could indeed craft dependent combinations to stall decoding, but this can be bounded and detected:

• Rank-growth monitoring: reject peers whose packets stop increasing rank
• Per-window repair limits: cap accepted coded packets to ≤ K (1 + ε + margin)
• Deterministic coefficient seeds: tie each packet’s coefficients to its hash/seed for verification
• Rank-feedback gossip: allow peers to cross-validate decoding progress without exposing payloads

These mitigations keep memory bounded and prevent indefinite waiting on undecodable shards.

Next steps

Integrate rank-aware feedback into the simulation
Quantify computational cost of rank checks in live node context
Explore coded relay / partial sub-DAG propagation
Integrate with rusty-kaspa-simpa for modeling & simulation

Would appreciate feedback on practical validation overheads or other mitigation primitives that could coexist with Kaspa’s concurrency model.

Random Linear Network Coding For Scalable BlockDAG

@biryukovmaxim Maxim — Mon, 06 Oct 2025 08:38:04 +0000

Could a malicious actor perform a DoS attack by crafting blocks with intentionally flawed encodings?

For example, by generating shreds where the linear combinations are dependent or designed such that no shred (or combination of shreds) includes/reveals a specific critical piece of the block data.

This could cause nodes across the network to receive and store “almost complete” blocks in memory indefinitely while waiting for sufficient independent shreds to decode, but never actually succeeding. Repeating this for multiple blocks could exhaust memory resources network-wide.
Is this feasible under the proposed RLNC scheme? What mitigations could prevent it?

Random Linear Network Coding For Scalable BlockDAG

@FreshAir08 — Sun, 05 Oct 2025 15:18:11 +0000

Apologies for a belated response.

Its great to see such an initiative.

I realize part of this is background that I may be missing - but I wish for a more thorough explanation on the proposed scheme - for example - what exactly are the packets? are these random sections of data or do they correspond to something concrete (transactions/header hashes)?

Ignoring mempool, redundancy on Kaspa blockdag proper has two main sources:

a transaction is often shared by several “parallel” blocks
in the header, parents’ (direct and indirect hashes) are shared and repeated by many many blocks.

It has not been clear enough to me if this is what we are trying to optimize or something else. If that is it, please help us understand in detail how the reconstruction of a block looks like.

On practical terms - I believe the most substantial place where something like this could be employed is during IBD.

Pruning safety in the vProgs architecture

@Gordon_Murray Gordon Murray — Fri, 26 Sep 2025 01:05:23 +0000

In practice, your model turns “old history” into “recent proof obligations.” Any transaction that would otherwise fail because its scope dips below the pruning point can succeed by anchoring its logic to a recent zk proof of the account instead. The trade-off is larger proofs/witnesses, but much smaller storage requirements for nodes.

Example 1: Smart Contract with Old Transaction History

Suppose account A deployed a VProg (like a simple token contract) 10,000 blocks ago.
A new transaction x wants to read some intermediate balance changes that happened 9,500 blocks ago (well below the pruning horizon).

Without your pruning model:

Node tries to fetch the old transaction data.
If it’s pruned, the scope fails → transaction breaks.

With your pruning model:

Instead of reading the ancient balance updates, x can be reformulated as x′.
Provide witnesses from a more recent proof of A’s account state (e.g., a zk proof that the balance at block 9,800 already incorporates the old events).
Result: The node only needs to check recent proofs (within Δ=500 blocks), not store the whole 10,000-block history.

Example 2: Batch Payment Program

A VProg batches 1,000 payments into small chunks.
Some of those payments touch states 600 rounds old.
By the Δ=500 rule, those old vertices are discarded.

How it works:

Instead of providing witnesses for the ancient payment states, the transaction provides a witness to the account’s latest zk proof (say round 9,950).
The computation scope shrinks to the last Δ blocks, and the proof shows that all prior history is “absorbed” into the proven account state.

Example 3: DeFi Program with Reuse of Scopes

A lending VProg frequently rebuilds transaction scopes (e.g., when users roll over loans).
A scope references a state 400 blocks back.
A zk proof for that account was just refreshed 10 blocks ago.

Your Γ safeguard kicks in:

Because that vertex was “recently read,” it isn’t pruned for another Γ=50 rounds.
When the loan rollover transaction arrives, it can still reference that intermediate state without being forced to re-supply a long witness.

Example 4: Cold Account Wakes Up

Account B hasn’t been active in months. Its old vertices are far below pruning.
A new transaction wants to interact with B’s state.

With your model:

Transaction provides a zk proof for B’s state at a recent pruning boundary (e.g., “as of block 1,000,000, B had balance 50”).
No need to replay months of transactions — just prove membership of that final state.
Node discards all of B’s old vertices once Δ is exceeded.

VProg Scope with Pruning: Substitute Recent Proof

Top: Tx depends on pruned old state → breaks. Bottom: Tx uses a recent zk proof inside the pruning window → succeeds.

Δ / Γ Discard Rules for Vertices

Top: Discard vertices older than Δ rounds once a new proof exists. Bottom: If a vertex was recently read, preserve it temporarily for Γ rounds.

Random Linear Network Coding For Scalable BlockDAG

@Gordon_Murray Gordon Murray — Sun, 21 Sep 2025 11:56:08 +0000

About Me
I’m Gordon Murray, an independent researcher in the Kaspa community. My work focuses on improving Kaspa’s networking layer, including publishing a whitepaper on RLNC for BlockDAG propagation LinkeIn link. This post is a condensed version of that research, with additional ideas about enabling Kaspa to run over wireless mesh networks. The whitepaper was highlighted on my LinkedIn by Gilad Aviram, who encouraged me to share it here for discussion at the request of Michael Sutton. I should also note that I’ve been inspired by the pioneering work of Muriel Médard on network coding and the Optimum P2P project, which demonstrated RLNC’s real-world benefits in Ethereum.

Proposal: Applying Random Linear Network Coding (RLNC) to Kaspa’s P2P Layer (Including Wireless Mesh Networks)

Executive Summary
Kaspa’s BlockDAG currently achieves ~10 blocks per second (BPS) and aims for 100+ BPS. While consensus (GHOSTDAG) is efficient, the network layer remains a bottleneck:

• Redundant gossip floods overlapping transactions.
• Multiple block bodies per second stress bandwidth.
• Global latency skew reduces blue-set inclusion fairness.

I propose integrating Random Linear Network Coding (RLNC) into Kaspa’s P2P layer. RLNC optimizes propagation by transmitting linear combinations of blocks/transactions, so nearly every received shard is innovative.

This directly benefits:
• Block propagation (reduces redundancy)
• IBD (Initial Block Download) (faster sync, fewer stalls)
• Mempool sync (efficient transaction relay)
• And critically, enables Kaspa to run over wireless mesh networks for maximum decentralization.

Layer Separation: RLNC is Consensus-Agnostic

┌─────────────────────────────────────┐
│ Application Layer                   │
├─────────────────────────────────────┤
│ Consensus Layer                     │
│ (Kaspa PoW: Mining + GHOSTDAG)      │
├─────────────────────────────────────┤
│ Network / P2P Layer                 │
│ ← RLNC OPERATES HERE →              │
├─────────────────────────────────────┤
│ Transport Layer                     │
└─────────────────────────────────────┘

RLNC optimizes the network layer. Consensus remains untouched: whether PoW or PoS, blocks must still propagate efficiently.

Current P2P Implementation (Rusty-Kaspa)
In kaspad/src/p2p/flows:
• Block relay → flows/block/handle_incoming.rs
• Transaction relay → flows/transaction/relay.rs
• Initial Block Download → flows/ibd

All rely on gossip-based flooding: every block or transaction is propagated individually. This ensures delivery, but at high BPS causes:

Redundant transmission of overlapping transactions.
Bandwidth stress from concurrent block bodies.
Latency skew → slower blocks earn less blue score, reducing fairness.

RLNC Benefits for Kaspa

1. Block Propagation
• Encode multiple concurrent blocks into shards.
• Peers forward different shards → receivers decode originals once rank is full.
• Reduces redundant gossip.

2. IBD (Initial Block Download)
• Sync peers send coded shards over block ranges.
• New nodes don’t stall on missing blocks.
• Faster historical sync.

3. Transaction Relay
• Batch encode mempool transactions.
• Even if wireless peers miss shards, decoding succeeds once enough arrive.

Kaspa over Wireless Mesh Networks

Why RLNC is Ideal for Mesh
Wireless mesh networks are lossy, variable-latency, and retransmission-expensive. RLNC was originally developed for these conditions (multicast, satellite, ad-hoc).

Mesh-Specific Benefits
• Multicast efficiency → one radio transmission helps multiple neighbors.
• Partition healing → shards rapidly fill in missing data after splits.
• Resilience → innovative shards survive packet loss.
• Energy efficiency → fewer retransmissions save battery-powered nodes.

To our knowledge, no high-throughput PoW BlockDAG has ever been adapted to wireless mesh. Prior experiments used Hyperledger Fabric or PoA Ethereum with low block rates. Kaspa + RLNC would be the first truly high-rate permissionless blockchain optimized for mesh networks.

Example: RLNC in Rusty-Kaspa

// protocol/src/messages/rlnc.rs
#[derive(Clone, Debug, Serialize, Deserialize)]
pub struct RlncShard {
pub coefficients: Vec, // Random GF(2^8) coefficients
pub payload: Vec, // Encoded block/tx data
pub commitments: Vec, // Merkle roots for integrity
}

Extend block relay (flows/block):

fn propagate_blocks_rlnc(blocks: Vec) {
let shards = rlnc::encode(blocks, redundancy_factor);
for (peer, shard) in peers.iter().zip(shards) {
send_message(peer, Message::RlncShard(shard));
}
}

On receiving side:

fn handle_rlnc_shard(shard: RlncShard) {
decoder.add_shard(shard);
if decoder.is_ready() {
let blocks = decoder.decode();
for block in blocks {
validate_and_insert(block); // consensus unchanged
}
}
}

Quantitative Outlook (Whitepaper Results)
Formal model: RLNC improves finality when

3(1−η)τ > δ

where:
• η = efficiency factor (bandwidth savings)
• τ = network mixing time
• δ = decode overhead

Example from whitepaper:
• With W=10 blocks, ρ=0.6 overlap → η ≈ 0.46 (54% savings).
• τ ≈ 80ms → RLNC can tolerate δ up to ~130ms.
• Optimized decoding achieves δ < 5ms.

Thus RLNC yields 40–60% bandwidth savings and reduced ordering lag, improving blue-set fairness.

Deployment Strategy

Phase I: Systematic RLNC for block bodies (send some uncoded chunks + coded shards).
Phase II: Transaction batch gossip.
Phase III: Multi-tip coding across sliding windows.
Phase IV: Test with simpa under lossy/mesh conditions.
Phase V: Opt-in rollout (--enable-rlnc).

This roadmap is backward-compatible and can begin in private relay overlays before public deployment.

Conclusion
RLNC directly addresses Kaspa’s network bottlenecks:
• Reduces redundant gossip traffic.
• Speeds up IBD and transaction relay.
• Improves blue-set inclusion fairness at high BPS.
• Enables Kaspa to run over wireless mesh networks, maximizing decentralization.

Bottom Line: RLNC is not just a bandwidth optimization — it is a novel enabler that could make Kaspa the first high-throughput PoW BlockDAG blockchain to run efficiently on mesh networks.

Call to Action
I invite Kaspa devs and researchers to:

Review the proposed RlncShard message type and relay flow.
Use simpa with lossy network parameters to emulate mesh conditions.
Discuss feasibility, trade-offs, and next steps.

With RLNC, Kaspa can scale not just in datacenters but across community-built mesh deployments, making it censorship-resistant, resilient, and globally distributed.

Data Availability Concerns

@Pavel_Emdin Pavel Emdin — Wed, 17 Sep 2025 07:41:28 +0000

Hi Gordon! Yes, gas is paid in iKAS, which is the native asset on Igra — same as ETH on Ethereum. iKAS is issued via the canonical bridge from Kaspa L1, so every iKAS is fully backed 1:1 by KAS locked on L1.

Data Availability Concerns

@Gordon_Murray Gordon Murray — Wed, 17 Sep 2025 02:45:41 +0000

Hi Pavel. So gas is paid in Kas for Igra L2? I assumed the gas was paid in Igra. Edit: I read the documentation about the Kas to iKas bridge so I answered my own question.

Data Availability Concerns

@Pavel_Emdin Pavel Emdin — Thu, 04 Sep 2025 17:55:08 +0000

Thanks for raising these points. Let me clarify how Igra addresses them:

EVM Compatibility – we’re using Reth with modifications for Kaspa’s reorgs. Our data component handles L1 → L2 translation, while maintaining standard Ethereum RPC interfaces. Users interact normally through MetaMask, so everything else is compatible. In terms of known temporary limitations, there is 20 kBytes limit for payloads.
Reorgs. Kaspa’s frequent shallow reorgs are handled by following L1’s canonical chain. When L1 reorgs, L2 follows. We’ve implemented this in our Reth fork to handle Kaspa’s specific DAG structure.
RPC Decentralization. Anyone can run an Igra node and provide RPC endpoints. There’s no centralized relayer – reading from L1 and serve RPC requests are two independent processes.

Handling congestion is a very valid concern. It would be a good problem to have thought.
L2 fees naturally flow to L1 miners since all sequencing happens on Kaspa. High-priority transactions pay higher L1 fees, maintaining proper incentives. In line with Fees and throughput regulation dynamics - #4 by hashdag

Data Availability Concerns

@YesComrade 2333 — Mon, 01 Sep 2025 17:13:56 +0000

Recently, Igra/Kasplex/Sparkle Teams are working on L2s to enable smart contract features on Kaspa. While everyone seems to be jumping from one question to another and lose focus and priority of issues, i.e. fees/nodes requirement/economic model.

This post is to show that this will not be just “another EVM chain replicate”. I am writing to organize all my questions and the difficulties I see in a structured way. Definitely welcome others to add more. Some were already mentioned by @FreshAir08: On the inherent tension between multileader consensus and inclusion-time proving - #3 by FreshAir08.

1. ETH Mempool/Txpool & Node Rules

When a TX nonce value is too large, or the gas fee is too low (e.g., during network congestion), ETH nodes cache the TX in the queue until conditions are met. Then the TX is executed.

Since L2 lacks P2P consensus and relies solely on L1 sequencing, TX execution may differ across nodes. For instance, some nodes may cache transactions that others do not have and executed these TXs later again in future blocks.

Specifically, nodes that has cache TXs with invalid nonce values will re-execute them once the conditions are met. Newly joined nodes cannot re-cache these TXs. It will then look like that “older nodes have more TXs,” when in fact those should have been discarded.

ETH tried to solve this issue by switching this feature off.

2. L2 Node Reconstruction and Consistency

Based rollups means that all TXs are sequenced and ordered by L1. L2 functions solely as a layer for state storage and computation. In theory, L1 L2 logic and consensus should be aligned. In that case, ideally, as long as the data sources are identical, all L2 data, statefulness, and TXs can be simply rebuilt with any L1 archive.

By the way, this is also an overlooked feature of KRC20 protocol. KRC20 needs no compatibility; KRC20 doesn’t generate any block. A TX is an execution. Every TX generates a checkpoint. KRC20 blocks = L1 block. It is the most “based”.

For EVM chain, in contrast, things are way more complicated. EVM L2s produce blocks by themselves, obviously. The biggest issue of typical EVM L2 designs is TXs are NOT executed from L1 directly. It needs a LOT of compatibility stuff, including block binding, TX execution rules, mempool rules etc. As a result, any two of the L2 nodes may diverge in transaction ordering due to caching (as mentioned above) or deep reorgs. Despite the op codes themselves, EVM chains need to incorporate EIP-xxxx, various gas fee mechanisms, and other protocols. All these are included in the ETH client ends. Very difficult to make changes. I don’t know how Igra/Kasplex team will manage this problem.

One solution (maybe) is to supporting multiple supernodes mode: this assures at least syper nodes has same results when executed against the same L1 data.

3. Additional Thoughts

ReOrg Handling: When an L1 reorg occurs (this is a very unique Kaspa issue), usually, the resolution is to determine the affected VSPC blocks, and then the L2 must roll back these affected TXs/statefulness and re-execute them based on the new L1 data. This would require significant modification of EVM’s core components if the EVM L2 is on Kaspa.
To be attractive to existing EVM devs and dapps, L2 must achieve close-to-complete compatibility. Just allowing for progammaticability with Solidity is not good enough. It needs to get across infrastructure and protocol layers.
Wallet compatibility: Imo the EVM ecosystem is too heavy, but for reasons above, if Kaspa makes an EVM L2, it must try its best to be as compatible as possible. Seamless wallet integration is required
RPC Compatibility: Since TXs are submitted to L1 (if the L2 is based), read-only RPCs (on-chain data queries) can use existing EVM components directly. However, writing (transaction broadcasting) must be submitted to L1. There are maybe two possible solutions:

(a) Wallets directly construct and submit L1 transactions. The bad news is EVM wallets cannot be used then.

(b) Modify RPC to intercept broadcast calls and relay them to L1 (“relayer” compatible with EVM wallets). This is against decentralization.

L2 Explorer: L2 explorers must display the mapping between L2 blocks/transactions and L1 data, which is still not given by neither Kasplex nor Igra.

------------ NON EVM PARTS ------------

I have a sense that Kaspa people don’t really know much about ETH clients. Imo, Kasplex’s design is more trusless and is more challenging by its design. It seems to be based and it is consistent with their design of KRC20. Igra’s plan is not as based, although their transparent communication and update is highly appreciable.

The actual challenge is not with tge question “can they build a L2 that can run”. The question relies upon frontier conditions, such as DA cost. A good question can be: Under 10 BPS, if L2 uses a lot of calldata (not memory or blob), does the L2 have a strategy to manage potential congestion and bloat on L1 (the fee will then rise)? That’s kinda similar to what happened during the KRC20 launch. Unfortunately, these questions are so rare. Only see buzz words of “decentralized nodes”“trust model”“pre-zk” blah blah.

Pruning safety in the vProgs architecture

@FreshAir08 — Mon, 25 Aug 2025 16:19:47 +0000

For simplicity, below I assume L1 has full access to the Computation Dag structure, and can calculate the corresponding scopes of all VProgs. I will not concern myself much with how this is done. This assumption might eventually be relaxed using an enshrined L2, or just pushing the issue to each L2 individually.

Pruning in Kaspa consists of throwing away transaction data. If a transaction scope requires executing transaction data from below the pruning point, it could fail or succeed depending on the real time performance of the node, and it is clear such a transaction should not be allowed. I will argue below that the effects of disallowing this on the functionality are non crucial:

For a potential transaction x, let W_x denote the set of vertices created by x, let F_x denote the set of witnesses provided to the Prog_v, and let P_x denote all the paths from W_x to a vertex in F_x. Finally, let Y denote the set of vertices which are currently proven.

Observation: let C\subseteq Y be a covering of all paths in P_x. Then there is an equivalent transaction x’ creating the same set of vertices W_x, which provides the witnesses C instead, and introduces a scope of computation contained in the one originally introduced in X.

Remark: there is no guarantee that the number of witnesses (i.e. the size of C) provided by x’ will be smaller than in x, and it in fact can be bigger. As a sidenote, it may be beneficial on general to determine heuristics to find a set which does reasonably well for both the “width” of the cut and the scope size.

The above observation implies that in order to obtain the same functionality as a transaction x, one can equivalently provide witnesses for “later” vertices, so that the scope will be contained within the pruning period. This however requires these later vertices to be “proven”, but all this requires is that a recent proof of the corresponding account exists up to this pruning period. It is a reasonable requirement to expect that VProgs aiming to be composed with recent proofs, will provide proofs within the latest pruning period. It is remarked that under our design, a prover can always prove old data even after an extended period of inactivity (see On the design of based ZK rollups over Kaspa's UTXO-based DAG consensus)

Correspondingly, the data of a vertex in the computation Dag lying below the pruning point can be discarded. Pruning of the Dag structure is ignored for now, but as an afterthought we could prune it as well and only store membership proofs of its pre-pruned structure.

Discarding long-proven account data

While the above limits the number of vertices data need to be stored, storing intermediary states for the entire pruning period remains a big load. For this reason we should also introduce regular pruning of vertices lying far below a recent zk proof.
Specifically, extend the definition of a vertex to (a,t,T), with the last entry denoting at which block number the transaction was accepted, i.e. its “round”.
Let then v=(a,t,T) be a vertex, and \pi_a be the latest proof submitted for the account a, then v shouldbe discarded when T (\Delta=500 for example) where T’ denotes the “round” of the latest transaction proven by \pi_a.
This scheme is logical because:

a) The more time surpasses following it, the less likely it is a vertex will need to be read.
b) if such a vertex absolutely must be read, witnesses for it could still be provided for it via the intermediary states commitment on \pi_a. I.e. the functionality remains possible, but merely requires including witnesses again.

There is a case for including another condition - that if a vertex was recently read, it will not be discarded for \Gamma (=50) rounds. Logic being that sometimes transaction scopes will be rebuilt in parts, hence reerasing them immediately will be detrimental. This however further complicates the objective inferring of a transaction scope.

*The numbers 50, 500 etc. are pure placeholders.

A Basic Framework For Proofs Stitching

@FreshAir08 — Sun, 24 Aug 2025 16:15:41 +0000

It is the responsibility of the transaction stitcher to ensure the correct sourcing of transactions. I assumed the intermediaries are easily derivable, and a merkle tree within the commitment is indeed a natural choice for that. I consider this important discussion to be part of point 3.
if it knows by independent execution that the transaction succeeded , or even if it trusts others attesting to that (and the result of the transaction), it can proceed processing locally. I previously indeed thought of the latter scenario as an optimistic case (prover network), but now with the VProg architecture I tend to the former as standard which I think you are more comfortable with?
The way I saw it the “settled state” commitment however should not move until it has been fully stitched, because an outside observer has no way of verifying the state progression until that. Of course there are also other commitments for conditional “segments” already partially proven. These kind of things form a metastate enveloping the settled state, but “rolling back” the settled state shouldn’t be hard if it has never actually moved.
If I understood correctly this is the way I imagined things work and correspondingly I am not sure if I’m bothered by this. Regardless I believe we must first decide what we believe is the correct solution and then find ways to mediate the gap between existing tools and theory.
i.e. offchain stitching. This might make certain aspects simpler but loses the autonomy of provers to only prove segments on their own, on their own pace, and make them very codependent of each other. Depending on how you imagine it (was not clear enough to me exactly) it might also force architectural changes to the utxo covenants as they are. If you do not move the “settled” state there is no real difference to me in how to deal with rollbacks.

# On transaction scopes and the visibility of the object DAG

@hashdag Yonatan Sompolinsky — Wed, 20 Aug 2025 20:21:11 +0000

Prerequisites: The definition and intuition of a transaction’s scope from the point of view of a vprog, roughly: “the historical dependency subgraph a vprog must execute to validate a new transaction”. (see prvs post)

Why do txn scopes require special treatment?

In general, when a composing transaction interacts with a vprog, the vprog must be able to check its validity not only from its own perspective but also from that of other vprogs it interacts with. Typically, this validity check is a special case of the broader requirement that a vprog must maintain independence by executing any transaction that creates direct or indirect dependencies on its state, and that transactions must gas‑pay for this.

A preface note on the importance of gas-payment

Gas‑payment refers to the metering of a transaction’s resource consumption, simply the definition of its resource consumption. This is distinct from the actual transaction fee mechanism in KAS (clarification of this - in a separate post).

A transaction causing a vprog to consume resources without gas‑paying for it is not merely economically, rather indicates a design flaw. It means vprog sovereignty and independence are violated. Therefore, the critical reader of our vprog design should look for bugs of the form “here’s a scenario where vprog vp_i was compelled to execute computation or allocate storage without being gas‑paid for it,” or otherwise be convinced the design is sound. I argue this condition is both necessary and sufficient.

Cross‑vprog scope independence

In our design, a transaction tx_1 gas‑pays vp_j for the computation of scope(tx_1, vp_j). See Michael’s post linked above. Yet it is not immediately clear whether the design guarantees that either (1) vp_i does not need to compute scope(tx_1, vp_j), or (2) vp_i is gas‑paid for it. Indeed, tx_1 could in principle be invalid due to insufficient gas funding for its vp_j‑scope, which (eg due of atomicity) must invalidate it also from the perspective of vp_i. Does this imply a bug?

To ensure soundness, we design for condition (1), by making a transaction’s vp_j‑scope fully determined from the object DAG’s topology, itself determined from transaction declarations. This follows from the definition of scopes. But what guarantees that the object DAG’s topology is unambiguous? We need to ensure that if a transaction declares a read and a write, the implied vertices always materialize and add to the object DAG as expected.

Read failures

Read operations create no dependency, so they can fail only if the transaction aborted, was gas‑drained, or simply did not include the read command. To prevent such scenarios, we enforce a meta‑instruction that fetches and reads all declared read accounts. Thus, a vprog computes a txn’s scope and reads its read-declared accounts before it considers the txn’s actual instructions.

What happens if this meta-instruction is not sufficiently gas-funded by the txn? Such a scenario must be easily discernible from txn+object DAG meta data, and indeed is, by using gas-commitments: The txn specifies in metadata its local gas and scope gas, namely, the max amount of gas its own computation consumes, and the max amount of gas its scope computation consumes from a vprog it writes to. Checking if the txn sufficiently funds all scope computations is now immediate: Traverse its scope wrt any involved (written to) vprog, add all the local-gas fields, and if those are equal to or less than the txn’s scope-gas commitment, the txn is sufficiently funded

Write failures

In prvs design posts we referred to the “state vertex of an account” as the output of the latest txn declaring a write to that account, the maximal vertex in the object DAG with this account ID. This definition is unambiguous from the pov of the vprog owning the account, but could be ambiguous from the perspective of other vprogs that were not required to execute it.

If the txn requires both read and write access then the challenge is addressable via the same construction that disambiguated read success. Namely, the txn is preceded with a meta instruction to read all of the account data, including the to-be-written account. We then define that the txn creates a new state vertex for this account, regardless of whether its write succeeded or failed; a write failure reduces to the new account state vertex holding the same value is its predecessor.

How to treat write-only access

How should write‑only declarations be treated? Michael et al. suggest forbidding it: all writable accounts must also be declared readable (w(x) ⊆ r(x)). Their rationale is that there is only a handful of cases where a txn will want to override the entire data inside an account regardless of its prvs data.

The classic example would be an oracle updating a pair’s price without any dependence on its prvs value. Assuming the oracle’s txn indeed overrides the entire account data, it does not require any read access. With their proposed forcing rule, txns that read the oracle account will need to commit to a large scope: If the oracle updated the account 100 times within one (validity proof) epoch, txns that read oracle data (and write to other vprogs) will need to commit to a scope 100 x larger than needed.

Their counterargument is that oracle-like txns, and specifically those that override the entire account data, are rare and do not justify complicating the design. Such real world considerations are nauseating and should be looked down upon. Instead, I argue we can add an optional field to account ID, representing its sequence index (the nth write to it). This allows write‑only txns to reduce scope size. By extension, txns that read such accounts, too, can reduce their scope by declaring not only the plain account ID they wish to read from but also its sequencing index field (or a lower bound over it!). The txn issuer’s wallet can easily discern this (lower bound on the) index, but other vprogs need not trust it: vprogs execute the reduced txn scope and in case of a write-failure the txn’s value is considered null by all vprog (including the originating one). This “concession” allows third party vprogs to trustlessly process these txns with their reduced scopes. Upon interest I will elaborate on the soundness of this proposal, though something tells me the demand for this would be somewhat limited in scope.

Motivation to enshrine the object DAG in L1

The design ensures txn scopes are inferable from the object DAG topology plus metadata.

In principle, this assertion satisfies the soundness requirement. In practice, however, inferring the object DAG structure can complicate eg by txns that were legally mined yet whose gas-commitments were insufficient to fund their scopes across all involved vprogs.

Therefore, we are currently leaning towards enshrining the object DAG in L1 consensus, by labeling txns with insufficient gas commitments as off DAG (non members of the object DAG), and maintaining in consensus the object DAG’s structure (@michaelsutton @FreshAir08 I think it is not necessary to enforce this on the block or merging block level; the UX will anyways be complicated due to multileaderness).

While such a route will require additional metadata processing, it can be shown to keep L1 agnostic to txn execution and txn (non meta) data, and to incur a negligible overhead to consensus nodes. Nodes will only need to monitor the skeleton of the object DAG, via read-write declarations and gas commitments, and vprogs will be able to rely on L1’s verified data structure in their covenant proofs.

End Notes

“vprog” technically refers to the stateless logic owning a set of accounts. It is frequently used also as an intuitive shorthand for stateful nodes of the vprog, nodes that choose to continuously monitor its total state.
Our architecture is inspired by Solana, and also Sui, in that txns statically declare dependencies. We ruled out Aptos-like dynamic dependency detection, precisely because txn dependencies, or scopes, are then not discernble without full execution of all vprog txns. This compromises on vprog sovereignty and state growth independence, and reduces the system to one fat L2.

Unlike Solana’s batch treatment, we assume an underlying sequencer that prioritizes txns in order. This L1 based ordering prevents deadlocks, cyclic dependencies. Understanding how our “linear” design is still parallelism-friendly is a topic for another day.
Also unlike Solana, our design represents account states in a UTXO-like structure. In particular, token transfers which in Solana are writable only would require read-and-write declarations in our system.
@FreshAir08 assisted with much of this post but I don’t know that he’s particularly proud of it.

Zoom-in: A formal backbone model for the vProg computation DAG

@michaelsutton Michael Sutton — Tue, 19 Aug 2025 11:09:19 +0000

Following offline discussions I have fixed the definition of a scope from using the past relation to using the routable definition. See above. Essentially using the past relation was wrong here as seen in the example above.

Zoom-in: A formal backbone model for the vProg computation DAG

@FreshAir08 — Mon, 18 Aug 2025 16:01:18 +0000

Below is a short extension to the above regarding the gas design as I currently see it. I focused on the execution aspect for now though parts of the discussion may apply to other resources such as permanent/transient storage.

Gas Scaling Factor

GS(i) denotes the gas scale associated with VProg i.
This reflects differences in proving environments across VProgs: distinct proving stacks may vary substantially in cost per unit of computation. Further, individual VProgs may wish to impose differing caps on their total proof resource usage. For simplicity, I will assume this scaling factor is transparent to L1.

Proof Gas Vector

PG(x) is the (unscaled) proof gas vector for transaction x.
This is a sparse vector indexed by vProgs i such that x writes to i, where each entry PGᵢ(x) denotes an upper bound on the prover-side cost of including x in the execution trace of VProg i.

The intended role of PG(x) is:

To support L1 enforced caps on the total proving load submitted per block for a VProg (adjusted to the gas scale)
To enable reimbursement of provers within L2s, compensating them for their contributions according to the actual proving burden.

Previously we assumed the raw computation per tx consumed by a VProg (“logic zone” in older jargon) was in direct proportion to the proof costs and hence the two aspects could be capped as one. This is no longer the case, as we now expect VProg provers (and VProg users) to execute but not prove the transactions in their missing scope, which could result in substantial overhead which cannot be disregarded.

Execution Mass

To retain execution feasibility in light of the above, we must separately constrain the execution load.

EM(x) - the execution mass vector of transaction x, with each coordinate EM_v(x) representing the execution burden of x from the perspective of VProg v.

This execution mass of x is defined as:

EM_v(x)=\sum_{y\in scope(x,v)}\sum_i PG_i(y)⋅GS(i)/+∑PGi(x)⋅GS(i)

That is, a follower of v must execute all transactions in the scope of x (as needed to compute x) as well as x itself.

This expression defines a static upper bound on the compute burden placed on v, derivable solely from the DAG structure and the declared PG values, which can be inferred without execution of the transactions.

If L1 is made aware of the DAG structure, then EM(x) can be deterministically computed by miners, similar to transaction masses today. Otherwise, transaction issuers should include within the transaction an upper bound for the execution mass commitment (per VProg), where a transaction is only considered “valid” if the commitment exceeds the true value on all coordinates. Validity could be determined by a dedicated enshrined VProg aware of the Dag structure, for example. In either case, it is L1 that enforces a cap on the masses (or their commitments).

Potential Issue: DoS via Proof Gas Overestimation

To best illustrate this issue, consider a vProg v that is rarely written to, but frequently read from. A malicious transaction that writes to v with a vastly overstated PG_v will inflate EM_u(x) for the “next” transaction on all reading VProgs u, degrading execution performance.
While this impact is limited in to circa one transaction per u, it is cheap and can be repeated indefinitely if unconsummated gas is returned to the user. Some mechanism for discouraging overestimation may be necessary.

Zoom-in: A formal backbone model for the vProg computation DAG

@hashdag Yonatan Sompolinsky — Mon, 18 Aug 2025 10:18:07 +0000

There’s still some mismatch between text and drawing but the general point is understood – that the past of vp contains elements on which it is stateless (here B_{t-3},B_{t-4}).

Zoom-in: A formal backbone model for the vProg computation DAG

@michaelsutton Michael Sutton — Sun, 17 Aug 2025 19:27:49 +0000

Edited to clarify the point.

Zoom-in: A formal backbone model for the vProg computation DAG

@hashdag Yonatan Sompolinsky — Sun, 17 Aug 2025 19:04:43 +0000

Why is

Why did x_{t-2} enter the past of p?

Zoom-in: A formal backbone model for the vProg computation DAG

@michaelsutton Michael Sutton — Sun, 17 Aug 2025 14:22:47 +0000

There are two arguments as to why the pure past diff Past(R(x_t)) \setminus Past(p_{t-1}) does not suffice (related to the two terms in the anchor set).

witnesses provided by the txn can cut the past diff thus reducing the scope size (ie a witness can be provided to a point way above Past(p_{t-1}))
precisely because of the case above, being within Past(p_{t}) does not mean the transaction was actually computed by the prog, hence state data might be locally unknown for some vertices in this past (think of transactions within the past diff at time t but below the aforementioned cut).

The following case illustrates both points:

Prog p owns account A. Accounts B, C are foreign
A_{t-5} is the maximal vertex for account A at time t-1
Txn x_t reads from B_{t-1}, A_{t-5} and writes to A_t. It provides a witness directly to B_{t-1}
Txn x_{t-2} read from B_{t-3} and wrote to C_{t-2} and B_{t-1}
Txn x_{t-3} read from B_{t-4} and wrote to B_{t-3}
Txn x_{t+1} reads from C_{t-2} and writes to A_{t+1}. It provides a witness to B_{t-4}.

At time t, a witness was provided to B_{t-1}, so txns x_{t-2}, x_{t-3} were not executed by p, but entered its past.

At time t+1, a witness is provided to B_{t-4}, so p will need to execute x_{t-3}, x_{t-2} in order to compute C_{t-2} (despite being in its past already).

Edit: B_{t-1} should be technically named B_{t-2} to follow the correct indexing convention.

Zoom-in: A formal backbone model for the vProg computation DAG

@hashdag Yonatan Sompolinsky — Sun, 17 Aug 2025 12:56:48 +0000

Nice writeup!

Pls provide example explaining how defining scope as Past(R(x_t)) minus Past(p_{t-1}) does not hold. I’m using p_t as notation for the past of the prog’s state at time {t-1}

Zoom-in: A formal backbone model for the vProg computation DAG

@michaelsutton Michael Sutton — Sun, 17 Aug 2025 12:27:35 +0000

Context/broader-outlook: vprog proposal.

The following presents a minimal formal model for the computational cost of synchronous composability for verifiable programs (vProgs) operating on a shared sequencer. The model defines a “computational scope”—the historical dependency subgraph a vProg must execute to validate a new transaction. The model shows that frequent zero-knowledge proof (ZKP) submissions act as a history compression mechanism, reducing this scope and thereby the cost of cross-vProg interaction.

1. The Model

Consider a system of vProgs interacting through a global, ordered sequence of operations.

vProg & Accounts: A verifiable program, p, is a state transition function combined with a set of accounts it exclusively owns, S_p. Only p has write-access to any account a \in S_p.
Operation sequence (T): The system’s evolution is defined by a global sequence of operations, T = \langle op_1, op_2, \dots \rangle, provided by a shared sequencer. An operation is either a Transaction (x) or a ZK-Proof Submission (z).
Transaction (x): An operation describing an intended state transition. It is defined by a tuple containing:
- r(x): The declared read-set of account IDs.
- w(x): The write-set of account IDs, with the constraint that w(x) \subseteq r(x).
- \pi(x): The witness set. A (potentially empty) set of witnesses, where each proves the state of an account a at a specific time t'. These are provided to resolve the transaction’s dependency scope and are independent of the direct read-set r(x).
ZK-Proof submission & state commitments:
- A proof object, z_p^i, submitted by vProg p, attests to the validity of its state transitions up to a time t. It contains a state commitment, C_p^t.
- Structure: This commitment C_p^t is a Merkle root over the sequence of per-step state roots created by p since its last proof z_p^{i-1} (e.g., from time j to t=j+k).
- Implication: This hierarchical structure is crucial because it allows for the creation of concise witnesses for the state of any account owned by p at any intermediate time between proof submissions.
Execution rule: Any vProg p that owns an account in w(x) must execute transaction x.
Remark on composability: This execution rule enables two primary forms of composable transactions:
1. Single-writer composability: The write-set w(x) is owned by a single vProg, but the read-set r(x) includes accounts from other vProgs.
2. Multi-writer composability: The write-set w(x) contains accounts owned by two or more vProgs. This implies a more complex interaction, such as a cross-vProg function call, and requires all owning vProgs to execute the transaction per the rule above.

2. The Computation DAG

The flow of state dependencies is modeled using a Computation DAG G = (V, E), whose structure is determined dynamically by the global sequence of transaction declarations.

Vertices (V): A vertex v = (a, t) represents the state of account a at time t. The state data for a vertex is not computed globally; instead, it is computed and stored locally by a vProg only when required by a scope execution.
Edges (E): The set E contains the system’s transactions. Each transaction x_t acts as a hyperedge, connecting a set of input vertices (its reads) to a set of output vertices (its writes).
Graph construction: When a transaction x_t appears in the sequence:
1. New vertices are created for each account in its write-set, \{v_{a, t} \mid a \in w(x_t)\}.
2. The transaction x_t itself is added to E, forming a dependency from the set of vertices it reads from to the set of new vertices it creates.
State compression: When a proof z_p^i appears in the sequence, it logically compresses the dependency history for all accounts in S_p. This creates a new, trustless anchor point in the DAG that other vProgs can reference. The historical vertices are now considered eligible for physical deletion by nodes after a safe time delay.
Remark on the write-set constraint: The rule w(x) \subseteq r(x), defined earlier, is enforced to ensure the DAG’s structure is independent of execution outcomes. If a transaction contains a conditional write to an account a_w \in w(x) that fails, the new vertex v_{a_w, t} must still be populated with a valid state. By requiring a_w to also be in the read-set r(x), we guarantee that the prior state of a_w is available to be carried forward.

3. Computational Scope

The scope is the set of historical transactions a vProg must re-execute to validate a new transaction, x_t.

Read vertices (R(x_t)): The dynamic set of graph vertices that the transaction’s declared read-set, r(x), resolves to at time t. For each account a \in r(x), this corresponds to the vertex v_{a,k} with the largest timestamp k < t.
Anchor set (A(p, x_t)): The set of vertices that serve as terminal points for the dependency traversal. For a vProg p executing x_t, this set includes:
1. All vertices whose state is proven by a witness in \pi(x_t).
2. All vertices for which p has already computed and stored the state data from a previous scope execution.
Scope definition: The scope is the set of historical transactions whose outputs are routable to the transaction’s inputs without passing through any known anchor. More formally:
Let \text{routable}(S, A) be the set of all vertices v from which there exists a path to a vertex in the set S that does not pass through any vertex in the set A.

The set of vertices in the scope is then V_{scope} = \text{routable}(R(x_t), A(p, x_t)).

The Scope is the set of all transactions that created the vertices in V_{scope}.

Note: this set can be computed procedurally by a backwards graph traversal, e.g., BFS, starting from the vertices in R(x_t) where the traversal along any path is halted upon reaching a vertex in the anchor set A(p, x_t).
Remark on witnesses and anchor time: Unaligned proof submissions can create “alternating” dependencies. The hierarchical state commitment structure solves this by allowing a transaction issuer to choose a single, globally consistent witness anchor time, t_{anchor}, for all provided witnesses. This transforms the starting point for the scope computation from a jagged edge into a clean, straight line, limiting the scope to the DAG segment between t_{anchor} and t.

Core Contributions

This model provides a formal ground for several key properties of the proposed architecture, centered on sovereignty, scalability, and the emergence of a shared economy.

First, the model provides a formalization of vProg sovereignty. Despite the shared environment, vProgs retain full autonomy. Liveness sovereignty is guaranteed because a vProg’s ability to execute and prove its own state is never contingent on the cooperation of its peers. Furthermore, resource sovereignty is provided by the Scope function, which allows a vProg to price the computational load of any transaction and protect itself from denial-of-service attacks.

Second, the model provides a rigorous framework for analyzing computational efficiency and scalability, which is detailed in the following section. The key insight is that system-wide performance exhibits a sharp phase transition that can be managed by tuning the proof frequency to prevent the formation of a giant dependency component.

Finally, the Scope cost function serves as a primitive for a shared economy. It creates a positive-sum game of “dependency elimination,” where self-interested vProgs are incentivized to compress their history, making them cheaper dependencies for the entire ecosystem. It also provides a foundation for sophisticated economic mechanisms like scope-cost sharing or Continuous Account Dependency (CAD).

Scalability and Phase Transitions

System-wide performance is a direct function of the proof epoch length, F, which we define as the number of transactions between consecutive proof submissions. Assuming a globally coordinated proof frequency for simplicity, the total computational overhead does not degrade linearly but exhibits a sharp phase transition. This means efficiency is maintained by keeping the proof epoch length sufficiently short. The dependency entanglement within an epoch can be analyzed by reducing the timed computation DAG to a timeless, undirected graph G'_F, where vertices are accounts and an edge connects any two accounts that interact within a transaction. In this reduced graph, the size of the largest connected component serves as a proxy for the maximum aggregate scope a single vProg might need to compute.

The formation of this graph can be modeled using the well-known Erdős–Rényi random graph model. Let N be the total number of accounts and q be the probability of a cross-vProg transaction. A classic result for random graphs is the existence of a sharp phase transition: a “giant component” emerges if the edge probability p (proportional to Fq/N^2) exceeds the critical threshold of 1/N. To avoid this, the epoch length must be kept below this threshold, approximately F < N/q. For any F below this value, the largest dependency clusters are small, on the order of O(\log N). This result implies that the computational overhead becomes manageable. Instead of the worst-case scenario where a vProg re-executes the entire epoch’s history—a cost of O(nC), where n is the number of vProgs and C is the ideal cost—the work for an average vProg is the cost of executing its own transactions, plus a small, logarithmically-sized fraction of other transactions.

While real-world transaction patterns may be non-uniform, similar critical threshold phenomena are known to exist in more complex models like power-law graphs. The key insight remains: the model provides a formal basis for managing system-wide efficiency by tuning the proof epoch length F to stay within a region that prevents the formation of a giant dependency component. This efficiency mechanism is the foundation for the positive-sum game described earlier. Self-interested vProgs are incentivized to compress their history frequently, as doing so makes them cheaper and more attractive dependencies for the entire ecosystem. This economic pressure helps justify our initial simplifying assumption of a global proof epoch, as vProgs that fail to keep pace will effectively be excluded from composable interactions due to their high cost.

Concrete proposal for a synchronously composable verifiable programs architecture

@Mitama_Anegokoji Mitama Anegokoji — Sun, 17 Aug 2025 07:11:16 +0000

This proposal offers a strong foundation for synchronous composability in L1, but it needs clarification on risk mitigation mechanisms for lost or corrupted witness data, as well as cross-vProg interoperability standards that guarantee security across different VMs. Furthermore, the gas fee-sharing model should be tested through real-world economic simulations to avoid creating negative incentives like spam transactions or state bloat.

Concrete proposal for a synchronously composable verifiable programs architecture

@dcof — Sat, 09 Aug 2025 18:01:35 +0000

Hey.
Love how this design keeps vProgs fully sovereign while enabling native sync composability — a beautiful way to avoid rollup lock-in while unifying L1 liquidity. Which do you see as the bigger challenge in practice: low proof latency, controlling witness/scope size, or standardizing cross-vProg rules — and how will you tackle it?
BR

Concrete proposal for a synchronously composable verifiable programs architecture

@oudeis — Fri, 08 Aug 2025 16:22:25 +0000

Can’t write in @kaspamd, so replying here:

Igra architecture is fully aligned from the beginning with these ideas, and we will be happy to contribute to the development in any capacity needed.

Can’t wait to get our hands dirty!

Concrete proposal for a synchronously composable verifiable programs architecture

@hashdag Yonatan Sompolinsky — Thu, 07 Aug 2025 17:57:02 +0000

Cowritten with @michaelsutton.

High level concepts, terminology:

Terminology, Solana-inspired: Accounts hold state data, verifiable Programs/vProgs own accounts and define their state transition logic, transactions declare in advance their read/write accounts.
Each vProg is practically a mini zkVM which commits and progresses its state to L1 through its own sovereign covenant.
vProgs have complete sovereignty over their throughput and state size regulation; each vProg defines its own corresponding constants and scale, and in particular regulates its own state growth. A txn requiring permanent storage from vp1 must pay for this according to the gas scale and STORM constants of vp1.
vProgs are mutually trustless: vp2 never relies on correct execution or state availability of vp1.

Composability:

Sync composability feature enables txns that create dependency between accounts (read state of this account, use as input to write to another account) belonging to different vProgs whilst maintaining sole ownership over the state, which includes in particular the ability to enforce cross vProg atomicity.
Reminder why crucial to optimize for sync composability: Without native syncompo, users and liquidity will gradually flow to rollup entities, which offer syncompo and unified execution environments, yet whose inherent incentive is to win it all and remain as single parasitic entity—rollups have no incentive to interop and defragment. Native syncompo vProg design remedies this by optimizing for deployment of vProgs directly on L1 through no intermediaries. By replacing Solana’s account-centric Programs with vPrograms we inherit its coherent standards and unified liquidity without bloating L1 state and minimal full node HW requirements. See this thread https://x.com/divine_economy/status/1884243869136740361
vProgs are synchronously composable w/o compromising their sovereignty; thanks to the account-centric design, the dependency created by a syncompo transaction is limited to the relevant account (and its scope, see below) rather than to the entire vProg, eg an SPL account token transfer doen’t create dependency on rest of SPL accounts. Rule of thumb: roughly speaking, if a txn is parallelizable in a Programs setup, it creates no syncompo dependecy in a vPrograms setup.
The design intends to maximize inclusivenss of vProgs, zkVMs, proof systems. Still, receiving dependency from any vProg is unsafe, some prerequisites are needed eg vProg sourcecode availability (see below), VM familiarity (for on-site execution) and gas scale conversion (point 6), etc. Under research: (i) precise characterization of features requiring vetting; (2) automation of the filtering process through some standard + proof that covenant locks a vProg adhering to the standard (in applied jargon this is referred to as zk-SBOM, which practically seems like rather ordinary usage of Merkle Trees to prove some properties and structure of the program; still a useful term to distinguish proofs of program properties from proofs of correct execution).
Async composability is enabled too, but introduces execution uncertainty to the transactor, since asyncompo implies either lack of atomicity or execution latency in the order of num_of _dependencies*proof_latency. Contrast to the following design goal under syncompo: keep confirmation times in order of L1 sequencing latency rather than proof latencies (without this requirement the architecure challenge is order of magnitude easier). Note that as long as we adhere to this goal there’s no need for timebound proof settlement mentioned here.

Details on sync composability:

The trusless cross-vProg communication requires that a syncompo txn that reads from vp1 and writes to an account owned by vp2 must provide all relevant witness data (that hasn’t been prvsly provided to vp2!), and must gas-pay vp2 for the remaining resources its scope consumes. scope:=state transitions between this txn and backwards to the latest witness that was already zkp-anchored. The latest submitted witness hence the scope depend on the pov of the callee vProg, which is mostly determinable by the structure of the computation DAG (unrelated to blockDAG) but note:
A naive approach towards computation of scope would infer it solely from the topology of the computation DAG, but this fails to account for cases where a txn declared a read to an account yet failed to impelement that read. A read-fail can be treated by (i) requiring txns to begin their execution with reading all declared-read accounts, and (ii) using gas commitments inside txns to reason, without executing the txn, on whether the read instructions are sufficiently gas-funded. It is crucial to additionally be convinced that failure to write to declared accounts creates no negative consequences; this topic is delicate, and requires careful attention, analysis, and separate post.
Storage of witnesses: As noted in article 10 in parenthesis, witness data is assumed to be stored by the receiving vProg. This storage can be defined by convention to be transient and therefore pruned post pruning epoch. The alternative permanent convention is doable too, but seems practically unnecessary.

Validity proofs:

Validity proofs aka zkp have two vital roles in this system: communicating the state to L1, and preventing the explosion of txn scopes due to cascading dependnecies. The lower proof latency is, the smaller scopes become, the cheaper and more feasible sync composable txns become.
Each vProgs has its own, ideally permissionless, set of provers which advance its state through its L1 covenant. In principle, its provers are able to prove the entire execution of syncompo txns, including segments which belong to other vProgs. A vProg thus controls its own liveness.
To enable also the optimistic case where provers are responsive and collaborative, each should be able to submit to L1 conditional proofs regarding its own segment of the execution, and once conditional proofs to all of the txn’s components are submitted, these are stitched together to one proof that can advance each party’s covenant. conditional proof:=proof whose input is the state commitment of the (potentially unproven yet) output of the prvs segment of the txn, and which becomes actionable only once the prvs segment was proven. See relevant post.

Economics:

Syncompo txns create two types of externalities on the callee vProg, one is the witness and scope computation, the other is the computation of the new txn which creates dependency hence introduces sequentiality into the computation. In terms of a stateful node of the callee vProg, the former externality is the added CPU cycles in a separate core that the node runs in parallel, the latter externality is the added computation depth due to dependency/sequentiality. vProgs are advised to define parallelism-aware gas functions, eg the Weighted Area function from this paper.
vProgs which initiate steady frequent sync composable txns with other vProgs (concretely, syncompo_frequency >1/proof_latency) might find it beneficial to initiate and fund account continuous dependency (CAD). When vp1 initiates a CAD from account A1 it owns to vp2, vp2 continuously monitors and computes the state of A1, an externality that is funded and gas-paid by the CAD issuer. Validity proofs might still be a better approach for the vProg since they reduce its users’ costs (=size of txn scopes) of interacting with all other vProgs, in one shot.
Online cost-sharing mechanisms can be applied to smooth and share gas costs among syncompo transactors or/and with the CAD issuer. Assuming such a mechanism means that the CAD issuer doesn’t need to fund the continuous dependency from A1 (a vp1 account) to vp2, rather it merely places an initial deposit which gets continuously refilled by future transactors.

Miscellaneous:

The above design is agnostic to the question how vProgs ensure their state availability. Instead it focuses on the narrower challenge of syncing (caller vProg) substates relevant to each cross-vProg execution, and which are by construction reconstructable from on-chain data, regardless of the availability of the entire state. Observe that loss of full state availability affects no other vProg, per our design (article 10, trustlessness).
In contrast to state availability, vProgs do need a guarantee on the source code availability of vProgs with which they are syncomposable. One way to enforce this is through cryptographic proofs of replication that can attest eg to some fraction of miners holding all relevant vProgs. Such attestation will be used and enforced in the same manner discussed in article 8.
Broadcasting witness data offchain will remove lots of inefficiency and add lots of complexity; if proof latency is too high and witness data too large for feasible high throughput, the complexity might be worth it. One path is to require on-chain witnesses only in case the callee vProg’s proofs are delayed by more than a predetermined threshold. Main caveat: compromises on the design goal of conf times ~ sequencing latency (article 9 above), unless the user of the callee vProg is willing to assume or trust that witness data of caller vProg will never go missing.
Enshrined covenant: To ease coordination of standards across composable vProgs, it makes sense to develop a canonical covenant which compliant vProgs will instantiate (to enforce eg SBOM, see article 8), or/and to deploy a canonical meta vProg that will handle computation of txn scopes. The details of this are under research.

Timeline:

Timeline for yellow paper draft: end of month.
Timeline for production ready testnet: call for developers to enter the L1<>L2 channel and mention their availability and potential contribution Telegram: View @kasparnd

Acknowledgment and thanks to @FreshAir08 and @Hans_Moog for extensive discussions, contributions, and constructive criticism on the architecture.

A Basic Framework For Proofs Stitching

@Hans_Moog Hans Moog — Tue, 05 Aug 2025 13:14:34 +0000

A few remarks:

I don’t think it is enough to prove the correct execution from arbitrary input data to output but for every input, we also need to prove that it is correctly sourced from the previous output to ensure that they were not made up and retain a connection to the DAG of state transitions that lead to the input having that specific value - so while it might not be necessary to do a full zk-proof, we still need to produce a commitment that aggregates the produced output values under i.e. a merkle tree that allows us to then create a proof that a specific input value was sourced from that output / result of the previous subtransaction.
A logic zone can not continue to process further transactions until it learns about the result of the composable transactions, because it needs to eventually roll-back the state transition before applying further changes (atomicity in composable transactions requires information to flow in both directions - forward and backward).
This means that settling intermediary proofs on the L1 with delayed stitching will block the logic zone until that stitching result becomes available + it might complicate the act of advancing the commitment on the L1 because we need to prove that the previously committed value was rolled back.
SP1 does not support runtime modularity / i.e. invoking other programs and working on their result but inputs need to be known upfront, which implies a 2-layer separation of first proving subtransactions and then aggregating the existing results to produce actual commitments to the aggregate effects which has to be a separate zk circuit.
Would it maybe make more sense to only attest to finalized / stitched transactions on the L1 to not have to deal with rollbacks on the L1 and delegate that logic to the proof of the stitcher?

Data Availability Concerns

@Hans_Moog Hans Moog — Mon, 14 Jul 2025 01:24:25 +0000

In other networks, opting out of replicating historical data is usually supported but replicating the entire smart contract state is actually mandatory for nodes to be able to efficiently validate blocks.

Making data retention the default mode of operation also goes a long way as it is not unreasonable to assume that at least some actors are lazy or altruistic enough to just follow best practices and retain data even if they theoretically don’t have to (especially if the underlying protocol limits state-growth to support this mode of operation).

If so, notice that this too is a design choice, and the default L2 client can/should be set up in the same manner we set an L1 node – to save the available state.

I am not saying that this is a show-stopper and you are right that just defaulting to the rule that everybody tracks everything forever like all other networks do would be an easy and straight forward solution.

But this “solution” also means that you inherit the same limitations around state-growth and scalability as all other networks and I was actually assuming that Kaspa was planning to leverage its modularity to build a more scalable and fluid system where it is no longer necessary for “everybody to just globally store and execute everything” (even if separating execution from the L1).

I agree pruning provides a new flavour to the state-availability challenge, I disagree that it is a newly introduced challenge, or that the reliance on social consensus is a new assumption that Kaspa introduces.

I didn’t say that DA is a “new challenge” - what I am saying is that our system is “modular enough” to make this become a problem if we want to fully leverage our modularity and allow actors to only store and execute parts of the global load (that is somehow relevant for them).

Cryptographic proofs-of-replication can be baked into the protocol, alleviating the reliance on social consensus. While this does not guarantee real time retrievability (replicas can still refuse to share the state on demand), this problem appears everywhere in crypto (eg L1 miners refusing to share the UTXO).

What kind of proofs do you envision as this is usually done with things like data availability sampling and data availability committees (utilizing threshold signatures to attest to the availability of data) which seems to not translate well into the realm of PoW.

And, yes you are absolutely right - it is in fact very related to things like mining attacks that withhold data to prevent others from being able to extend the latest chain (i.e. [1912.07497] BDoS: Blockchain Denial of Service).

What makes this tricky is the fact, that this can now be done by a user rather than a miner (who is at least bound by economic incentives to keep its own statements extendable and eventually reveal the missing data to stay relevant for the mining algorithm).

Imagine I spawn up a new logic zone that nobody else tracks (and for which historic data is eventually lost) and then I compose my state with yours (paying whatever fee is necessary to pay for the assumed “externalities” to make this operation dynamically possible) while never revealing my input data / state to anybody else.

This not only makes me the only person on the planet that can prove correct execution and advance the state commitment on the L1 but it also means that if I decide to never reveal the missing input data then everybody else will forever be locked out of accessing that shared state again.

P.S.
Ideally the platform would be rollup-unfriendly, so maybe we should use another term. In the past we used logic-zones as placeholder, and now I propose vApps from Succinct’s white paper. I mean, the entire design efforts are in order to counter rollups, defined (hereby) as logic zones optimized for more vapps joining under their own state, state commitment / proving; as opposed to vapps which are apps with defined logic which will naturally optimize for interoping with other vapps. Eg Arbitrum vs Aave. Perhaps we should elaborate more on the inherent L1-rollup misalignment, for now referring to this quick comment https://x.com/hashdag/status/1886191148533944366

I agree that we should optimize for decentralization rather than specialized infra providers but tbh. I don’t really care how we call things in our discussions as long as we recognize that Kaspas design choices and default parameters result in unique challenges that need to be addressed if we want to securely leverage our modularity.

And what I am furthermore claiming is that solving these problems algorithmically does not work - but they “have to be solved on the social consensus layer” which means that the moment somebody launches a “vApp” that is supposed to be composable with other “vApps” (at some point in the future) then there needs to be a mechanism in place (backed by strong game-theoretic guarantees) that ensures that the state of that vApp is tracked by a sufficiently large group of actors (that will “never” forget its latest state).

Establishing the social consensus that everybody just tracks everything forever absolutely solves this but if that is the goal / basic assumption for L2 nodes then I don’t understand why we even discuss things like atomic sync composability since if everybody is assumed to have access to the state of all other vApps then they can just natively call into each other?

PS: I think that we can do orders of magnitude better than this and actually “solve” not just some but all of the hardest problems around smart contract enabled chains (scalability, state growth and state expiry) but we first need to recognize the problem and the fact that possible solutions will significantly influence and constrain the “open questions” we are currently trying to answer.

Data Availability Concerns

@hashdag Yonatan Sompolinsky — Fri, 11 Jul 2025 08:02:14 +0000

Welcome aboard ser!

In the common good setup state replication is voluntary too. Perhaps you mean that users will opt in to state replication rather than opt out as in the default setup? If so, notice that this too is a design choice, and the default L2 client can/should be set up in the same manner we set an L1 node – to store the available state.

I agree pruning provides a new flavour to the state-availability challenge, I disagree that it is a newly introduced challenge, or that the reliance on social consensus is a new assumption that Kaspa introduces.

Cryptographic proofs-of-replication can be baked into the protocol, alleviating the reliance on social consensus. While this does not guarantee real time retrievability (replicas can still refuse to share the state on demand), this problem appears everywhere in crypto (eg L1 miners refusing to share the UTXO set with new nodes).

P.S.
Ideally the platform would be rollup-unfriendly, so maybe we should use another term. In the past we used logic-zones as placeholder, and now I propose vApps from Succinct’s white paper. I mean, the entire design efforts are in order to counter rollups, defined (hereby) as logic zones optimized for more vapps joining under their own state, state commitment / proving; as opposed to vapps which are apps with defined logic which will naturally optimize for interoping with other vapps. Eg Arbitrum vs Aave. Perhaps we should elaborate more on the inherent L1-rollup misalignment, for now referring to this quick comment https://x.com/hashdag/status/1886191148533944366

Data Availability Concerns

@Hans_Moog Hans Moog — Fri, 11 Jul 2025 07:26:08 +0000

Introduction

Kaspa is pursuing an ambitious architectural goal: moving its entire execution layer off-chain into enshrined roll-ups secured by zk-proofs.

Rather than persisting state and calldata on-chain indefinitely, Kaspa’s Layer 1 (L1) is designed to prune historical data and serve primarily as a data dissemination and anchoring layer.

In this system, roll-ups periodically submit zk-commitments that:

Prove the correctness of off-chain execution.
Enable indirect communication between independent roll-ups through verifiable state transitions.

This reimagining of L1 functionality creates a separation of concerns that brings both promising benefits and challenging implications.

The good

From an L1 perspective, Kaspa’s approach is elegant and efficient:

Avoids state bloat: By not storing all execution data on-chain, the protocol avoids the ever-growing state size that burdens full nodes in many other smart contract platforms.
Lightweight infrastructure: Users and nodes not interested in specific roll-ups are not forced to store or process their data.
Correctness without replication: Thanks to zk-proofs, correctness can be independently verified without everyone re-executing everything.
Selective participation: Only those interested in a particular roll-up need to follow and replicate it, reducing unnecessary overhead for the rest of the network.

In essence, the system aligns computational effort with actual interest, while still preserving security and verifiability through cryptographic proofs.

The bad

However, these benefits come with non-trivial trade-offs:

No full reconstruction from L1: Since the L1 prunes state, it cannot serve as a canonical archive. Reconstructing a roll-up’s latest state requires cooperation from actors who have preserved it.
Withholding risks: If those who hold or mirror roll-up state become inactive or malicious, users may lose access to their funds or be unable to prove ownership/state transitions.
Fragmented DA assumptions: With many independent roll-ups, each potentially operated by different entities, users cannot easily assess the data availability guarantees of the roll-up they’re interacting with.

This introduces a form of informational asymmetry - users may trust a roll-up without realizing that their ability to access their funds depends on the unstated behavior of off-chain actors.

For instance, a user interacting with Rollup A may assume it’s as robustly available as Rollup B, not realizing that the latter is backed by a commercial DA service while the former depends on a small, volunteer-run mirror without much community participation.

And the ugly

At the heart of the data availability (DA) issue lies a game-theoretic dilemma, not just a technical one:

In most traditional blockchains, shared smart contract state is treated as a common good - all nodes replicate it by default, ensuring broad availability.
In Kaspa’s model, state replication is voluntary. Users choose which roll-ups to follow, and by extension, which data to retain. This makes the system highly flexible but also fragile.

Even if a roll-up has sufficient replication today, this could deteriorate over time if interest wanes, or actors exit the network.

This leads us into a classic tragedy of the commons like scenario:

Everyone benefits from someone maintaining data, but no one is individually incentivized to do so for the collective good - especially if they are not directly impacted.

Note: Unlike traditional commons problems, this isn’t just free-riding - it’s structural. Actors may act perfectly rationally by not storing what doesn’t affect them, yet the cumulative result is fragility.

Because there is no global consensus on what data matters or how long it should persist, availability becomes subject to social consensus and economic incentives, not protocol guarantees.

Conclusion and open questions

Kaspa introduces a fascinating shift in blockchain design - from a model of forced consensus and replication to one of voluntary association and market-driven state tracking.

But this raises critical open questions:

How can users trust that state will remain available without mandatory replication?
What incentives (or penalties) can ensure long-term DA without undermining Kaspa’s lean L1 goals?
How will users evaluate the reliability of roll-ups without transparent visibility into their DA infrastructure?

These are non-trivial coordination problems that extend beyond code into social behavior, governance, and incentive design and solving them will (at least in my opinion) be key to Kaspa’s long-term success as a zk-secured, off-chain smart contract platform.

PS: I am going to propose a concrete solution to this problem but since the research post I am writing about this covers a lot of ground and is still expanding in scope, I thought that it makes sense to separate the problem statement from the proposal (and post it already) so they can be discussed independently - maybe somebody has elegant answers that are completely unrelated to my line of thought.

A proposal towards elastic throughput

@hashdag Yonatan Sompolinsky — Tue, 03 Jun 2025 17:01:44 +0000

Re persistent_storage_mass, it seems no different than transient_storage_mass, in that we don’t care about the peak usage just about its regulated growth in time, be it abrupt or gradual (ofc apart from the CPU cost of the storing operation). Applying credit to persistent_storage_mass would imply the credit is never expiring.

Re cpu_mass, I was alluding to a credit that plays on the margin between peak and avg within the scope of an epoch, but better leave that complex optimization aside and admit that in the CPU context, as you said, we are interested in peak and not only aggregate consumption over time.

BTW I’lll mention here another type of mechanism, which I am not necessarily advocating but which we might draw inspiration from: There’s an EVM feature where operations on zero-led addresses enjoy discounted gas costs. https://blockwithanand.hashnode.dev/the-impact-of-leading-zeros-in-ethereum-addresses-on-transaction-costs. At the time Eth teams used this hack to reduce their gas costs, by investing POW in creating zero-led addresses, in order to save on future costs. If this address-POW mechanism simplifies implementation, we can consider using it in order to regulate transient storage by allowing txn issuers eg provers that use such addresses to publish large txns/proofs that exceed the usual block size; the address or POW would need to be renewed every pruning period.

The benefit would be avoiding the need to couple identity of miner and prover/txn issuer, and no need to compromise miner anonymity through block linkage. If you view this path as superior to credit, we can discuss details eg whether this POW needs adjustment.

A proposal towards elastic throughput

@coderofstuff coderofstuff — Mon, 02 Jun 2025 20:11:11 +0000

Applicability to transient storage is unique in that the epoch with which it is relevant is clearly defined (the pruning period). After such an epoch, the transient storage credits are reset and at any time the expected maximum bound for storage use within the pruning period is maintained even if the transient storage limits are made elastic.

However, given your claim above I’m curious what your thoughts are for how this credit system this would be applied to persistent storage mass (which KIP9 applies to; and relates to UTXO storage and is boundless) and to compute mass (related to CPU usage, bounded to 100% cpu capacity but is also technically a resource that can’t be “stored for use later” easily)

On the inherent tension between multileader consensus and inclusion-time proving

@FreshAir08 — Tue, 27 May 2025 11:08:29 +0000

Async interop should generally be discouraged, but the issue there is in my opinion similar only in cosmetics - it would ressemble more a locking of outboxes and such, and of the top of my head as long as all parties know what they are into, it has no effect on others in the system and they can set their rules as they see fit. The difficulty with sync interop is that until the cross tx is settled all other tx of these logic zones cannot settle.
for complex composition cases - I’d kind of say the opposite. The less the probability a tx is proven eventually, the less leeway I think it should get in terms of proving time. Allowing people to create complex transactions that could potentially fail to settle is an attack vector.
Generally speaking, different logic zones can have different timeouts, or even different tx of the same logic zones can have different timeouts, but it’s still imperative that these be globally known - everyone involved directly or indirectly must agree on whether a tx succeeded or failed.
The world I personally dream of is a permissionless world where anyone can participate and open a new zkapp (possibly constrained to standard code infras to maintain security), but their ability to interact with other zkapps is automatically in correlation to their credibility in supplying proofs. i.e. if your zkapp was unable to provide a proof for its part in a cross tx, you would get less and less leeway (and possibly higher fees) every time it happens henceforth - and the opposite.

On the inherent tension between multileader consensus and inclusion-time proving

@Pavel_Emdin Pavel Emdin — Fri, 23 May 2025 08:55:05 +0000

Would it be possible to have a common default deadline (e.g. 5 min), but allow rollups to explicitly set a longer one when needed – for async interop or complex composition cases?
In the interop layer, we’d isolate state impact, so if a cross-rollup proof fails, only the affected zone is rejected, and the rest of the rollup proceeds unaffected.

On the inherent tension between multileader consensus and inclusion-time proving

@michaelsutton Michael Sutton — Thu, 22 May 2025 19:49:02 +0000

Parallelized L1s (e.g., Kaspa’s 10bps block-DAG, advanced multileader designs) inherently offer sub-RTT block time and strong intra-round censorship resistance. A key byproduct of their parallelism is execution uncertainty at inclusion time: transactions are included prior to final global ordering and execution. This uncertainty is not a flaw but an enabler for features such as MEV-resistance strategies which operate by obscuring sequence predictability from block composers. At the same time, for universal synchronous composability across multiple based ZK rollups (often conceived as distinct logic zones, each managing independent state), inclusion-time proving represents a near-ideal: Achieving atomic cross-zone operations for composable txns necessitates complex off-chain coordination—related to our framework for proof stitching—before these operations culminate in L1 settlement. Inclusion-time proving would offer immediate, verifiable L1 commitment for the state transitions resulting from these coordinated efforts.

The inherent counter-duality

The conflict between inclusion-time proving and execution uncertainty is direct:

Inclusion-time proving mandates a known, unambiguous pre-state for proof generation at the moment of L1 inclusion.
Execution uncertainty (resulting from multileader, and conducive to MEV-resistance) implies an indefinite pre-state at L1 inclusion, dependent on eventual sequencing of concurrently processed, potentially contending transactions.

This basic conflict presents a choice. We opt for multileader consensus, embracing its natural execution uncertainty. Consequently, true inclusion-time proving for L1-visible L2 effects (like state commitments) cannot be achieved. Proofs for such effects must therefore be deferred, appearing on L1 only after parallel processing converges and transaction order is sufficiently established to define a clear state.

Proof availability requirement

This inherent gap introduces a critical challenge: L1 has already accepted the transaction data (achieved DA for it) and the system is, in a sense, committed to its potential effects. What if the required proof never materializes? This immediately necessitates a robust proof DA mechanism—the eventual availability of the proof itself must be guaranteed or its absence handled gracefully.

Furthermore, in an ecosystem of autonomous Logic Zones (LGs)—where each LG is responsible for generating proofs for its own operational segments but cannot compel others—atomic cross-LG operations create interdependencies. The successful L1 settlement of such a composite transaction thus becomes reliant on the timely L1 submission and verification of valid proofs from all participating LGs. This critical interdependency, particularly given the autonomy of each LG, naturally leads to an operational model we term “Timebound proof settlement”.

Timebound proof settlement

Under this model, transaction data first achieves L1 DA. Ultimate L1 settlement of its cross-domain effects, however, is explicitly dependent on subsequent L1 verification of ZK proofs, which must be submitted within a defined time window, T, post-L1 sequencing. Confirmation of an L2 operation’s L1 impact is thus its proof-verified settlement within T; failure by any party in a multi-segment operation to provide its proof within this bound means that segment (and potentially the entire atomic operation) fails to settle, with penalties ensuring accountability.

A key implication of timebound proof settlement is the viability of fast, user-side optimistic confirmation well before L1 proof settlement. Unlike inclusion-time proving where prover censorship directly blocks L1 inclusion, here L1 DA of a transaction already binds it to a based rollup’s L1-registered program. Any designated prover failing to subsequently prove such an L1-committed transaction compromises their rollup’s liveness. This incentive structure extends to multi-rollup atomic operations: users running composite execution nodes can optimistically confirm transactions, relying on each participating rollup’s self-interest in maintaining its own liveness by submitting its proof segment. While such “fat node” optimistic confirmation offers immediate feedback, the underlying L1 settlement latency itself—determined by L1 sequencing plus the cumulative L2 proving times—remains crucial. Importantly, as ZK proving technology continues its rapid advance towards near real-time performance (where real-time << 12 seconds…), this L1 settlement latency under timebound proof settlement is poised to significantly decrease, enhancing the model’s practicality.

This timebound proof settlement approach contrasts with embedding full witness DA within L1 transaction payloads, which, while ensuring eventual provability, imposes substantial and constant DA overhead.

The architectural path an L1 takes will profoundly shape its multileader/MEV characteristics and the efficiency of its rollup ecosystem’s composability. Future L1 designs might explore tiered DA/execution models, offering distinct contexts for “uncertain inclusion” and “certain inclusion” (perhaps with different fee structures or trust assumptions). Ultimately, while timebound proof settlement offers a pragmatic path, novel cryptographic approaches (e.g., proofs over partially indeterminate states) could eventually reshape these trade-offs.

KIP Draft: Threshold Address Format for KIP-10 Mutual Transactions

@biryukovmaxim Maxim — Thu, 22 May 2025 08:39:04 +0000

It’s important to note that the conversion from an address to a script public key (SPK) is a one-way process. Once an address is converted to an SPK, the original script cannot be recovered from the SPK hash alone unless all placeholder values are known.

KIP Draft: Threshold Address Format for KIP-10 Mutual Transactions

@biryukovmaxim Maxim — Thu, 22 May 2025 08:23:49 +0000

Introduction

This proposal introduces a new address type for Kaspa that leverages a compact floating-point representation for threshold values. The Float24 format enables efficient encoding of threshold values within P2SH scripts, borrowing functionality while maintaining compatibility with existing address infrastructure.

Motivation

Current address formats don’t efficiently support threshold-based spending conditions as described in KIP-10. By encoding threshold values directly in addresses, we can:

Support borrowing functionality where UTXOs can only be spent by increasing their value
Provide a compact representation for a wide range of threshold values
Maintain compatibility with existing P2SH infrastructure

Specification

Address Version

We propose a new address version for threshold-based scripts:

Version::ThresholdScript = 9

Address Structure

A threshold address consists of:

Network prefix (e.g., “kaspa”, “kaspatest”)
Version byte (9 for ThresholdScript)
Address payload (35 bytes for ECDSA, 35 bytes for Schnorr)

Address Payload Format

The address payload consists of:

Public key X coordinate (32 bytes)
Float24 threshold value (3 bytes)

The Float24 format uses 3 bytes (24 bits) with the following structure:

[1 bit: signature type] [1 bit: Y coordinate parity (ECDSA) or reserved (Schnorr)] [17 bits: mantissa] [5 bits: exponent]

Where:

Signature type bit: 0 for ECDSA, 1 for Schnorr
Y coordinate parity bit:
- For ECDSA: 0 for even Y, 1 for odd Y
- For Schnorr: Always 0 (reserved for future use)
Mantissa: 17-bit unsigned integer (0-131,071)
Exponent: 5-bit unsigned integer (0-31)

The threshold value is calculated as:

threshold = mantissa * (2^exponent)

Range of Representable Values

With this format, we can represent:

Minimum value: 0 * 2^0 = 0 sompi (allowing zero-threshold conditions)
Maximum value: 131,071 * 2^31 ≈ 281.4 trillion sompi

This range easily covers most practical threshold values, though it falls short of the maximum possible Kaspa supply of 29 billion KAS (2.9 * 10^18 sompi).

Script Template

The address automatically generates a P2SH script with the following pattern:

For non-zero thresholds:

OP_IF
    OP_CHECKSIG[_ECDSA]
OP_ELSE
   OP_TXINPUTINDEX OP_TXINPUTSPK OP_TXINPUTINDEX OP_TXOUTPUTSPK OP_EQUALVERIFY
   OP_TXINPUTINDEX OP_TXOUTPUTAMOUNT
    OP_SUB
   OP_TXINPUTINDEX OP_TXINPUTAMOUNT
   OP_GREATERTHANOREQUAL
OP_ENDIF

Examples

Example 1: Zero Threshold with Schnorr

Public key X coordinate: 5fff3c4da18f45adcdd499e44611e9fff148ba69db3c4ea2ddd955fc46a59522 (32 bytes)
Signature type: Schnorr (1)
Y coordinate bit: 0 (reserved for Schnorr)
Threshold: 0 sompi
Float24: [1][0][00000000000000000][00000] (mantissa=0, exponent=0)

Example 2: Small Threshold (1,024 sompi) with Schnorr

Public key X coordinate: 5fff3c4da18f45adcdd499e44611e9fff148ba69db3c4ea2ddd955fc46a59522 (32 bytes)
Signature type: Schnorr (1)
Y coordinate bit: 0 (reserved for Schnorr)
Threshold: 1,024 sompi
Float24: [1][0][00000000000000001][00010] (mantissa=1, exponent=10)

Example 3: Medium Threshold (100,000 sompi) with ECDSA (Even Y)

Public key X coordinate: ba01fc5f4e9d9879599c69a3dafdb835a7255e5f2e934e9322ecd3af190ab0f6 (32 bytes)
Signature type: ECDSA (0)
Y coordinate bit: 0 (even Y)
Threshold: 100,000 sompi
Float24: [0][0][00000000001100100][00000] (mantissa=100000, exponent=0)

Example 4: Medium Threshold (5,242,880 sompi) with ECDSA (Odd Y)

Public key X coordinate: ba01fc5f4e9d9879599c69a3dafdb835a7255e5f2e934e9322ecd3af190ab0f6 (32 bytes)
Signature type: ECDSA (0)
Y coordinate bit: 1 (odd Y)
Threshold: 5,242,880 sompi
Float24: [0][1][00000000000000101][00020] (mantissa=5, exponent=20)

Example 5: Large Threshold (~1 million KAS) with Schnorr

Public key X coordinate: Public key X coordinate: 5fff3c4da18f45adcdd499e44611e9fff148ba69db3c4ea2ddd955fc46a59522 (32 bytes)
Signature type: Schnorr (1)
Y coordinate bit: 0 (reserved for Schnorr)
Threshold: 99,999,547,392 sompi (~999,995 KAS)
Float24: [1][0][10111010010000111][10100] (mantissa=95367, exponent=20)

Alternative Approaches

Alternative 1: Adjusting Bit Allocation for Full Range Coverage

One alternative approach would be to adjust the bit allocation to provide coverage for the full range of possible Kaspa values. By moving 1 bit from the mantissa to the exponent, we could use:

[1 bit: signature type] [1 bit: Y coordinate parity] [16 bits: mantissa] [6 bits: exponent]

With this adjustment:

Mantissa: 16-bit unsigned integer (0-65,535)
Exponent: 6-bit unsigned integer (0-63)

The maximum representable value would be:
65,535 * 2^63 ≈ 6.04 * 10^23 sompi

This would easily cover the maximum Kaspa supply of 2.9 * 10^18 sompi with significant headroom. The tradeoff would be slightly less precision in the mantissa (16 bits instead of 17 bits), but 65,535 distinct mantissa values should still provide sufficient granularity for practical threshold purposes.

Alternative 2: Non-Power-of-2 Exponent Base

Another alternative would be to use a non-power-of-2 base for the exponent, such as 10:

threshold = mantissa * (10^exponent)

This would make the values more human-readable and potentially allow for more intuitive threshold settings. However, this approach has significant drawbacks:

Performance Penalties: Computing powers of 10 is more computationally expensive than powers of 2, which can be implemented as simple bit shifts.
Precision Issues: Powers of 10 cannot be represented exactly in binary, potentially leading to rounding errors.

For these reasons, the power-of-2 approach is preferred for the Float24 format.

Benefits

Practical Range: Float24 can represent values from 0 sompi to well beyond most practical threshold needs
Zero Threshold Support: Enables simple same-address spending without value constraints
Signature Type Choice: Allows selection between ECDSA and Schnorr signatures
Complete Public Key Information: Preserves Y coordinate parity for ECDSA keys

Implementation Considerations

Bech32 Encoding: The complete 35-byte payload would be encoded using bech32
Script Generation: Wallet software would need to generate the appropriate script based on the Float24 value and signature type
Validation: Address validation would need to check the Float24 format is valid
Compatibility: Existing software would need updates to recognize and handle the new address type

Summary

The Float24 Threshold Address format provides an efficient way to encode threshold values directly in Kaspa addresses, enabling powerful compounding and borrowing functionality without requiring consensus changes. This proposal builds on the transaction introspection capabilities introduced in KIP-10 while maintaining compatibility with existing P2SH infrastructure.

By standardizing this address format, we can enable a new class of applications that leverage threshold-based spending conditions, particularly for mining pools and other services that benefit from auto-compounding UTXOs. The support for both ECDSA and Schnorr signatures, along with a wide range of threshold values, provides flexibility for various financial applications.

A Basic Framework For Proofs Stitching

@FreshAir08 — Wed, 02 Apr 2025 12:08:08 +0000

Introduction

In an atomic synchronous composability model, a single transaction may invoke execution in several logic zones (conceptually, smart contracts or rollups). A priori, synchronous composability ambitions appear to be at odds with ZK execution scaling—if provers and verifiers are forced to ZK prove and verify such involved transactions as a whole, the separation to distinct and independent logic zones will effectively collapse to a single monolithic design.

This post outlines a proof of concept for how provers of various logic zones can cumulatively create proofs for multi logic zone transactions, while each is only required to provide proofs limited to executions of code in their respective logic zone. This consists an initial but essential step to potentially prevent the aforementioned collapse.

The main idea is stitching in a coherent manner several heterogenous zk proofs. In that regard, the framework bears some resemblance to works like legosnarks, but unlike those, merely uses Snark in a black box manner.

Subtransactions

In our setting, multiple logic zones execute within a single transaction, sequentially calling on one another, where the initial logic zone and its input arguments are determined by the transaction’s contents. Each zone maintains its own internal state, isolated from others, which can neither read from nor write to it.

The execution trace derived from the above can be broken into continuous segments I refer to as subtransactions. Each subtransaction represents a continuous block of execution in one logic zone and is executed in a strict sequence, with a call stack managing the flow between them. To account for atomicity, the call stack is extended with a special element \bot, that when pushed to the stack denotes that the transaction as a whole has failed.

A subtransaction can be categorized by three elements:

Inputs:

The internal state of the logic zone executed, at the outset of the subtransaction.
The input call stack, with the executed logic zone and the current local variables (possibly including a program counter) stored as a pair at the stack’s top. This call stack allows managing proper context switching between the subtransactions.

Outputs:

The updated internal state of the executed logic zone, after execution of the subtransaction.
The output call stack, capturing any changes made to the call stack as a result of a call to a new logic zone (update of the top’s local variables, and a push of a new pair to the stack consisting of the called logic zone, and the local variables initialized by the call’s arguments) or a return from one (A pop of the current element of the stack, and an extension of the new top’s local variables with a new variable denoting the return value).

Ordinal Number:

A subtransaction is assigned an index representing its position in the overall sequence of execution. This explicit ordering will be useful for efficiently stitching subtransactions together.

Zero-Knowledge Proof Messages

A proof message is constructed as a tuple

(index, inputs, outputs, \pi),

conceptually representing a subtransaction’s execution.

Namely:

index: ordinal number i of the subtransaction.
inputs: The starting inputs, including the executed logic zone’s internal state and the input call stack.
outputs: The resulting outputs, including the updated internal state and the output call stack.
\pi: a zero-knowledge proof demonstrating that, given these inputs, a subtransaction with index i correctly produces the specified outputs.

The logic zone responsible for executing the subtransaction, as well as the next logic zone in line, are implicitly identified within the call stack data.

Proofs can be internally valid (i.e. \pi verifies the subtransaction as stated correctly) even if they do not represent a computation that occurs at the transaction’s proper execution trace. Proofs hence need be treated as conditional, and not finalized until they are “stitched” (see below) together with others up to the transaction’s conclusion.

Proving Conditional Proofs

Provers each specialize in a designated logic zone and produce proofs for its corresponding subtransactions.

To generate a proof of a certain subtransaction’s execution, a prover needs to have the inputs of that subtransaction. These intermediary values may depend on the results of subtransactions in other logic zones. Provers could acquire the results of these intermediary computations, in one of two ways:

Communicate intermediary results of subtransactions with each other via an offchain provers’ network.
Execute (but not prove) subtransactions of other logic zones in order to derive the intermediary results they require by themselves.

Proofs Stitching

All proof messages, once submitted and verified, are stored in a public database. They are assumed to be submitted in an unordered manner. To finalize a transaction, a sequence of proofs need to be sequentially stitched with each other, in a manner such that the outputs and inputs are consistent, from the initial subtransaction and the global state on all associated logic zones at its inception, until either the stack is empty, or alternatively, a \bot has been pushed to it.

An implementation for stitching is given below.

class TransactionStitcher:
    FAILURE_SIGN = '⊥'

    class StitchedChain:
        def __init__(self, initial_call_stack, initial_states):
            """
            :param initial_call_stack: The initial call stack object. It must include a field
                                       'currentLogicZone' that identifies the active logic zone.
            :param initial_states: A dict mapping each logic zone to its internal state at transaction inception.
            """
            self.proof_list = []  # List of proofs in forward order.
            self.latest_writes = dict(initial_states)  # zone -> internal state (latest updates)
            self.current_call_stack = initial_call_stack  # The current call stack.

    def __init__(self, initial_stack, initial_states):
        """
        :param initial_stack: The initial call stack (an object with a field 'currentLogicZone').
        :param initial_states: A dict mapping each logic zone to its initial internal state.
        """
        # proofs_by_index maps an index to a dict:
        # key: (internalState, callStack) from the proof's inputs, value: the proof.
        self.proofs_by_index = {}
        self.stitched_chain = self.StitchedChain(initial_stack, initial_states)
        self.initial_states = dict(initial_states)  # For failure case.

 # --- Main Method ---

    def submit_proof_message(self, proof):
        """
        Processes a new publicly verified proof  by storing it by index and attempting
        to extend the active forward-growing chain.
        Outputs the result of the transaction if it can be fully stitched together

        :param proof: A dictionary representing a proof with keys:
            - 'index': int, the subtransaction's ordinal index.
            - 'inputs': dict with keys:
                  'callStack': an object with a field 'currentLogicZone',
                  'internalState': the state of the active logic zone at the start.
            - 'outputs': dict with keys:
                  'callStack': an object (same structure as inputs['callStack']),
                  'internalState': the updated state after execution.
            - 'π': the zero-knowledge proof data (not used in stitching logic).
            - 'callstack': an object representing the call stack; it must include a field
                           'currentLogicZone'.
        """
        self.store_proof(proof)

        while True:
            candidate = self.lookup_candidate()
            if candidate:
                self.extend_chain_with_candidate(candidate)
                if self.is_candidate_ending(candidate):
                    if self.is_failure(candidate['outputs']['callStack']):
                        return self.initial_states  # Failure: return initial state.
                    else:
                        return self.stitched_chain.latest_writes  # Success: return latest writes.
            else:
                break
        return None
    # --- Helper Functions ---

    def is_empty(self, call_stack):
        """
        Determine if the call stack is empty.
        (Implementation left out – should return True if call_stack represents an empty stack.)
        """
        pass

    def is_failure(self, call_stack):
        """
        Determine if the call stack signals failure (i.e. contains FAILURE_SIGN as the logic zone at the top).
        (Implementation left out.)
        """
        pass
    def is_candidate_ending(self, proof):
        """
        Returns True if the proof's output call stack indicates termination:
        either an empty call stack (success) or a failure signal.
        """
        cs = proof['outputs']['callStack']
        return self.is_empty(cs) or self.is_failure(cs)

    def get_candidate_key(self, proof):
        """
        Returns a tuple (internalState, callStack) from the proof's inputs for dictionary lookup.
        """
        return (proof['inputs']['internalState'], proof['inputs']['callStack'])

    def store_proof(self, proof):
        """
        Stores the proof in the proofs_by_index dictionary.
        """
        index = proof['index']
        key = self.get_candidate_key(proof)
        self.proofs_by_index.setdefault(index, {})[key] = proof

    def lookup_candidate(self):
        """
        Looks up and returns a candidate proof for the next index using the expected inputs,
        or returns None if no candidate exists.
        """
        active_zone = self.stitched_chain.current_call_stack.currentLogicZone
        expected_state = self.stitched_chain.latest_writes.get(active_zone)
        expected_stack = self.stitched_chain.current_call_stack
        next_index = len(self.stitched_chain.proof_list) + 1
        candidate_key = (expected_state, expected_stack)
        return self.proofs_by_index.get(next_index, {}).get(candidate_key, None)

    def extend_chain_with_candidate(self, candidate):
        """
        Extends the stitched chain with the given candidate proof and updates the chain's state.
        """
        self.stitched_chain.proof_list.append(candidate)
        self.stitched_chain.current_call_stack = candidate['outputs']['callStack']
        active_zone = self.stitched_chain.current_call_stack.currentLogicZone
        self.stitched_chain.latest_writes[active_zone] = candidate['outputs']['internalState']

Final Notes

The precise identity of TransactionStitcher in the ecosystem remains to be finalized. It could be invoked within each logic zone separately, at a single stitching specialized logic zone, or even delegated to the L1. Regardless, it must be aware of the standard used to encode data (inputs and outputs) in proof messages.
Whole transactions could be proven ahead of time and stitched together in a similar manner. Transactions notably form a dependency DAG between themselves according to their associated logic zones. Two branches of this DAG could potentially advance themselves independently of the other.
Proof messages of the same logic zones will likely be submitted by provers in batches (likely even including messages from several transactions). Parsimonious encoding of these batches, and correspondingly decoding their contents, warrants more detailed discussion.
A more general setting could consider parallelizable calls within the transaction. Such intra-transaction parallelism opens up various questions on data races prevention, determinism, and read and write permissions, which at the moment do not appear justified from a practical perspective.
Challenge for the reader: does a proof message truly must commit to the index of the subtransaction? does it need the state of the stack in its entirety?

Crescendo Hardfork discussion thread

@reshmem Ro Ma — Wed, 05 Mar 2025 19:42:14 +0000

Being tested … Hope it will be ok

Crescendo Hardfork discussion thread

@Jacek_Kozicki Jacek Kozicki — Thu, 27 Feb 2025 22:12:19 +0000

Any decision about KIP-15?

On the design of based ZK rollups over Kaspa's UTXO-based DAG consensus

@someone235 Ori Newman — Sun, 23 Feb 2025 15:17:32 +0000

I want to emphasize the fact that this opcode is stateful, which breaks the assumption that a script validity is dependent only on the transaction itself and its previous UTXOs. This means that a transaction that spends a UTXO with such opcode might be invalidated without an intentional double spend, which might harm the UX of the recipient.

This can be dealt by introducing a concept similar to coinbase maturity for such transactions, or developing an off-chain mechanism to help users identify such transactions (and dependent transactions) so they can set a higher confirmation time.

Crescendo Hardfork discussion thread

@reshmem Ro Ma — Sun, 23 Feb 2025 12:54:08 +0000

Dear community - I propose to consider including KIP-15 to the Crescendo HF.

The change is tiny (literally few lines of code) but still IMHO very powerful.

KIP 15 discussion thread

@someone235 Ori Newman — Sun, 23 Feb 2025 09:07:49 +0000

This is the discussion thread for KIP 15

Additional practical considerations re hash function and zk opcodes

@superMainnet — Wed, 12 Feb 2025 18:14:53 +0000

it does seem PLONK is the right place to invest resources beyond just shipping for the moment. A culture of hard forks seems to work against a chain’s interests so elliptic curve-based systems in general seem problematic

From the STARK end, JOLT is cool but from what I understand there’s still alot of precompile tinkering in that kind of system

Additional practical considerations re hash function and zk opcodes

@reshmem Ro Ma — Thu, 06 Feb 2025 09:25:10 +0000

yeah, I know it too … any suggestions ? PLONK does not require TS per APP … but Groth16 is still state of the art in the space. …

Additional practical considerations re hash function and zk opcodes

@reshmem Ro Ma — Thu, 06 Feb 2025 09:22:57 +0000

Thank you for reading !

A proposal towards elastic throughput

@hashdag Yonatan Sompolinsky — Wed, 05 Feb 2025 23:08:47 +0000

In the following post I’m discussing transient storage as one type of resource that is capped per block. The proposed method can be applied to other resources as well, e.g., persistent_storage_mass, compute_mass.

Prepaid Transient Storage: A Proposal

Fullnodes on commodity hardware require strict transient storage limits. Currently, a 200GB cap per pruning epoch (~42 hours) is enforced. The enforcement mechanism uniformly distributes this cap across blocks, allocating an equal fraction of storage per block. This approach limits flexibility and prevents occasional large transactions from utilizing excess available storage, even when the global transient storage cap is not exceeded.

Problem with the current enforcement mechanism

The current mechanism enforces a fixed per-block limit:

transient_storage_mass_epoch_cap / total_blocks_per_epoch

This method prevents miners from accommodating natural fluctuations in transaction size. If some blocks use less than their allocated storage, the excess cannot be transferred to other blocks. As a result, transactions requiring more storage than a single block’s allocation are not feasible, even when total transient storage remains within the global cap.

Example usecase 1: supporting high peak txn demand (aka elastic throughput)

If a miner expects high transaction demand within an epoch, such as during peak hours, or needs to guarantee block space for users who have prepaid for transaction approval, they can mine underutilized blocks at the beginning of an epoch. This reserves transient storage for future blocks within the same epoch, ensuring that miners can handle anticipated transaction surges efficiently and allocate block space as needed. (Implicitly, I’m assuming here the method is applied to compute mass as well.)

The idea to support elastic throughput came to me from Gregory Maxwell and Meni Rosenfeld’s elastic block cap ideas (around the big block debates), https://bitcointalk.org/index.php?topic=1078521.msg11517847#msg11517847. The gap between max capacity and peak capacity is of greater importance in Bitcoin vs Kaspa, since the former (supposedly) employs no pruning. Practically though, Kaspa’s pruning epoch of ~42 hours corresponds to more than 4 months’ worth of data growth in Bitcoin. In short, the peak vs avg gap is relevant to Kaspa in-pruning-epoch as well.

Example usecase 2: supporting native STARK rollups

A zk rollup entity may seek to implement native STARK verification using arithmetic field operations. Unlike SNARKs, STARKs do not require a trusted setup and offer quantum resistance, making them especially attractive for zk rollup infra. However, STARK proofs and the verifier size are significantly larger than SNARK proofs and verification scripts, potentially exceeding a few hundreds of KBs. Under the current enforcement mechanism, such proofs may not fit within a single block, making STARK-based rollups cumbersome or requiring them to go through a SNARK reduction, which is a legit construction, though slightly compromises the trustless property (it’s not too bad, since PLONK SNARK setup is universal updatable).

Proposed Solution: prepaid transient storage

To enable occasional publication of large blocks while maintaining the global transient storage cap, miners should be able to accumulate transient storage credits by underutilizing previous blocks. The credits here are used metaphorically, as a conceptual concept within the consensus/fullnode.

When a miner produces a block B, the transient storage consumed is recorded as transient_storage_mass(B). If transient_storage_mass(B) < transient_storage_mass_cap, the difference is stored as credit. In a future block X, the miner may prove via digital signature that it is the miner of block B, and utilize:

transient_storage_mass_cap + (transient_storage_mass_cap - transient_storage_mass(B))

Generalizing this across multiple previously mined blocks B_1, ..., B_n, the total allowable transient storage in block X is:

C = (n+1) * transient_storage_mass_cap - Σ transient_storage_mass(B_i)

The full node then deducts the usage proportionally from the previously mined blocks:

transient_storage_mass(B_i) -= C/n

This mechanism enables miners to accumulate storage credits and later use them for transactions requiring more storage in a single block, ensuring better resource allocation while adhering to the global cap.

Elastic throughput and DAGKNIGHT (DK)

Recall that larger blocks propagate slower, which widens the DAG. Fortunately, the DK protocol can handle dynamic DAG widths, by readjusting the parameter k in real time. Even with DK, some hard cap on individual blocks’ sizes must be applied, e.g., each block shouldn’t exceed 2 MB.

Solo miners and the prepaid approach

One not accustomed to Kaspa’s high block creation rate – upcoming 10 per second – might find this whole approach of prepaid block space awkward. However, with 10 bps and north, it is likely that the mining market will change and adjust. In particular, some service providers – e.g., wallets, rollup/prover teams – may find it profitable to either mine only or primarily their own users’ txns or to engage in some agreement with existing miners. This gives rise to the notion that mined blocks – the economics of mining txns – will sometimes reflect the needs of specific entities and sectors, alongside ordinary generic mining nodes. (All of the latter rambling describes offchain economics; no entity will receive privileged treatment from consensus’ POV.)

Notes

Emphasizing again that this approach can be applied to other resource constraints as well, such as persistent storage and compute mass limits. Though, it seems particularly relevant for transient storage.
One caveat of this approach is that, by providing proof of mining of previous blocks, miners link their mined blocks, thereby reducing their anonymity. However, most miners seem to not actively conceal their identity, so this is unlikely to be a significant issue.
Post pruning all blocks’ credits must be zeroed, b/c the global transient storage cap is relevant for and enforced per pruning epochs (thank you @coderofstuff for this comment and general proofreading)