This article keeps the idea simple and practical. You’ll get a clear definition, reasons to use (and not use) the approach, and a generic .NET reference you can adapt to almost any context.
A cell is not “a microservice” and it’s not “a region”. A cell is a self‑sufficient slice of your workload that can serve a subset of traffic on its own. Think of it as a small copy of the system’s operational shape: API entry points, background workers if you have them, a cache if you need it, and this is the part people often underestimate its own state or a clearly separated state partition. The cell should be something you can scale, deploy, throttle, and even drain out of rotation without breaking the global system.
A typical view looks like this:
flowchart LR U[Clients] --> R[Cell Router / Gateway] R --> C0[Cell 00: API + Workers + Cache + Data partition] R --> C1[Cell 01: API + Workers + Cache + Data partition] R --> C2[Cell 02: API + Workers + Cache + Data partition] R --> CX[...]
The router decides which cell receives a request. The decision is usually driven by a partition key: tenant id, account id, region, cohort, or any other stable attribute that matches how your traffic naturally splits.
It helps to be explicit about what CBA doesn’t magically solve. It isn’t a synonym for microservices. You can have dozens of microservices and still have one shared database and one shared failure domain. It also isn’t automatically “multi‑region”; cells can live inside a single region if your main goal is isolation rather than geo redundancy. And it isn’t Domain‑Driven Design. DDD partitions by meaning, ownership, and business language. Cells partition by operational isolation and blast radius. They can coexist nicely, but they answer different questions.
The main value of CBA is not theoretical elegance. It’s everyday operational safety. When you isolate the workload into cells, you get failure containment almost for free. If cell‑03 is having trouble maybe a bad deployment, maybe a hot tenant, maybe a misbehaving downstream dependency you can reduce or remove traffic to that one cell while the rest keeps working. Capacity may go down, but the system doesn’t collapse into a single global incident. Deployments become calmer as well. Instead of releasing to the whole fleet, you release to a single “canary cell”, observe, and then promote. This sounds like canary deployments in general, but the difference is that a cell gives you a bounded unit where not only the API binaries change, but also the local cache, queues, workers, and data partition behave together. You are testing a full slice of reality. Finally, scaling becomes more predictable. If load spikes are localized, often true, in multi‑tenant systems you can scale the impacted cells without resizing everything else, which is healthier for cost and for operational focus.
Cell‑based architecture is most compelling when your workload already “wants” to be partitioned. If you serve multiple tenants or customer groups with different behaviors, CBA gives you a clean way to prevent one from hurting the others. If you operate in a regulated or mission‑critical environment where a broad outage is unacceptable, cells provide an isolation boundary that is easy to reason about in incident response. And if deployments are routinely stressful because a change can take down too much at once, the canary‑cell rollout model is a concrete way to reduce risk without slowing delivery. In other words, CBA shines when you can answer two questions with confidence: “what is my partition key?” and “can I keep state isolated per partition?”
CBA is not cheap. The thing you gain independence comes from the thing you pay duplication.
If your system is small enough that failures are manageable with simpler patterns, cells are likely overkill. If your traffic does not partition naturally, routing becomes arbitrary, and you’ll spend your time arguing about why a request should go to cell‑04 rather than cell‑07. If you cannot isolate state and you’re not ready to change that, cells won’t deliver their promise; you’ll end up with replicated stateless tiers all depending on the same stateful bottleneck.
A good practical check is this: if your “cells” share a database in a way that allows a global lock, a global schema migration, or a global saturation to take everything down, then you don’t have cell isolation yet. You have replicas.
People often start from routing and deployments because they’re visible and exciting. The real architecture work is state. The cleanest model is database per cell, because it gives you true isolation and straightforward failure boundaries. The more common compromise is a shard per cell: same database technology, but strict partitioning so that each cell owns its slice. Some teams start with a shared database and a strong partitioning discipline as a transition phase, but you should treat that as a temporary step. Shared state tends to reintroduce shared fate over time. State isolation also implies a mindset shift: cross‑cell synchronous calls should be rare. When information needs to flow between partitions, asynchronous integration (events, outbox/inbox patterns, projections) keeps the isolation boundary intact.
This section stays deliberately use‑case‑agnostic. Whether you run on Kubernetes, VMs, or managed services, the logical responsibilities remain the same. You’ll typically have a cell router at the edge and a replicated cell workload behind it. A control plane is optional at the beginning, but it becomes valuable once you need rebalancing and operational automation. In .NET, a pragmatic router can be built with YARP (Yet Another Reverse Proxy). YARP is a production‑grade reverse proxy that supports routing, transforms, health checks, and dynamic configuration.
The basic idea is simple:
X-Tenant-Id)There are two common ways to assign cells. A deterministic hash is the easiest place to start. It requires no lookup store and it gives you stable placement across restarts. The trade‑off is rebalancing: moving a tenant between cells becomes a project. A lookup table (tenant → cell) is more flexible. It enables hot‑tenant isolation and controlled migrations, but it introduces a dependency you must make highly available and fast. In practice, teams often start with hashing and later add a mapping layer once operational needs demand it. The reference below uses hashing because it’s the smallest thing that can work.
Here is a minimal appsettings.json excerpt that defines one cluster per cell. The addresses are placeholders; in Kubernetes they could be service DNS names, elsewhere they can be regular endpoints.
{
"ReverseProxy": {
"Routes": {
"all": {
"ClusterId": "byCell",
"Match": { "Path": "/{**catch-all}" }
}
},
"Clusters": {
"cell-00": {
"Destinations": { "d1": { "Address": "http://cell-00-api/" } }
},
"cell-01": {
"Destinations": { "d1": { "Address": "http://cell-01-api/" } }
}
}
}
}
For a large number of cells you will not want to hardcode routes and clusters. You’ll generate them from a config store or service discovery. But starting simple helps you validate the pattern quickly.
The router needs to compute a cell for each request. The snippet below reads a partition key from X-Tenant-Id, hashes it, and injects X-Cell-Id. This is intentionally generic: replace the header with whatever key makes sense in your system.
using System.Security.Cryptography;
using System.Text;
public sealed class CellAssignmentMiddleware
{
private readonly RequestDelegate _next;
private readonly int _cellCount;
public CellAssignmentMiddleware(RequestDelegate next, int cellCount) =>
(_next, _cellCount) = (next, cellCount);
public async Task Invoke(HttpContext ctx)
{
var key = ctx.Request.Headers["X-Tenant-Id"].ToString();
if (string.IsNullOrWhiteSpace(key))
{
ctx.Response.StatusCode = StatusCodes.Status400BadRequest;
await ctx.Response.WriteAsync("Missing X-Tenant-Id.");
return;
}
var cellId = ComputeCellId(key, _cellCount); // "cell-03"
ctx.Items["cell.id"] = cellId;
ctx.Request.Headers["X-Cell-Id"] = cellId;
await _next(ctx);
}
private static string ComputeCellId(string key, int cellCount)
{
var bytes = SHA256.HashData(Encoding.UTF8.GetBytes(key));
var value = BitConverter.ToInt32(bytes, 0) & int.MaxValue;
var idx = value % cellCount;
return $"cell-{idx:00}";
}
}
A straightforward technique is to have one route per cell that matches on X-Cell-Id. It is verbose but it is extremely easy to understand and debug. Later, once you have a control plane, you can switch to dynamic routing.
{
"ReverseProxy": {
"Routes": {
"cell00": {
"ClusterId": "cell-00",
"Match": {
"Path": "/{**catch-all}",
"Headers": [{ "Name": "X-Cell-Id", "Values": ["cell-00"] }]
}
},
"cell01": {
"ClusterId": "cell-01",
"Match": {
"Path": "/{**catch-all}",
"Headers": [{ "Name": "X-Cell-Id", "Values": ["cell-01"] }]
}
}
}
}
}
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddReverseProxy()
.LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"));
var app = builder.Build();
app.UseMiddleware<CellAssignmentMiddleware>(cellCount: 10);
app.MapReverseProxy();
app.Run();
This is enough to demonstrate the core behavior: a request enters the router, gets assigned a cell, and is forwarded to the correct backend.
Your APIs and workers inside a cell should feel like normal .NET services, with one important constraint: they must be cell‑aware for state and telemetry.
If you run on Kubernetes, the simplest approach is to inject CELL_ID as an environment variable. On VMs, it can come from configuration. Either way, treat the cell id as part of the service identity.
From there, select the correct data partition. The cleanest pattern is to resolve connection settings from a small “cell context” service. Whether that context picks a different connection string, schema, or shard key depends on your storage strategy, but the principle is the same: a cell must not quietly drift into shared state.
Once you introduce cells, incident response becomes “which cell is sick?”. That only works if your telemetry carries cell.id consistently.
In .NET, OpenTelemetry is a practical default. Add the cell id as a resource attribute at startup, so every trace and metric inherits it. The exporters and backend are your choice; the key is that cell identity becomes a first‑class dimension.
using OpenTelemetry.Resources;
using OpenTelemetry.Trace;
var cellId = Environment.GetEnvironmentVariable("CELL_ID") ?? "cell-unknown";
builder.Services.AddOpenTelemetry()
.ConfigureResource(r => r.AddService("sample.cell.api")
.AddAttributes(new Dictionary<string, object> { ["cell.id"] = cellId }))
.WithTracing(t => t
.AddAspNetCoreInstrumentation()
.AddHttpClientInstrumentation()
.AddOtlpExporter());
Even if you do not adopt OpenTelemetry today, keep the idea: every log line, trace, metric, and alert should know which cell it belongs to.
The most satisfying “cell moment” is the day you stop doing big‑bang releases. A simple operational loop is enough. Deploy the new version to one canary cell. Route a small cohort to it (or only internal traffic). Watch the basics: error rate, latency, saturation, and any business signals that matter. If it looks healthy, promote to the next cells in waves. If it doesn’t look healthy, drain the canary cell and roll back without impacting global traffic. You still have work to do, but you’ve turned a platform‑wide incident into a contained issue.
It’s common to worry that cells “fight” Domain‑Driven Design. In practice they don’t, as long as you don’t try to force one partitioning dimension to do everything. DDD tells you how to structure business ownership and boundaries. Cells tell you how to contain operational failures. Sometimes a cell will contain multiple bounded contexts because the partition key is tenant or region rather than domain. Other times a single bounded context might be replicated across cells. That is fine. The design becomes confusing only when you pretend these are the same concept.
You know your cell‑based architecture is real when a broken cell behaves like a capacity issue, not like a correctness issue. You can drain it, keep the rest serving, and your operational tools can tell you, within minutes, what went wrong and where. If you can’t confidently drain a cell, or if draining a cell breaks global behavior, you’re not done yet. The main value of CBA is not the diagram; it’s the operational control it gives you on a bad day.
Once the basic pattern works, you’ll almost certainly want two upgrades. First, introduce a mapping layer so you can move tenants between cells without changing hash functions. Second, automate the lifecycle: provisioning a new cell should be infrastructure‑as‑code, not a weekend project. But don’t start there. Start with a single router, a handful of cells, strict state separation, and cell‑aware telemetry. If those four pieces hold, you’ll have a platform you can grow with confidence.
]]>When you work with PDND-based interoperability, it is tempting to think that the job ends when the e-service exchange is “correct” from a protocol standpoint: authentication works, the request reaches the provider, and a payload comes back. No, this is only the starting point.
What matters in Public Administration scenarios is what happens after the exchange: can the data be trusted, compared across time, reused by multiple processes, and evolved without breaking everything downstream? If interoperability is the highway, data quality is what determines whether the traffic is safe and sustainable.
This is why I want to adopt the Medallion Architecture as a first-class pattern for PDND data exchanges. Medallion (often described as Bronze → Silver → Gold) is not a technology choice; it is an operating model for improving data quality in stages. Instead of attempting to produce a “perfect” dataset immediately (which usually leads to brittle pipelines or hidden assumptions), Medallion establishes a controlled progression where each layer has a clear contract and a clear responsibility.
In practice, this layered approach solves three recurring problems that are particularly common in interoperability ecosystems:
First, traceability and replayability: PDND exchanges happen between multiple parties and evolve over time. When a contract changes, or when an issue is discovered months later, you need to be able to go back to the original exchanged payload and reprocess it with new rules. Without this, every correction becomes a one-off patch and historical data becomes inconsistent.
Second, semantic stability: even when providers follow the same interoperability rules, payload semantics can vary. Code sets can drift, optional fields can be missing, different versions can coexist, and interpretations can differ between organizations. Medallion gives you a place to define a stable internal meaning (Silver) that downstream systems can rely on, independently from how external payloads evolve.
Third, reusability at scale: once PDND data is available, it tends to be consumed by many different use-cases: reporting, controls, workflow automation, downstream services, auditing, and analytics. If every consumer cleans and normalizes data on its own, you end up with duplicated logic, conflicting numbers, and fragile processes. Medallion addresses this by centralizing data quality and publishing consumption-ready products (Gold) with explicit contracts.
The key insight is simple:
Medallion Architecture is a maturity path for interoperability data.
It turns “data exchanged” into “data that multiple systems can trust”, while keeping evolution manageable.
This post keeps examples intentionally generic and focuses on how to interpret Bronze/Silver/Gold specifically in a PDND context, using high-level .NET design patterns to express the contracts between layers.
Interoperability data naturally changes over time. Payloads can be refined, fields can be added, and semantics can be clarified. Without structure, these changes quickly force downstream consumers to either break or to implement their own ad-hoc interpretation logic, which leads to inconsistent results and fragile processes.
Medallion Architecture addresses this by giving you three well-defined “quality gates”:
Bronze preserves what was exchanged so you can always replay and reprocess. Silver standardizes and validates meaning through a canonical internal contract. Gold then packages that trusted meaning into consumption-ready data products aligned to real operational and business needs. The result is an interoperability pipeline that becomes more reliable as the ecosystem grows.
The Bronze layer captures the PDND exchange as it happened, staying as close as possible to the source. Its purpose is not to improve the data; it is to preserve evidence and enable replay.
In practice, Bronze stores the original payload (as exchanged) together with minimal technical metadata to keep it traceable across time and systems. A useful design choice here is to avoid long method signatures and “flat” records; instead, model the exchange as an envelope that carries both metadata and payload. This keeps contracts stable when metadata evolves (new identifiers, environment markers, schema hashes, etc.) without forcing changes across the codebase.
public sealed record PdndExchangeMetadata(
string CorrelationId,
DateTimeOffset ReceivedAtUtc,
string EServiceId,
string ProviderOrganization,
string ContractVersion
);
public sealed record PdndBronzeExchange(
PdndExchangeMetadata Meta,
string Payload // raw payload as exchanged (e.g., JSON/XML as string)
);
A helpful mental model is simple: Bronze is the truth you can always return to, especially when rules change or issues are discovered later.
Silver is where “as exchanged” data becomes a canonical internal representation with stable semantics. This is the layer where data quality is actively increased and formalized.
The central concept is the canonical contract: your internal model that preserves meaning even when external payload versions evolve. In Silver you enforce schema and types, normalize formats and code sets, validate both syntax and semantics, and deal explicitly with idempotency/deduplication in a deterministic way. Silver is also where you make data quality visible by quarantining invalid or incomplete records instead of silently dropping them.
To keep the design clean and signatures stable, a useful approach is to pass cross-cutting concerns (reference data, policies, environment-specific rules) through a processing context, rather than adding parameters to every method. This makes the pipeline easier to evolve and test.
public sealed record ProcessingContext(
string Environment,
IReferenceData ReferenceData,
IPolicySet Policies
);
public sealed record ValidationResult(bool IsValid, string? Reason);
public interface ICanonicalMapper<in TIn, out TOut>
{
TOut Map(TIn input, ProcessingContext ctx);
}
public interface IValidator<in T>
{
ValidationResult Validate(T item, ProcessingContext ctx);
}
With those contracts in place, your Silver pipeline becomes readable and explicit about the quality boundary it enforces. Importantly, it stays stable even when you add new rules or enrichments, because those evolve inside ProcessingContext rather than in method signatures.
public sealed class SilverPipeline<TBronze, TSilver>
{
private readonly ICanonicalMapper<TBronze, TSilver> _mapper;
private readonly IValidator<TSilver> _validator;
public SilverPipeline(
ICanonicalMapper<TBronze, TSilver> mapper,
IValidator<TSilver> validator)
{
_mapper = mapper;
_validator = validator;
}
public (TSilver? Valid, TSilver? Invalid, string? Reason) Process(
TBronze bronze,
ProcessingContext ctx)
{
var canonical = _mapper.Map(bronze, ctx);
var result = _validator.Validate(canonical, ctx);
return result.IsValid
? (canonical, default, default)
: (default, canonical, result.Reason);
}
}
Gold takes validated Silver data and turns it into consumption-oriented datasets, often referred to as data products. Gold does not necessarily make the data “more correct” than Silver; it makes it more usable for a specific audience and use-case.
In Gold you typically apply controlled enrichments (for example by joining reference data), define domain projections that match how processes and services consume the information, and introduce derived attributes or aggregates when they add clarity and value. The key architectural choice is to avoid a single “catch-all” dataset and instead publish targeted products with explicit contracts.
Here again, the same “clean signature” rule applies: keep your projector interfaces simple and push cross-cutting concerns into the context.
public interface IGoldProjector<in TSilver, out TGold>
{
TGold Project(TSilver silver, ProcessingContext ctx);
}
public sealed record GoldDataProductRow(
string BusinessKey,
DateOnly BusinessDay,
string Category,
string ProviderOrganization,
string EServiceId
// + fields shaped for a specific consumer/use-case
);
A practical advantage of Medallion in PDND scenarios is that it allows exchanges to evolve safely. External contracts can change, but Bronze preserves the original payload for replay. Silver stabilizes internal meaning through canonical mapping and validation. Gold delivers consumption-ready data products whose contracts you control.
This prevents a common interoperability anti-pattern: each downstream consumer interprets the same PDND payload differently, re-implementing cleaning rules and code mappings in inconsistent ways. Over time that leads to fragile processes and conflicting numbers. Medallion keeps meaning centralized, explicit, and versionable.
Treat Bronze as immutable evidence. Ensure Silver transformations are deterministic so reprocessing remains trustworthy. Treat external versions as explicit contracts and keep your canonical representation stable. Make quarantine a normal part of the pipeline, because visibility is how quality improves. Finally, design Gold as a set of purpose-driven data products rather than a single generic dataset.
In PDND interoperability, the exchange is the beginning, not the end. Medallion Architecture provides a clean and repeatable path to convert “data exchanged across entities” into “data that multiple systems can trust”:
In enterprise systems, especially in Public Administration (PA), interoperability means exchanging information that stays correct, interpretable, and auditable over time. That “over time” part is where most projects break: models drift, regulations evolve, suppliers change, legacy systems stay put, and the same business concept ends up represented in five different ways across five different databases.
This is a pragmatic walkthrough of what actually works in that environment. We’ll cover Anti-Corruption Layers, Integration Layers, message brokers, normalization, and a few reliability patterns that keep integrations from collapsing under real-world conditions. No vendor talk, no “just adopt microservices”, and no pretending that legacy doesn’t exist.
Integration is the act of connecting systems: sending requests, receiving responses, delivering messages, moving files. Interoperability is what happens after that connection exists: whether both parties understand the same data in the same way, whether they can evolve without breaking each other, and whether you can reconstruct the truth when something goes wrong.
In PA this gets harder because integrations are institutional. You’re not linking one “service” to another; you’re linking organizations with different standards, responsibilities, and timelines. And traceability is not negotiable. When a citizen’s case is impacted, “the system replied 200 OK” is rarely a meaningful answer. You need to know what changed, why, and which source triggered it.
A simple way to frame interoperability is this: it’s not about connectivity, it’s about shared meaning + controlled evolution + auditability.
Most interoperability failures are predictable because the same shortcuts keep happening. Point-to-point growth is the classic one. It starts as “just one integration”, then becomes six, then becomes thirty. Each endpoint carries slightly different assumptions, each transformation is implemented differently, and eventually nobody can explain what the system really believes about a piece of data. Another shortcut is extracting analytics directly from operational databases. The first report works fine, then volumes grow, queries get heavier, indexes get tuned for reporting instead of operations, and the transactional workload slows down. At that point you choose between degrading operational performance or killing analytics that stakeholders now depend on. Then there’s the retry trap. It’s easy to implement “retry on error” and call it resilience. But blind retries can re-apply a command that no longer matches the current state. In legacy-heavy environments this is common: downstream systems aren’t idempotent, upstream systems don’t know the real state, and you introduce subtle misalignments that appear weeks later. Finally, investigations often fail because history is missing. If you can’t reconstruct “what exactly happened to this record”, every incident becomes guesswork. In PA, that’s not only operational pain, it’s also governance pain.
Here are the shortcuts that usually show up together:
These failures aren’t caused by bad tools. They happen when interoperability is treated as wiring rather than architecture.
Two concepts are frequently confused: the Integration Layer and the Anti-Corruption Layer (ACL). They both sit between systems, but their goals are not the same.
The Integration Layer is about mechanics. It standardizes how you connect: protocols, authentication, routing, throttling, canonical headers, consistent error handling. It’s where you centralize the boring but essential concerns that otherwise get re-implemented everywhere.
The ACL is about meaning. It protects your internal model from external chaos. Legacy systems and external organizations often encode business concepts in ways that are convenient for them, not for you. Status codes like 7 or 9 might make sense historically, but they’re not a stable language for your domain. An ACL translates external representations into internal concepts and enforces your invariants.
This is where teams often underestimate the work. They implement DTO mapping and call it an ACL. But real ACL work is semantic. It’s deciding what an external field actually means in your domain, how to handle missing or contradictory information, and how to evolve without letting external changes ripple into your core services.
A useful rule of thumb is simple: your core domain should never “speak legacy”. If you find yourself introducing external codes, external lifecycle states, or external quirks inside your core model, you’re skipping the ACL and you’ll pay for it later.
A practical mental model:
Normalization is often treated as “format cleanup”: date formats, trimming strings, validating identifiers, ensuring encoding consistency. That’s necessary, but it’s not sufficient. In interoperability projects, the harder part is semantic normalization. The same field can represent different meanings depending on the source, the time period, or the business process that generated it. Two systems may both expose a concept called “status”, but one encodes operational status while the other encodes legal eligibility. You can normalize formats all day and still exchange wrong information. That’s why interoperability needs data contracts, not just endpoints. A contract makes expectations explicit: schema, semantics, versioning, validation rules, and compatibility commitments. In long-lived PA ecosystems, contracts are the difference between stable evolution and perpetual firefighting because they let you change one side without breaking the other, and they give you a shared reference when disputes arise.
What good contracts typically include:
Treat contracts as products: documented, versioned, tested, and owned. Otherwise interoperability becomes folklore.
When organizations integrate, availability and timing rarely align. That’s why async-first architectures are so effective in PA and legacy landscapes. A message broker introduces decoupling: producers don’t need consumers to be online right now, and consumers can process at their own pace. But a broker doesn’t magically solve reliability. It forces you to be explicit about things synchronous calls often hide: retry strategies, poison messages, ordering assumptions, and what at least once delivery means for your business operations. This is also where the classic “blind retry” problem shows up again. If a failed transaction is retried without considering the current state, you can create contradictions. The same command that was correct yesterday might be incorrect today because new information arrived or the citizen’s case changed. Resilience that ignores state is just chaos with better logging. A broker helps because it enables a controlled approach: dead-letter queues, delayed retries, backpressure, replay, and fan-out patterns. It turns integrations into pipelines rather than fragile call chains.
A few broker-related concerns you should always decide upfront (explicitly):
Once you accept that distributed systems fail in annoying ways, a few patterns become non-negotiable.
Transactional Outbox is the first. If a service updates its database and must publish an event, you want both actions to be consistent. Writing state to the DB and sending a message in the same “logical transaction” is harder than it looks. Without an outbox, you’ll eventually hit scenarios where the DB commit succeeds but message publishing fails (or vice versa). The outbox approach writes the outgoing event into a table in the same DB transaction and publishes it asynchronously. Not glamorous, but one of the best reliability trades you can make.
Idempotency and deduplication are the second. Duplicates happen. Replays happen. Recovery produces duplicates by design. Consumers must safely process the same message multiple times without breaking state. This usually means idempotency keys, dedup stores, and “upsert-like” semantics. It also means designing commands/events so that replaying them is safe.
Finally, for long-lived cross-system workflows you often need a Saga or process manager. PA workflows can span multiple domains and last days or weeks. Modeling them as a chain of synchronous calls is fragile and makes recovery painful. A process manager maintains state, correlates events, handles timeouts, and defines compensations. It’s the difference between a controlled workflow and a pile of retries.
If you want a quick checklist, these are the patterns most often missing in broken integrations:
In many private contexts, you can survive with basic logging and an incident ticket. In PA contexts, auditability is a baseline requirement. You often need to prove what happened, not just fix it. That means correlation IDs across boundaries, structured logs, and distributed tracing for integration paths. It also means designing an audit trail that answers uncomfortable questions: who changed what, when, why, and based on which source. This becomes critical when the same “record” is influenced by multiple inbound integrations. At the same time, auditability must respect privacy. You can’t just log everything. You need tokenization, masking, and retention policies. The goal is to keep integrations observable and reconstructable without turning logs into a liability.
One of the most consistent anti-patterns in enterprise systems is “analytics on production databases”. It usually starts with good intentions and ends with performance problems and fragile tuning. When volumes grow, operational and analytical workloads compete, and tuning becomes a zero-sum game. A healthier approach is workload isolation. Build read models, projections, marts, or any other dedicated layer for analytics. Populate it incrementally, ideally near-real-time, and let it evolve independently from the operational model. This is where “Data-as-a-Product” becomes practical: curated datasets with ownership, documentation, and expectations. Near-real-time analytics isn’t about buying a platform. It’s about not mixing responsibilities. When operations and analytics are separated, both become more reliable.
Interoperability improvements don’t require rewriting everything. They require choosing the right battles and sequencing changes. Start by identifying the critical exchange flows: high volume, high failure impact, or high business sensitivity. Formalize contracts for those flows and introduce an ACL where semantics are unstable. Move heavy exchange toward asynchronous messaging where coupling and availability are hurting you, and add idempotency and recovery mechanisms before you scale. If you publish events, introduce an outbox. If reporting is killing the operational DB, isolate analytics with projections. Then instrument the whole thing with correlation IDs and audit trails so you can operate it with confidence.
A simple sequence that works surprisingly often is:
Small steps. Measurable wins. Repeat.
Interoperability is semantic stability over time, not a simply connection. In PA and legacy landscapes, architecture must assume drift, partial failure, and long-lived processes. The Integration Layer standardizes mechanics; the ACL protects meaning. Brokers, idempotency, and outboxes make integrations resilient and evolvable. Normalization must include semantics, not just formats. Analytics belongs outside operational databases. Observability and auditability are first-class requirements, not optional extras.
If you treat interoperability as an architecture discipline rather than wiring, you stop fighting fires and start shipping reliable change.
]]>BenchmarkDotNet solves this problem. It’s a straightforward, yet extremely precise framework that enables developers to measure execution time and memory consumption of small sections of .NET code reliably and unstated way that is statistically reliable. BenchmarkDotNet takes the noise and adjudication away from measuring performance, assuring you trustable, repeatable results.
In most projects, performance issues appear late, usually when the system scales or users increase. Running benchmarks early in development helps you:
Benchmarking isn’t only for low-level optimization but it’s a decision making tool.
BenchmarkDotNet is available as a standard NuGet package:
dotnet add package BenchmarkDotNet
Once added, you can define small benchmark classes decorated with attributes that describe what to measure and how. The library handles warmup runs, multiple iterations, and reports average performance, deviation, and allocations. It’s the same tool used by the .NET runtime team, which makes it a solid foundation for your own analysis.
Let’s benchmark two sorting approaches to see BenchmarkDotNet in action:
Array.Sort(): the optimized built-in algorithm used in .NETusing BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Columns;
using BenchmarkDotNet.Configs;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Order;
using BenchmarkDotNet.Running;
using System.Linq;
public class Config : ManualConfig
{
public Config()
{
AddJob(Job
.Default
.WithWarmupCount(3)
.WithIterationCount(8)
.WithLaunchCount(1));
AddColumn(RankColumn.Png); // ranking badge
WithOrderer(new DefaultOrderer(SummaryOrderPolicy.FastestToSlowest));
}
}
[MemoryDiagnoser]
[Config(typeof(Config))]
public class SortingBenchmarks
{
private int[] data;
[Params(1_000, 10_000, 100_000)] // reduce 100_000 if runs take too long
public int Size;
[GlobalSetup]
public void Setup()
{
var random = new Random(42);
data = Enumerable.Range(0, Size).Select(_ => random.Next()).ToArray();
}
[Benchmark(Baseline = true)]
public int[] ArraySort()
{
var copy = (int[])data.Clone();
Array.Sort(copy);
return copy;
}
[Benchmark]
public int[] BubbleSort()
{
var copy = (int[])data.Clone();
for (int i = 0; i < copy.Length - 1; i++)
for (int j = 0; j < copy.Length - i - 1; j++)
if (copy[j] > copy[j + 1])
(copy[j], copy[j + 1]) = (copy[j + 1], copy[j]);
return copy;
}
}
public class Program
{
public static void Main() => BenchmarkRunner.Run<SortingBenchmarks>();
}
Run it in Release mode to get accurate results:
dotnet run -c Release
BenchmarkDotNet will execute both methods under controlled conditions and output a table like this:
| Method | Size | Mean (ms) | Error (ms) | StdDev (ms) | Allocated |
|---|---|---|---|---|---|
| ArraySort | 1,000 | 0.100 | 0.001 | 0.001 | ~3.9 KB |
| BubbleSort | 1,000 | 3.000 | 0.060 | 0.045 | ~3.9 KB |
| ArraySort | 10,000 | 1.329 | 0.013 | 0.011 | ~39.1 KB |
| BubbleSort | 10,000 | 300.000 | 6.000 | 4.500 | ~39.1 KB |
| ArraySort | 100,000 | 16.610 | 0.166 | 0.133 | ~390.6 KB |
| BubbleSort | 100,000 | 30000.000 | 600.000 | 450.000 | ~390.6 KB |
These results are consistent with the theoretical complexity of each algorithm:
Memory allocations are identical because both methods clone the input array before sorting. This ability to measure, compare, and reason about performance with hard numbers is what makes BenchmarkDotNet so powerful for .NET developers and architects. BenchmarkDotNet quantifies that gap with precise numbers. It’s not an assumption or an estimate, it’s a reproducible measurement.
BenchmarkDotNet can do much more than measure time.
By adding attributes like [MemoryDiagnoser] or [Params], you can:
For larger projects, you can use the BenchmarkSwitcher API to run multiple benchmark classes in the same session.
Benchmarks are most useful when they’re not treated as one-off experiments. Running them occasionally can be insightful, but the real value comes when you make benchmarking part of your regular development and delivery workflow. In most teams, performance tends to drift over time. A new feature might introduce an extra allocation, for example a refactor might change an algorithm’s complexity, a library update might slow things down under load. These changes are rarely visible through functional tests alone. Integrating benchmarks ensures that performance regressions are detected as early as functional bugs.
The first step is to treat benchmarks like any other form of validation. You can keep a Benchmarks project inside your solution, right next to your test projects. Developers can run it locally when they modify critical code paths, for example, data serialization, parsing, caching, or algorithms. BenchmarkDotNet projects build and run just like unit tests, but instead of pass/fail, they output numeric results. A good practice is to check those results into version control, so you have a historical view of how performance evolves.
Once the local workflow is stable, automate it. You can configure your CI pipeline (like GitHub Actions, Azure DevOps, or Jenkins) to:
This transforms performance testing into a continuous quality signal, rather than a last-minute audit before release.
Benchmarks are meaningful only when compared against something.
Define a baseline version of your application, perhaps the latest stable release or a known good commit, and store its benchmark results.
Subsequent runs can be compared against that baseline to show whether a change improved or degraded performance.
BenchmarkDotNet supports this directly through the [Baseline] attribute and relative comparison columns, but you can also manage it externally with exported data.
Beyond catching regressions, benchmarks can provide trend visibility. By tracking the same set of benchmarks across releases, you can observe how the system’s performance evolves, which areas are improving, which are degrading, and how architectural decisions impact real performance. You can even use tools like Power BI, Grafana, or Excel to visualize historical data exported from BenchmarkDotNet runs. This makes it easier to justify optimization work and demonstrate progress to non technical stakeholders.
Benchmarks should be small, isolated, and purposeful. Focus on code paths where performance matters, like loops, serialization, parsing, or algorithmic components. Avoid full end-to-end scenarios that mix CPU and I/O, which are better suited for load testing tools like NBomber or k6. Keep each benchmark reproducible and independent. The goal is to measure computation, not environment noise.
Finally, share the results with the team. BenchmarkDotNet produces readable Markdown reports that can be published automatically in your documentation or wiki. Use them during sprint reviews or retrospectives to highlight improvements or issues. Encouraging developers to interpret and discuss these metrics builds a shared understanding of performance, helping teams write faster and more predictable code.
BenchmarkDotNet gives developers a reliable, repeatable way to see how .NET code actually performs. It replaces assumptions with data, and guesses with measurement. That means few surprises about performance in production and more faith in architecture decisions. You will often be surprised by the results, which is exactly why it is worthwhile to benchmark.
]]>Designing a public API is more than just an engineering exercise, it is an exercise in communication. Every endpoint, every field and every error message communicates with another developer. That conversation needs to be clear, consistent and respectful of their time.
When an organization publishes an API, it creates a contract. From that moment onward, every modification of that contract has a consequence. Bad design frequently will show itself months later when the teams notice a very small change to an internal model breaks dozens of clients. At that stage versioning will become messy, and the team will spend more time maintaining compatibility than creating features. Good design prevents this. A well-defined API minimizes ambiguity, reduces support overhead, and is a reliable base for growth. More importantly, it protects developers from unintended coupling to internal details. The purpose of a public API should be to expose capabilities, not implementation details. The moment consumers are dependent upon the shape of your database tables or your internal class names, you have lost control of your own evolution.
Every application programming interface (API) starts with a fundamental question: who will be using it, and what outcomes will they achieve? I think APIs typically become reflections of the product’s internal layout rather than the needs of the user. A thoughtful design inverts that concept. Instead of exposing raw data models, we expose actions, workflows, and outcomes. We foster APIs from that perspective and the result is typically smaller APIs, with clearer concepts, and a design which is easier to maintain. Empathy is the first principle of API design. When we consider how another developer will read and consume our documentation, develop an understanding of what we have named our concepts, and how they will recover from an error, we design a better interface.
Simplicity is not the lack of features, simplicity is the practice of showing only what is necessary. A well-designed API can feel natural and intuitive because it behaves the way the developer expects. When you use standard conventions with a consistent naming pattern for your resources, a consistent response format, or a meaningful use of HTTP verbs, the developer will spend less time learning and more time building. Predictability is what makes an API feel elegant. Once a developer understands how one piece works, they should be able to predict how the rest will work, and that predictability creates confidence in the developer and shortens the learning curve significantly. The best APIs have a personality they are opinionated enough to be clear, but also flexible enough to accommodate multiple use cases.
The best APIs in .NET often share consistent middleware pipelines for exception handling, validation, and response shaping, which can be achieved with tools like FluentValidation and global exception filters.
When you publish an API, it’s akin to opening a window into your system. The wider you open it, the more you unveil and the harder it will become to modify the interior. A stable public interface should never leak any internal identifiers, domain objects, or temporary fields. DTO act as protective layers, giving you the flexibility to change internal logic without breaking clients. This border between your internal and external operation is what enables your life of freedom. It permits you to refactor, optimize, and change behind the scenes while appearing reliable on the outside.
Libraries such as AutoMapper or Mapster simplify the mapping between domain models and public representations, ensuring your internal code can evolve independently.
Regardless of how well you think through your API design, you will have to change it. The trick is to change it without losing trust. If you introduce backward compatibility, it shows your consumers respect. If you are adding new fields, don’t change the name or the type of an existing field. If you are forced to make breaking changes, introduce new versions and communicate them well and early. Versioning is not just a technical strategy but also a product policy. You should assume that each version comes with a release, and that version will have a lifecycle. Versioning often requires deprecation of older versions, so do this in less than a climax but more than what you have implemented. Giving developers time to migrate builds trust and stability, and confidence encourages adoption.
In .NET, API versioning can be managed elegantly using the Asp.Versioning library. It integrates seamlessly with ASP.NET Core, supporting URL-based (/v1/orders) or header-based (api-version: 2.0) strategies. Combined with Swashbuckle.AspNetCore, you can generate versioned Swagger documentation automatically.
An API with no documentation is equivalent to a library without a catalog of books. The simpler that you can enable discovery about how your API works, the more developers will use your API correctly. Effective documentation tells a narrative, how do you authenticate, how do you perform standard operations, what an error means, or what an acceptable response looks like. While useful, detailed lists of all parameters don’t carry much weight compared to useful examples. Interactive documentation, such as explorers based on OpenAPI encourages a process of learning through experimentation. Developers can see how the API behaves or what a response looks like without needing to write a line of code. This reduces the barrier to entry while identifying inconsistencies very quickly.
In the .NET ecosystem, Swashbuckle.AspNetCore and NSwag are the go-to tools for generating OpenAPI documentation automatically from your controllers or Minimal APIs. They can include authentication headers, example payloads, and even allow developers to test requests directly in the browser.
Security must be built in from the beginning, not added on as a patch. Secure communications, authentication and authorization, and input validation must be used at every layer. Protect your API from abuse by rate limiting and be prepared to monitor for abuse and anomalies. Protecting your users’ data and your own systems also qualifies as security. There are some very damaging error messages that can simultaneously expose internal information due to the lack of an access check. Every interaction you have with your API should be treated as if it came from a completely unknown source, because it will.
Protect your API from abuse using Microsoft.AspNetCore.RateLimiting, available natively from .NET 7, or the community package AspNetCoreRateLimit for more granular configuration. Always validate inputs libraries like FluentValidation can enforce strong typing and consistent validation across DTOs.
Errors are not defects in the design, they are aspects of the conversation between your API and the user. A vague, nonsensical 500 Internal Server Error tells you nothing. An informative, organized response giving more information about what went wrong and how to fix it, however, converts frustration into trust. For example, when a request fails because of a missing parameter, returning a descriptive message like Field email is required, and including a precise error code, conveys respect for the dev’s time. The nature of your errors is the GUI to your attitude, toward the users.
You can handle errors gracefully in ASP.NET Core using ProblemDetails responses, a standardized format supported by the framework out of the box. Combine this with FluentValidation to return consistent validation errors like:
{
"errors": {
"email": ["The 'email' field is required."]
},
"type": "https://httpstatuses.com/400",
"title": "Bad Request"
}
A public API should respond well to growth. When thinking about efficiency, don’t depend solely on hardware, but design your API carefully. The many features of a great API like pagination, caching, and asynchronous operations will help you provide a fast and consistent response. Sometimes, the easiest way to optimize is to expose fewer resources: smaller data payloads, a smaller number of endpoints, less round trip time. Scalability is also defined in terms of resilience. When throughput spikes or downstream systems are unhealthy, your API should degrade gracefully while communicating clearly instead of silently timing out requests.
When performance is critical, use Response Caching Middleware or distributed caching solutions like StackExchange.Redis. For high traffic scenarios, consider gRPC for inter service communication, it offers excellent performance for binary payloads and is fully supported in ASP.NET Core.
Creating APIs for public use is an art and a responsibility. It requires clarity of thought, the ability to empathize with developers, and discipline in practice. A good API is not one that shows every feature of a system, but one that demonstrates intent clearly and can change without friction. When design is intentional, the API is more than a technical interface, it is a shared language between systems and individuals. Developers grow to trust it, to build on it, and to create value it in ways you may not ever expect. In time that trust becomes an ecosystem: integrations, tools, and applications that increase the value of what your product can do.
If it’s a good API, it disappears into the background of great software. It just works.
]]>This is where MassTransit, a free, open-source distributed application framework for .NET, steps in.
At its core, MassTransit abstracts away the complexity of messaging systems (RabbitMQ, Kafka or Azure Service Bus) and lets developers focus on business logic instead of boilerplate. With MassTransit you get:
It’s like getting a well-tested messaging toolkit built on top of proven transport engines.
Let’s start with a basic scenario: a service that publishes a EventSubmitted event, and another service that processes it.
public record EventSubmitted(Guid EventId, string Body);
services.AddMassTransit(x =>
{
x.AddConsumer<EventSubmittedConsumer>();
x.UsingRabbitMq((context, cfg) =>
{
cfg.Host("rabbitmq://localhost");
cfg.ConfigureEndpoints(context);
});
});
public class EventSubmittedConsumer : IConsumer<EventSubmitted>
{
public async Task Consume(ConsumeContext<EventSubmitted> context)
{
Console.WriteLine($"Processing event {context.Message.EventId} for {context.Message.Body}");
// Business logic goes here
}
}
await bus.Publish(new EventSubmitted(
Guid.NewGuid(),
"BODY_HERE"
));
So, the publishing service doesn’t care which component listens. The consumer picks up the event, processes it, and everything is safely routed via RabbitMQ.
In many public sector scenarios, processes do not operate within a single request/response cycle. Consider a citizen requesting a benefit certificate. The system must verify the documents first, conduct a background check, and only then make a decision. Each of those steps can be performed by separate services and may take time. If those workflows are manually managed then services tend to be large, complex services with conditional logic, retries, and fragile state. MassTransit provides a solution to this complexity in the form of sagas. A SAGA is a state machine that is aware of the lifecycle of a request and responds to events like “Documents Verified” or “Background Check Complete”. The durability and simplicity of sagas is what makes them so useful. The current state is persisted in a database, and as long as the request is still open, if the system restarts the process will continue as though it never left. Developers do not need to worry about lifecycle management. They only need to define the events of interest, a class to hold the request’s current state and a state machine that details how the events will transition the workflow. MassTransit orchestrates the correlation, persistence, and reliability so you can focus on the business rules.
MassTransit handles all the difficult work behind the curtain:
Developers can now model the complex business process just by the few lines of configuration and state machine definition. Essentially, rather than painstakingly writing thousands of lines of orchestration code, you simply specify how your process should change and it is MassTransit that ensures it as reliable, distributed and resilient by design. Sagas are extremely beneficial to those types of domains where processes, are involved in multiple asynchronous steps and have to be very careful not to lose any part of the process such as e-commerce, finance, telecom, and healthcare. By adopting sagas, you make your workflows explicit, auditable, and easy to extend over time. In fact, sagas convert the fast-orchestrated, haphazard, ad-hoc mode to a lucid, event-driven flow. In .NET, MassTransit pattern acceptance is not only possible but also surprisingly easy.
Scenario: a citizen submits a request for a family benefits certificate. The process spans multiple back-office steps: document verification, background checks, and final approval. Each step is asynchronous and possibly handled by different services.
public record BenefitRequestSubmitted(Guid RequestId, string CitizenFiscalCode);
public record DocumentsVerified(Guid RequestId, bool Passed, string? Notes = null);
public record BackgroundCheckCompleted(Guid RequestId, bool Passed, string? Notes = null);
public record RequestApproved(Guid RequestId);
public record RequestRejected(Guid RequestId, string Reason);
using MassTransit;
public class BenefitRequestState : SagaStateMachineInstance
{
public Guid CorrelationId { get; set; } // MassTransit saga key
public string CurrentState { get; set; } = default!;
public string CitizenFiscalCode { get; set; } = default!;
public bool? DocsOk { get; set; }
public bool? BackgroundOk { get; set; }
public string? LastNotes { get; set; }
public DateTime StartedAtUtc { get; set; }
}
using MassTransit;
public class BenefitRequestStateMachine : MassTransitStateMachine<BenefitRequestState>
{
public State Submitted { get; private set; }
public State Verifying { get; private set; }
public State Checking { get; private set; }
public State Completed { get; private set; }
public Event<BenefitRequestSubmitted> RequestSubmitted { get; private set; }
public Event<DocumentsVerified> DocsVerified { get; private set; }
public Event<BackgroundCheckCompleted> BgcCompleted { get; private set; }
public BenefitRequestStateMachine()
{
InstanceState(x => x.CurrentState);
Event(() => RequestSubmitted, x => x.CorrelateById(c => c.Message.RequestId));
Event(() => DocsVerified, x => x.CorrelateById(c => c.Message.RequestId));
Event(() => BgcCompleted, x => x.CorrelateById(c => c.Message.RequestId));
Initially(
When(RequestSubmitted)
.Then(ctx =>
{
ctx.Instance.CitizenFiscalCode = ctx.Data.CitizenFiscalCode;
ctx.Instance.StartedAtUtc = DateTime.UtcNow;
})
.TransitionTo(Submitted)
// kick off first back-office step (document verification)
.Publish(ctx => new /* command to verifier */ BenefitRequestSubmitted(
ctx.Instance.CorrelationId, ctx.Instance.CitizenFiscalCode))
);
During(Submitted,
When(DocsVerified)
.Then(ctx =>
{
ctx.Instance.DocsOk = ctx.Data.Passed;
ctx.Instance.LastNotes = ctx.Data.Notes;
})
.IfElse(ctx => ctx.Data.Passed,
thenBinder => thenBinder.TransitionTo(Verifying)
// request background check only if docs ok
.Publish(ctx => new /* command to background-check svc */ DocumentsVerified(
ctx.Instance.CorrelationId, passed: true)),
elseBinder => elseBinder.Finalize().Publish(ctx =>
new RequestRejected(ctx.Instance.CorrelationId, "Document verification failed")))
);
During(Verifying,
When(BgcCompleted)
.Then(ctx =>
{
ctx.Instance.BackgroundOk = ctx.Data.Passed;
ctx.Instance.LastNotes = ctx.Data.Notes;
})
.IfElse(ctx => ctx.Data.Passed && ctx.Instance.DocsOk == true,
thenBinder => thenBinder.TransitionTo(Completed).Publish(ctx =>
new RequestApproved(ctx.Instance.CorrelationId)),
elseBinder => elseBinder.Finalize().Publish(ctx =>
new RequestRejected(ctx.Instance.CorrelationId, "Background check failed")))
);
// When saga reaches Completed (approved) or gets Finalized (rejected), it’s removed from storage:
SetCompletedWhenFinalized();
}
}
using MassTransit;
using Microsoft.EntityFrameworkCore;
public class BenefitSagaDbContext : SagaDbContext
{
public BenefitSagaDbContext(DbContextOptions options) : base(options) { }
protected override IEnumerable<ISagaClassMap> Configurations
=> new[] { new BenefitRequestStateMap() };
}
public class BenefitRequestStateMap : SagaClassMap<BenefitRequestState>
{
protected override void Configure(EntityTypeBuilder<BenefitRequestState> entity, ModelBuilder model)
{
entity.Property(x => x.CurrentState);
entity.Property(x => x.CitizenFiscalCode);
entity.Property(x => x.DocsOk);
entity.Property(x => x.BackgroundOk);
entity.Property(x => x.LastNotes);
entity.Property(x => x.StartedAtUtc);
}
}
builder.Services.AddDbContext<BenefitSagaDbContext>(opt =>
opt.UseSqlServer(builder.Configuration.GetConnectionString("SagaDb")));
builder.Services.AddMassTransit(x =>
{
x.AddSagaStateMachine<BenefitRequestStateMachine, BenefitRequestState>()
.EntityFrameworkRepository(r =>
{
r.ConcurrencyMode = ConcurrencyMode.Pessimistic;
r.AddDbContext<DbContext, BenefitSagaDbContext>((_, cfg) => { /* configured above */ });
});
x.UsingRabbitMq((ctx, cfg) =>
{
cfg.ConfigureEndpoints(ctx);
});
});
Anywhere in your system (like an API endpoint), publish the initial event:
[HttpPost("requests")]
public async Task<IActionResult> Submit([FromServices] IBus bus, [FromBody] string fiscalCode)
{
var id = Guid.NewGuid();
await bus.Publish(new BenefitRequestSubmitted(id, fiscalCode));
return Accepted(new { requestId = id });
}
MassTransit allows .NET teams to construct resilient, scalable, and decoupled systems while staying above the level of low-level messaging details. Teams do not need to know anything about queues, exchanges, and retry loops, and can focus on creating business value. If your system is constantly threatened by tight coupling, message loss, or integration spaghetti code, you will want to check out MassTransit. With some really simple configuration, you could be taking your architecture to the next level and into the event-driven world.
]]>All of this comes with a unified query API and built-in thread-safety, making it easy to reason about data in time-sensitive applications. TemporalCollections is ideal for scenarios like event streaming, sliding-window analytics, telemetry buffers, rate limiting, session tracking, and caches with expiry any situation where time is a first-class concern.
Time is a first-class dimension in many systems:
While you could bolt timestamps onto standard collections, you would still need to solve ordering, race-free timestamp assignment, range queries, pruning, and concurrency consistently across multiple data structures. TemporalCollections addresses these concerns out-of-the-box with a monotonic timestamp guarantee and a common query surface.
Temporal collections only make sense if time behaves. In practice, though, system clocks don’t always cooperate: multiple calls to UtcNow within the same tick can return identical values; NTP can move the clock backwards; and highly concurrent code can interleave operations so tightly that two insertions appear to occur at the same instant.
If timestamps aren’t strictly ordered, time-window queries become flaky (GetInRange may miss or double count items on boundaries) and age-based pruning (RemoveOlderThan) isn’t deterministic.
To keep temporal behavior predictable, the library assigns a monotonic timestamp to every insertion: each generated value is guaranteed to be strictly greater than the one before it within the same process.
If the clock doesn’t advance between two reads, we simply step forward by one tick and move on.
GetInRange(from, to) is inclusive on both ends.RemoveOlderThan(cutoff) removes Timestamp < cutoff (keeps >= cutoff).GetBefore(time) is strictly <; GetAfter(time) is strictly >.GetLatest() / GetEarliest() return extremes or null when empty.These rules make window math predictable and prevent off-by-one bugs.
Enumerations return a stable snapshot at call time, preserving determinism under concurrency.
TemporalItem<T>All collections store TemporalItem<T>, a lightweight wrapper that pairs an immutable value with a timestamp (DateTimeOffset) representing the insertion moment.
Timestamps are strictly increasing even under bursty or concurrent insertions: if UtcNow would produce a non-increasing value (precision limits / clock granularity), the library atomically increments by a tick to maintain order and uniqueness.
This yields deterministic chronology without races across threads.
ITimeQueryable<T>Every structure implements ITimeQueryable<T>, exposing consistent operations:
GetInRange(from, to): enumerate items in an inclusive time window.RemoveOlderThan(cutoff): age/prune items strictly older than cutoff.CountInRange(from, to): count items in a window.GetTimeSpan(): time span covered by the collection (latest−earliest).RemoveRange(from, to): delete items in a window.GetLatest() / GetEarliest(): fast access to extremes.GetBefore(time) / GetAfter(time): query by relative time.CountSince(from): rolling counts.GetNearest(time): nearest neighbor by timestamp.This interface makes code collection agnostic, you can prototype with a queue and later swap to a sorted structure or an interval tree without rewriting queries.
This section shows how to install and use TemporalCollections in your .NET projects with simple examples.
dotnet add package TemporalCollections
TemporalQueue
using System;
using System.Linq;
using TemporalCollections.Collections;
var queue = new TemporalQueue<string>();
// Enqueue items (timestamps are assigned automatically)
queue.Enqueue("event-1");
queue.Enqueue("event-2");
// Peek oldest (does not remove)
var oldest = queue.Peek();
Console.WriteLine($"Oldest: {oldest.Value} @ {oldest.Timestamp}");
// Dequeue oldest (removes)
var dequeued = queue.Dequeue();
Console.WriteLine($"Dequeued: {dequeued.Value} @ {dequeued.Timestamp}");
// Query by time range (inclusive)
var from = DateTime.UtcNow.AddMinutes(-5);
var to = DateTime.UtcNow;
var inRange = queue.GetInRange(from, to);
foreach (var item in inRange)
{
Console.WriteLine($"In range: {item.Value} @ {item.Timestamp}");
}
TemporalSet
using System;
using TemporalCollections.Collections;
var set = new TemporalSet<int>();
set.Add(1);
set.Add(2);
set.Add(2);
Console.WriteLine(set.Contains(1));
// Remove older than a cutoff
var cutoff = DateTime.UtcNow.AddMinutes(-10);
set.RemoveOlderThan(cutoff);
// Snapshot of all items ordered by timestamp
var items = set.GetItems();
TemporalDictionary<TKey, TValue>
using System;
using System.Linq;
using TemporalCollections.Collections;
var dict = new TemporalDictionary<string, string>();
dict.Add("user:1", "login");
dict.Add("user:2", "logout");
dict.Add("user:1", "refresh");
// Range query across all keys
var from = DateTime.UtcNow.AddMinutes(-1);
var to = DateTime.UtcNow.AddMinutes(1);
var all = dict.GetInRange(from, to);
// Range query for a specific key
var user1 = dict.GetInRange("user:1", from, to);
// Compute span covered by all events
var span = dict.GetTimeSpan();
Console.WriteLine($"Span: {span}");
// Remove a time window across all keys
dict.RemoveRange(from, to);
TemporalStack
using System;
using System.Linq;
using TemporalCollections.Collections;
var stack = new TemporalStack<string>();
// Push (timestamps assigned automatically, monotonic UTC)
stack.Push("first");
stack.Push("second");
// Peek last pushed (does not remove)
var top = stack.Peek();
Console.WriteLine($"Top: {top.Value} @ {top.Timestamp}");
// Pop last pushed (removes)
var popped = stack.Pop();
Console.WriteLine($"Popped: {popped.Value}");
// Time range query (inclusive)
var from = DateTime.UtcNow.AddMinutes(-5);
var to = DateTime.UtcNow;
var items = stack.GetInRange(from, to).OrderBy(i => i.Timestamp);
// Remove older than cutoff
var cutoff = DateTime.UtcNow.AddMinutes(-10);
stack.RemoveOlderThan(cutoff);
TemporalSlidingWindowSet
using System;
using System.Linq;
using TemporalCollections.Collections;
var window = TimeSpan.FromMinutes(10);
var swSet = new TemporalSlidingWindowSet<string>(window);
// Add unique items (insertion timestamp recorded)
swSet.Add("A");
swSet.Add("B");
// Periodically expire items older than the window
swSet.RemoveExpired();
// Snapshot (ordered by timestamp)
var snapshot = swSet.GetItems().ToList();
// Query by time range
var from = DateTime.UtcNow.AddMinutes(-5);
var to = DateTime.UtcNow;
var inRange = swSet.GetInRange(from, to);
// Manual cleanup by cutoff (if needed)
swSet.RemoveOlderThan(DateTime.UtcNow.AddMinutes(-30));
TemporalSortedList
using System;
using System.Linq;
using TemporalCollections.Collections;
var list = new TemporalSortedList<int>();
// Add items (kept sorted by timestamp internally)
list.Add(10);
list.Add(20);
list.Add(30);
// Fast range query via binary search (inclusive)
var from = DateTime.UtcNow.AddSeconds(-30);
var to = DateTime.UtcNow;
var inRange = list.GetInRange(from, to);
// Before / After helpers
var before = list.GetBefore(DateTime.UtcNow);
var after = list.GetAfter(DateTime.UtcNow.AddSeconds(-5));
// Housekeeping
list.RemoveOlderThan(DateTime.UtcNow.AddMinutes(-1));
Console.WriteLine($"Span: {list.GetTimeSpan()}");
TemporalPriorityQueue<TPriority, TValue>
using System;
using System.Linq;
using TemporalCollections.Collections;
var pq = new TemporalPriorityQueue<int, string>();
// Enqueue with explicit priority (lower number = higher priority)
pq.Enqueue("high", priority: 1);
pq.Enqueue("low", priority: 10);
// TryPeek (does not remove)
if (pq.TryPeek(out var next))
{
Console.WriteLine($"Peek: {next}");
}
// TryDequeue (removes highest-priority; stable by insertion time)
while (pq.TryDequeue(out var val))
{
Console.WriteLine($"Dequeued: {val}");
}
// Time-based queries are also available
var from = DateTime.UtcNow.AddMinutes(-5);
var to = DateTime.UtcNow;
var items = pq.GetInRange(from, to);
Console.WriteLine($"Count in range: {pq.CountInRange(from, to)}");
TemporalCircularBuffer
using System;
using System.Linq;
using TemporalCollections.Collections;
// Fixed-capacity ring buffer; overwrites oldest when full
var buf = new TemporalCircularBuffer<string>(capacity: 3);
buf.Add("A");
buf.Add("B");
buf.Add("C");
buf.Add("D"); // Overwrites "A"
// Snapshot (oldest -> newest)
var snapshot = buf.GetSnapshot();
foreach (var it in snapshot)
{
Console.WriteLine($"{it.Value} @ {it.Timestamp}");
}
// Range queries
var from = DateTime.UtcNow.AddMinutes(-5);
var to = DateTime.UtcNow;
var inRange = buf.GetInRange(from, to);
// Remove a time window
buf.RemoveRange(from, to);
// Cleanup by cutoff (keeps >= cutoff)
buf.RemoveOlderThan(DateTime.UtcNow.AddMinutes(-1));
TemporalIntervalTree
using System;
using System.Linq;
using TemporalCollections.Collections;
var tree = new TemporalIntervalTree<string>();
var now = DateTime.UtcNow;
tree.Insert(now, now.AddMinutes(10), "session:A");
tree.Insert(now.AddMinutes(5), now.AddMinutes(15), "session:B");
// Overlap query (values only)
var overlapValues = tree.Query(now.AddMinutes(7), now.AddMinutes(12));
// Overlap query (with timestamps = interval starts)
var overlapItems = tree.GetInRange(now.AddMinutes(7), now.AddMinutes(12));
Console.WriteLine($"Overlaps: {string.Join(", ", overlapValues)}");
// Remove intervals that ended before a cutoff
tree.RemoveOlderThan(now.AddMinutes(9));
All collections are thread-safe. Locking granularity and common operations (amortized):
| Collection | Locking | Add/Push | Range Query | RemoveOlderThan |
|---|---|---|---|---|
| TemporalQueue | single lock around a queue snapshot | O(1) | O(n) | O(k) from head |
| TemporalStack | single lock; drain & rebuild for window ops | O(1) | O(n) | O(n) |
| TemporalSet | lock-free dict + per-bucket ops | O(1) avg | O(n) | O(n) |
| TemporalSortedList | single lock; binary search for ranges | O(n) insert | O(log n + m) | O(k) |
| TemporalPriorityQueue | single lock; SortedSet by (priority,timestamp) |
O(log n) | O(n) | O(n) |
| TemporalIntervalTree | single lock; interval overlap pruning | O(log n) avg | O(log n + m) | O(n) |
| TemporalDictionary | concurrent dict + per-list lock | O(1) avg | O(n) | O(n) |
| TemporalCircularBuffer | single lock; ring overwrite | O(1) | O(n) | O(n) |
n = items, m = matches, k = removed.
Measured with BenchmarkDotNet, the results paint a consistent picture:
Insert-heavy pipelines with periodic age-off
TemporalQueue and TemporalCircularBuffer deliver the lowest median insert times (constant-time appends) and predictable pruning
(head-first for the queue, overwrite for the ring).
Frequent, wide time-window queries over large datasets
TemporalSortedList (binary-search boundaries) and TemporalIntervalTree (overlap index) offer the best query latency,
at the cost of more expensive inserts—especially for the sorted list.
Middle ground
TemporalSet and TemporalSlidingWindowSet show good insertion behavior and simple maintenance,
but range scans are linear compared to indexed structures.
Priority-aware processing
TemporalPriorityQueue optimizes for priority-based dequeue, so time-range scans and pruning are comparatively slower.
Per-key histories + global time queries
TemporalDictionary<TKey,TValue> is a balanced option when you need per-key histories together with global time queries,
while TemporalStack mirrors the queue on inserts but pays linear costs on range queries and pruning.
For exact median timings, environment details, and methodology, see the full report:
TemporalCollections offers a pragmatic, production-minded approach to managing time-aware data in .NET: you get consistent timestamps, a unified query API, and a portfolio of structures optimized for different temporal needs. Start simple with a queue or sliding window set; when your workload demands it, switch to a sorted or interval-based structure, without changing how you query by time.
]]>The modular monolith is a compromise in architectural form: like traditional monolithic applications, it is simple, but it also imposes simple, clear module boundaries, similar to microservices. In this case, the software will be deployed as a single software unit, but it will have clear module boundaries with well encapsulated components, each encapsulating a domain or business function.
Often referred to as the Goldilocks architecture, it provides just right flexibility and simplicity. You miss the operational overhead of microservices (service orchestration, network latency, etc.) while pumping up maintainability and scalability over that of tightly coupled monolith. Other compelling advantages:
Encapsulation through Modules Every module should encapsulate its own domain logic, data access, and potentially its own database context. Modules only interact with one another via abstractions, any cross-module leakage will manage that effectively to ensure low coupling and high cohesion.
Vertical Slices + Domain-Driven Design Structuring your code (Vertical Slice Architecture) around features makes implementation closer to its business use cases. For our Clean or Onion Architecture layers, domain-centred architecture can also help establish clearer dependencies and encourage enforced isolation.
Shared vs Isolated Data Even though you might have multiple modules sharing a database, architectural discipline should prevent ways to gain unauthorized access. Each module should take care to only interact with its aggregates or its schemas, this may be enforced through internal visibility or architectural boundaries.
In a modular monolith, each module is self-contained and exposes only a public API (interfaces, commands, events) to other modules, while internal classes remain inaccessible. Imagine a .NET solution structured like this:
MyApp.Web // Entry point
MyApp.Modules.Orders // Orders domain
MyApp.Modules.Inventory // Inventory domain
MyApp.SharedKernel // Shared contracts
MyApp.Modules.Orders.Tests
MyApp.Modules.Inventory.Tests
// SharedKernel/Contracts/IInventoryService.cs
public interface IInventoryService
{
bool AdjustStock(int productId, int quantity);
}
internal class InventoryService : IInventoryService
{
public bool AdjustStock(int productId, int quantity) { ... }
}
Modules expose public interfaces for cross-module communication.
public static IServiceCollection AddInventoryModule(this IServiceCollection services)
{
services.AddScoped<IInventoryService, InventoryService>();
return services;
}
For example, OrderService could depend on an IInventoryService defined in a shared contract promoting decoupling.
Use the internal modifier to enforce module boundaries. Some frameworks, like Ardalis.Modulith (https://github.com/ardalis/modulith), even automate boundary checking via tests.
Usage in Orders module:
public class OrderService
{
private readonly IInventoryService _inventory;
public OrderService(IInventoryService inventory) => _inventory = inventory;
public bool PlaceOrder(int productId, int qty) =>
_inventory.AdjustStock(productId, -qty);
}
The key is that modules never depend on each other’s internal classes only on public contracts. This keeps boundaries clear, reduces coupling, and makes future refactoring or extraction into microservices easier.
This style shines when:
Modular monoliths are a clever compromise. They’re easier to deal with than real distributed systems, while yielding structure, maintainability, and testability. Modular monoliths are also complementary with .NET tooling and provide a future migration path into microservices if that is the long term goal.
]]>Hexagonal Architecture, introduced by Alistair Cockburn, is an approach to software design that aims to isolate the core business logic from external concerns like databases, web APIs, UIs and messaging systems.
Key Concepts:
Benefits:
Software development in the world of .NET is often quick off the blocks ASP.NET Core gives us ready to use templates, Entity Framework simplifies the data access, and integrating third-party services is just a package away. However, as expectations shift and the code grows, tightly coupled code becomes the drag rather than the benefit. This is where Hexagonal Architecture really shines. Instead of structuring your app based on frameworks and technology, you structure your app based on your domain logic the essence of your application that actually matters. By isolating the core business rules from external dependencies, Hexagonal Architecture helps .NET developers create systems that are more maintainable, testable, and adaptable over time. You’re no longer tied to a specific database, UI, or even framework those are just adapters that can be swapped out with minimal impact. Here’s why this approach can be a game-changer for .NET applications:
In the real world, this structure is very powerful in:
Overall, Hexagonal Architecture does not slow your development process down it just ensures the speed you’re gaining today won’t come at the expense of maintainability tomorrow.
Let’s implement a simple Webinar Management system following the hexagonal approach.
Domain: WebinarControl.Core
Application Layer: WebinarControl.Application
Adapters:
WebinarControl.Infrastructure: Database Adapter (EF Core)WebinarControl.WebApi: REST API Adapter (ASP.NET Core)WebinarControl.Core// Models/Webinar.cs
namespace WebinarControl.Core.Models
{
public class Webinar
{
public Guid Id { get; set; } = Guid.NewGuid();
public string Title { get; set; } = string.Empty;
public string Speaker { get; set; } = string.Empty;
public DateTime ScheduledAt { get; set; }
}
}
// Ports/IWebinarRepository.cs
namespace WebinarControl.Core.Ports
{
public interface IWebinarRepository
{
Task<Webinar> GetByIdAsync(Guid id);
Task<IEnumerable<Webinar>> GetAllAsync();
Task AddAsync(Webinar webinar);
}
}
// Ports/IScheduleWebinarUseCase.cs
namespace WebinarControl.Core.Ports
{
public interface IScheduleWebinarUseCase
{
Task<Webinar> ScheduleAsync(string title, string speaker, DateTime scheduledAt);
}
}
WebinarControl.Application// Services/ScheduleWebinarService.cs
using WebinarControl.Core.Models;
using WebinarControl.Core.Ports;
namespace WebinarControl.Application.Services
{
public class ScheduleWebinarService : IScheduleWebinarUseCase
{
private readonly IWebinarRepository _repository;
public ScheduleWebinarService(IWebinarRepository repository)
{
_repository = repository;
}
public async Task<Webinar> ScheduleAsync(string title, string speaker, DateTime scheduledAt)
{
var webinar = new Webinar { Title = title, Speaker = speaker, ScheduledAt = scheduledAt };
await _repository.AddAsync(webinar);
return webinar;
}
}
}
WebinarControl.Infrastructure// EF/WebinarDbContext.cs
using Microsoft.EntityFrameworkCore;
using WebinarControl.Core.Models;
namespace WebinarControl.Infrastructure.EF
{
public class WebinarDbContext : DbContext
{
public DbSet<Webinar> Webinars => Set<Webinar>();
public WebinarDbContext(DbContextOptions<WebinarDbContext> options)
: base(options) { }
}
}
// Repositories/WebinarRepository.cs
using Microsoft.EntityFrameworkCore;
using WebinarControl.Core.Models;
using WebinarControl.Core.Ports;
namespace WebinarControl.Infrastructure.Repositories
{
public class WebinarRepository : IWebinarRepository
{
private readonly WebinarDbContext _context;
public WebinarRepository(WebinarDbContext context)
{
_context = context;
}
public async Task AddAsync(Webinar webinar)
{
_context.Webinars.Add(webinar);
await _context.SaveChangesAsync();
}
public async Task<IEnumerable<Webinar>> GetAllAsync()
{
return await _context.Webinars.ToListAsync();
}
public async Task<Webinar> GetByIdAsync(Guid id)
{
return await _context.Webinars.FindAsync(id);
}
}
}
WebinarControl.WebApi// Program.cs
using Microsoft.EntityFrameworkCore;
using WebinarControl.Core.Ports;
using WebinarControl.Application.Services;
using WebinarControl.Infrastructure.EF;
using WebinarControl.Infrastructure.Repositories;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddDbContext<WebinarDbContext>(opt =>
opt.UseInMemoryDatabase("WebinarDb"));
builder.Services.AddScoped<IWebinarRepository, WebinarRepository>();
builder.Services.AddScoped<IScheduleWebinarUseCase, ScheduleWebinarService>();
builder.Services.AddEndpointsApiExplorer();
builder.Services.AddSwaggerGen();
var app = builder.Build();
app.UseSwagger();
app.UseSwaggerUI();
app.MapPost("/webinars", async (string title, string speaker, DateTime scheduledAt, IScheduleWebinarUseCase useCase) =>
{
var webinar = await useCase.ScheduleAsync(title, speaker, scheduledAt);
return Results.Created($"/webinars/{webinar.Id}", webinar);
});
app.Run();
// UnitTests/ScheduleWebinarServiceTests.cs
using Moq;
using WebinarControl.Core.Ports;
using WebinarControl.Application.Services;
using Xunit;
public class ScheduleWebinarServiceTests
{
[Fact]
public async Task ScheduleAsync_ShouldCreateWebinarWithGivenDetails()
{
var repoMock = new Mock<IWebinarRepository>();
var service = new ScheduleWebinarService(repoMock.Object);
var scheduledAt = DateTime.UtcNow.AddDays(1);
var webinar = await service.ScheduleAsync(".NET Hexagonal", "F. Del Re", scheduledAt);
Assert.Equal(".NET Hexagonal", webinar.Title);
Assert.Equal("F. Del Re", webinar.Speaker);
Assert.Equal(scheduledAt, webinar.ScheduledAt);
repoMock.Verify(r => r.AddAsync(It.IsAny<Webinar>()), Times.Once);
}
}
While Hexagonal Architecture offers numerous benefits, especially around testability, separation of concerns, and adaptability, it’s important to understand that it comes with trade-offs.
Initial Complexity and Overhead: When starting a new project, Hexagonal Architecture introduces abstractions and layers that may feel premature or heavy-weight for small or prototype applications. You might be crafting multiple projects, interfaces, and dependency injections before you see a single feature working.
Over-Abstraction: Too much abstraction can lead to boilerplate code and cognitive overhead, especially when interfaces are created for every single service, even when there’s only a single implementation. For small teams or codebases, this slows down development rather than speeding it up.
Learning Curve for Developers Teams: New developers to the pattern may struggle to find their way around or contribute to the code base. Terms like ports, adapters, and the separation of inbound and outbound interfaces require a shift in thinking from traditional layered architectures.
Hexagonal Architecture is a strong paradigm for .NET developers wishing to write maintainable, testable, and dynamic systems. By keeping your domain logic and infrastructure concerns well separated, you not only decouple, but you allow your application to scale in a manner independent of tech choices. With .NET and C#’s evolving features, it’s easier than ever now to write well-architected systems that stand the test of time.
]]>In a microservices context, and specifically in a Kubernetes environment, health probes are ways for the platform to track the status of each container and take relevant action when something goes wrong. Health probes help ensure high availability and are a key part of managing the orchestration of services because they deliver information to Kubernetes about your application’s internal state. There are three types of probes: liveness probe, readiness probe, and startup probe.
The liveness probe acts to determine if your application is still active. It gives a simple answer to the question: is the application running, or is it deadlocked or stuck? If the liveness probe continues to fail, Kubernetes will treat the container as broken and will restart it automatically. This is useful in case where the app has stopped processing due to some internal failure, but the process has not crashed. Liveness checks are usually simple and fast, just enough to determine that the core application loop has not gone down. A properly configured liveness probe will prevent long-running but non-live containers from staying in production, improving overall resilience.
The readiness probe checks if a container is ready to serve requests. The container may be alive (as determined by a liveness probe) but still not ready to serve requests for a variety of reasons such as still initializing, waiting for configuration, or establishing a database connection. In this event, Kubernetes will drop the Pod from the Service endpoint list until the readiness probe is satisfied again. The container is not restarted, the container is just being held back from serving requests. This readiness check is especially important during deployments and rolling updates, and restarts to ensure that only containers that are fully ready take on any load.
The startup probe is intended for use with applications that take a long time to initialize. The startup probe will run once during startup, and while it is in a state of failure, Kubernetes won’t run the liveness or readiness probes. This is particularly valuable for legacy systems or services that have long bootstrapping processes. The startup probe avoids a case where the liveness probe can prematurely mark the probe as failed before the application is even ready and cause the container to restart. Once the startup probe has skipped, Kubernetes will start running the regular readiness and liveness probes.
Health check functionality is built into the ASP.NET Core framework and does not require any additional packages.
In your Program.cs or Startup.cs, register health checks:
builder.Services.AddHealthChecks()
.AddCheck<DatabaseHealthCheck>("database_check");
You can create custom checks by implementing IHealthCheck interface which contains a single CheckHealthAsync method:
public class DatabaseHealthCheck : IHealthCheck
{
private readonly IConfiguration _config;
public DatabaseHealthCheck(IConfiguration config)
{
_config = config;
}
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
var connectionString = _config["Data:DefaultConnection"];
using var connection = new SqlConnection(connectionString);
try
{
await connection.OpenAsync(cancellationToken);
return HealthCheckResult.Healthy();
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy(ex);
}
}
}
Map the health check endpoints in Program.cs:
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
Predicate = (check) => check.Name == "self"
});
app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
Predicate = (check) => check.Name == "database"
});
In certain cases, it’s useful to aggregate multiple health checks under a single, composite health check. This is particularly helpful when you want to expose a higher-level abstraction like StorageHealth, which internally evaluates the health of, say, a database, a blob storage, and a file system. Here’s how you can implement a composite health check by composing multiple IHealthCheck instances:
public class StorageHealthCheck : IHealthCheck
{
private readonly IEnumerable<IHealthCheck> _checks;
public StorageHealthCheck(IEnumerable<IHealthCheck> checks)
{
_checks = checks;
}
public async Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default)
{
var results = await Task.WhenAll(_checks.Select(c =>
c.CheckHealthAsync(context, cancellationToken)));
if(results.Any(r => r.Status == HealthStatus.Unhealthy))
{
return HealthCheckResult.Unhealthy();
}
return HealthCheckResult.Healthy();
}
}
You can register it like this:
builder.Services.AddHealthChecks()
.AddCheck<StorageHealthCheck>("storage_health");
Example configuration for Kubernetes deployment.yaml:
livenessProbe:
httpGet:
path: /health/live
port: 80
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 80
initialDelaySeconds: 5
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
startupProbe:
httpGet:
path: /health/live
port: 80
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 30
initialDelaySeconds: Time to wait after container starts before probing.periodSeconds: How often to perform the check.failureThreshold: Number of failed checks before taking action.timeoutSeconds: Timeout for each probe request./health/ready to include checks for dependencies like databases, caches, etc./health/live to ensure your app is running, even if not fully operational.startupProbe for apps that need extra time to initialize.Health probes are a vital part for robust microservices. By utilizing ASP.NET Core’s health check system and Kubernetes probes in conjunction, you’ll have the ability to see that your services are reliably behaving and scaling appropriately. When you correctly implement the liveness, readiness, and startup probes, you can reduce downtime and increase observability.
]]>