Skip to content

Latest commit

 

History

History
204 lines (161 loc) · 6.31 KB

File metadata and controls

204 lines (161 loc) · 6.31 KB

RegexCraft

Build regex patterns with IntelliSense, not StackOverflow.

Fluent API for building regex patterns in C# with full IntelliSense support, an Explain method, GeneratedRegex integration, and pre-built common patterns.

Technical Details

Attribute Value
Language C#
Target .NET 8+
Distribution NuGet
Test framework xUnit + FluentAssertions

Architecture

RegexCraft/
  RegexCraft.sln
  src/
    RegexCraft/
      RegexCraft.csproj
      Pattern.cs                    # Main fluent API entry point
      PatternBuilder.cs             # Internal builder state
      Quantifiers/
        QuantifierExtensions.cs     # .OneOrMore, .ZeroOrMore, .Optional, .Exactly(n), .Between(min,max), .AtLeast(n)
      CharacterClasses/
        CharacterClassExtensions.cs # .Digits, .Letters, .WordChars, .Whitespace, .Any, .AnyOf, .NoneOf
      Groups/
        GroupExtensions.cs          # .Group(...), .NamedGroup(...), .NonCapturing(...)
      Anchors/
        AnchorExtensions.cs         # .StartOfLine, .EndOfLine, .WordBoundary
      Core/
        Alternation.cs              # .Or(...)
        Literal.cs                  # .Then("..."), .Literal("...")
      Explain/
        PatternExplainer.cs         # Human-readable explanation of the pattern
      CodeGen/
        GeneratedRegexEmitter.cs    # Generates [GeneratedRegex] attribute code
      Prebuilt/
        Patterns.cs                 # Pre-built patterns: Email, Url, IpV4, Phone, Guid
  tests/
    RegexCraft.Tests/
      RegexCraft.Tests.csproj
      PatternBuilderTests.cs
      QuantifierTests.cs
      CharacterClassTests.cs
      GroupTests.cs
      AnchorTests.cs
      AlternationTests.cs
      ExplainTests.cs
      GeneratedRegexTests.cs
      PrebuiltPatternTests.cs
      IntegrationTests.cs

Fluent API Design

The API uses method chaining. The entry point is Pattern, which is a static class that starts a builder chain. Each method returns the builder to allow chaining.

Key Design Decision: Quantifier-First vs Target-First

The API uses a quantifier-first style where quantifiers come before the character class they apply to:

Pattern.OneOrMore.Digits           // \d+
Pattern.Exactly(3).Digits          // \d{3}
Pattern.Optional.Letters           // [a-zA-Z]?
Pattern.Between(2, 6).Letters      // [a-zA-Z]{2,6}

This reads naturally: "one or more digits", "exactly 3 digits", "optional letters".

For anchors and literals, no quantifier is needed:

Pattern.StartOfLine                // ^
Pattern.Then("-")                  // \-
Pattern.EndOfLine                  // $

Chaining Examples

// Phone number: ^\d{3}-\d{4}$
Pattern.StartOfLine
    .Exactly(3).Digits
    .Then("-")
    .Exactly(4).Digits
    .EndOfLine
    .Build()

// Email (simplified): \w+@\w+\.[a-zA-Z]{2,6}
Pattern.OneOrMore.WordChars
    .Then("@")
    .OneOrMore.WordChars
    .Then(".")
    .Between(2, 6).Letters
    .Build()

// URL prefix: (?:https?)?://
Pattern.Optional.Group(p => p.Literal("http").Optional.Literal("s"))
    .Then("://")
    .Build()

Builder State

The PatternBuilder maintains:

  • List<string> segments — accumulated regex fragments
  • A "pending quantifier" state — when .OneOrMore is called, it stores the quantifier and applies it when the next character class is specified

API Reference

Entry Point: Pattern (static class)

Quantifiers (return a QuantifierState that expects a character class):

  • Pattern.OneOrMore → pending +
  • Pattern.ZeroOrMore → pending *
  • Pattern.Optional → pending ?
  • Pattern.Exactly(int n) → pending {n}
  • Pattern.Between(int min, int max) → pending {min,max}
  • Pattern.AtLeast(int n) → pending {n,}

Character Classes (can follow a quantifier or be used directly):

  • .Digits\d (with pending quantifier, e.g., \d+)
  • .Letters[a-zA-Z]
  • .WordChars\w
  • .Whitespace\s
  • .Any.
  • .AnyOf(string chars)[chars] (escaped as needed)
  • .NoneOf(string chars)[^chars]

Anchors:

  • .StartOfLine^
  • .EndOfLine$
  • .WordBoundary\b

Literals:

  • .Then(string text) → escaped literal text
  • .Literal(string text) → same as .Then()

Groups:

  • .Group(Action<PatternBuilder> inner)(...) — capturing group
  • .NamedGroup(string name, Action<PatternBuilder> inner)(?<name>...)
  • .NonCapturing(Action<PatternBuilder> inner)(?:...)

Alternation:

  • .Or(Action<PatternBuilder> alternative)|...

Terminal Methods:

  • .Build() → returns the regex string
  • .ToRegex() → returns a compiled Regex object
  • .Test(string input) → returns bool (calls Regex.IsMatch)
  • .Explain() → returns human-readable description
  • .ToGeneratedRegex() → returns C# code string with [GeneratedRegex] attribute

Explain Method

PatternExplainer walks the built pattern and produces an English description:

Pattern.StartOfLine.Exactly(3).Digits.Then("-").Exactly(4).Digits.EndOfLine.Explain()
// → "Start of line, exactly 3 digits, literal '-', exactly 4 digits, end of line"

The explainer doesn't parse raw regex — it uses the builder's internal segment list with metadata.

GeneratedRegex Code Generation

.ToGeneratedRegex() produces:

[GeneratedRegex(@"^\d{3}-\d{4}$")]
private static partial Regex MyPattern();

The method name defaults to "MyPattern" but can be customized: .ToGeneratedRegex("PhoneRegex").

Pre-built Patterns

Patterns static class with pre-built PatternBuilder instances:

  • Patterns.Email — standard email validation pattern
  • Patterns.Url — HTTP/HTTPS URL pattern
  • Patterns.IpV4 — IPv4 address pattern
  • Patterns.Phone — US phone number pattern (various formats)
  • Patterns.Guid — GUID/UUID pattern (with or without hyphens)

Each can be further customized by chaining, or used directly via .Build().

XML Documentation

Every public method and property has XML doc comments for full IntelliSense support. This is a core feature — the entire value proposition is discoverability via IntelliSense.

/// <summary>
/// Matches one or more of the following character class.
/// </summary>
/// <example>
/// Pattern.OneOrMore.Digits.Build() // → \d+
/// </example>
public static QuantifierState OneOrMore => ...