Skip to content

Latest commit

 

History

History
363 lines (276 loc) · 13.9 KB

File metadata and controls

363 lines (276 loc) · 13.9 KB

PROMPT: Build RegexCraft from scratch

You are Claude Code. Build the entire RegexCraft project autonomously. Do NOT ask any questions. Make all decisions yourself. Follow every step below in order.


Project Overview

RegexCraft is a C# fluent API for building regex patterns with full IntelliSense support. It provides method chaining for pattern construction, an Explain method for human-readable descriptions, GeneratedRegex code generation, and pre-built common patterns.

  • Language: C#
  • Target: .NET 8.0
  • Test Framework: xUnit + FluentAssertions
  • Package Manager: NuGet (dotnet CLI)
  • Working Directory: /Users/kexxt/code-opensource/regexcraft

Step 1: Initialize the Project

cd /Users/kexxt/code-opensource/regexcraft
dotnet new sln -n RegexCraft
mkdir -p src/RegexCraft tests/RegexCraft.Tests

Create the projects:

  • src/RegexCraft/RegexCraft.csproj — Class library targeting net8.0. Set <PackageId>RegexCraft</PackageId>, <GenerateDocumentationFile>true</GenerateDocumentationFile>, <NoWarn>$(NoWarn);CS1591</NoWarn> (but aim to document everything).
  • tests/RegexCraft.Tests/RegexCraft.Tests.csproj — xUnit test project targeting net8.0. Reference RegexCraft. Add FluentAssertions.

Add all to solution:

dotnet sln add src/RegexCraft/RegexCraft.csproj
dotnet sln add tests/RegexCraft.Tests/RegexCraft.Tests.csproj

Step 2: Implement Core Builder

Design

The fluent API uses two key types:

  1. Pattern — static entry point with methods/properties that return builders
  2. PatternBuilder — mutable builder that accumulates regex segments
  3. QuantifierState — intermediate state after a quantifier is specified, awaiting a character class

The builder stores a list of Segment records internally, each containing the raw regex string and metadata (for the Explain feature).

src/RegexCraft/Segment.cs

internal record Segment(string Regex, string Description);

src/RegexCraft/PatternBuilder.cs

Internal mutable builder:

public class PatternBuilder
{
    private readonly List<Segment> _segments = new();
    private string? _pendingQuantifier;
    private string? _pendingQuantifierDesc;

Methods (all return PatternBuilder for chaining):

Quantifiers (set pending state):

  • OneOrMore (property) → pending +, desc "one or more"
  • ZeroOrMore (property) → pending *, desc "zero or more"
  • Optional (property) → pending ?, desc "optional"
  • Exactly(int n) → pending {n}, desc "exactly N"
  • Between(int min, int max) → pending {min,max}, desc "between min and max"
  • AtLeast(int n) → pending {n,}, desc "at least N"

Character Classes (consume pending quantifier):

  • Digits (property) → \d + quantifier
  • Letters (property) → [a-zA-Z] + quantifier
  • WordChars (property) → \w + quantifier
  • Whitespace (property) → \s + quantifier
  • Any (property) → . + quantifier
  • AnyOf(string chars)[chars] + quantifier
  • NoneOf(string chars)[^chars] + quantifier

Anchors:

  • StartOfLine (property) → ^
  • EndOfLine (property) → $
  • WordBoundary (property) → \b

Literals:

  • Then(string text)Regex.Escape(text) — consumes pending quantifier if present (wraps in non-capturing group if multi-char with quantifier)
  • Literal(string text) → alias for Then

Groups:

  • Group(Action<PatternBuilder> inner)(...)
  • NamedGroup(string name, Action<PatternBuilder> inner)(?<name>...)
  • NonCapturing(Action<PatternBuilder> inner)(?:...)

All group methods: create a sub-builder, invoke the action, get the sub-pattern, wrap in group syntax, apply pending quantifier if present.

Alternation:

  • Or(Action<PatternBuilder> alternative) → appends | then the alternative pattern
  • Implementation: adds | segment, then builds the alternative inline

Terminal Methods:

  • Build() → concatenates all segment regex strings, returns the pattern string
  • ToRegex(RegexOptions options = RegexOptions.None) → returns new Regex(Build(), options)
  • Test(string input)ToRegex().IsMatch(input)
  • Explain() → joins all segment descriptions into a human-readable sentence
  • ToGeneratedRegex(string methodName = "MyPattern") → returns C# code string

src/RegexCraft/Pattern.cs

Static entry point class. Each static property/method creates a new PatternBuilder and calls the corresponding method on it:

public static class Pattern
{
    public static PatternBuilder StartOfLine => new PatternBuilder().StartOfLine;
    public static PatternBuilder EndOfLine => new PatternBuilder().EndOfLine;
    public static PatternBuilder WordBoundary => new PatternBuilder().WordBoundary;
    public static PatternBuilder OneOrMore => new PatternBuilder().OneOrMore;
    public static PatternBuilder ZeroOrMore => new PatternBuilder().ZeroOrMore;
    public static PatternBuilder Optional => new PatternBuilder().Optional;
    public static PatternBuilder Exactly(int n) => new PatternBuilder().Exactly(n);
    public static PatternBuilder Between(int min, int max) => new PatternBuilder().Between(min, max);
    public static PatternBuilder AtLeast(int n) => new PatternBuilder().AtLeast(n);
    // Character classes without quantifier
    public static PatternBuilder Digits => new PatternBuilder().Digits;
    public static PatternBuilder Letters => new PatternBuilder().Letters;
    public static PatternBuilder WordChars => new PatternBuilder().WordChars;
    public static PatternBuilder Whitespace => new PatternBuilder().Whitespace;
    public static PatternBuilder Any => new PatternBuilder().Any;
    public static PatternBuilder Then(string text) => new PatternBuilder().Then(text);
    public static PatternBuilder Literal(string text) => new PatternBuilder().Literal(text);
    public static PatternBuilder Group(Action<PatternBuilder> inner) => new PatternBuilder().Group(inner);
    public static PatternBuilder NamedGroup(string name, Action<PatternBuilder> inner) => new PatternBuilder().NamedGroup(name, inner);
    public static PatternBuilder NonCapturing(Action<PatternBuilder> inner) => new PatternBuilder().NonCapturing(inner);
}

Step 3: Implement Explain

src/RegexCraft/PatternExplainer.cs

The Explain() method on PatternBuilder joins segment descriptions:

Each segment stores both the regex and a description. Examples:

  • ^ → "start of line"
  • \d{3} → "exactly 3 digits"
  • \- → "literal '-'"
  • \d{4} → "exactly 4 digits"
  • $ → "end of line"

Explain() joins with ", " and capitalizes the first letter: → "Start of line, exactly 3 digits, literal '-', exactly 4 digits, end of line"


Step 4: Implement GeneratedRegex Code Generation

src/RegexCraft/GeneratedRegexEmitter.cs

ToGeneratedRegex(string methodName = "MyPattern") returns:

[GeneratedRegex(@"<pattern>")]
private static partial Regex <methodName>();

Escape the pattern properly for a verbatim string literal (double any quotes).


Step 5: Implement Pre-built Patterns

src/RegexCraft/Patterns.cs

Static class with pre-built patterns. Each returns a new PatternBuilder instance (so users can further chain if desired):

public static class Patterns
{
    /// <summary>Standard email validation pattern.</summary>
    public static PatternBuilder Email => Pattern.OneOrMore.WordChars
        .Then("@").OneOrMore.WordChars.Then(".").Between(2, 63).Letters;

    /// <summary>HTTP/HTTPS URL pattern.</summary>
    public static PatternBuilder Url => ...;

    /// <summary>IPv4 address pattern.</summary>
    public static PatternBuilder IpV4 => ...;

    /// <summary>US phone number pattern.</summary>
    public static PatternBuilder Phone => ...;

    /// <summary>GUID/UUID pattern (with hyphens).</summary>
    public static PatternBuilder Guid => ...;
}

Build each using the fluent API itself. For complex patterns like IpV4, it's acceptable to use a hybrid approach where you manually add segments if the fluent API can't express the pattern cleanly. Alternatively, use groups and alternation.

Make the pre-built patterns reasonably accurate but don't over-engineer them. The point is demonstrating the API, not being RFC-compliant validators.


Step 6: Write Tests

Write comprehensive xUnit tests in tests/RegexCraft.Tests/:

PatternBuilderTests.cs

  • Building an empty pattern returns empty string
  • Pattern.Then("hello").Build() returns escaped literal
  • Chaining multiple Then calls concatenates
  • Pattern.Literal("test") is alias for Then

QuantifierTests.cs

  • Pattern.OneOrMore.Digits.Build()\d+
  • Pattern.ZeroOrMore.Digits.Build()\d*
  • Pattern.Optional.Digits.Build()\d?
  • Pattern.Exactly(3).Digits.Build()\d{3}
  • Pattern.Between(2, 6).Letters.Build()[a-zA-Z]{2,6}
  • Pattern.AtLeast(1).WordChars.Build()\w{1,}
  • Quantifier applied to Then for single char: Pattern.Optional.Then("s").Build()s?
  • Quantifier applied to Then for multi-char wraps in non-capturing group: Pattern.OneOrMore.Then("ab").Build()(?:ab)+

CharacterClassTests.cs

  • Pattern.Digits.Build()\d
  • Pattern.Letters.Build()[a-zA-Z]
  • Pattern.WordChars.Build()\w
  • Pattern.Whitespace.Build()\s
  • Pattern.Any.Build().
  • Pattern.AnyOf("abc").Build()[abc]
  • Pattern.NoneOf("xyz").Build()[^xyz]
  • Special characters in AnyOf are escaped properly

GroupTests.cs

  • Pattern.Group(p => p.Digits).Build()(\d)
  • Pattern.NamedGroup("num", p => p.OneOrMore.Digits).Build()(?<num>\d+)
  • Pattern.NonCapturing(p => p.Then("http").Optional.Then("s")).Build()(?:https?) Wait — that won't work because Optional is a quantifier that attaches to the next class/literal. Let me reconsider... Actually: p.Then("http").Optional.Then("s") should mean: literal "http", then optional "s". So the result is (?:https?). This works because Optional sets a pending quantifier, then Then("s") consumes it, producing s?.
  • Nested groups work
  • Quantifier on group: Pattern.Optional.Group(p => p.Then("abc")).Build()(abc)?

AnchorTests.cs

  • Pattern.StartOfLine.Build()^
  • Pattern.EndOfLine.Build()$
  • Pattern.WordBoundary.Build()\b
  • Combined: Pattern.StartOfLine.OneOrMore.Digits.EndOfLine.Build()^\d+$

AlternationTests.cs

  • Pattern.Then("cat").Or(p => p.Then("dog")).Build()cat|dog
  • Alternation within group: Pattern.Group(p => p.Then("a").Or(q => q.Then("b"))).Build()(a|b)

ExplainTests.cs

  • Pattern.StartOfLine.Exactly(3).Digits.Then("-").Exactly(4).Digits.EndOfLine.Explain() contains meaningful description
  • Pattern.OneOrMore.WordChars.Explain() mentions "one or more" and "word characters"
  • Pre-built patterns have explanations

GeneratedRegexTests.cs

  • Output contains [GeneratedRegex( attribute
  • Output contains the correct pattern
  • Custom method name is used
  • Output contains partial Regex

PrebuiltPatternTests.cs

  • Patterns.Email.Test("[email protected]") → true
  • Patterns.Email.Test("not-an-email") → false
  • Patterns.IpV4.Test("192.168.1.1") → true
  • Patterns.IpV4.Test("999.999.999.999") → depends on pattern complexity (at minimum, format check)
  • Patterns.Guid.Test("550e8400-e29b-41d4-a716-446655440000") → true
  • Patterns.Url.Test("https://example.com") → true
  • Patterns.Phone.Test("555-1234") → true or similar phone format

IntegrationTests.cs

Full pattern builds and regex matching:

  • Phone pattern: Pattern.StartOfLine.Exactly(3).Digits.Then("-").Exactly(4).Digits.EndOfLine matches "555-1234"
  • Email-like pattern matches "[email protected]"
  • URL-like pattern matches "https://example.com"
  • Pattern does NOT match invalid inputs
  • Complex patterns with groups and alternation

Aim for at least 50 test cases total.


Step 7: Add XML Documentation

Go through every public member of Pattern, PatternBuilder, and Patterns and add XML doc comments with <summary>, <param>, <returns>, and <example> tags. This is a core feature of the library — IntelliSense discoverability.


Step 8: Create Supporting Files

.gitignore

Standard .NET gitignore (bin/, obj/, *.user, .vs/, etc.)

README.md

Brief README with:

  • Project name and one-line description
  • Installation: dotnet add package RegexCraft
  • Quick start examples showing fluent API
  • Pre-built patterns
  • Explain feature
  • GeneratedRegex feature
  • License: MIT

LICENSE

MIT license, copyright 2025 RegexCraft contributors.


Step 9: Build and Test

cd /Users/kexxt/code-opensource/regexcraft
dotnet restore
dotnet build --no-restore
dotnet test --no-build --verbosity normal

Fix any compilation errors or test failures. Iterate until all tests pass and the build is clean.


Step 10: Commit

cd /Users/kexxt/code-opensource/regexcraft
git init
git add -A
git commit -m "feat: initial implementation of RegexCraft - fluent regex builder for .NET

- Fluent API with quantifiers, character classes, groups, anchors, alternation
- Method chaining: Pattern.StartOfLine.Exactly(3).Digits.Then(\"-\").Build()
- Explain() method for human-readable pattern descriptions
- ToGeneratedRegex() for .NET 7+ source generator integration
- Pre-built patterns: Email, Url, IpV4, Phone, Guid
- Test() convenience method for quick matching
- Full XML documentation for IntelliSense support
- Comprehensive xUnit test suite with 50+ test cases"

Rules

  • Do NOT ask any questions. Make all decisions autonomously.
  • Do NOT skip any steps.
  • If something fails, debug and fix it yourself.
  • Use net8.0 as the target framework everywhere.
  • Use System.Text.RegularExpressions (NOT any third-party regex library).
  • Use xUnit + FluentAssertions for testing.
  • Ensure the project builds and all tests pass before committing.
  • Write clean, idiomatic C# code.
  • XML doc comments on ALL public APIs are mandatory — this is the core value proposition.
  • The fluent API must be intuitive and read like English.