Follow these precise instructions to build the Delimited Text Splitter (DTS) tool exactly as specified. Use Go (golang.org, version 1.21 or later) for the implementation, relying solely on the standard library.
The tool must be a command-line executable, cross-platform (Windows .exe, Linux, macOS), with performance optimizations for large files (10GB+). A key feature is the Windows interactive mode, triggered by dragging a file onto the executable, which provides a guided user experience. The implementation must remain encoding-agnostic by handling all data as byte streams.
- Create a new directory:
mkdir dts && cd dts. - Initialize a Go module:
go mod init github.com/fluxnull/dts. - Create the main file:
main.go. - In
main.go, start with the specified imports and constants.
- In the
mainfunction, detect the execution mode. If run on Windows with a single file argument, callrunInteractiveMode(os.Args[1]). Otherwise, proceed with flag parsing.
- Define the
configstruct. - Use the
flagpackage to parse all command-line options. - Update the
-r/--rangeflag to parse...as the separator. - Implement all validation logic (e.g., ensuring
-for-lis specified).
- Create the
runInteractiveMode(filename string)function. - This function will:
- Analyze the file to get total lines.
- Display file stats.
- Prompt the user for configuration (header, split mode, count).
- Populate the
configstruct. - Run the main split logic, showing an animated spinner.
- Wait for user confirmation to exit.
- Create an
openFilefunction that uses build tags for platform-specific code. - Windows: Use
syscall.CreateFilewithFILE_FLAG_SEQUENTIAL_SCAN. - Linux: Use
os.Openfollowed bysyscall.FadvisewithFADV_SEQUENTIAL. - Other: Use a standard
os.Open.
- Implement the
countLinesAndIndexfunction to concurrently count newlines and build a sparse index for efficient seeking. Use a worker pool andsync.Poolfor buffers.
- Implement
resolveRangeto parseBOF,COF,EOF, and numeric boundaries. - Implement
seekToLineto use the sparse index to jump to the approximate location and then scan forward to the exact line.
- Implement
extractHeaderto read the first line of the file as a byte slice, preserving the original file seek position.
- Implement
runSplitByLinesfor single-pass, streaming splitting based on a line count.
- Implement
runSplitByFilesfor two-pass, concurrent splitting into a fixed number of files. Use channels to distribute work to writer goroutines.
- Implement
generateNameto produce filenames in the format<baseName>_<timestamp>_<part#>.<ext>.
- Build the executable:
go build -o dts. - Cross-build using
GOOSandGOARCH. - Test Plan:
- Verify CLI Mode with various flag combinations.
- Verify Interactive Mode on Windows.
- Benchmark performance and verify output correctness.