This Go script splits a large mbox file into smaller parts, each containing a maximum number of emails. The split files are named based on the date of the first email in each part. This is useful for managing large email archives and improving performance when processing email data.
-
Make sure you have Go installed.
-
Clone this repository:
git clone https://github.com/VadimOnix/mbox-splitter.git cd mbox-splitter -
Build the executable:
go build
./mbox-splitter <path to mbox file>For example:
./mbox-splitter /path/to/my/large.mboxThis will create multiple mbox files in the current directory, named like this:
mbox_part_<number>_<date>.mbox
Where:
<number>is the sequential part number (starting from 0).<date>is the date of the first email in that part, in YYYY_MM_DD format.
maxEmailsPerFile: The maximum number of emails per output file. This is controlled by themaxEmailsPerFileconstant in themain.gofile and is currently set to 1000. You can modify this value in the source code if needed.bufferSize: The buffer size used for writing to files. This is controlled by thebufferSizeconstant inmain.goand is set to 4096 bytes. You can adjust this value as needed.
Let's say you have a large large.mbox file. Running:
./mbox-splitter large.mboxmight produce the following output files:
mbox_part_0_2024_09_22.mboxmbox_part_1_2024_09_23.mboxmbox_part_2_2024_09_24.mbox- ... and so on.
The script includes error handling for file operations and will print error messages to the console if any issues occur.
Contributions are welcome! Please feel free to open issues or submit pull requests.