Skip to content

Commit 33c1f28

Browse files
Shaun Mahonyclaude
andcommitted
Fix CIS-BP sync: stream ZIP extraction to avoid string length limit
The TF_Information_all_motifs.txt file exceeds Node.js's 512MB string limit when extracted via adm-zip's getData().toString(). Replace the in-memory approach with streaming for the web-download path: - Download ZIPs as fetch streams; convert to Node.js Readable via Readable.fromWeb() instead of buffering the full arrayBuffer() - Pipe through unzipper.Parse({ forceStream: true }) to read entries one at a time without materializing the whole archive - Parse TF_Information line-by-line with readline so no single large string is ever allocated - Buffer individual PWM entries (each ~1 KB) via entry.buffer() — safe because they are tiny - Add periodic progress logging every 1,000 stored motifs - Keep adm-zip path in syncCisbp() for smaller user-uploaded ZIPs - Install unzipper + @types/unzipper Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
1 parent c549e33 commit 33c1f28

3 files changed

Lines changed: 324 additions & 26 deletions

File tree

web/package-lock.json

Lines changed: 104 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

web/package.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
"test:watch": "vitest"
1414
},
1515
"dependencies": {
16+
"@types/unzipper": "^0.10.11",
1617
"adm-zip": "^0.5.16",
1718
"archiver": "^7.0.0",
1819
"bullmq": "^5.0.0",
@@ -24,6 +25,7 @@
2425
"nodemailer": "^6.9.0",
2526
"react": "^18.3.0",
2627
"react-dom": "^18.3.0",
28+
"unzipper": "^0.12.3",
2729
"zod": "^3.23.0"
2830
},
2931
"devDependencies": {

0 commit comments

Comments
 (0)