Atomic-level static analysis across binaries and source.
  • Rust 97.6%
  • YARA 1.6%
  • Go 0.5%
  • Makefile 0.2%
Find a file
Thomas Stromberg 173cf93541 update README
2026-04-27 21:52:44 -04:00
.cargo
.config
.githooks bug/testing fixit 2026-03-03 14:48:45 -05:00
benches Add kv subcommand, move lnk metrics into metrics 2026-04-20 17:06:20 -04:00
cleave-macros Embedded file fixups & lint 2026-03-20 12:35:30 -04:00
crates fix invalid embedded python within kotlin detection (harder) 2026-04-27 16:04:03 -04:00
hacks mkdir -p 2026-03-01 20:44:15 -05:00
media update screenshot 2026-04-27 21:45:12 -04:00
src fix invalid embedded python within kotlin detection (harder) 2026-04-27 16:04:03 -04:00
testdata add testdata for encrypted 7z 2026-03-05 20:35:50 -05:00
tests add embedded file metrics 2026-04-24 12:45:40 -04:00
tools Add taogoldi 2026-04-17 08:40:54 -04:00
.gitignore add experiment docs to gitignore 2026-04-06 17:10:16 -04:00
.woodpecker.yml bug/testing fixit 2026-03-03 14:48:45 -05:00
Cargo.lock upgrade crates 2026-04-27 15:51:51 -04:00
Cargo.toml upgrade crates 2026-04-27 15:51:51 -04:00
clippy.toml
CONTRIBUTING.md migrate from 'string_value' to 'text' 2026-04-08 21:08:20 -04:00
LICENSE Add license, delete dead code 2026-03-10 19:12:54 -04:00
Makefile install: work around text file busy on Linux 2026-04-19 06:57:40 -04:00
PRECISION.md efficiency tuning - read once 2026-03-09 21:37:58 -04:00
README.md update README 2026-04-27 21:52:44 -04:00

cleave

cleave answers one question — what can this program do? It extracts capabilities from binaries, source, and archives, scoring each against 50,000+ behavior rules aligned to MBC and ATT&CK. Built for supply-chain and malware triage — useful standalone, and designed from day one to be embedded in other open-source or commercial software via its stable JSON contract. Apache-2.0, no telemetry.

screenshot

What It Analyzes

  • Binaries: Mach-O, ELF, PE, Java .class, Python .pyc, compiled AppleScript
  • Source (~20 languages via tree-sitter): Python, JS/TS, Go, Rust, C/C++, Java, C#, Swift, ObjC, Ruby, PHP, Perl, Lua, Shell, PowerShell, Groovy, Scala, Zig, Elixir
  • Archives (recursive): zip, tar, 7z, rar, jar/war, deb, rpm, apk, gem, crate, whl, nupkg, phar, vsix, xpi, crx, ipa, epub
  • Documents & data: RTF, LNK, PNG (steganography), PDF, plist, VBScript, Batch, package manifests, GitHub Actions workflows

Quick Start

brew tap atomdrift/tap https://codeberg.org/atomdrift/homebrew-tap.git
brew install atomdrift/tap/cleave
# or: make install
cleave suspect.bin                            # single sample
cleave /tmp/box-o-malware                     # recursive, unpacks archives
cleave --format jsonl --min-crit suspicious   # streaming triage feed
cleave diff old.tgz new.tgz                   # behavior delta across releases

Optional: rizin for disassembly, upx for runtime unpacking.

Design

  • Capabilities, not verdicts. Findings ranked from baseline to hostile. Downstream classifiers (e.g. litmus) consume the JSONL directly.
  • No skips. Every archive member is analyzed regardless of size or filename.
  • Layered unpacking. UPX, embedded binaries, and base64/hex/AES/XOR payloads via stng.
  • Deterministic output. JSONL streaming, SHA256-keyed cache, same input → same output.
  • AST matching via tree-sitter; YARA-X for signatures; Goblin for headers.
  • malcontent — predecessor; cleave significantly improves upon its accuracy.
  • capa — original inspiration; cleave has 20× the rule coverage and broader format support.

License

Apache-2.0