Atomic-level static analysis across binaries and source.
- Rust 97.6%
- YARA 1.6%
- Go 0.5%
- Makefile 0.2%
| .cargo | ||
| .config | ||
| .githooks | ||
| benches | ||
| cleave-macros | ||
| crates | ||
| hacks | ||
| media | ||
| src | ||
| testdata | ||
| tests | ||
| tools | ||
| .gitignore | ||
| .woodpecker.yml | ||
| Cargo.lock | ||
| Cargo.toml | ||
| clippy.toml | ||
| CONTRIBUTING.md | ||
| LICENSE | ||
| Makefile | ||
| PRECISION.md | ||
| README.md | ||
cleave answers one question — what can this program do? It extracts capabilities from binaries, source, and archives, scoring each against 50,000+ behavior rules aligned to MBC and ATT&CK. Built for supply-chain and malware triage — useful standalone, and designed from day one to be embedded in other open-source or commercial software via its stable JSON contract. Apache-2.0, no telemetry.
What It Analyzes
- Binaries: Mach-O, ELF, PE, Java
.class, Python.pyc, compiled AppleScript - Source (~20 languages via tree-sitter): Python, JS/TS, Go, Rust, C/C++, Java, C#, Swift, ObjC, Ruby, PHP, Perl, Lua, Shell, PowerShell, Groovy, Scala, Zig, Elixir
- Archives (recursive): zip, tar, 7z, rar, jar/war, deb, rpm, apk, gem, crate, whl, nupkg, phar, vsix, xpi, crx, ipa, epub
- Documents & data: RTF, LNK, PNG (steganography), PDF, plist, VBScript, Batch, package manifests, GitHub Actions workflows
Quick Start
brew tap atomdrift/tap https://codeberg.org/atomdrift/homebrew-tap.git
brew install atomdrift/tap/cleave
# or: make install
cleave suspect.bin # single sample
cleave /tmp/box-o-malware # recursive, unpacks archives
cleave --format jsonl --min-crit suspicious # streaming triage feed
cleave diff old.tgz new.tgz # behavior delta across releases
Optional: rizin for disassembly, upx for runtime unpacking.
Design
- Capabilities, not verdicts. Findings ranked from
baselinetohostile. Downstream classifiers (e.g. litmus) consume the JSONL directly. - No skips. Every archive member is analyzed regardless of size or filename.
- Layered unpacking. UPX, embedded binaries, and base64/hex/AES/XOR payloads via stng.
- Deterministic output. JSONL streaming, SHA256-keyed cache, same input → same output.
- AST matching via tree-sitter; YARA-X for signatures; Goblin for headers.
Related
- malcontent — predecessor; cleave significantly improves upon its accuracy.
- capa — original inspiration; cleave has 20× the rule coverage and broader format support.
License
Apache-2.0
