Skip to content

arnac-io/history-sanitizer

Repository files navigation

History Sanitizer πŸ”’

A powerful command-line tool written in Go that automatically scans your shell history files for sensitive information and obfuscates it to keep your data safe.

Why?

Shell history files are incredibly useful for daily work, but they can inadvertently store sensitive information such as:

  • API keys and tokens
  • Passwords and secrets
  • Database connection strings
  • Private keys
  • Credit card numbers
  • Authentication headers

history-sanitizer helps you maintain the utility of your history while protecting sensitive data.

Features

  • πŸ” Smart Detection: Uses detection patterns from Gitleaks - industry-leading, community-maintained secret scanner (15k+ stars)
  • 🎨 Colored Output: Clear, colored terminal output for easy reading
  • πŸ” Safe Obfuscation: Replaces sensitive data with redacted placeholders
  • πŸ’Ύ Non-Destructive: Creates a new sanitized file, preserving your original
  • 🌈 Multi-Shell Support: Works with bash, zsh, fish, and other shell history formats
  • πŸš€ Fast & Efficient: Built with Go for speed and reliability
  • πŸ§ͺ Dry Run Mode: Preview changes before applying them
  • πŸ”„ Auto-Updated Patterns: Leverages Gitleaks' actively maintained detection rules

Detection Patterns

The tool uses detection patterns sourced from Gitleaks - a well-maintained, community-driven project. We've extracted and implemented 36 high-value patterns covering:

Cloud Providers & Services:

  • AWS (Access Keys, Secret Keys, Session Tokens)
  • Google Cloud (API Keys)
  • Stripe, Heroku, Square API keys

Version Control:

  • GitHub (Personal Access Tokens, App Tokens, OAuth tokens, Fine-Grained PATs)

Credentials & Secrets:

  • Private Keys (RSA, EC, DSA, PGP, SSH)
  • JWT Tokens
  • Database connection strings (MongoDB, MySQL, PostgreSQL)
  • Generic passwords, API keys, and secrets

Communication & Monitoring:

  • Slack (Bot/App/User/Webhook tokens)
  • SendGrid, MailChimp, Twilio API keys
  • Datadog, PagerDuty tokens

Other:

  • 1Password service tokens
  • Environment variables with secrets
  • Proxy URLs with passwords

The full Gitleaks config (200+ rules) is embedded for reference at pkg/scanner/gitleaks.toml.

How We Use Gitleaks Patterns

We extract and implement Gitleaks' regex patterns directly because:

  • βœ… Gitleaks patterns are open source and well-maintained by a large community
  • βœ… Gitleaks CLI is designed as a standalone tool, not a Go library
  • βœ… Direct pattern implementation is simpler and more maintainable
  • βœ… Avoids 50+ transitive dependencies from the full Gitleaks package
  • βœ… We get the same detection quality with full control over the implementation

Our implementation:

  • Patterns defined in pkg/scanner/patterns.toml (extracted from Gitleaks)
  • Full gitleaks.toml (95KB, 200+ rules) embedded for reference
  • Easy to update by syncing with the official Gitleaks repository

Installation

Using Homebrew (Recommended)

brew tap arnac-io/tap
brew install history-sanitizer

From Source

git clone https://github.com/arnac-io/history-sanitizer.git
cd history-sanitizer
go build -o history-sanitizer

Using Go Install

go install github.com/arnac-io/history-sanitizer@latest

Usage

Basic Usage

Scan and sanitize your default shell history (zsh):

./history-sanitizer

Specify a History File

./history-sanitizer -f ~/.bash_history

Dry Run (Preview Only)

See what would be changed without modifying any files:

./history-sanitizer --dry-run

Verbose Output

Show detailed information about each finding:

./history-sanitizer -v

List Available Detection Rules

See all detection rules provided by Gitleaks:

./history-sanitizer list-rules

Custom Output File

./history-sanitizer -f ~/.bash_history -o ~/safe_history.txt

Complete Example

# Scan with dry run to see what will be found
./history-sanitizer -f ~/.zsh_history --dry-run -v

# If satisfied, run the actual sanitization
./history-sanitizer -f ~/.zsh_history -o ~/.zsh_history.clean

# Review the cleaned file
less ~/.zsh_history.clean

# Replace original (make sure to backup first!)
cp ~/.zsh_history ~/.zsh_history.backup
mv ~/.zsh_history.clean ~/.zsh_history

Command-Line Options

Flag Short Description Default
--file -f Path to history file ~/.zsh_history
--output -o Output file path <input>.sanitized
--dry-run -d Show changes without modifying files false
--verbose -v Show detailed information false
--in-place -i Replace original file (creates .backup) false
--help -h Show help message -

Additional Commands

Command Description
list-rules Display all available Gitleaks detection rules

Example Output

πŸ” Scanning history file: /Users/you/.zsh_history

⚠ Found 3 sensitive pattern(s)

Finding #1:
  Type: AWS Access Key
  Line: 42

Finding #2:
  Type: Generic Secret
  Line: 108

Finding #3:
  Type: GitHub Token
  Line: 234

βœ“ Sanitized history saved to: /Users/you/.zsh_history.sanitized

Original file preserved at: /Users/you/.zsh_history

To replace your history file, run:
  mv /Users/you/.zsh_history.sanitized /Users/you/.zsh_history

How It Works

  1. Scan: Reads your shell history file and scans each line against known patterns
  2. Detect: Uses regular expressions to identify sensitive information
  3. Obfuscate: Replaces sensitive data with safe placeholders like [REDACTED_KEY_a1b2c3d4]
  4. Save: Writes the sanitized content to a new file

Security Considerations

  • βœ… Original files are never modified automatically
  • βœ… Obfuscated values include a hash for consistency
  • βœ… Output files are created with restrictive permissions (0600)
  • βœ… All processing happens locally - no data is sent anywhere

Development

Project Structure

history-sanitizer/
β”œβ”€β”€ main.go                      # Entry point
β”œβ”€β”€ cmd/
β”‚   β”œβ”€β”€ root.go                  # Main scan/sanitize command
β”‚   └── list.go                  # List detection rules command
β”œβ”€β”€ pkg/
β”‚   β”œβ”€β”€ scanner/
β”‚   β”‚   β”œβ”€β”€ scanner.go           # Pattern detection logic
β”‚   β”‚   β”œβ”€β”€ patterns.toml        # Detection patterns (from Gitleaks)
β”‚   β”‚   └── gitleaks.toml        # Full Gitleaks config (reference)
β”‚   └── sanitizer/
β”‚       └── sanitizer.go         # Obfuscation logic
β”œβ”€β”€ examples/
β”‚   └── sample_history.txt       # Sample file for testing
β”œβ”€β”€ go.mod                       # Go module definition
└── README.md                    # This file

Running Tests

go test ./...

Building

go build -o history-sanitizer

Cross-Platform Builds

# Linux
GOOS=linux GOARCH=amd64 go build -o history-sanitizer-linux

# macOS
GOOS=darwin GOARCH=amd64 go build -o history-sanitizer-macos

# Windows
GOOS=windows GOARCH=amd64 go build -o history-sanitizer.exe

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License.

Acknowledgments

  • Powered by Gitleaks for secret detection - a well-maintained, industry-standard tool
  • Built with Cobra for CLI framework
  • Uses fatih/color for colored terminal output

Roadmap

  • Add configuration file support for custom patterns
  • Support for more shell history formats
  • Integration with git hooks
  • Cloud backup sanitization
  • Machine learning-based detection

Documentation

Support

If you encounter any issues or have questions, please open an issue on GitHub.


⚠️ Remember: Always backup your history files before running any sanitization tool!

About

A CLI tool that automatically detects and obfuscates sensitive information (API keys, tokens, passwords) from shell history files, powered by Gitleaks patterns.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors