This Python CLI tool analyzes text content to identify common indicators associated with scams, such as phishing, prize scams, tech support scams, and more. It provides a likelihood score and a list of specific heuristics triggered.
- Heuristic-Based Analysis: Checks text against predefined lists of keywords and regular expression patterns related to:
- Urgency and pressure tactics.
- Requests for sensitive information (credentials, financial details).
- Too-good-to-be-true offers and prizes.
- Generic greetings.
- Tech support scam language.
- Payment requests and mentions of cryptocurrencies/gift cards.
- URL Analysis (String-Based):
- Detects URLs within the text.
- Checks for suspicious Top-Level Domains (TLDs).
- Looks for suspicious keywords (e.g., "login", "verify") in URL paths or domains (with a basic check to avoid flagging known major domains).
- Detection of Potential Identifiers:
- Cryptocurrency addresses (BTC, ETH).
- Phone numbers (basic detection).
- Scam Likelihood Score: Calculates a score based on the number and severity (weights) of indicators found.
- Configurable Input: Accepts text via direct command-line argument, from a file, or from standard input (stdin).
- Verbose Output: Option to display detailed analysis of URLs found.
- Adjustable Threshold: Set a score threshold for a "High Risk" warning.
- Python 3.6+
- No external Python libraries are required for the core functionality (uses only standard libraries like
re,argparse,urllib.parse).
-
Download the Code:
- Ensure you have
scam_main.pyand thescam_detectordirectory (containinganalyzer.py,heuristics.py,__init__.py).
- Ensure you have
-
No
pip installneeded for external libraries for the core tool.
The tool is run from the command line using python3 scam_main.py. You must provide one input method.
- Input (Required - choose one):
-t TEXT, --text TEXT: Text content to analyze directly.-f FILE, --file FILE: Path to a plain text file to read content from.--stdin: Read text content from standard input (e.g., via a pipe).
- Options:
-v, --verbose: Enable verbose output (shows detailed URL analysis if URLs are found).--threshold THRESHOLD: Score threshold above which a 'High Risk' warning is displayed (default: 5.0).-h, --help: Show help message and exit.
-
Analyze text directly:
python3 scam_main.py --text "Dear Customer, your account is suspended. Please login at http://yourbank.suspicious-site.xyz/update to avoid closure." -
Analyze text from a file:
python3 scam_main.py --file path/to/suspicious_email.txt
-
Analyze text from a file with verbose output and a custom threshold:
python3 scam_main.py --file message.txt --verbose --threshold 3.0
-
Analyze text piped from another command (Linux/macOS):
cat email_body.txt | python3 scam_main.py --stdin(On Windows, you might type input then Ctrl+Z, Enter for stdin)
- Overall Scam Likelihood Score: A numerical score. Higher scores indicate a higher likelihood of the text being a scam based on the tool's heuristics.
- Assessment: A qualitative assessment (e.g., "Low risk," "Medium risk," "WARNING: High risk!") based on the score and the threshold.
- Indicators Found: A list of specific reasons why the text was flagged (e.g., presence of urgency keywords, suspicious URL TLDs).
- Detailed URL Analysis (with
--verbose): For each URL found:- The URL string.
- Whether it was deemed suspicious.
- Specific reasons for suspicion (e.g., "Uses a potentially suspicious TLD," "URL contains suspicious keyword").
- Heuristic-Based, Not Foolproof: This tool uses a set of predefined rules, keywords, and patterns. It is not a definitive judgment on whether a piece of text is a scam. Scammers constantly evolve their tactics.
- False Positives/Negatives: The tool may incorrectly flag legitimate text as suspicious (false positive) or fail to detect a real scam (false negative).
- Context is Key: The tool does not understand the full context of the communication, the sender, or your relationship with them, all of which are crucial for accurately identifying scams.
- Use Your Judgment: Always exercise extreme caution and use your best judgment when dealing with unsolicited communications, requests for personal information, or offers that seem too good to be true.
- Do Not Rely Solely on This Tool: This tool is an aid and should be one of many factors in your decision-making process. If you are unsure about a message, consult trusted sources or individuals.
.
├── scam_detector/
│ ├── __init__.py # Makes 'scam_detector' a Python package
│ ├── analyzer.py # Core scam analysis logic
│ └── heuristics.py # Keyword lists, regex patterns, and weights
└── scam_main.py # CLI entry point
└── README_scam_detector.md # This documentation file