Skip to content

anpa1200/Android-Malware-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Android Malware Analysis Tool

AI-powered static analysis framework for Android APK files. Combines YARA rule matching, semantic component analysis, threat indicator scoring, and multi-provider LLM classification (Claude, OpenAI, Google Gemini, or local Ollama) into a single terminal pipeline — producing detailed malware reports, MITRE ATT&CK mappings, VirusTotal cross-validation, and Frida instrumentation scripts.

📖 Full guide on Medium: Android APK Analysis Tool: AI-Powered Static Malware Analysis in Your Terminal


Features

Layer What it does
Static Analysis Extracts metadata, permissions, components, API calls, certificates, strings, network IOCs from APK/DEX via androguard
Threat Scoring Weighted risk scoring (0–100) across 4 dimensions: permissions, behavior, network, obfuscation
Semantic Analysis Decodes malware capabilities directly from service/activity names (ServiceRAT → RAT, EncryptionService → Ransomware)
YARA Scanning 20 rules covering Banking Trojans (Anubis, Cerberus), Ransomware, Spyware, RAT (Metasploit), Joker, Stalkerware, Evasion
AI Classification Claude, OpenAI (GPT-4o), Google Gemini, or local Ollama — auto-selects based on available API keys; produces family name, MITRE techniques, IOC list, Frida hooks
VirusTotal Optional hash lookup for cross-validation (requires VT_API_KEY)
Frida Hooks AI-generated JavaScript targeting specific APIs found in the sample

Quickstart

git clone https://github.com/anpa1200/Android-Malware-Analysis.git
cd Android-Malware-Analysis
./setup.sh

# Set one or more API keys (tool auto-selects best available):
export ANTHROPIC_API_KEY="sk-ant-..."      # Claude (best results)
export OPENAI_API_KEY="sk-..."             # OpenAI GPT-4o
export GOOGLE_API_KEY="..."                # Google Gemini Flash
export VT_API_KEY="..."                    # VirusTotal (optional)

# Analyze a single APK
python analyzer.py analyze malware.apk

# Choose provider explicitly
python analyzer.py analyze malware.apk -p claude
python analyzer.py analyze malware.apk -p openai
python analyzer.py analyze malware.apk -p google

# Skip AI, static analysis only (~3 seconds)
python analyzer.py analyze malware.apk --no-ai

# Override model
python analyzer.py analyze malware.apk --model claude-sonnet-4-6

# Batch analyze a directory
python analyzer.py batch /path/to/apks/

# Quick file info (no analysis)
python analyzer.py info malware.apk

Requirements

  • Python 3.10+
  • Ollama with qwen3:8b pulled (for local AI, no API key needed)
  • Anthropic API key (optional — for faster/better Claude analysis)
  • VirusTotal API key (optional)

Install dependencies:

pip install -r requirements.txt

Or use the setup script which creates a venv automatically:

./setup.sh

Analysis Pipeline

APK file
   │
   ├─ Phase 1: Static Analysis (androguard)
   │    ├─ Package metadata, SDK versions, signing certificate
   │    ├─ Permissions (31 dangerous permissions scored)
   │    ├─ Components: activities, services, receivers, providers
   │    ├─ Suspicious API calls (DexClassLoader, SmsManager, etc.)
   │    ├─ Network IOCs: URLs, IPs, domains
   │    └─ Obfuscation detection (ProGuard, fddo pattern, gibberish names)
   │
   ├─ Phase 2: Threat Indicator Scoring (0–100)
   │    ├─ Permission scoring (BIND_ACCESSIBILITY +8, READ_SMS +5, ...)
   │    ├─ Dangerous permission combos (SMS+accessibility+overlay = banking trojan)
   │    ├─ Behavioral API patterns
   │    └─ MITRE ATT&CK mapping per indicator
   │
   ├─ Phase 2b: Semantic Analysis
   │    ├─ Component name → capability mapping (25+ patterns)
   │    │    ServiceRAT       → Remote Access Trojan
   │    │    EncryptionService → Ransomware (File Encryptor)
   │    │    ServiceVNC       → VNC Remote Control
   │    │    ActivityStartUSSD → Bank Account Wipe
   │    │    NMSGService      → Covert C2 Messaging
   │    │    SyncTGData       → Telegram C2 Exfil
   │    ├─ String-based semantic detection (Telegram bot, Discord webhook, USSD, emulator checks)
   │    ├─ Shannon entropy analysis (encrypted payloads / C2 keys)
   │    └─ Rule-based family inference (Banking Trojan, Stalkerware, Ransomware, RAT, SMS Fraud, Dropper)
   │
   ├─ Phase 2c: VirusTotal Lookup (optional)
   │    └─ Detection ratio, threat label, family names, top AV detections
   │
   ├─ Phase 3: YARA Scanning (20 rules)
   │    └─ APK binary + all embedded DEX/SO files
   │
   └─ Phase 4: AI Analysis (Claude or Ollama)
        ├─ Threat classification + specific family name
        ├─ Executive summary + technical analysis
        ├─ MITRE ATT&CK techniques with evidence
        ├─ IOC list (hashes, packages, services, certificates)
        ├─ Remediation steps
        └─ Complete Frida instrumentation script

YARA Rules

Rule Family Severity
BankingTrojan_Anubis Anubis/BankBot (ServiceRAT + USSD + overlay combo) CRITICAL
BankingTrojan_Obfuscated_fddo Anubis builder fddo obfuscation pattern HIGH
BankingTrojan_Cerberus Cerberus/Alien (accessibility + clipboard hijack + crypto) CRITICAL
BankingTrojan_Accessibility Generic banking trojan (accessibility + overlay) CRITICAL
Ransomware_FileEncryption File-encrypting ransomware (EncryptionService/DecryptionService) CRITICAL
Ransomware_DeviceAdmin Screen-locker ransomware (lockNow + ransom strings) CRITICAL
RAT_Metasploit Metasploit Android Meterpreter stage CRITICAL
Spyware_MultiVector Call log + SMS + network exfil CRITICAL
Spyware_AudioLocation Audio recording + location tracking CRITICAL
Spyware_SMSStealer SMS/contact stealing spyware HIGH
Stalkerware_Covert Hidden-icon covert surveillance CRITICAL
SMSFraud_PremiumRate Premium-rate SMS fraud HIGH
SMSFraud_Joker Joker/Bread WebView subscription fraud HIGH
NotificationStealer_OTP OTP theft via notification listener CRITICAL
Overlay_PhishingActivity Overlay phishing over banking apps CRITICAL
Dropper_DexClassLoader Dynamic DEX loading (dropper/loader) HIGH
HiddenInstaller_Silent Silent APK installation CRITICAL
AntiAnalysis_EmulatorCheck Emulator/debugger detection MEDIUM
C2_TelegramBot Telegram bot as C2 channel HIGH
C2_DiscordWebhook Discord webhook for exfiltration HIGH

AI Backend

Use --provider / -p to choose: auto (default), claude, openai, google, ollama.

auto tries providers in order — Claude → OpenAI → Google → Ollama — based on which API keys are present in the environment or config.yml.

Provider Default Model Key Variable
Claude claude-opus-4-6 ANTHROPIC_API_KEY
OpenAI gpt-4o OPENAI_API_KEY
Google gemini-2.0-flash GOOGLE_API_KEY
Ollama qwen3:8b (auto-detected)

Cloud providers receive the full evidence prompt (~5000 tokens) for rich technical analysis and detailed Frida hooks.

Ollama (fully offline, no API key required):

  • Auto-selects best available local model
  • Compact prompt mode (~450 tokens) for CPU inference
  • ~110–170s per sample on CPU
# Install Ollama and pull model (fully offline fallback)
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull qwen3:8b

# Run without any API key — Ollama kicks in automatically
python analyzer.py analyze malware.apk

Config file (config.yml in project dir or ~/.android_malware_analyzer.yml):

provider: claude        # auto, claude, openai, google, ollama
# model: claude-opus-4-6
# api_key: sk-ant-...
# vt_key: your-vt-key
output_dir: reports

Output

Terminal output (Rich tables, color-coded severity):

Phase 1: Static Analysis
  Package: naqsl.ebxcb.exu  App: pandemidestek  SDK: 15→28
  SHA256: 041ccba5...  Size: 859 KB

Phase 2: Threat Indicator Scoring
  Risk Score: 71/100  Level: HIGH
  ┌─ CRITICAL: Accessibility + SMS interception + overlay (T1417)
  ├─ CRITICAL: USSD execution capability (T1582)
  └─ HIGH: fddo obfuscation pattern — Anubis/BankBot builder

Phase 3: YARA Scanning
  4 rule(s) matched: BankingTrojan_Anubis, BankingTrojan_Obfuscated_fddo, ...

Phase 4: AI Analysis
  Classification: Banking Trojan
  Family: Anubis  Confidence: High
  "This Android Banking Trojan is a variant of the Anubis family..."

JSON report (reports/<package>_<timestamp>.json):

{
  "apk_metadata": { "package": "naqsl.ebxcb.exu", "sha256": "041c...", ... },
  "risk_assessment": { "risk_score": 71, "risk_level": "HIGH", ... },
  "yara_matches": [{ "rule": "BankingTrojan_Anubis", ... }],
  "ai_analysis": {
    "classification": "Banking Trojan",
    "family": "Anubis",
    "confidence": "High",
    "mitre_techniques": [{ "id": "T1417", "name": "Input Injection", ... }],
    "frida_hooks": "Java.perform(function() { ... });",
    ...
  }
}

Frida script (reports/<package>.js):

// Auto-generated Frida hooks for naqsl.ebxcb.exu
Java.perform(function() {
    // Hook SMS interception
    var SmsManager = Java.use("android.telephony.SmsManager");
    SmsManager.sendTextMessage.overload(...).implementation = function(...) { ... };
    // Hook overlay injection
    ...
});

Validated Results

Tested against 12 real-world malware samples. Tool correctly classified all with High confidence:

Sample Official Family AI Classification YARA Semantic
pandemidestek.apk Anubis Banking Trojan Banking Trojan / Anubis BankingTrojan_Anubis Banking Trojan/RAT (95%)
sep_cerberus.apk Cerberus Banking Trojan Banking Trojan / Cerberus BankingTrojan_Cerberus Banking Trojan/RAT (50%)
RansomwareCryDroid.apk CryDroid Ransomware Ransomware / File Encryptor Ransomware_FileEncryption Ransomware (80%)
dec_sextortionistSpyware.apk Sextortion Spyware Spyware / SMS Stealer Spyware_SMSStealer SMS Fraud/Stealer (90%)
nov_jokerNew.apk Joker Premium Dialer SMS Fraud / Joker SMSFraud_Joker — (obfuscated payload)
projectSpy.apk Stalkerware Stalkerware / Spyware Spyware_AudioLocation Stalkerware (80%)
mar_CovidMetasploit.apk Metasploit RAT RAT / Metasploit RAT_Metasploit
roamingMantis1.apk Roaming Mantis RAT / C2 Messaging — (encrypted) RAT (40%)
mysteryBot.apk Banking Trojan Banking Trojan / RAT Banking Trojan (80%)

Project Structure

android-malware-analysis/
├── analyzer.py              # CLI entry point (click)
├── setup.sh                 # One-command setup script
├── requirements.txt
├── core/
│   ├── apk_analyzer.py      # Static analysis via androguard
│   ├── indicators.py        # Weighted threat scoring engine
│   ├── semantic_analyzer.py # Component name → capability mapping
│   ├── yara_scanner.py      # YARA rule execution
│   ├── ai_engine.py         # Claude API + Ollama with lite prompt mode
│   ├── ollama_engine.py     # Local LLM streaming client
│   ├── vt_lookup.py         # VirusTotal API v3
│   └── reporter.py          # Rich terminal UI + JSON + Frida output
├── rules/
│   ├── malware.yar          # 20 YARA rules
│   ├── permissions.yaml     # 31 dangerous Android permissions with scores
│   └── suspicious_patterns.yaml  # API call and string patterns
└── templates/               # Report templates

Environment Variables

Variable Description
ANTHROPIC_API_KEY Anthropic API key for Claude
OPENAI_API_KEY OpenAI API key for GPT-4o
GOOGLE_API_KEY Google API key for Gemini
VT_API_KEY VirusTotal API key (also VIRUSTOTAL_API_KEY)

Adding Custom YARA Rules

Add rules to rules/malware.yar. Required metadata fields:

rule MyFamily_Variant {
    meta:
        description = "Description of what this detects"
        severity    = "CRITICAL"    // CRITICAL / HIGH / MEDIUM / LOW
        category    = "Banking Trojan"
        mitre       = "T1417"
        family      = "FamilyName"  // optional
    strings:
        $s1 = "SomeService" ascii
        $s2 = "SomeAPI" ascii
    condition:
        $s1 and $s2
}

The tool automatically picks up new rules on next run — no recompilation needed.


License

MIT License. For educational and authorized security research purposes only.

Do not use this tool against systems you do not own or have explicit permission to analyze.

Malware samples are NOT included in this repository. Sources for research samples:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors