The ONLY SOC agent that learns from expert feedback and improves over time
Unlike traditional SOC automation that repeats the same actions indefinitely, CyberSentinel implements procedural memory to capture "how-to" knowledge from security analysts and apply it to future incidents.
The Result?
- Response quality improves from 3/5 → 5/5
- Zero analyst corrections needed after learning
- Handles novel attacks through reasoning
- Production-ready automation that gets BETTER, not stale
Most security agents focus on:
- ❌ Static rule-based detection
- ❌ One-time log parsing
- ❌ Fixed response playbooks
CyberSentinel focuses on:
- ✅ Learning from human experts
- ✅ Procedural memory for playbooks
- ✅ Measurable improvement (3/5 → 5/5 quality)
- ✅ Self-evolving responses without code changes
🎯 This is the difference between automation and INTELLIGENCE.
Track: Agents for Good Course: Google AI Agent Development Course - Capstone Project
Traditional Intrusion Detection Systems (IDS) face critical limitations:
❌ Can't Learn - Same mistakes repeated across incidents
❌ No Context - Each alert treated in isolation
❌ Manual Intensive - Requires constant analyst intervention
❌ Static Playbooks - Can't adapt to evolving threats
Result: Security teams overwhelmed, response quality inconsistent, repeated vulnerabilities.
| Approach | Traditional SOC | CyberSentinel |
|---|---|---|
| Response Quality | Static, never improves | Improves over time (3/5 → 5/5) |
| Learning | ❌ None | ✅ From analyst feedback |
| Memory | ❌ No retention | ✅ Procedural playbooks |
| Novel Threats | ❌ Fails or requires updates | ✅ Reasons through patterns |
| Improvement | Requires code changes | Self-evolving |
| Analyst Workload | High (constant corrections) | Reduced by 80% after learning |
| Context Awareness | Each incident isolated | Remembers past successes |
CyberSentinel is a self-learning Security Operations Center powered by Google ADK that:
✅ Learns from Feedback - Analyst corrections become permanent knowledge
✅ Remembers Everything - Procedural memory stores successful response playbooks
✅ Gets Better Over Time - Response quality improves: 3/5 → 5/5
✅ Handles Novel Threats - Reasons about patterns never seen before
Traditional SOC agents are like junior analysts who never learn:
- ❌ Same mistakes repeated across incidents
- ❌ No context from past incidents
- ❌ Can't improve without code changes
- ❌ Every attack treated as brand new
CyberSentinel is like a senior analyst who remembers:
- ✅ "We handled this DDoS last month - here's what worked"
- ✅ "The analyst corrected our approach - apply that learning"
- ✅ "This is a novel pattern - let me reason through it"
- ✅ "I've improved 67% since my first incident"
Real-World Impact:
Enterprise SOCs can achieve:
• 67% improvement in response quality
• 80% reduction in analyst correction time
• Zero repeated mistakes after learning
• Automatic adaptation to evolving threats
This is the difference between automation and INTELLIGENCE.
┌─────────────────────────────────────────────────┐
│ THE LEARNING EFFECT │
├─────────────────────────────────────────────────┤
│ │
│ FIRST DDoS ATTACK (No Memory) │
│ ├─ Response Quality: 3/5 ⭐⭐⭐ │
│ ├─ Gaps: 4 identified │
│ └─ Status: ❌ Needs Revision │
│ │
│ 💾 Analyst corrections saved to memory │
│ │
│ SECOND DDoS ATTACK (With Memory) │
│ ├─ Response Quality: 5/5 ⭐⭐⭐⭐⭐ │
│ ├─ Gaps: 0 identified │
│ └─ Status: ✅ Approved │
│ │
│ 📊 Improvement: +67% quality, -80% time │
└─────────────────────────────────────────────────┘
| Metric | First Attack (No Memory) | Second Attack (With Memory) | Improvement |
|---|---|---|---|
| Quality Score | 3/5 ⭐⭐⭐ | 5/5 ⭐⭐⭐⭐⭐ | +67% |
| Completeness | 60% | 100% | +67% |
| Analyst Corrections | 4 gaps identified | 0 gaps | Perfect! |
| Time to Quality Plan | ~15 min (with revisions) | ~3 min | -80% |
[Link to demo video showing Scenario 1 → Scenario 2 transformation]
BEFORE Scenario 1 (Empty Memory):
// outputs/memory_store/response_playbooks.json
{}AFTER Scenario 1 (Analyst Feedback Captured):
{
"DDoS": [
{
"id": "DDoS_20251121_002846",
"timestamp": "2025-11-21T00:28:46",
"attack_type": "DDoS",
"threat_level": "CRITICAL",
"response_actions": [
"Enable cloud-based DDoS protection service",
"Configure rate limiting: 100 requests/sec per IP",
"Monitor for distributed attack sources",
"Set up traffic scrubbing rules",
"Prepare customer-facing status page"
],
"analyst_feedback": "The plan covers basic blocking and alerting, but for a CRITICAL DDoS attack, we need more comprehensive mitigation...",
"effectiveness": "successful",
"quality_score": 5
}
]
}SCENARIO 2 (Memory Retrieved & Applied):
# Agent automatically queries memory
playbooks = retrieve_response_playbooks("DDoS")
# Found: 1 playbook with proven success
# Result: 5/5 quality, ZERO corrections needed! ✅┌──────────────────────────────────────────┐
│ 1️⃣ First Incident (No Memory) │
│ Agent creates basic plan │
│ Quality: 3/5 ⭐⭐⭐ │
└──────────────┬───────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 2️⃣ Human Analyst Review │
│ Identifies 4 critical gaps │
│ Provides expert corrections │
└──────────────┬───────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 3️⃣ Memory Storage │
│ save_response_playbook() │
│ Stores CORRECTED version │
└──────────────┬───────────────────────────┘
│
▼
┌──────────────────────────────────────────┐
│ 4️⃣ Second Incident (With Memory) │
│ retrieve_response_playbooks() │
│ Applies learned best practices │
│ Quality: 5/5 ⭐⭐⭐⭐⭐ │
│ Corrections: ZERO ✅ │
└──────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ 🎯 ORCHESTRATOR AGENT │
│ (Coordinator Pattern - Day 1) │
└────────────────┬────────────────────────────────────────┘
│
┌───────────┼───────────┬────────────┐
▼ ▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────┐
│ Traffic │ │ ML │ │ Response │ │ Security │
│Analyzer │ │Detector │ │ Planner │ │ Judge │
│ Agent │ │ Agent │ │ Agent │ │ Agent │
└─────────┘ └─────────┘ └──────────┘ └──────────┘
│ │ │ │
└───────────┴───────────┴────────────┘
│
▼
┌────────────────┐
│ 💾 PROCEDURAL │ ← Learning happens here!
│ MEMORY │
└────────────────┘
-
Traffic Analyzer 📊
- Parses raw network packets
- Extracts 16 ML features
- Validates data quality
-
ML Detector 🤖
- 100% accuracy on test set
- RandomForest classifier
- Explainable predictions with reasoning
-
Response Planner 📋
- Queries procedural memory FIRST
- Creates mitigation strategies
- Learns from analyst feedback
-
Security Judge ⚖️
- LLM-as-a-Judge evaluation
- Scores on 5 dimensions
- Ensures production-ready quality
- Coordinator Pattern with orchestrator
- A2A Communication between specialist agents
- Tool-based delegation with
AgentTool
- "Knowing How" vs "Knowing What"
- Stores successful response playbooks
- Consolidates analyst feedback
- THIS IS OUR SECRET WEAPON!
- LLM-as-a-Judge for response quality
- Automated metrics (precision, recall, F1)
- Continuous improvement tracking
- Clear tool documentation
- Granular, single-purpose tools
- Structured outputs with error handling
- ML model wrapped as tool
- Structured logging
- Execution traces
- Performance metrics
- Cost tracking
- Session management for conversations
- Memory consolidation across sessions
- Efficient context window usage
- Python 3.10+
- Google AI API Key (Get one here)
# 1. Clone the repository
git clone https://github.com/kawish918/CyberGuard_AgentsIntensive.git
cd cybersentinel
# 2. Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# 3. Install dependencies
pip install -r requirements.txt
# 4. Configure API key
cp .env.example .env
# Edit .env and add your GOOGLE_API_KEY
# 5. Generate dataset (already included, but you can regenerate)
cd data/synthetic
python generate_dataset.py
cd ../..
# 6. Train model (pre-trained included, but you can retrain)
cd data/models
python train_model.py
cd ../..
# 7. Run quick test
python test_quick.py# Interactive menu
python src/main.py
# Or run specific scenarios
python src/demos/scenario_1_first_attack.py # Shows initial response
python src/demos/scenario_2_learned.py # Shows learning!
python src/demos/scenario_3_novel.py # Shows reasoning- Accuracy: 100% on test set (synthetic data)
- Attack Types Detected: DDoS, Port Scan, SQL Injection, Brute Force
- Features Used: 16 network traffic characteristics
- Training Samples: 300 (balanced distribution)
payload_entropy(15.0%) - Encryption/randomness detectionavg_packet_size(14.0%) - Attack signaturefailed_connections(13.6%) - Brute force indicatorstd_packet_size(12.7%) - Traffic variabilityunique_dst_ports(11.5%) - Port scan detection
Scenario 1: First DDoS Attack (No Memory)
Initial Response Quality: 3/5
Gaps Identified:
• Missing rate limiting
• No DDoS mitigation service
• Lacks distributed source monitoring
• Missing customer communication
Result: ❌ Needs Revision
Scenario 2: Second DDoS Attack (With Memory)
Response Quality: 5/5 ⭐⭐⭐⭐⭐
Gaps Identified: None
Improvements:
✅ All previous gaps addressed
✅ Comprehensive multi-layer defense
✅ Production-ready plan
Result: ✅ Approved
Impact:
- 67% Quality Improvement
- Zero analyst corrections needed
- 80% faster to quality plan
- Repeatable learning across attack types
This project demonstrates real-world enterprise AI agent development:
- See all 5 whitepapers concepts in one system
- Learn multi-agent orchestration
- Understand procedural memory implementation
- Practice evaluation-driven development
- Understand AI-augmented SOC operations
- See how ML enhances threat detection
- Learn automated response planning
- Explore continuous improvement workflows
- Study production-grade agent architecture
- Learn memory management patterns
- Understand evaluation frameworks
- See observability in practice
Agent encounters DDoS attack, creates basic response plan:
# Response Planner Agent
playbooks = retrieve_response_playbooks("DDoS")
# Returns: [] (empty - no prior experience)
# Agent falls back to general best practices
response = create_general_response(threat_data)
# Result: Basic but incomplete plan (3/5 quality)Analyst reviews → Identifies gaps → Provides corrections.
Analyst feedback is captured in procedural memory:
# Memory Tools (src/tools/memory_tools.py)
def save_response_playbook(
attack_type: str,
response_actions: list,
analyst_feedback: str,
effectiveness: str,
quality_score: int
) -> dict:
"""
Stores the CORRECTED response plan with analyst wisdom.
This becomes permanent knowledge for future incidents.
"""
playbook = {
"id": f"{attack_type}_{timestamp}",
"attack_type": attack_type,
"response_actions": response_actions, # ← ENHANCED version
"analyst_feedback": analyst_feedback,
"effectiveness": effectiveness,
"quality_score": quality_score,
"timestamp": datetime.now().isoformat()
}
# Persist to JSON file
store.save(playbook)
return playbookWhat Gets Stored:
- ✅ Complete enhanced action list
- ✅ Analyst's reasoning for corrections
- ✅ Effectiveness rating
- ✅ Quality score for future reference
Agent encounters similar DDoS attack:
# Response Planner queries memory FIRST
playbooks = retrieve_response_playbooks("DDoS", limit=5)
# Returns: [<DDoS_playbook_1>] ← Found learned experience!
if playbooks:
# Apply proven successful approach
response = apply_learned_playbook(playbooks[0], current_threat)
# Result: 5/5 quality plan, production-ready ✅
else:
# Fall back to general approach (like Step 1)
response = create_general_response(current_threat)The Agent's Internal Reasoning:
"I've seen this before! On 2025-11-21, we handled a DDoS attack.
The analyst taught me to:
1. Enable cloud DDoS protection
2. Configure rate limiting (100 req/sec)
3. Set up traffic scrubbing
4. Monitor distributed sources
5. Prepare customer communication
I'll apply that proven strategy now."
Result: Zero corrections needed! ✅
def retrieve_response_playbooks(
attack_type: str,
limit: int = 5
) -> list:
"""
Retrieves past successful responses for similar attacks.
Sorted by effectiveness and recency.
"""
all_playbooks = load_from_json(MEMORY_FILE)
# Filter by attack type
relevant = [
p for p in all_playbooks
if p['attack_type'] == attack_type
]
# Sort by quality score (descending) and recency
sorted_playbooks = sorted(
relevant,
key=lambda x: (x['quality_score'], x['timestamp']),
reverse=True
)
return sorted_playbooks[:limit]Key Features:
- 🎯 Attack-type matching (DDoS, Port Scan, etc.)
- 📊 Quality-based ranking
- 🕐 Recency consideration
- 💾 Persistent JSON storage
| Memory Type | What It Stores | Example |
|---|---|---|
| Semantic Memory | Facts, definitions | "DDoS = Distributed Denial of Service" |
| Procedural Memory | How-to knowledge | "Enable cloud DDoS protection THEN configure rate limiting" |
From Day 4 Whitepaper (Page 64):
"Procedural memory enables agents to remember not just what happened,
but HOW to respond effectively based on past experience."
CyberSentinel implements procedural memory because knowing WHAT a DDoS is doesn't help - knowing HOW to stop it does!
cybersentinel/
├── data/
│ ├── synthetic/ # Dataset generation
│ │ ├── generate_dataset.py
│ │ ├── train_data.csv (300 samples)
│ │ └── test_data.csv (60 samples)
│ └── models/ # Trained ML model
│ ├── threat_detector.pkl
│ ├── scaler.pkl
│ └── training_results.json
│
├── src/
│ ├── agents/ # 5 specialist agents
│ │ ├── orchestrator.py (Coordinator)
│ │ ├── traffic_analyzer.py (Feature extraction)
│ │ ├── ml_detector.py (ML classification)
│ │ ├── response_planner.py (Uses memory!)
│ │ └── security_judge.py (LLM-as-judge)
│ │
│ ├── tools/ # ADK tools
│ │ ├── ml_classifier.py (ML model wrapper)
│ │ ├── packet_parser.py (Feature extraction)
│ │ └── memory_tools.py (Procedural memory)
│ │
│ ├── demos/ # Learning scenarios
│ │ ├── scenario_1_first_attack.py
│ │ ├── scenario_2_learned.py 🏆 The money shot!
│ │ └── scenario_3_novel.py
│ │
│ ├── utils/
│ │ └── config.py
│ └── main.py # Interactive demo runner
│
├── outputs/ # Generated during runtime
│ ├── memory_store/ # Procedural memory storage
│ ├── logs/
│ └── visualizations/
│
├── requirements.txt
├── test_quick.py
└── README.md
- Real-time packet capture integration (Wireshark/tcpdump)
- Multi-model ensemble (Gemini 2.0 + specialized models)
- Federated learning across organizations
- SIEM integration (Splunk, ELK)
- Automated red-team testing
- Executive dashboard with metrics
- Incident timeline visualization
- Memory pruning and consolidation strategies
This is a capstone project submission, but feedback and suggestions are welcome!
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Google AI Agent Development Course - Comprehensive training
- Course Instructors - Excellent whitepaper materials
- ADK Team - Powerful agent framework
If this project helped you understand AI agents or inspired your own work, please give it a star! ⭐
It helps others discover this educational resource.
🛡️ Securing the Future with Intelligent Agents 🛡️