Intelligent incident management system for automated detection, classification, and resolution tracking.
About The Project β’ Architecture β’ Key Features β’ Quick Start β’ Usage β’ API Endpoints
Thala is an intelligent incident management system that automatically:
- Detects incidents from Slack messages, Jira tickets, and emails
- Classifies & predicts severity, category, and likelihood using agent
- Tracks resolutions and links them to original incidents
- Searches similar past incidents using semantic similarity
- Extracts text from image attachments using AWS Textract
- Ingestion: Slack/Jira/Email β Connectors β Kafka
- Classification: Llama 3.3 70b LLM classifies messages (incident, resolution, discussion, unrelated)
- Prediction: AWS Bedrock (llama-3.3-70b) agent predicts category & severity
- Attachment Processing: Images β S3 β Textract β Extracted text β Context
- Storage: Flask API β Elasticsearch (with embeddings for semantic search)
- Resolution Tracking: Links resolution messages to original incidents
- UI: Slack bot commands (/thala latest_issue, /thala search)
- Uses LLM from AWS Bedrock (llama-3.3-70b) to classify messages semantically
- No keyword matching - pure agent understanding
- Types: incident_report, resolution, discussion, unrelated
- Links vague resolutions ("auth issue fixed") to correct incidents
- Uses semantic similarity (embeddings) + conversational context
- Automatically marks incidents as "Resolved" in Elasticsearch
- Downloads images from Slack/Jira attachments
- Uploads to S3 bucket (thala-images)
- Extracts text using AWS Textract
- Adds extracted text to message context for classification
- Category: Database, API, Frontend, Infrastructure, Authentication, etc.
- Severity: Critical, High, Medium, Low
- Likelihood: Likely, Unlikely (for new queries)
- Uses Llama model with few-shot learning
- Finds similar past incidents using vector embeddings
- Prioritizes resolved incidents with complete resolution info
- Returns similarity scores and resolution details
/thala latest_issue [page]- View ongoing incidents (paginated)/thala search <query>- Search similar resolved incidents/thala predict <description>- Predict category/severity/thala- Show help
- Python 3.12+
- Elasticsearch 9.1.5+ (running)
- Kafka (KRaft mode, optional for real-time)
- AWS Account (for S3 + Textract)
Install dependencies:
pip install -r requirements.txt
pip install -r team-thala/src/ui_requirements.txtCreate .env file in the root directory:
GEMINI_API_KEY=
FLASK_API_URL=http://localhost:5000
# Elasticsearch Configuration (if remote, change localhost to your ES host)
ELASTICSEARCH_HOST=https://localhost:9200
SLACK_APP_TOKEN=
JIRA_URL=https://kphotos1803.atlassian.net
JIRA_EMAIL=
JIRA_API_TOKEN=
SLACK_BOT_TOKEN=
SLACK_CHANNEL_ID=
# Kafka Configuration
KAFKA_BOOTSTRAP_SERVERS=localhost:9092
KAFKA_TOPIC_SLACK=thala-slack-events
KAFKA_TOPIC_JIRA=thala-jira-events
# Logging Configuration
LOG_LEVEL=INFO
LOG_FILE=logs/thala_ingestion.log
# Elasticsearch Configuration (if remote, change localhost to your ES host)
AWS_LAMBDA_URL=
# Kafka Configuration
KAFKA_TOPIC_SLACK=thala-slack-events
KAFKA_TOPIC_JIRA=thala-jira-events
SEARCH_BACKEND=opensearch_serverless
AWS_REGION=us-east-2
AWS_ACCESS_KEY_ID=""
AWS_SECRET_ACCESS_KEY=""
AWS_SESSION_TOKEN=""
FunctionUrl= ""
FunctionArn= ""
AWS_BEARER_TOKEN_BEDROCK=""
OPENSEARCH_HOST = ""
KAFKA_BOOTSTRAP_SERVERS=""
REDIS_FALLBACK_ENABLED=true
REDIS_HOST=127.0.0.1
REDIS_PORT=6379
REDIS_LIST_PREFIX=thala:queue:
AWS_REGION=us-east-2
BEDROCK_LLAMA_MODEL_ID=meta.llama3-3-70b-instruct-v1:0
- Create Slack app at https://api.slack.com/apps
- Add Bot Token Scopes:
channels:history,channels:readchat:write,commandsapp_mentions:read,im:historyfiles:read(REQUIRED for attachments)
- Install app to workspace
- Copy Bot Token (xoxb-...) to
.env
See: team-thala/SLACK_FILES_READ_SETUP.md for detailed setup instructions.
python integrated_main.py# Terminal 1: Flask API
python new.py
# Terminal 2: Kafka Consumer
python team-thala/src/kafka_consumer_to_flask.py
# Terminal 3: Slack Connector
python team-thala/src/slack_connector_enhanced.py
# Terminal 4: Slack Bot UI
python team-thala/src/slack_bot_ui.py/thala # Show help and available commands
/thala latest_issue [page] # View ongoing incidents (paginated, 10 per page)
/thala search <query> # Search similar resolved incidentsSlack: "API server is down"
β LLM from AWS Bedrock (llama-3.3-70b) classifies as "incident_report"
β It predicts: Category=API, Severity=High
β Sent to Kafka β Flask β Elasticsearch
β Tracked in Incident Tracker
β Available in Slack: /thala latest_issue
Slack: "API issue has been fixed"
β LLM from AWS Bedrock (llama-3.3-70b) classifies as "resolution"
β Semantic search finds matching open incident
β Updates status to "Resolved" in Elasticsearch
β Logs resolution text, resolved_by, resolved_at
β Removed from ongoing incidents list
Slack: [Image attachment] "Check this error"
β Download image from Slack (files_info API)
β Upload to S3 bucket
β Extract text using Textract
β Add extracted text to message context
β Classify with full context (image + text)
β Create incident if classified as incident_report
Slack: /thala search "database timeout"
β Flask API performs semantic search in Elasticsearch
β Returns similar resolved incidents
β Prioritizes incidents with complete resolution info
β Displays in Slack with rich formatting
- Monitors Slack channels for messages
- Classifies messages using LLM from AWS Bedrock (llama-3.3-70b)
- Processes attachments (S3 + Textract)
- Detects resolutions and links to incidents
- Prevents resolution messages from creating new incidents
- Handles vague messages intelligently
- Slack bot with slash commands
- Paginated incident listing
- Semantic search interface
- Rich UI with Slack Block Kit
- Predicts category & severity
- Uses few-shot learning with training examples
- Caches predictions (24h TTL)
- Downloads attachments from Slack/Jira
- Uploads to S3 bucket
- Extracts text using Textract
- Handles image format conversion (PNG β JPEG)
/index- Store incidents in Elasticsearch/search- Semantic similarity search/predict_incident- Predict likelihood/update_status- Mark incidents as resolved/lookup_incident- Find incident by ID
Store new incident in Elasticsearch
{
"texts": ["API server is down"],
"timestamp": "2025-11-01T10:00:00",
"status": "Open",
"source": "slack",
"category": "API",
"severity": "High"
}Semantic similarity search
{
"query": "database connection timeout",
"top_k": 10
}Mark incident as resolved
{
"issue_id": "slack_1234567890",
"status": "Resolved",
"resolution_text": "Fixed connection pool",
"resolved_by": "U08L203J5TK",
"resolved_at": "2025-11-01T10:15:00"
}Find incident by ID
{
"issue_id": "slack_1234567890"
}- Bot Token (xoxb-...): Required for Web API calls (files_info, channels, etc.)
- App Token (xapp-...): Only for Socket Mode (not used currently)
- Use Bot Token in SLACK_BOT_TOKEN environment variable
- Slack app must have
files:readscope - AWS credentials must be configured
- S3 bucket must exist (thala-images)
- Textract must be enabled in AWS region
- No keyword matching - pure semantic understanding
- Links resolutions even if ID not mentioned explicitly
- Uses conversational context (recent incidents)
- Fallback to most recent open incident if no match













