A modular pipeline for searching and analyzing GitHub issues and providing recommendations for issue classification
This project provides an intelligent, multi-agent system for processing, analyzing, and managing GitHub issues at scale. It leverages LLMs, vector databases, and cloud-native Kubernetes infrastructure to automate search, triage, and enrichment of GitHub issues, supporting advanced workflows and integration using LangGraph agents.
The detailed implementation of the project can be found in the following blog.
- Multi-agent orchestration for issue processing
- Integration with GitHub, PostgreSQL, and Qdrant vector store
- Modular architecture for agents, guardrails, tools, and pipelines
- Infrastructure-as-code with AWS CDK and Kubernetes support
- GitHub Issues Multiagent Intelligence
├── LICENSE
├── Makefile
├── README.md
├── SETUP.md
├── alembic.ini
├── aws_cdk_infra
│ ├── README.md
│ ├── app.py
│ ├── aws_eks_rds
│ │ ├── __init__.py
│ │ ├── eks_stack.py
│ │ ├── rds_stack.py
│ │ └── vpc_stack.py
│ ├── requirements.txt
│ └── source.bat
├── docker
│ ├── dev.Dockerfile
│ ├── docker-compose.yml
│ └── prod.Dockerfile
├── env.example
├── kubernetes
│ ├── fastapi-deployment.yaml
│ ├── fastapi-service.yaml
│ ├── iam_policy.json
│ └── test-pod.yaml
├── langgraph.json
├── migrations
│ ├── README
│ ├── env.py
│ ├── script.py.mako
│ └── versions
│ └── 77e4d0a13aa8_create_comments_and_issues_table.py
├── pyproject.toml
├── scripts
│ └── lint-makefile.sh
├── src
│ ├── __init__.py
│ ├── agents
│ │ ├── __init__.py
│ │ ├── agents.py
│ │ ├── graph.py
│ │ └── graph_service.py
│ ├── api
│ │ ├── __init__.py
│ │ └── main.py
│ ├── config
│ │ ├── guardrails.yaml
│ │ └── repos.yaml
│ ├── data_pipeline
│ │ ├── __init__.py
│ ├── ingest_embeddings.py
│ │ └── ingest_raw_data.py
│ ├── database
│ │ ├── __init__.py
│ │ ├── drop_tables.py
│ │ ├── init_db.py
│ │ └── session.py
│ ├── models
│ │ ├── __init__.py
│ │ ├── agent_models.py
│ │ ├── api_model.py
│ │ ├── db_models.py
│ │ ├── github_models.py
│ │ ├── guardrails_models.py
│ │ └── repo_models.py
│ ├── utils
│ │ ├── __init__.py
│ │ ├── config.py
│ │ ├── error_handler.py
│ │ ├── guardrails.py
│ │ └── promps.py
│ └── vectorstore
│ ├── __init__.py
│ ├── create_collection.py
│ ├── create_index.py
│ ├── delete_collection.py
│ ├── payload_builder.py
│ ├── qdrant_store.py
│ └── qdrant_store_sync.py
├── tests
│ ├── integration
│ │ ├── test_api_process_issue.py
│ │ ├── test_full_graph_output_guardrails.py
│ │ └── test_query_search.py
│ └── unit
│ ├── test_db_ingest_qdrant.py
│ ├── test_input_guardrail_agent.py
│ ├── test_output_guardrail_agent.py
│ └── test_qdrant_collection.py
└── uv.lock
- Python 3.12+
- uv
- Docker & Docker Compose
- PostgreSQL
- Qdrant
- AWS CLI (for CDK)
- Node.js (for AWS CDK)
- Kubernetes CLI (
kubectl) - OpenAI API Key
- Guardrails AI API Key
- GitHub Token
git clone https://github.com/benitomartin/github-issues-multiagent-intelligence.git
cd github-issues-multiagent-intelligenceuv sync --all-groups
source ./.venv/bin/activateThere must be two environments created (development and production):
cp env.example .env.dev
cp env.example .env.prodThe development mode runs on localhost. The production mode runs with RDS as the database and AWS EKS with Fargate for FastAPI.
You must follow the SETUP.md first before running the below commands:
Start the database and supporting services, either in development or production mode:
make docker-build APP_ENV=devAccess Adminer at http://localhost:8080.
Update the database schema:
alembic upgrade headThis file defines which repositories to pull issues from, how many issues to pull, and in what state (e.g., open, closed, or all).
- owner: scikit-learn
repo: scikit-learn
state: all
per_page: 100
max_pages: 1This file configures the thresholds for Guardrails agents like jailbreak, toxicity, and secrets detection.
jailbreak:
threshold: 0.8
on_fail: "filter"
toxicity:
threshold: 0.5
validation_method: "full"
on_fail: "filter"
secrets:
on_fail: "filter"Install dependencies in a separate virtual environment:
pip install -r aws_cdk_infra/requirements.txtDeploy infrastructure:
- AWS EKS with Fargate and Load Balancer
- AWS RDS
- VCP
cd aws_cdk_infra
cdk bootstrap
cdk deployRun all tests (unit and integration):
make all-testsOr run individual test suites.
The FastAPI server is defined in src/api/main.py.
Start the API server (example):
uvicorn src.api.main:app --reload{
"title": "Test Issue",
"body": "Test Issue"
}API docs are available at /docs when running.
Kubernetes manifests are in kubernetes. Once the CDK Stack has been deployed, the environment variables must be adapted. Sensitive information can be found under AWS Secrets Manager.
Update your cluster configuration and add a new namespace:
aws eks --region <aus-region> update-kubeconfig --name <cluster-name>
kubectl create namespace my-appMake sure to add the environment variables information to the Kubernetes cluster:
kubectl create configmap app-config \
--from-literal=APP_ENV=prod \
--from-literal=AWS_REGION= \
--from-literal=POSTGRES_DB= \
--from-literal=POSTGRES_PORT= \
--from-literal=ADMINER_PORT= \
--from-literal=ISSUES_TABLE_NAME= \
--from-literal=COMMENTS_TABLE_NAME= \
--from-literal=DENSE_MODEL_NAME= \
--from-literal=SPARSE_MODEL_NAME= \
--from-literal=LEN_EMBEDDINGS= \
--from-literal=COLLECTION_NAME= \
--from-literal=CHUNK_SIZE= \
--from-literal=BATCH_SIZE= \
--from-literal=CONCURRENT_COMMENTS= \
--from-literal=LLM_MODEL_NAME= \
--from-literal=TEMPERATURE= \
--from-literal=REPOS_CONFIG=src/config/repos.yaml \
--from-literal=GUARDRAILS_CONFIG=src/config/guardrails.yaml \
-n my-appkubectl create secret generic app-secrets \
--from-literal=GH_TOKEN= \
--from-literal=POSTGRES_USER= \
--from-literal=POSTGRES_HOST=\
--from-literal=POSTGRES_PASSWORD== \
--from-literal=QDRANT_API_KEY= \
--from-literal=QDRANT_URL= \
--from-literal=LANGSMITH_API_KEY= \
--from-literal=OPENAI_API_KEY= \
--from-literal=GUARDRAILS_API_KEY= \
--from-literal=SECRET_NAME= \
-n my-appYou need to build and push the image in production and send it to AWS ECR:
aws ecr get-login-password --region eu-central-1 | docker login --username AWS --password-stdin <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com
aws ecr create-repository --repository-name fastapi-app --region <aus-region>
docker tag myapp-prod-image:latest <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latest
docker push <aws-account-id>.dkr.ecr.<aus-region>.amazonaws.com/fastapi-app:latestThen adapt the image name in the deployment manifest and apply:
kubectl apply -f kubernetes/fastapi-deployment.yamlAs the VPC is in private mode, you cannot make a request locally. You can either forward the port, create an EC2 instance in the same network, or add a load balancer to your Kubernetes cluster following these instructions.
Then you can apply the load balancer manifest:
kubectl apply -f kubernetes/fastapi-service.yamlThese will expose an External IP that can be used to make requests:
curl -X POST "http://k8s-myapp-fastapie-96d739e92d-4d28b27c27683b40.elb.eu-central-1.amazonaws.com/process-issue" \
-H "Content-Type: application/json" \
-d '{
"title": "Test Issue",
"body": "Test Issue"
}'This project is licensed under the MIT License. See the LICENSE

