VigiLens is a comprehensive, AI-powered surveillance system designed to detect suspicious shoplifting behaviors for both real-time monitoring and post-event security investigations. The system leverages state-of-the-art pose estimation to analyze video feeds, flag anomalies, and present findings in a centralized, user-friendly web interface.
This project was developed as a Capstone for the Bachelor of Science in Computer Science program at Adamson University.
VigiLens combines cutting-edge AI technology with a modern web interface to provide intelligent surveillance capabilities. The system can simultaneously monitor multiple live camera feeds and analyze uploaded video footage, making it suitable for both real-time security monitoring and forensic analysis.
- Live Camera Feeds: Connect to and process multiple live camera feeds (webcams, IP cameras, RTSP streams) simultaneously
- Real-Time Pose Detection: Continuous human pose estimation using YOLOv11 for immediate anomaly detection
- Live Streaming Dashboard: View annotated live feeds directly in the web browser via MJPEG streams
- Multi-Camera Support: Monitor multiple cameras with dedicated AI workers for each feed
- Video Upload & Processing: Upload recorded surveillance footage for automated background analysis
- Asynchronous Processing: Non-blocking video analysis with real-time progress tracking
- Incident Logging: Automatic detection and logging of suspicious activities with timestamps
- Advanced Pose Estimation: Utilizes YOLOv11 model for accurate human pose detection and tracking
- Anomaly Detection Model: Custom transformer-based model for identifying suspicious behaviors
- Continuous Learning: System designed to improve detection accuracy over time
- Centralized Dashboard: Clean, modern React-based interface for all system operations
- Real-Time Statistics: Live overview of total incidents, active cameras, and system status
- Incident Management: Detailed incident logs with video evidence and thumbnails
- Camera Management: Dynamic addition/removal of camera feeds without system restart
- Video Evidence: View original and annotated video clips for each detected incident
- Scalable Multi-Process Design: Independent services ensure responsive UI during intensive processing
- Production-Ready: Uses Waitress WSGI server for robust performance
- Database Integration: SQLite database for reliable incident storage and retrieval
- Modular Structure: Clean separation of concerns with dedicated modules for different functionalities
VigiLens uses a sophisticated multi-process architecture designed for scalability and reliability:
-
Main Web Server (
app.py)- Technology: Flask with Waitress WSGI server
- Purpose: Serves the React frontend and provides REST API endpoints
- Features:
- Incident management and retrieval
- Video file serving for playback
- Database operations
- Static file serving for the compiled React app
- Port: 5000
-
Live Stream Server (
stream_server.py)- Technology: Lightweight Flask application
- Purpose: Handles real-time video streaming
- Features:
- Receives annotated frames from AI workers
- Serves live MJPEG streams to web browsers
- Manages frame buffers for multiple cameras
- Port: 8080
-
AI Worker Processes (
worker.py)- Technology: OpenCV + Ultralytics YOLOv11
- Purpose: Dedicated per-camera AI processing
- Features:
- Continuous video feed processing
- Real-time pose estimation
- Anomaly detection
- Frame annotation and streaming
- Incident clip generation
-
Clip Saver Process (
save_clip.py)- Technology: FFmpeg integration
- Purpose: Background video processing
- Features:
- Non-blocking incident recording
- Video compression and optimization
- Thumbnail generation
- Database logging
Camera Feed β AI Worker β Pose Estimation β Anomaly Detection
β β β β
Stream Server β Annotated Frame Database β Incident Log
β
Web Browser β Live Stream
The run_all.py master script coordinates all services using Python multiprocessing:
- Automatic service startup and coordination
- Graceful shutdown handling
- Process isolation for stability
- Centralized logging and monitoring
- Python 3.10+ - Core backend language
- Flask - Web framework for API and service endpoints
- Waitress - Production WSGI server for stable video streaming
- SQLAlchemy - Database ORM for incident management
- OpenCV - Computer vision and video processing
- Ultralytics YOLOv11 - State-of-the-art pose estimation model
- FFmpeg - Video encoding, compression, and clip generation
- NumPy - Numerical computing for AI operations
- Requests - HTTP client for inter-service communication
- React 19 - Modern UI framework
- Vite - Fast development server and build tool
- Axios - HTTP client for API communication
- React Router - Single-page application routing
- Material-UI - Component library for consistent design
- React Player - Video playback component
- React Icons - Icon library
- SQLite - Embedded database for incident logs
- Local File System - Video clips and thumbnails storage
- YOLOv11 - Real-time pose estimation
- Custom Transformer Model - Anomaly detection algorithm
- PyTorch - Deep learning framework
Before setting up VigiLens, ensure your system meets these requirements:
- Operating System: Windows 10/11, macOS 10.15+, or Linux (Ubuntu 18.04+)
- RAM: Minimum 8GB (16GB recommended for multiple cameras)
- Storage: At least 5GB free space (more for video storage)
- CPU: Multi-core processor (quad-core recommended)
- GPU: Optional but recommended for faster AI processing
- Download: python.org
- Verify installation:
python --version # Should display Python 3.10.x or higher
- Download: nodejs.org (LTS version recommended)
- Verify installation:
node --version npm --version
- Windows: Download from ffmpeg.org and add to PATH
- macOS: Install via Homebrew:
brew install ffmpeg - Linux: Install via package manager:
sudo apt install ffmpeg - Verify installation:
ffmpeg -version # Should display version information without errors
- Webcam: For live monitoring (built-in or USB)
- Network Access: For RTSP camera connections
- Camera Specifications: IP cameras should support standard RTSP protocols
git clone https://github.com/bear-hunter/Camonyou.git
cd CamonyouNavigate to the backend directory and set up the Python environment:
cd backend# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On macOS/Linux:
source venv/bin/activate
# Verify activation (should show (venv) in terminal prompt)# Install all required packages
pip install -r requirements.txt
# Verify key packages are installed
pip list | grep -E "(flask|opencv|ultralytics|torch)"The system requires AI models for pose estimation and anomaly detection:
# Create models directory if it doesn't exist
mkdir -p models
# Place your trained models in the models directory:
# - yolov11s-pose.pt (YOLOv11 pose estimation model)
# - transformer_anomaly_detector.pt (Custom anomaly detection model)
# - shopformer_v2.pth (Transformer model weights)
# - gcae_tokenizer_v2.pth (Tokenizer for the model)Note: The YOLOv11 model will be automatically downloaded on first run if not present. Custom models should be trained separately or obtained from the project maintainers.
# The database will be automatically created on first run
# Optional: Seed with sample data
python app.py seedOpen a new terminal and navigate to the frontend directory:
cd frontend# Install all frontend dependencies
npm install
# Install additional required packages if not already included
npm install axios react-router-dom @mui/material @emotion/react @emotion/styled react-player
# Verify installation
npm list --depth=0# For development, this step is optional as Vite serves files directly
# For production deployment:
npm run buildConfigure the cameras you want to monitor by editing the configuration file:
# Navigate to backend directory
cd backend
# Edit the camera configuration
# Use your preferred text editor to modify cameras.jsonEdit backend/cameras.json to define your camera sources:
[
{
"id": "LAPTOP-WEBCAM",
"rtsp_url": "0"
},
{
"id": "OFFICE-CAMERA-1",
"rtsp_url": "rtsp://username:[email protected]:554/stream1"
},
{
"id": "PARKING-CAMERA",
"rtsp_url": "rtsp://admin:[email protected]/live/main"
}
]| Source Type | Configuration | Example |
|---|---|---|
| Built-in Webcam | "rtsp_url": "0" |
Laptop camera |
| USB Camera | "rtsp_url": "1" |
External USB camera |
| IP Camera (RTSP) | "rtsp_url": "rtsp://user:pass@ip:port/path" |
Network security camera |
| HTTP Stream | "rtsp_url": "http://ip:port/stream" |
Web-based camera |
- Use descriptive, unique identifiers
- Avoid spaces and special characters
- Examples:
ENTRANCE-CAM,CASHIER-1,WAREHOUSE-NORTH
Ensure your project structure matches this layout:
Camonyou/
βββ backend/
β βββ models/ # AI model files
β βββ uploads/ # Uploaded video storage
β βββ processed_data/ # Processed clips and thumbnails
β βββ instance/ # Database files
β βββ vigilens_core/ # Core application modules
β βββ cameras.json # Camera configuration
β βββ requirements.txt # Python dependencies
β βββ run_all.py # Main startup script
βββ frontend/
β βββ src/ # React source code
β βββ public/ # Static assets
β βββ package.json # Node.js dependencies
β βββ dist/ # Built frontend (created after build)
βββ README.md
The easiest way to start VigiLens is using two terminals:
# Navigate to backend directory
cd backend
# Ensure virtual environment is active
source venv/bin/activate # On macOS/Linux
# OR
venv\Scripts\activate # On Windows
# Start all backend services
python run_all.pyThis single command starts:
- β Main Web Server (Port 5000)
- β Live Stream Server (Port 8080)
- β AI Workers (one per camera)
# Navigate to frontend directory (in a new terminal)
cd frontend
# Start the development server
npm run devOnce both terminals show successful startup messages:
- Open your web browser
- Navigate to:
http://localhost:5173 - You should see the VigiLens dashboard
- Main API:
http://localhost:5000/api/dashboard/stats - Live Stream:
http://localhost:8080/stream/CAMERA-ID(replace with actual camera ID)
- Development Server:
http://localhost:5173 - Should display: React-based VigiLens interface
For production deployment:
# Build frontend
cd frontend
npm run build
# Start backend only (serves built frontend)
cd ../backend
source venv/bin/activate
python run_all.py
# Access at: http://localhost:5000- Total Incidents: View cumulative count of detected anomalies
- Active Cameras: Monitor currently connected camera feeds
- System Status: Real-time status of all services
- Top Cameras: Cameras with most incident detections
- Navigate to Camera View: Access live feeds from all configured cameras
- Real-Time Annotations: See pose estimation overlays in real-time
- Incident Alerts: Automatic notifications when anomalies are detected
- Multi-Camera Grid: Monitor multiple feeds simultaneously
- Upload Videos: Drag and drop or select video files for analysis
- Background Processing: Videos are processed asynchronously
- Progress Tracking: Monitor analysis progress in real-time
- Results Review: View detected incidents with timestamps
- Incident List: Browse all detected anomalies chronologically
- Video Playback: Watch original and annotated video clips
- Thumbnail Preview: Quick visual reference for each incident
- Export Capabilities: Download video evidence for reporting
- Add Cameras: Configure new camera sources through the UI
- Remove Cameras: Deactivate camera feeds as needed
- Live Configuration: Changes take effect after service restart
- Camera Testing: Verify camera connectivity before deployment
Problem: ModuleNotFoundError when starting backend
# Solution: Ensure virtual environment is activated
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# Reinstall dependencies if needed
pip install -r requirements.txtProblem: Camera connection fails
# Check camera configuration in cameras.json
# Verify RTSP URL format: rtsp://username:password@ip:port/path
# Test with VLC or similar player firstProblem: FFmpeg errors during video processing
# Verify FFmpeg installation
ffmpeg -version
# Check PATH environment variable includes FFmpeg
# Reinstall FFmpeg if necessaryProblem: npm run dev fails to start
# Clear npm cache and reinstall
npm cache clean --force
rm -rf node_modules package-lock.json
npm installProblem: Cannot connect to backend API
- Ensure backend is running on port 5000
- Check for CORS errors in browser console
- Verify firewall settings
Problem: High CPU usage with multiple cameras
- Solution: Reduce number of simultaneous cameras
- Alternative: Upgrade hardware or use GPU acceleration
Problem: Memory leaks during long-term operation
- Solution: Restart services periodically
- Monitor: Use system monitoring tools to track resource usage
# Run with verbose output
cd backend
python run_all.py
# Check individual service logs
python worker.py CAMERA-ID RTSP-URL# Development server logs
npm run dev
# Browser console for JavaScript errors
# Open browser DevTools (F12) β Console tabInsufficient Memory:
- Close unnecessary applications
- Consider reducing video resolution
- Limit number of concurrent cameras
Storage Space:
- Regularly clean processed_data folder
- Implement automatic cleanup policies
- Monitor disk usage
- RTSP Credentials: Use strong passwords for camera access
- Network Isolation: Consider separate VLAN for security cameras
- Firewall Rules: Limit access to necessary ports only
- Local Storage: All data remains on local system by default
- Access Control: Implement user authentication for production use
- Data Retention: Establish policies for video data lifecycle
- HTTPS: Enable SSL/TLS for production environments
- Authentication: Implement proper user management
- Backup: Regular backup of incident database and video files
- GPU Acceleration: Install CUDA for faster AI processing
- Storage: Use SSD for better video I/O performance
- Network: Ensure stable network for RTSP streams
- Model Selection: Use lighter models for lower-end hardware
- Frame Rate: Adjust processing frame rate based on requirements
- Resolution: Balance detection accuracy with performance
We welcome contributions to improve VigiLens! Please follow these guidelines:
- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
- Follow PEP 8 for Python code
- Use ESLint for JavaScript code
- Add comments for complex logic
- Include unit tests where applicable
This project is developed as an academic capstone project. Please refer to the repository for licensing information.
For issues and questions:
- Check the troubleshooting section above
- Search existing GitHub issues
- Create a new issue with detailed information
- Include system specifications and error logs
- Adamson University - Computer Science Program
- Ultralytics - YOLOv11 implementation
- OpenCV Community - Computer vision tools
- React Team - Frontend framework
Developed by: Adamson University Computer Science Students
Project Type: Capstone Project
Academic Year: 2024-2025