A modular Python-based system for real-time human detection and face recognition using computer vision and deep learning. Supports multiple camera sources including webcams, USB cameras, IP cameras, and video files.
- Human Detection: Real-time full-body detection using YOLOv8 (with HOG fallback)
- Face Detection: Multi-face detection with MTCNN or Haar Cascades
- Face Recognition: Identity matching using FaceNet embeddings
- Flexible Camera Input: Webcam, USB, IP cameras (RTSP/HTTP), video files
- Modular Architecture: Easy to swap models and extend functionality
- GPU Acceleration: Optional CUDA support for improved performance
- Privacy-First: All processing happens locally, no cloud dependency
- Configurable: YAML-based configuration for all parameters
- OS: Windows, Linux
- Python: 3.10 or higher
- RAM: 4GB minimum (8GB+ recommended)
- GPU: Optional (NVIDIA with CUDA for acceleration)
See requirements.txt
cd f:\ai-camera\Ai-human-finderpython -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activatepip install -r requirements.txtYOLOv8 models will download automatically on first run. For faster startup, you can pre-download:
python -c "from ultralytics import YOLO; YOLO('yolov8n.pt')"Edit config.yaml to customize:
camera:
source_type: 'webcam' # 'webcam', 'usb', 'ip', 'video'
source: 0 # 0 for default webcam, path for video file
resolution: [640, 480] # or null for default
fps_limit: 30human_detection:
enabled: true
model: 'yolov8n' # n=fastest, s/m/l/x=slower but more accurate
confidence_threshold: 0.5
use_gpu: true
face_detection:
enabled: true
model: 'mtcnn' # 'mtcnn' or 'haar'
confidence_threshold: 0.9
face_recognition:
enabled: true
model: 'facenet'
recognition_threshold: 0.6 # Lower = stricter matchingSimply run the main application:
python main.pyControls:
Q- Quit applicationF- Toggle fullscreen
- Create a folder structure in
data/known_faces/:
data/known_faces/
├── John_Doe/
│ ├── photo1.jpg
│ ├── photo2.jpg
│ └── photo3.jpg
├── Jane_Smith/
│ ├── photo1.jpg
│ └── photo2.jpg
└── Bob_Johnson/
├── photo1.jpg
└── photo2.jpg
- Run enrollment:
python enroll_faces.py --mode directory# Capture 5 samples for a person
python enroll_faces.py --mode camera --name "John_Doe" --samples 5Follow on-screen instructions:
- Position face in frame
- Press
SPACEto capture each sample - Vary pose and expression between captures
- Press
Qto finish early
After enrolling faces:
python main.pyThe system will now display names above recognized faces!
Ai-human-finder/
├── main.py # Main application
├── enroll_faces.py # Face enrollment tool
├── config.yaml # Configuration file
├── requirements.txt # Python dependencies
├── README.md # This file
│
├── src/ # Source modules
│ ├── camera.py # Camera input handler
│ ├── human_detector.py # Human detection
│ ├── face_detector.py # Face detection
│ └── face_recognizer.py # Face recognition
│
├── data/ # Data directory
│ ├── known_faces/ # Known face images
│ │ └── PersonName/ # One folder per person
│ └── embeddings.pkl # Trained embeddings
│
└── logs/ # Application logs
camera:
source_type: 'usb'
source: 1 # Camera indexcamera:
source_type: 'ip'
source: 'rtsp://username:[email protected]:554/stream'camera:
source_type: 'video'
source: 'path/to/video.mp4'yolov8n- Fastest, lowest accuracy (recommended for CPU)yolov8s- Small, balancedyolov8m- Medium, good accuracyyolov8l- Large, high accuracyyolov8x- Extra large, best accuracy (GPU recommended)
mtcnn- More accurate, slower (recommended)haar- Faster, less accurate (CPU-friendly fallback)
In config.yaml:
face_recognition:
recognition_threshold: 0.6 # 0.4 = strict, 0.8 = lenientLower values = stricter matching (fewer false positives) Higher values = looser matching (more false positives)
human_detection:
model: 'yolov8n'
use_gpu: false
face_detection:
model: 'haar'
camera:
resolution: [640, 480]
fps_limit: 15human_detection:
model: 'yolov8m' # or yolov8l
use_gpu: true
face_detection:
model: 'mtcnn'
camera:
resolution: [1280, 720]
fps_limit: 30- Check camera index (try 0, 1, 2)
- Ensure camera isn't used by another application
- On Linux, check permissions:
sudo usermod -a -G video $USER
- Reduce camera resolution
- Use smaller model (yolov8n)
- Enable GPU acceleration
- Increase skip_frames in config
- Ensure embeddings.pkl exists (run enroll_faces.py)
- Add more training samples per person (5-10 recommended)
- Adjust recognition_threshold
- Ensure good lighting during enrollment and runtime
pip install ultralytics mtcnn facenet-pytorch- Use smaller models
- Reduce camera resolution
- Process fewer frames (increase skip_frames)
Test System: Intel i7-10700K, RTX 3070, 640x480 resolution
| Configuration | FPS |
|---|---|
| YOLOv8n + Haar (CPU) | 12-15 |
| YOLOv8n + MTCNN (CPU) | 8-10 |
| YOLOv8n + MTCNN (GPU) | 28-32 |
| YOLOv8m + MTCNN (GPU) | 22-25 |
- Multi-camera support
- Object tracking (ID persistence)
- Emotion detection
- Mask detection
- Activity recognition
- Web interface
- Edge deployment (Raspberry Pi, Jetson Nano)
- Video recording with detections
- Database integration
- RESTful API
Human Detection: YOLOv8 (You Only Look Once)
- Trained on COCO dataset
- Real-time object detection
- Detects 'person' class
Face Detection: MTCNN (Multi-task Cascaded CNN)
- Three-stage cascade: P-Net, R-Net, O-Net
- Fast and accurate face localization
Face Recognition: FaceNet
- 512-dimensional embeddings
- Cosine similarity for matching
- Pre-trained on VGGFace2 dataset
- Capture frame from camera
- Detect humans (YOLOv8) and faces (MTCNN)
- Extract face crops from detections
- Compute embeddings using FaceNet
- Compare with known embeddings using cosine similarity
- Identify if distance < threshold
- Render results with bounding boxes and labels
Contributions are welcome! Areas for improvement:
- Additional detection models
- Performance optimizations
- New features from roadmap
- Documentation improvements
- Bug fixes
This project is for educational and research purposes.
Model Licenses:
- YOLOv8: AGPL-3.0
- MTCNN: MIT
- FaceNet: Apache 2.0
This system uses face recognition technology. Please use responsibly:
- Obtain consent before capturing faces
- Comply with privacy laws (GDPR, CCPA, etc.)
- Secure face data appropriately
- Do not use for surveillance without authorization
- Be aware of algorithmic bias in face recognition
- Consider ethical implications of your use case
For issues, questions, or suggestions:
- Check Troubleshooting section
- Review config.yaml documentation
- Examine logs in
logs/directory
Built with:
- OpenCV - Computer vision library
- PyTorch - Deep learning framework
- Ultralytics YOLOv8 - Object detection
- MTCNN - Face detection
- FaceNet-PyTorch - Face recognition
Made for learning and experimentation
Remember: With great power comes great responsibility. Use AI ethically!