A Python application that captures webcam or video frames and sends them to the LM Studio API for AI-powered image analysis and description.
Features Real-time Webcam Analysis: Capture and analyze webcam feed in real-time
Video File Support: Process pre-recorded video files Customizable Prompts: Choose from multiple analysis levels (basic, detailed, maximum) or create your own Configurable API Settings: Adjust API URL, timeout, FPS, and query intervals Results Tracking: Save analysis results with timestamps to CSV Conversation History: Maintains context with the AI model
Setup & Usage Prerequisites:
Python 3.7+ LM Studio running locally with API enabled (default: http://127.0.0.1:1234) Webcam (for live analysis)
Installation: bash pip install -r requirements.txt Running the Application:
bash python main.py Configuration:
Adjust settings in the Configuration tab Set your desired API endpoint, FPS, and analysis interval Choose or customize analysis prompts Using the Application: Start webcam analysis or select a video file View real-time analysis results in the response window Monitor performance with FPS counter Review all results in the Results tab Key Configuration Options API URL: Endpoint for LM Studio API API Timeout: Request timeout in seconds FPS: Frames per second for capture Analysis Interval: How often to send frames to API CSV File: Path to save results Prompt Selection: Choose analysis detail level
The application automatically saves your configuration and results, making it easy to continue previous sessions.
This application requires a multimodal vision-language model to analyze images. Standard text-only models will not work, as they cannot process visual data. You must load a vision-capable model in LM Studio for the application to function properly. Look for models specifically tagged as "multimodal" or "vision" in the LM Studio interface. Popular choices include models from the Llava, BakLLaVA, and Moondream families.
https://www.youtube.com/watch?v=wJUy5waWZTs (V2.0) https://www.youtube.com/watch?v=o0yITTFbbRE (V1.0)