This is the backend API for the SkoolMe course generation system. It provides file upload, analysis, and content processing capabilities.
- File Upload: Support for documents (.txt, .pdf, .docx, .png, .jpg, .jpeg, .bmp) and audio files (.mp3, .wav, .m4a)
- Document Processing: OCR and text extraction from various document formats
- Audio Processing: Speech-to-text transcription for audio files
- Progress Tracking: Real-time progress updates for file analysis
- File Size Validation: 100MB limit for documents, 50MB for audio files
- Google Cloud Integration: Uses Google Cloud Speech-to-Text and Vision APIs
cd backend
pip install -r requirements.txtWindows:
- Install Tesseract OCR
- Add Tesseract to your system PATH
- Install Poppler for PDF processing
macOS:
brew install tesseract popplerLinux (Ubuntu/Debian):
sudo apt-get install tesseract-ocr poppler-utils- Create a Google Cloud project
- Enable the following APIs:
- Speech-to-Text API
- Cloud Vision API
- Cloud Storage API
- Create a service account and download the JSON key file
- Place the key file as
skoolme-ocr-b933da63cd81.jsonin the backend directory
gsutil mb gs://skoolme-audio-transcriptsFor Development:
python run_server.pyFor Production:
gunicorn --bind 0.0.0.0:5000 wsgi:appGET /api/health
Returns server status and active sessions count.
POST /api/upload
Upload files for analysis. Returns a session ID.
Request: Multipart form data with files field Response:
{
"session_id": "uuid",
"files": [
{
"filename": "document.pdf",
"original_name": "document.pdf",
"file_type": "document",
"size": 1024000
}
],
"message": "Successfully uploaded 1 files"
}POST /api/analyze
Start analysis of uploaded files.
Request:
{
"session_id": "uuid"
}Response:
{
"session_id": "uuid",
"message": "Analysis started",
"status": "processing"
}GET /api/progress/{session_id}
Get real-time analysis progress.
Response:
{
"status": "processing|completed|error",
"progress": 85,
"message": "Processing file 3 of 5...",
"results": [...],
"overall_score": 82.5,
"generated_title": "Introduction to Physics",
"error": null
}DELETE /api/cleanup/{session_id}
Clean up session files and data.
- PDF: Text extraction with OCR fallback
- DOCX: Native text extraction
- TXT: Direct text reading with encoding detection
- Images: OCR using Google Vision API or Tesseract
- Format Support: MP3, WAV, M4A
- Conversion: Auto-converts to 16kHz mono WAV
- Transcription: Google Cloud Speech-to-Text with timestamps
- Speaker Diarization: Identifies multiple speakers
The system calculates extraction scores based on:
- Content Length (50%): Amount of text extracted
- Word Diversity (30%): Unique words vs total words
- Structure (20%): Presence of sentences and paragraphs
- 80-100%: 🟢 Green - Excellent extraction
- 30-79%: 🟡 Yellow - Good extraction with some issues
- 0-29%: 🔴 Red - Poor extraction, unusable content
The API provides comprehensive error handling:
- File size validation
- File type validation
- Google Cloud API errors
- Processing timeouts
- Storage errors
GOOGLE_APPLICATION_CREDENTIALS=path/to/credentials.json
FLASK_ENV=productionFROM python:3.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
# Install system dependencies
RUN apt-get update && apt-get install -y \
tesseract-ocr \
poppler-utils \
&& rm -rf /var/lib/apt/lists/*
COPY . .
EXPOSE 5000
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "wsgi:app"]- Use Redis for session storage in production
- Implement file cleanup jobs
- Set up monitoring and logging
- Use cloud storage for file uploads
- Implement rate limiting
# Test file upload
curl -X POST -F "[email protected]" http://localhost:5000/api/upload
# Test health check
curl http://localhost:5000/api/healthSet FLASK_ENV=development for detailed error messages and auto-reload.
- Documents: Usually < 30 seconds
- Audio files: 1-2 minutes per minute of audio
- Large files: May take longer, progress is tracked
- File validation prevents malicious uploads
- Temporary files are cleaned up automatically
- Google Cloud credentials should be kept secure
- Implement authentication for production use
Import Errors:
- Ensure all dependencies are installed:
pip install -r requirements.txt - Check system dependencies (Tesseract, Poppler)
Google Cloud Errors:
- Verify credentials file exists and is valid
- Check API permissions and billing
- Ensure storage bucket exists
File Processing Errors:
- Check file format compatibility
- Verify file size limits
- Ensure sufficient disk space
Check the console output for detailed error messages and processing status.