A production-ready web application that uses AI and machine learning to automatically screen and rank CVs/resumes against job descriptions. Built with React frontend and Flask backend.
- π High Performance: Processes ~890 resumes per second (8.9x faster than baseline)
- π― Smart Matching: Uses TF-IDF and cosine similarity for intelligent resume ranking
- πΌ Skill Analysis: Automatically extracts and matches technical skills
- π Real-time Progress: Live progress tracking with WebSocket-like polling
- π± Responsive Design: Works seamlessly on desktop, tablet, and mobile
- π Production Ready: Environment-based configuration, CORS support, error handling
SmartHire_2.0/
βββ frontend/ # React + Vite frontend
β βββ src/
β β βββ App.jsx # Main application component
β β βββ App.css # Styles and responsive design
β β βββ main.jsx
β βββ package.json
β βββ vite.config.js
β
βββ backend/ # Flask backend API
β βββ src/
β β βββ app.py # Main Flask application
β β βββ database.py # Database initialization
β β βββ skills_master.py # Skills database
β βββ requirements.txt
β βββ test_performance.py
β
βββ docs/ # Documentation
- Python 3.8+ (for backend)
- Node.js 16+ (for frontend)
- pip (Python package manager)
- npm (Node package manager)
-
Navigate to backend directory:
cd backend -
Create virtual environment (recommended):
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Download spaCy language model:
python -m spacy download en_core_web_sm
-
Configure environment (optional):
cp .env.example .env # Edit .env with your settings -
Run the backend server:
cd src python app.pyBackend will start at
http://localhost:5000
-
Navigate to frontend directory:
cd frontend -
Install dependencies:
npm install
-
Configure environment (optional):
cp .env.example .env # Edit .env with your API URL -
Run the development server:
npm run dev
Frontend will start at
http://localhost:5173
- Collect all CVs/resumes (PDF, DOCX, or TXT format)
- Organize them in a folder structure
- Create a ZIP archive of the folder
- Open the application in your browser (
http://localhost:5173) - Enter the Job Description (include required skills and responsibilities)
- Add Must-Have Skills (comma-separated, optional but recommended)
- Upload the ZIP file containing CVs
- Click Start Screening
- Monitor real-time progress as resumes are analyzed
- View the Top 5 Candidates ranked by match score
- Review matched skills and missing must-haves for each candidate
- Start a new screening job or export results
SmartHire uses a sophisticated multi-factor scoring system:
- TF-IDF Cosine Similarity (50%): Measures overall text similarity between job description and resume
- Skill Matching (50%): Analyzes technical skills from a database of 200+ skills
- Must-Have Penalty: Significantly reduces score for missing critical skills
- Bonus Points: Extra points for high-demand skills (React, Full Stack, etc.)
Final score: 0-100 scale
- Regex Pattern Caching: 85x faster pattern matching
- Job Description Pre-processing: Compute once, reuse for all resumes (5.2x faster)
- Batch Database Operations: 90% reduction in I/O operations
- Optimized Text Extraction: Efficient PDF and DOCX parsing
- No Unused Dependencies: Removed 100MB+ unused spaCy model
See PERFORMANCE_OPTIMIZATIONS.md for details.
Create a .env file in the backend/ directory:
FLASK_ENV=development
FLASK_DEBUG=True
FLASK_HOST=0.0.0.0
FLASK_PORT=5000
FRONTEND_URL=http://localhost:5173
DB_PATH=smarthire.db
UPLOAD_FOLDER=uploads
MAX_CONTENT_LENGTH=524288000 # 500MBCreate a .env file in the frontend/ directory:
VITE_API_URL=http://localhost:5000# Install dependencies
pip install -r requirements.txt
pip install gunicorn
# Run with Gunicorn (production server)
cd backend/src
gunicorn -w 4 -b 0.0.0.0:5000 app:app# Backend Dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY backend/requirements.txt .
RUN pip install -r requirements.txt
RUN python -m spacy download en_core_web_sm
COPY backend/src/ .
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "app:app"]-
Create
Procfilein backend/:web: cd src && gunicorn app:app -
Deploy using platform CLI or Git integration
# Build for production
npm run build
# Deploy the 'dist' folder to your hosting provider# Build
npm run build
# Serve with any static file server (nginx, Apache, etc.)Upload ZIP file with CVs and start screening job.
Request:
Content-Type: multipart/form-datazip_file: ZIP archive containing CVsdescription: Job description textmust_haves: Comma-separated must-have skills
Response:
{
"message": "Started processing ZIP file",
"job_id": 1,
"total_cvs_found": 150
}Get processing status of a job.
Response:
{
"status": "Processing",
"processed": 75,
"total": 150,
"percentage": 50.0
}Get top 5 candidates for a job.
Response:
{
"status": "Completed",
"progress": "150/150",
"top_5": [
{
"id": 1,
"filename": "john_doe.pdf",
"score": 87.5,
"found_skills": "[\"Python\", \"React\", \"AWS\"]",
"missing_skills": "[]"
}
]
}cd backend
python test_performance.pyExpected output:
- β Regex caching working correctly (85x faster)
- β Job description preprocessing working correctly (5.2x faster)
- β Text extraction working correctly
- β Database context manager working correctly
- β Benchmark completed (890 resumes/second)
cd frontend
npm run lintThis is a thesis/academic project. For collaboration:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
SmartHire 2.0 - Academic Project / Thesis Work
- TF-IDF and Cosine Similarity algorithms from scikit-learn
- Flask web framework
- React and Vite for modern frontend development
- Performance optimization techniques from Python best practices
For questions, issues, or thesis inquiries:
- Open an issue on GitHub
- Review existing documentation in
/docs - Check performance guides in repository
Future enhancements for production use:
- User authentication and multi-tenant support
- Export results to PDF/Excel
- Advanced filtering and sorting
- Resume download functionality
- Email notifications
- Interview scheduling integration
- Analytics dashboard
- Bulk job management
- Custom skill databases per organization
Made with β€οΈ for automated recruitment | Processing at 890 resumes/second β‘