An automated job aggregation and alert system that scrapes job listings, filters relevant roles, removes duplicates, and sends email alerts with structured data.
Job seekers often need to manually track multiple company career pages, which is time-consuming and inefficient.
This system automates the process by monitoring selected companies, filtering relevant roles, and delivering structured job alerts.
- 🔍 Keyword-based job search (Python, Backend, AI, etc.)
- 🌐 Scraping from multiple sources
- 🧠 Smart filtering (removes irrelevant roles like HR, Sales, etc.)
- 🗂️ SQLite database for deduplication
- 📧 Email alert system
- 📊 Structured job storage for further analysis
- ⚙️ Modular and scalable architecture
- Python
- Requests + BeautifulSoup
- SQLite
- smtplib (Email automation)
- JSON (data handling)
job-scraper-2/
│
├── scraper.py # Main pipeline (orchestrates everything)
├── fetcher.py # Fetches raw job data
├── filter.py # Filters relevant jobs
├── database.py # Handles DB operations (SQLite)
├── email_service.py # Sends email alerts
├── config.py # Configurations (keywords, settings)
├── companies.json # Input companies / sources
├── jobs.db # SQLite DB (ignored)
├── requirements.txt # Dependencies
├── .gitignore
└── Not Important/ # Experimental scripts (ignored)
-
Fetch Jobs
fetcher.pycollects job listings from sources -
Filter Data
filter.pyremoves irrelevant roles -
Deduplicate
database.pyensures no duplicate entries -
Store Data Saves new jobs into SQLite database
-
Send Alerts
email_service.pysends email notifications
pip install -r requirements.txt
python scraper.py- Enable 2-Step Verification in Gmail
- Generate App Password
- Add credentials in
config.pyoremail_service.py
- Automating job search process
- Learning real-world web scraping pipelines
- Building production-ready Python automation systems
- Add Streamlit dashboard
- Export jobs to Excel/CSV
- Add API (FastAPI)
- Deploy on cloud (AWS/GCP) with scheduler
- Add AI-based job relevance scoring