Skip to content

Shubham37204/Job-Scraper-2

Repository files navigation

🚀 Company-Specific Job Scraper & Email Alert System

An automated job aggregation and alert system that scrapes job listings, filters relevant roles, removes duplicates, and sends email alerts with structured data.

🎯 Problem Statement

Job seekers often need to manually track multiple company career pages, which is time-consuming and inefficient.

This system automates the process by monitoring selected companies, filtering relevant roles, and delivering structured job alerts.

🔥 Features

  • 🔍 Keyword-based job search (Python, Backend, AI, etc.)
  • 🌐 Scraping from multiple sources
  • 🧠 Smart filtering (removes irrelevant roles like HR, Sales, etc.)
  • 🗂️ SQLite database for deduplication
  • 📧 Email alert system
  • 📊 Structured job storage for further analysis
  • ⚙️ Modular and scalable architecture

🧩 Tech Stack

  • Python
  • Requests + BeautifulSoup
  • SQLite
  • smtplib (Email automation)
  • JSON (data handling)

📁 Project Structure

job-scraper-2/
│
├── scraper.py           # Main pipeline (orchestrates everything)
├── fetcher.py           # Fetches raw job data
├── filter.py            # Filters relevant jobs
├── database.py          # Handles DB operations (SQLite)
├── email_service.py     # Sends email alerts
├── config.py            # Configurations (keywords, settings)
├── companies.json       # Input companies / sources
├── jobs.db              # SQLite DB (ignored)
├── requirements.txt     # Dependencies
├── .gitignore
└── Not Important/       # Experimental scripts (ignored)

⚙️ How It Works

  1. Fetch Jobs fetcher.py collects job listings from sources

  2. Filter Data filter.py removes irrelevant roles

  3. Deduplicate database.py ensures no duplicate entries

  4. Store Data Saves new jobs into SQLite database

  5. Send Alerts email_service.py sends email notifications

▶️ Run Locally

pip install -r requirements.txt
python scraper.py

🔐 Setup Email Alerts

  1. Enable 2-Step Verification in Gmail
  2. Generate App Password
  3. Add credentials in config.py or email_service.py

🎯 Use Case

  • Automating job search process
  • Learning real-world web scraping pipelines
  • Building production-ready Python automation systems

🚀 Future Improvements

  • Add Streamlit dashboard
  • Export jobs to Excel/CSV
  • Add API (FastAPI)
  • Deploy on cloud (AWS/GCP) with scheduler
  • Add AI-based job relevance scoring

About

End-to-end job scraping pipeline in Python that aggregates listings, filters relevant roles, deduplicates using SQLite, and sends automated email alerts with structured reports.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages