Skip to content

Gojer16/Dock-Hunter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Job Hunter CLI & Dashboard

A powerful Python CLI tool that automates job discovery by searching multiple job boards, parsing job postings, and intelligently filtering them based on:

✔️ Remote work preferences ✔️ Geographic regions (LATAM, Europe, USA) ✔️ Duplicates (smart deduplication) ✔️ Freshness and relevant ATS sources ✔️ Dynamic query strategies with role synonyms and progressive widening

Now with a beautiful web dashboard, smart duplicate detection, and senior-level query generation!

🚀 Quick Start (Super Easy!)

One-Click Setup

Mac/Linux:

git clone <repository-url>
cd work-automation
./start.sh

Windows:

git clone <repository-url>
cd work-automation
start.bat

The startup script will:

  • ✅ Auto-install all dependencies
  • 🎯 Launch interactive job search wizard
  • 🌐 Open web dashboard to view results
  • 📊 Show job statistics and analytics

Interactive Mode (Recommended)

uv run python main.py start

The wizard guides you through:

  • Role Selection - What position you're looking for
  • Region Targeting - LATAM, EUROPE, USA, or Global
  • Country Specific - Target specific countries like Germany, Colombia, Brazil
  • Platform Selection - Exclude platforms you don't want
  • Remote Preference - Strict remote, hybrid, or any

🌐 Web Dashboard

Launch the beautiful web interface to browse jobs:

uv run python main.py dashboard

Features:

  • 🔍 Live Search & Filtering - Filter by company, location, remote type
  • Job Bookmarking - Save favorites with one click
  • 🔄 Auto-Refresh - Real-time updates when new jobs are found
  • 📱 Mobile Friendly - Works great on all devices
  • 🎨 Clean Interface - Easy to browse and interact with jobs

What It Does

Dork Hunter uses a 5-phase pipeline to find relevant job opportunities:

  1. DISCOVER - Searches job boards using Google dorks with regional intelligence
  2. PARSE - Extracts structured data from job postings
  3. CLASSIFY - Determines remote work type (strict/hybrid/onsite)
  4. TAG - Identifies eligible regions (USA/EUROPE/LATAM/GLOBAL)
  5. FILTER - Applies user preferences and removes duplicates

🌍 Regional Intelligence

Smart Platform Selection

  • LATAM: Computrabajo, InfoJobs, Indeed LATAM sites
  • EUROPE: Xing, StepStone, Indeed EU sites
  • USA: Indeed US, Monster, ZipRecruiter, major ATS platforms
  • Global: LinkedIn, Glassdoor, Greenhouse, Lever, Workable, Ashby

Country-Specific Targeting

  • LATAM: Colombia, Brazil, Mexico, Argentina, Chile
  • EUROPE: Germany, Spain, UK, France, Netherlands
  • USA: United States, Canada

🔧 Advanced Features

Smart Duplicate Detection

uv run python main.py dedupe --input-file filtered_jobs.json
  • Multi-field matching: Company, title, location, remote type, regions
  • 90% similarity threshold: Conservative to avoid false positives
  • Detailed reporting: Shows which jobs were duplicates and why

Job Statistics & Analytics

uv run python main.py stats
  • Top companies hiring
  • Remote work distribution
  • Regional job availability
  • Platform performance metrics

Export & Integration

uv run python main.py export --format csv --output my_jobs
  • Export to CSV, JSON formats
  • Integration-ready data structure
  • Preserve all metadata and classifications

Dynamic Role Management

# Add new role synonyms without code changes
uv run python main.py add-role rust "rust developer" "systems engineer" "rust engineer"

# Test query generation for any role
uv run python main.py test-queries golang
  • Add roles dynamically without rewriting search logic
  • Preview generated query variants
  • Role synonym expansion and progressive widening

Platform Health Monitoring

uv run python main.py platforms --region LATAM
  • Check platform availability
  • Regional platform listings
  • Performance monitoring

Installation

Automatic (Recommended)

Use the startup scripts - they handle everything automatically!

Manual Installation

# Clone the repository
git clone <repository-url>
cd work-automation

# Install with uv (recommended)
uv sync

# Or install with pip
pip install -r requirements.txt

Usage Examples

Find Remote Python Jobs in LATAM

uv run python main.py start
# Select: Python Developer → LATAM → colombia, brazil → Exclude: linkedin → strict

Search Only European Platforms

uv run python main.py search --role "Frontend Developer" --region EUROPE --country germany --country spain

Full Pipeline with Deduplication

# 1. Search for jobs
uv run python main.py search --role "Software Engineer" --region USA

# 2. Process and filter
uv run python main.py process raw_jobs.json --remote strict

# 3. Remove duplicates
uv run python main.py dedupe --input-file filtered_jobs.json

# 4. Launch dashboard
uv run python main.py dashboard

Commands Reference

Core Commands

  • start - 🎯 Interactive wizard (recommended for beginners)
  • search - 🔍 Discover job URLs with regional filtering
  • process - ⚙️ Parse, classify, and tag jobs
  • dashboard - 🌐 Launch web interface
  • dedupe - 🔄 Remove duplicate jobs intelligently

Analytics & Utilities

  • stats - 📊 Show job search statistics
  • platforms - 🌍 List available platforms by region
  • export - 📤 Export jobs to CSV/JSON
  • health - 🏥 Check platform availability

Dynamic Query Management

  • add-role - ➕ Add role synonyms without code changes
  • test-queries - 🔍 Preview generated query variants for any role

Advanced Options

  • --region - Target specific regions (LATAM, EUROPE, USA)
  • --country - Focus on specific countries
  • --remote - Filter by remote work type (strict, hybrid, any)
  • --exclude - Skip specific platforms

Supported Platforms

ATS Platforms (Global)

  • Greenhouse (boards.greenhouse.io) - US-focused
  • Lever (jobs.lever.co) - US-focused
  • Workable (apply.workable.com) - Europe-focused
  • Ashby (jobs.ashbyhq.com) - US-focused

Major Job Boards

  • LinkedIn (linkedin.com/jobs) - Global
  • Indeed (indeed.com + regional sites) - Global
  • Glassdoor (glassdoor.com + regional sites) - Global

Regional Specialists

  • LATAM: Computrabajo, InfoJobs, Catho, Vagas, OCC
  • EUROPE: Xing, StepStone, Reed, Totaljobs, APEC
  • USA: Monster, ZipRecruiter, Dice

Features

  • 🎯 Regional Intelligence - Only searches relevant platforms per region
  • 🔍 Smart Duplicate Detection - Multi-field similarity matching
  • 🌐 Web Dashboard - Beautiful interface with live filtering
  • ⭐ Job Bookmarking - Save and organize favorite opportunities
  • 🔄 Auto-Refresh - Real-time job updates
  • 📊 Analytics - Comprehensive job market insights
  • 🚀 Easy Setup - One-click startup scripts
  • 🔧 Highly Configurable - Extensive filtering and customization options
  • ⚡ Fast Processing - Async parallel job parsing
  • 🧪 Well Tested - 40+ automated tests ensure reliability
  • 📱 Mobile Friendly - Dashboard works on all devices

Development

Running Tests

# Install test dependencies
uv add --group test pytest pytest-asyncio pytest-mock responses

# Run all tests
uv run pytest tests/ -v

# Run specific test categories
uv run pytest tests/test_platforms.py -v

Project Structure

work-automation/
├── src/                          # Core application modules
│   ├── enhanced_search.py        # Multi-platform search engine
│   ├── dynamic_query_builder.py  # Dynamic query strategy system
│   ├── regional_platforms.py     # Regional platform mapping
│   ├── parser.py                 # Job parsing and classification
│   ├── deduplicator.py          # Smart duplicate detection
│   ├── tagger.py                # Region tagging logic
│   ├── scorer.py                # Job scoring and ranking
│   ├── health_checker.py        # Platform monitoring
│   ├── rate_limiter.py          # Request rate limiting
│   ├── config.py                # Configuration management
│   └── logger.py                # Logging and statistics
├── tests/                       # Comprehensive test suite (40+ tests)
├── templates/                   # Web dashboard templates
├── dashboard.py                 # Flask web application
├── main.py                     # CLI interface
├── start.sh / start.bat        # Easy startup scripts
├── demo_query_strategy.py      # Dynamic query demonstration
├── test_dynamic_queries.py     # Query strategy testing
├── DYNAMIC_QUERY_STRATEGY.md   # Implementation documentation
├── QUICKSTART.md               # Beginner-friendly guide
├── CHANGELOG.md                # Version history
└── pyproject.toml             # Project configuration

Dependencies

  • typer - CLI framework with rich formatting
  • httpx - Async HTTP client for fast requests
  • beautifulsoup4 - HTML parsing and data extraction
  • duckduckgo-search - Search engine integration
  • rich - Beautiful terminal formatting
  • rapidfuzz - Fast fuzzy string matching for deduplication
  • flask - Web dashboard framework

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new functionality
  4. Run the test suite: uv run pytest tests/
  5. Submit a pull request

License

MIT License - see LICENSE file for details.


Happy Job Hunting! 🎯✨

Built with ❤️ for developers seeking remote opportunities worldwide

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors