Skip to content

Enhance Webflow page detection with multiple indicators#12

Merged
KoblerS merged 2 commits intomainfrom
copilot/fix-webflow-page-detection-check
Sep 9, 2025
Merged

Enhance Webflow page detection with multiple indicators#12
KoblerS merged 2 commits intomainfrom
copilot/fix-webflow-page-detection-check

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Sep 9, 2025

The current Webflow page detection logic only checks for <link> tags containing "website-files.com", which misses many valid Webflow sites that may not have CSS links but still have JavaScript files from the CDN.

This PR enhances the check_url() function to use multiple detection methods:

Before (single indicator):

# Only checked for CSS links
links = soup.find_all('link', href=True)
if not any("website-files.com" in link['href'] for link in links):
    return False  # Site rejected if no CSS links found

After (multiple indicators):

# Check 1: Links with "website-files.com" (existing)
# Check 2: Scripts with "website-files.com" (NEW)
# Check 3: Meta generator tag with "Webflow" (NEW)

Key improvements:

  • Script detection: Now detects <script src="proxy.php?url=https%3A%2F%2Fcdn.prod.website-files.com%2F...%2Fjs%2Fwebflow.%2A.js"> patterns
  • Meta tag fallback: Checks for <meta name="generator" content="Webflow">
  • Backward compatibility: Maintains existing CSS link detection
  • Better logging: Debug output shows which indicators were found

Test scenarios:

  • Sites with only JS files: ❌ → ✅ (main issue fixed)
  • Sites with CSS + JS: ✅ → ✅ (unchanged)
  • Sites with only meta tag: ❌ → ✅ (additional coverage)
  • Non-Webflow sites: ❌ → ❌ (correctly rejected)

The enhanced detection makes the tool more robust by checking for any of the three Webflow indicators rather than requiring specific CSS links.

Fixes #11.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Not all webflow pages are detected Enhance Webflow page detection with multiple indicators Sep 9, 2025
Copilot AI requested a review from KoblerS September 9, 2025 07:27
@KoblerS KoblerS marked this pull request as ready for review September 9, 2025 08:06
Copilot AI review requested due to automatic review settings September 9, 2025 08:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the Webflow site detection mechanism to be more robust by checking for multiple indicators instead of relying solely on CSS links. The enhancement addresses cases where valid Webflow sites may only have JavaScript files or meta generator tags without CSS links from the Webflow CDN.

Key changes:

  • Adds detection for <script> tags containing "website-files.com"
  • Adds detection for <meta name="generator" content="Webflow"> tags
  • Maintains backward compatibility with existing CSS link detection

@KoblerS KoblerS merged commit 4cca62e into main Sep 9, 2025
6 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Not all webflow pages are detected

3 participants