Skip to content

Latest commit

 

History

History
155 lines (117 loc) · 6.41 KB

File metadata and controls

155 lines (117 loc) · 6.41 KB
name github-tech-scanner
description Scans all active GitHub repositories using a Personal Access Token (PAT) and produces a full inventory of every programming language and framework in use — with repo counts, byte-share percentages, and version info where available. Use this skill whenever the user wants to audit their GitHub tech stack, see what languages or frameworks they use across repos, understand technology spread across an org, generate a tech inventory, or answer questions like "what frameworks do I use on GitHub?", "what languages appear in my repos?", "scan my GitHub for technologies", or "give me a breakdown of my GitHub stack". Trigger even when the user just mentions a GitHub PAT alongside any technology or repo question.

GitHub Tech Scanner

This skill takes a GitHub Personal Access Token (PAT) and produces a clean report showing every language and framework in use across all active repositories the token can access.


What you'll need from the user

Ask for these if not already provided:

  1. GitHub PAT — must have at minimum the repo scope (for private repos) or no scopes at all for public-only scanning. If they don't have one, direct them to: GitHub → Settings → Developer Settings → Personal Access Tokens.
  2. Active window (optional, default 365 days) — how far back to look when deciding whether a repo is "active". Repos that haven't been pushed to within this window, or are archived/disabled, are skipped.
  3. Show repo breakdown (optional) — whether to list which repos use each language/framework (off by default to keep output concise).

How to run it

The skill bundles a Python script that does all the heavy lifting via the GitHub REST API. Run it from the shell:

pip install requests --break-system-packages --quiet

python /path/to/github-tech-scanner/scripts/scan_repos.py \
  --token <PAT> \
  --active-days 365 \
  --verbose

Use the skill's own directory path to find the script. The script is at scripts/scan_repos.py relative to this SKILL.md.

Useful flags:

  • --active-days N — change the activity cutoff (e.g. 180 for 6 months)
  • --org NAME — limit scan to a specific GitHub organization (e.g. --org my-company)
  • --show-repos — include per-language/framework repo lists
  • --json — emit raw JSON (useful for further processing)
  • --verbose / -v — show progress on stderr while scanning

What the script detects

Languages

Uses GitHub's native /repos/{owner}/{repo}/languages endpoint — accurate, fast, and returns byte counts per language so you get both a raw count and a percentage share.

Frameworks (via manifest files in the repo root)

The script looks for these files in each repo's root directory and parses them:

File Ecosystems detected
package.json React, Vue, Angular, Next.js, Express, NestJS, Vite, …
requirements.txt / pyproject.toml Django, Flask, FastAPI, PyTorch, …
Gemfile Rails, Sinatra, RSpec, …
go.mod Gin, Echo, Fiber, GORM, …
pom.xml / build.gradle Spring, Quarkus, Hibernate, …
composer.json Laravel, Symfony, …
Cargo.toml Actix, Axum, Tokio, Diesel, Tauri, …
pubspec.yaml Flutter, Riverpod, Firebase, …

Output format

The default output is a human-readable report printed to stdout:

============================================================
  GitHub Tech Stack Report — @username
============================================================
  Repos scanned: 42 active (of 87 total, active = pushed within 365 days)

── Languages ────────────────────────────────────────────
  TypeScript             41.2%  ████████              (18 repos)
  Python                 28.5%  █████                 (12 repos)
  JavaScript             15.3%  ███                   (9 repos)
  ...

── Frameworks & Libraries ───────────────────────────────
  React                              (11 repos)  [18.1.0, 18.2.0]
  Django                             (6 repos)   [4.2.0]
  Express                            (5 repos)
  ...
============================================================

Present this output clearly in your response. If --show-repos was used, the per-repo lists will appear indented under each entry.


Presenting the results

After running the script, present a clean summary to the user:

  1. Top languages — highlight the top 3–5 by percentage
  2. Framework highlights — call out the most widely-used frameworks
  3. Observations — note interesting patterns (e.g. mixed frontend stacks, heavy ML footprint, polyglot backend, etc.)
  4. Offer to re-run with --show-repos if they want to know which repos use each technology
  5. Offer to re-run with a different --active-days if the active window seems off

Handling common issues

Rate limiting: The GitHub API allows 5,000 requests/hour for authenticated requests. Large accounts (100+ active repos) may approach this. If you get a 403 with a rate-limit message, wait and retry, or suggest the user narrow the scope with --active-days 90.

No PAT / wrong scope: If you get a 401, the token is invalid or expired. If private repos show 0 languages, the token might lack the repo scope.

org-owned repos: The script fetches repos via /user/repos with affiliation=owner,collaborator,organization_member, so it picks up repos the user contributes to across orgs, not just their personal account.

Empty results: If a repo has no languages, it may be empty, contain only binaries, or the default branch may have no code files.


Security note

The PAT is only used during the scan and is never stored or logged. Remind the user not to paste their token into shared documents or public chats. Fine-grained tokens with read-only Contents and Metadata repo permissions are sufficient and recommended over classic tokens with full repo scope.