Inspiration

The talent market for exceptional AI/ML engineers has become a needle-in-a-haystack problem at scale. xAI recruiters consistently report spending disproportionate time on two bottlenecks: initial outreach and information gathering, and routing candidates to the correct pipeline. Meanwhile, X has evolved into the town square for technical discourse—a place where engineers reveal curiosity, execution capability, and intellectual depth long before they update their resumes or appear on traditional job boards.

We asked ourselves a simple question: what if the signals of exceptional talent are already being broadcast in public, waiting to be systematically decoded? Rather than waiting for candidates to apply, we built a system that proactively identifies and engages with engineers who demonstrate the behavioral fingerprints of excellence.

What it does

RecruiterX is an AI-powered talent intelligence platform that identifies exceptional engineers on X through a rigorous 8-signal scoring framework, deploys autonomous agents for personalized outreach, and integrates directly with Greenhouse ATS for seamless pipeline handoff.

The platform enables recruiters to query X's technical ecosystem using natural language—searching for "kernel engineers with CUDA experience who discuss attention mechanisms" returns ranked profiles with confidence scores. Each candidate undergoes deep profile analysis with real-time streaming results, surfacing proof-of-work highlights, peer recognition patterns, and trajectory indicators. When a candidate crosses the confidence threshold, an autonomous DM agent initiates personalized outreach, handles responses conversationally, and collects artifacts like resumes and GitHub profiles. Candidates flow into Kanban-style talent pools mapped to xAI's core teams (Pretraining, Post-training, Human Data, Inference, Infrastructure) and can be exported to Greenhouse with structured notes and reasoning.

How we built it

We developed an 8-dimensional signal framework to quantify engineering excellence from X activity. The framework captures curiosity through early adopter behavior, craftsmanship through proof-of-work artifacts, speed through experimentation velocity, peer recognition through expert endorsement graphs, clarity of thought through discourse quality metrics, discipline through posting consistency, substance through anti-noise filtering, and trajectory acceleration through bio-change momentum.

The curiosity signal measures how far ahead of the global "surge moment" a user discusses emerging topics like transformers, LoRA, FlashAttention, or vLLM. We identify the surge point for each topic by finding when the 7-day rate of increase in global mentions peaks, then measure how early each user made substantive contributions relative to that moment. The craftsmanship signal captures observable building behavior—code links to GitHub/Gist/Colab, multi-tweet technical threads, artifact diversity across repositories, and the presence of error logs and stack traces that indicate real debugging. Speed measures how quickly someone engages with new model releases, frameworks, or kernels after they drop, with faster reactions yielding higher scores through a sigmoid-normalized latency metric.

Peer recognition tracks interactions from a curated seed set of high-signal experts. When respected researchers, OSS maintainers, or top technical voices reply to, quote, or follow someone's work, that signal carries far more weight than generic engagement. We combine direct expert interactions with PageRank computed over the expert subgraph to produce an endorsement score that highlights accounts recognized by elite builders rather than influencers. The clarity signal evaluates information density and technical relevance using a classifier-based approach, distinguishing engineers who communicate with substance from those who post vague commentary or hype.

The anti-noise filter is critical. A huge portion of X is dominated by influencers, bots, marketers, and engagement farmers who generate massive posting volume but zero engineering signal. We drop accounts that exceed thresholds for giveaway tweets, engagement bait phrases, affiliate links, or quote-spam without original content. This guarantees the system never wastes computational resources ranking noisy, non-technical users.

The final composite score combines all dimensions with learned weights:

$$ \text{Score}_u = 0.22 \cdot \text{EAI}_u + 0.22 \cdot \text{PoW}_u + 0.18 \cdot \text{Vel}_u + 0.18 \cdot \text{Endorsement}_u + 0.12 \cdot \text{Discourse}_u + 0.08 \cdot \text{Stability}_u + 0.05 \cdot \text{BioMomentum}_u $$

where Early-Adopter Index and Proof-of-Work carry the most weight because they directly reflect curiosity and execution—the two traits xAI values most highly.

The technology stack combines Next.js 16 with React 19 and TypeScript on the frontend, styled with Tailwind CSS and animated with Framer Motion. State management uses Zustand for reactive candidate data. The backend runs on Next.js API routes with Supabase Edge Functions handling X API authentication via OAuth 1.0a with HMAC-SHA1 signatures. The xAI Grok API powers natural language search with live web verification, using the grok-4-1-fast-reasoning model. A Python pipeline implements the feature engineers for deep scoring, and Greenhouse integration enables one-click ATS export.

The autonomous outreach agent follows a two-phase protocol. First, it resolves the X handle to a user ID via the v2 API. Then it sends an OAuth 1.0a signed DM with a personalized message crafted using Grok, referencing specific work the candidate has shared. The agent tracks response status and parses replies for resume links, LinkedIn URLs, and GitHub profiles. This vets candidate interest before consuming pipeline resources—recruiting is a two-way street, and we wanted to respect that from the first interaction.

Challenges we ran into

Our initial ambition was to score millions of X accounts. This presented compounding challenges: feature extraction across \(10^6\) profiles requires \(O(n \cdot m)\) API calls where \(m\) represents tweets per user, there are no ground-truth labels for "exceptional engineer" at scale making supervised approaches expensive, and X's user base is dominated by non-technical accounts such that naive sampling yields less than 1% relevant profiles.

Rather than boiling the ocean, we constrained the search space through topical community seeding:

$$\mathcal{U}{\text{seed}} = \left{ u \middle| u \in \bigcup{l \in \mathcal{L}} \text{followers}(l) \cup \text{members}(\text{topical_spaces}) \right}$$

where \(\mathcal{L}\) represents technical leaders like xAI, Ilya Sutskever, and prominent ML researchers. This reduced our candidate pool from approximately \(10^6\) to \(10^4\) while preserving signal density. We then applied the anti-noise filter first, dropping another 60% before expensive scoring computations.

X API's DM endpoint also required OAuth 1.0a with HMAC-SHA1 signatures rather than the more common SHA256. Implementing RFC 3986 percent encoding and constructing the Authorization header correctly required significant debugging, particularly around parameter sorting and base string construction.

Accomplishments that we're proud of

Rather than vibes-based recruiting, we built a mathematically rigorous scoring system grounded in observable behaviors. The entire flow from discovery through scoring to outreach and Greenhouse export requires zero manual data entry. The system gracefully handles missing indicators—lacking a GitHub profile does not disqualify a candidate because exceptional talent is not monolithic. Some brilliant engineers publish papers but never push public code; others ship constantly but avoid social media discourse. Our complementary signal design accommodates this diversity.

We also implemented real-time streaming via Server-Sent Events, so recruiters see analysis progress live rather than waiting for batch completion. This creates a responsive experience despite the computational complexity of the backend scoring pipeline.

What we learned

X functions as a leading indicator for talent. Engineers reveal their trajectory on the platform months before their LinkedIn updates or job switches. The discourse patterns, the topics they engage with early, and the peers who recognize their work all telegraph future potential in ways that traditional recruiting channels miss entirely.

Anti-noise filters proved essential. Without aggressive pre-filtering, any scoring model gets drowned in engagement farming and influencer content. We also learned that autonomous outreach actually works—candidates respond well to personalized, contextual DMs that demonstrate genuine familiarity with their work, as opposed to generic recruiter spam.

Most importantly, we learned that focus beats breadth. By doing one part of the pipeline excellently—sourcing and initial outreach—rather than building shallow features across the entire recruitment funnel (voice screening, scheduling, interview prep), we created tooling that delivers genuine value. A mediocre end-to-end system helps no one; a precise talent identification engine changes how recruiting operates.

What's next for RecruiterX

We plan to deploy secondary agents that fetch complementary signals from GitHub and arXiv once candidates cross a confidence threshold, enriching profiles without requiring those signals for initial qualification. A calibration loop will feed recruiter feedback back into weight tuning, allowing the model to learn from hiring outcomes over time. We want to incorporate multi-modal scoring that analyzes profile images, pinned tweet structure, and thread formatting patterns. Finally, passive monitoring will track watchlisted candidates and alert recruiters when BioMomentum spikes—when someone updates their bio to reflect a new research direction or lab affiliation, that's often the optimal moment to reach out.

Built With

  • deno
  • dnd-kit
  • framer-motion
  • greenhouse-api
  • next.js
  • oauth-1.0a
  • postgresql
  • python
  • radix-ui
  • react
  • recharts
  • server-sent-events
  • supabase
  • tailwind-css
  • typescript
  • x-api
  • xai-grok-api
  • zustand
Share this project:

Updates