HR Tech RAG Architecture Neo4j Knowledge Graph Multi-LLM

Hunel: AI-Native
Recruitment Intelligence

Building a production RAG system that transforms how recruiters discover and match candidates — with multi-LLM orchestration, hybrid semantic search, and agentic workflows.

40+
Data Sources
92%
Match Accuracy
<200ms
Search Latency
70%
Screening Time Saved

Production AI Stack

Neo4j knowledge graph for skill/role relationships
RAG with hybrid retrieval (dense + sparse + graph)
Multi-LLM orchestration with intelligent routing
Graph neural network-inspired matching
Multi-vector embeddings per entity
GDPR-compliant EU-hosted inference
Entering Scale Phase — Q2 2025
Approved for Public Sharing: This case study has been reviewed and approved by Hunel. Technical specifications, infrastructure details, and third-party service names have been generalized to protect proprietary systems.
The AI Problem

Keyword Matching Is Fundamentally Broken

Traditional recruitment technology relies on keyword matching — a fundamentally broken approach. "Python Developer" doesn't match "Software Engineer (Python)". Scanned CVs are invisible. Skill relationships are ignored.

Hunel needed an AI system that understands recruitment the way humans do — but at machine scale and speed. Not another chatbot wrapper. A production-grade RAG system that could:

  • Ingest and index data from 40+ job boards and national employment services
  • Process any CV format — text, scanned, creative/infographic
  • Understand semantic relationships between skills, roles, and career trajectories
  • Explain why a candidate matches, not just return a score

Keyword mismatch

Qualified candidates missed by exact matching

30% CVs invisible

Scanned documents can't be searched

No skill relationships

"React" doesn't surface "Frontend" experts

Black box scoring

No explanation for why candidates match

The Solution

What We Delivered

92%
Match Accuracy
vs recruiter decisions
70%
Screening Reduction
time saved per role
<15s
CV Processing
including OCR
3x
Throughput
qualified candidates

Hybrid Semantic Search

Dense + sparse + graph retrieval with cross-encoder reranking. Not just vectors — real understanding of skill relationships.

Explainable AI Matching

Every match comes with human-readable reasoning: strengths, gaps, deal-breakers, and recommended pitch angles.

Agentic Workflows

Autonomous agents with tool-calling that search, score, enrich, and draft personalized outreach — with human-in-the-loop approval.

Multi-Modal CV Processing

Text, scanned, creative/infographic — all processed with OCR, Vision AI, and structured extraction. No CV left unsearchable.

Technical Deep Dive

The RAG Architecture

A production-grade retrieval system that goes far beyond basic vector search

Hybrid Retrieval Pipeline

Stage 1: Parallel Retrieval
Dense (semantic)
Sparse (keyword)
Graph (relations)
Stage 2: Fusion
Reciprocal Rank Fusion
Deduplication
Diversity preservation
Stage 3: Reranking
Cross-encoder
Deep attention
High precision
Stage 4: LLM Reasoning
Match scoring
Strengths/gaps
Pitch angles

Multi-Vector Embeddings

Not one embedding per CV — multiple vectors for skills, experience, trajectory, and ideal job. Match on specific experience, not just overall profile.

Multi-LLM Orchestration

Intelligent routing by task complexity, latency, and privacy requirements. Reasoning LLM for complex matching, speed LLM for autocomplete, privacy LLM for sensitive data.

Neo4j Knowledge Graph

Graph neural network approach: skills, roles, companies, and trajectories as nodes. Relationships encode "leads to", "requires", "similar to". Traverse the graph to expand queries and infer implicit matches.

Vision AI Processing

OCR with language detection. Image preprocessing. Creative CV extraction. Portfolio screenshot analysis. No document type left unsearchable.

Structured Extraction

LLM-powered parsing to JSON schema. Experience, skills, achievements, seniority level — all inferred and structured from raw text.

GDPR-Compliant Inference

EU-hosted LLM options for sensitive PII. Data residency controls. Automatic routing of personal data to compliant endpoints.

Graph Intelligence

Neo4j Knowledge Graph

Why vectors alone aren't enough — and how graph relationships enable true semantic understanding

The Problem with Pure Vector Search

Vector similarity finds "Python Developer" ≈ "Software Engineer" — but it doesn't understand that Python → Django → Web Backend → API Design is a skill progression. It doesn't know that someone who did Django for 5 years probably knows REST APIs, even if they never listed it. That's where the graph comes in.

Vector-Only Approach

"React Developer" and "Frontend Engineer" are similar vectors, but you miss that React → TypeScript → Testing Library is a common skill cluster that implies deeper frontend expertise.

Graph-Enhanced Approach

Traverse from "React" node to connected skills. Weight by co-occurrence frequency. Expand the search to include implied competencies. Score based on graph centrality.

Graph Node Types & Relationships

S
Skills
Python, React, AWS...
R
Roles
Backend Dev, Tech Lead...
C
Companies
Type, size, industry...
T
Trajectories
Career paths, progressions...
// Neo4j Cypher — Expand skill query with graph traversal
MATCH
(skill:Skill {name: "Python"})
-[:OFTEN_USED_WITH|LEADS_TO|REQUIRES*1..2]->
(related:Skill)
WHERE
related.category IN ["backend", "data", "devops"]
RETURN
related.name, count(*) as weight
ORDER BY
weight DESC

Graph Neural Network-Inspired Matching

Message Passing

Skills "propagate" relevance to neighbors. A strong Python signal boosts Django, FastAPI, Flask nodes in the candidate's profile.

Graph Centrality

Candidates with skills at "hub" positions (high connectivity) are more versatile. PageRank-style scoring identifies T-shaped profiles.

Path Analysis

Career trajectory = path through role nodes. Predict next likely role. Identify unconventional but successful transitions.

Technology

The Production Stack

Built for scale from day one — not a prototype that needs rebuilding

AI & Machine Learning

Multi-LLM Gateway Neo4j Knowledge Graph Vector Database Cross-Encoder Reranking Multilingual Embeddings

Backend

Node.js TypeScript Express Async Workers Message Queue

Data Layer

PostgreSQL Redis Object Storage Search Engine

Infrastructure

Docker Dual-Container Architecture API + Worker Separation
Production Challenges

The Hard Parts

The problems that separate POC-land from production reality

Subjective Job Titles & Semantic Ambiguity

Job titles are subjective and inconsistent. "Tech Lead" at a startup ≠ "Tech Lead" at a bank. "Full-Stack Developer" could mean React+Node or PHP+jQuery. You can't match on titles — you need to understand the context behind them.

Solution: Context-first approach. Extract the actual responsibilities, technologies, team size, and company context from experience descriptions. Map to a normalized skill/role taxonomy (ESCO) before matching. The match happens on what they did, not what their title said.

Hallucinations & Confidence Calibration

LLMs confidently make things up. A recruiter sending outreach based on hallucinated experience destroys trust instantly.

Solution: Multi-stage verification. RAG grounds all claims in retrieved documents. Confidence scoring with explicit uncertainty. "Unable to verify" is a valid answer. Cross-reference extracted data against source documents.

Context Window Limits in Production

Matching requires comparing job requirements against multiple candidates, each with multi-page CVs. Context windows fill up fast. Chunking destroys semantic coherence.

Solution: Multi-vector representation — separate embeddings for skills, experience, trajectory. Hierarchical summarization for context-heavy comparisons. Smart chunking that respects document structure (sections, not arbitrary splits).

Latency vs Accuracy Trade-offs

Users expect autocomplete in <500ms. But high-quality matching needs cross-encoder reranking and LLM reasoning. You can't have both with naive implementation.

Solution: Multi-LLM routing. Fast model for real-time suggestions. Reasoning model for final scoring. Progressive disclosure — show fast results immediately, refine in background. Cache common patterns.

GDPR & Data Residency

CVs contain sensitive PII. European regulations require data residency controls. Most LLM providers are US-based. Compliance isn't optional.

Solution: EU-hosted inference options in the LLM gateway. Automatic routing of PII-heavy requests to compliant endpoints. Data classification at ingestion. Audit trail for all AI decisions.

Key Innovations

What Made This Different

1

Context → Map → Match Pipeline

Job titles are subjective garbage. "Tech Lead" means different things everywhere. So we extract context first (responsibilities, team size, tech), map to a normalized taxonomy (ESCO), then match. Never skip to similarity without understanding.

2

Multi-Vector CV Representation

Instead of one embedding per candidate, we generate multiple vectors capturing different aspects: skills, experience, trajectory, ideal job. "Show me candidates who HAVE DONE fintech" vs "know fintech".

3

Hybrid Retrieval with Reranking

Combining dense (semantic), sparse (keyword), and graph (relational) search with cross-encoder reranking. Vectors alone miss exact matches. Keywords alone miss semantics. We use both.

4

Explainable AI Matching

Every match score comes with human-readable reasoning — strengths, gaps, deal-breakers, and recommended pitch angles. Recruiters can trust and verify.

Key Insight

What I Learned

" The hardest part wasn't the LLM — it was understanding that job titles are meaningless without context. A 'Senior Developer' at a 5-person startup has completely different experience than one at a bank. The breakthrough was: context first, then map to taxonomy, then match. You can't skip straight to vector similarity. The LLM is maybe 20% of the work. The other 80% is understanding the domain well enough to know what actually matters. "

Building an AI System?

Whether it's RAG, multi-LLM orchestration, agentic workflows, or semantic search — I've shipped production systems that handle the hard parts. Let's talk about your project.

Let's Talk About Your AI Project
Approved for Public Sharing: This case study has been prepared with the explicit consent of Hunel. Technical specifications, infrastructure details, third-party service names, and other sensitive information have been generalized or omitted to protect proprietary systems and security posture.
Available now

Let's Build Something That Ships

Free 30-minute strategy call. No pitch deck. No sales pressure. Just an honest conversation about your project.

Faster
50%
Less cost
100%
Satisfaction
No commitment
24h response
Free discovery call

Trusted by startups, scale-ups, and enterprises across France and internationally.