Back to all projectsBack to portfolio
EProduction-style Prototype
Search, Retrieval & Applied AI

ExpertMatchAI

Hybrid expert search combining FAISS semantics, BM25 relevance, and location-aware ranking

Indexed 12,168 profiles with average search latency below 200 ms and p95 below 350 ms.

During my Software Engineering internship focused on AI/ML and data infrastructure at Global Futures Group, I built the retrieval and ranking infrastructure for an expert-matching platform covering 12,168 profiles. The system combined semantic search, BM25 lexical retrieval, structured filters, and tunable ranking signals, with average search latency under 200 ms, p95 latency under 350 ms, and full index rebuilds in under 60 seconds.

ContextInternship
RoleSoftware Engineer Intern, AI/ML & Data Infrastructure
TeamSolo
DateOct–Dec 2025

During my Software Engineering internship focused on AI/ML and data infrastructure at Global Futures Group, I designed and implemented the retrieval, ranking, API, and data infrastructure for an expert-matching platform. I transferred the production codebase to the company at the end of the internship.

12,168 profiles indexedAverage search latency under 200 msp95 search latency under 350 ms
Next.jsTypeScriptFastAPIPythonPostgreSQLPrisma

Overview

ExpertMatchAI is the retrieval and ranking platform I built during my Software Engineering internship focused on AI/ML and data infrastructure at Global Futures Group, helping users find experts by intent, specialty, and location. It combines semantic and lexical retrieval with structured filtering, then presents a transparent match score and explanation for each result.

Architecture

The platform combined a Next.js and TypeScript application with FastAPI retrieval services and PostgreSQL-backed profile data. Queries passed through structured location and specialty filters, semantic retrieval using sentence-transformer embeddings and FAISS, and BM25 lexical retrieval. A weighted ranking layer combined those signals into explainable match results, while a lexical fallback kept search available when semantic retrieval was unavailable.

What I Built

I designed and implemented the semantic and lexical retrieval pipeline, weighted ranking logic, profile ingestion and indexing workflow, FastAPI services, PostgreSQL data layer, fallback behavior, observability, and deployment configuration. I also built the testing workflow across backend, frontend, and end-to-end layers.

  • Indexed 12,168 expert profiles and served hybrid semantic and lexical retrieval over the full directory.
  • Built semantic search using sentence-transformer embeddings and FAISS vector search, combined with BM25 lexical retrieval for exact-term matching.
  • Designed a weighted ranking layer that combined semantic, lexical, and structured location and specialty filter signals into a single explainable match score, with tunable weighting.
  • Added a lexical fallback so search stayed available when semantic retrieval was unavailable.
  • Built the profile ingestion and index-rebuild workflow, completing full-corpus rebuilds in under 60 seconds.
  • Reduced cold-start latency by 30% and validated the platform with pytest, Vitest, and Playwright across backend, frontend, and end-to-end layers.
  • Deployed the frontend on Vercel and backend services and PostgreSQL on Railway, then transferred the production codebase to Global Futures Group at the end of the internship.

Engineering Decisions

FAISS vector retrieval plus BM25 lexical search

Why — Semantic search captures related expertise expressed with different wording, while BM25 protects exact specialties, certifications, and query terms. Hybrid search preserves both conceptual recall and lexical precision.

Trade-off — Blending multiple signals requires deliberate weight calibration and relevance evaluation.

Runtime-tunable ranking weights

Why — Environment-configured semantic, BM25, and filter weights allow ranking behavior to change without rebuilding the application.

Trade-off — The weights still require evaluation against labeled matches or interaction data.

PostgreSQL filtering before final ranking

Why — Prisma applies geo filtering, specialty, rating, and experience constraints before the ranking layer combines semantic and lexical signals.

Trade-off — Candidate filtering improves precision but can exclude profiles when structured metadata is incomplete.

Graceful lexical fallback

Why — BM25 and structured filters keep search available when FastAPI or the embedding model cannot respond.

Trade-off — Fallback results lose the semantic signal and depend more heavily on keyword overlap.

Results & Validation

Indexed 12,168 profiles, served search responses in under 200 ms on average with p95 below 350 ms, and rebuilt the full index in under 60 seconds. During deployment, the system sustained 99%+ uptime, reduced cold-start latency by 30%, and used zero-downtime delivery across Vercel and Railway.

I validated the platform with pytest for backend retrieval behavior, Vitest for frontend and ranking utilities, and Playwright for end-to-end search workflows.

A lexical fallback and structured filters preserve retrieval when semantic search is unavailable, supported by health checks, rate-limit handling, and fallback results to top-rated experts.

The application was deployed during the internship period with the frontend on Vercel and backend services and PostgreSQL on Railway. The production codebase was transferred to Global Futures Group at the end of the internship. This case study describes the system at an architectural level. Code, internal data, and proprietary implementation details are not published. Any examples referenced in the portfolio are synthetic or anonymized.

Evidence / Technologies

Next.jsTypeScriptFastAPIPythonPostgreSQLPrismaFAISSBM25Sentence TransformersHybrid SearchSemantic SearchLexical SearchVector SearchRetrieval SystemsDockerVercelRailwayPlaywrightVitestpytest