Girijesh Singh

PydanticAIFastMCPAzure OpenAIPyTorchTransformersXGBoostSHAPFastAPIPySparkAzure Cosmos DBOpenTelemetryArize PhoenixHierarchical RAGGPT-4oAzure Service BusScikit-learnPythonSQLDockerCitation ValidationMulti-Agent SystemsNLPFine-tuningPydanticAIFastMCPAzure OpenAIPyTorchTransformersXGBoostSHAPFastAPIPySparkAzure Cosmos DBOpenTelemetryArize PhoenixHierarchical RAGGPT-4oAzure Service BusScikit-learnPythonSQLDockerCitation ValidationMulti-Agent SystemsNLPFine-tuning

What I Bring

What I Bring to the Table

Three things that separate a production AI architect from someone who can run a notebook.

Zero-to-One GenAI Architecture

I don't just prompt-engineer; I build deterministic, multi-agent microservices from scratch. When off-the-shelf tools like LangChain fail at scale, I design custom orchestration layers (PydanticAI, FastMCP) that actually work in production.

High-Performance System Optimization

I bridge the gap between Data Science and Data Engineering. I optimize vector indices (Azure Cosmos DB IVF) to slash query latencies from minutes to seconds, circumventing hundreds of thousands in cloud scale-out costs.

Measurable Enterprise Impact

My architectures don't just live in notebooks. I've directed teams of 10–14 engineers to deploy systems that drive $10M+ in operational efficiency and process tens of thousands of queries daily across 4 global regions.

Technical Philosophy

Why I Stopped Using LangChain for Enterprise RAG

Moving a Generative AI prototype to a regulated production environment exposes the severe limitations of standard wrapper libraries. Within 30 days of deploying standard RAG on highly interlinked, 100+ page insurance documents, I diagnosed catastrophic context collapse and citation failure.

My solution wasn't a larger context window - it was a better architecture. I engineered a custom runtime schema transpilation layer and a hierarchical node retrieval engine from scratch. The result? A system that captures complex cross-references without hallucinations, dynamically scaling to thousands of users while keeping cloud costs flat.

“The best GenAI architecture isn't the one with the most features - it's the one that can provably tell you exactly why it gave the answer it did, every single time.”

Featured Work

What I've Built

Production systems. Published research. Shipped products.

🏗️

PRODUCTION · NDA

CoverAI: Zero-Hallucination Retrieval Engine

Lead Data Scientist · 2024–Present · Confidential Employer

Problem: Off-the-shelf LangChain RAG failed catastrophically on 700+ page insurance policies — hallucinating citations with legal consequences.

Build: Custom hierarchical JSON tree retrieval with runtime schema transpilation. Deterministic citation validation streams output character-by-character. New carriers onboard via config, zero code changes.

🔥 Scale: 30K+ daily queries · 2,000+ enterprise users

⚡ Speed: 10× latency (7.5s → 0.7s) · end-to-end under 30s

🛡️ Safety: 100% citation validation · zero hallucinated references

💰 Cost: $500K+ cloud scale-out costs avoided

PydanticAIFastMCPHierarchical Node RetrievalAzure Cosmos DB (IVF)MicroservicesReal-Time Citation ValidationFastAPIOpenTelemetryAzure OpenAI/GPT-4oAdobe PDF Services

10×

Latency

<30s

E2E Response

$500K+

Cloud Saved

2,000+

Active Users

30K+

Daily Queries

Regions

⚡PRODUCTION

Autonomous Multi-Line Claims Routing

ML Engineer / Lead · 2018–2021

XGBoost pipeline for insurance claim triage. SHAP values for 100% regulatory audit trail. CTO: "No dedicated team has ever built anything like this." 45% cost reduction · $200K annual savings · 35% faster settlements.

XGBoostSHAPDjangoRabbitMQExplainable ML

🔤INTERNAL

PriML: Natural Language → SQL

Team Lead of 10 · 2018–2021 · Pre-LLM era

Fine-tuned Rat-SQL transformer. NL query → SQL → Plotly dashboard, self-serve. 87% accuracy on complex multi-table JOINs before LLMs existed.

Transformers (Rat-SQL)Fine-tuning87% AccuracyPlotlyPostgreSQL

⚡PRODUCTION · NDA

FNOL Classification Agent System

Lead Data Scientist · 2024–Present

3-service microarchitecture (FastAPI + FastMCP + Azure Service Bus). PydanticAI agent generates type-safe Pydantic models from per-tenant schemas at runtime. 95% alignment · 2% hallucination rate.

PydanticAIFastMCPAzure Service BusDynamic Schema GenerationMulti-Tenant

📄PUBLISHED

Selective EEG Anonymization via Multi-Objective Autoencoders

PST 2023 · Copenhagen, Denmark · Lakehead University

Privacy-preserving ML for Brain-Computer Interfaces. Selective anonymization preserves clinical signal while eliminating re-identification vectors.

View on IEEE Xplore ↗

Multi-Objective AutoencodersEEG/BCIsPyTorchPST 2023

Side Projects

🎬OPEN SOURCE

DirectorAI

Browser-native AI video editor. Natural language → FFmpeg.wasm. TensorFlow.js face detection client-side. No uploads, no server, no privacy tradeoff.

React 19FFmpeg.wasmTensorFlow.jsVite

GitHub

🔌LIVE

AI Conversation Exporter

Export ChatGPT, Claude, Gemini conversations as TXT, Markdown, JSON, or HTML. Zero permissions. All processing is local. Chrome + Firefox.

Chrome/Firefox ExtensionManifest V3Zero Permissions

GitHub

Experience

Career Timeline

Jan 2024 – Present

Lead Data Scientist

Primus Software Corporation

Waterloo, ON

Led cross-functional team of 10–14. Scaled enterprise AI to 2,000+ users globally. Resolved two production crises. Built zero-code carrier onboarding.

Promoted from Senior → Lead in 12 months

Jan 2023 – Jan 2024

Senior Data Scientist

Primus Software Corporation

Waterloo, ON

Diagnosed LangChain's fundamental limits on multi-document insurance policies. Designed hierarchical RAG architecture solo in 3 months. Resolved latency crisis: 2.5 min → 40 sec.

Sep 2021 – Apr 2023

M.Sc. Computer Science

Lakehead University

Thunder Bay, ON

Project-based Masters supervised by Dr. Garima Bajwa. Published privacy-preserving ML research at PST 2023, Copenhagen. Continued AI development at Primus concurrently.

Published at PST 2023 · Copenhagen

May – Aug 2022

Data Science Intern

Ciena

Ottawa, ON

PySpark pipelines, divisive clustering, manufacturing batch anomaly detection.

Jun 2018 – Dec 2022

ML Engineer → Senior ML Engineer

Primus Software Development

Noida, India → Canada (2021)

Built FNOL classification for Crawford & Company solo. Led PriML NL-to-SQL project (team of 10). 6 years of insurance domain expertise starts here.

CTO recognition + $1,000 bonus

Research

Publications

📚 Peer-Reviewed · IEEE · International Conference

Selective EEG Signal Anonymization using Multi-Objective Autoencoders

PST 2023 · Copenhagen, Denmark

Co-authored research on advanced autoencoder architectures for securing biological telemetry. Selective anonymization preserves clinical signal while eliminating re-identification vectors. Supervised by Dr. Garima Bajwa, Lakehead University.

View on IEEE Xplore ↗

📚 Peer-Reviewed · Springer

In-Memory Computation for Real-time Face Recognition

ICICT 2019 · Springer

Optimized edge-compute inference for computer vision on resource-constrained hardware. In-memory computation strategies significantly reduce latency for real-time face recognition workloads.

View on Springer ↗

Beyond the Code

Leadership & Communication

Data scientists who can communicate build better systems. The evidence is below.

🎤

Toastmasters International

Competitive public speaking training that directly informs how I present technical findings to non-technical stakeholders — executives, clients, and insurance carriers.

7× Best Impromptu Speaker

4× Best Evaluator

3× Best Prepared Speech

👥

Cross-Functional Team Lead

Ran day-to-day technical and delivery decisions for a team of 10–14. Direct stakeholder requirement gathering, refinement, and brainstorming. Second-most senior person on the team.

10–14 person cross-functional team

Multi-region, multi-carrier delivery

Client-facing requirement ownership