Open to Lead & Staff DS Roles · $190K+ CAD

Girijesh Singh

Architecting Enterprise GenAI for High-Stakes Environments Waterloo, ON

I build production-grade AI systems where hallucinations carry legal consequences, latency costs millions, and scale is non-negotiable. Currently driving $10M+ in operational efficiency as a Lead Data Scientist.

2,000+ Production Users
30K+ Daily AI Queries
$10M+ Annual Impact
6 yrs Domain Depth

What I Bring to the Table

Three things that separate a production AI architect from someone who can run a notebook.

01

Zero-to-One GenAI Architecture

I don't just prompt-engineer; I build deterministic, multi-agent microservices from scratch. When off-the-shelf tools like LangChain fail at scale, I design custom orchestration layers (PydanticAI, FastMCP) that actually work in production.

02

High-Performance System Optimization

I bridge the gap between Data Science and Data Engineering. I optimize vector indices (Azure Cosmos DB IVF) to slash query latencies from minutes to seconds, circumventing hundreds of thousands in cloud scale-out costs.

03

Measurable Enterprise Impact

My architectures don't just live in notebooks. I've directed teams of 14+ engineers to deploy systems that drive $10M+ in operational efficiency and process tens of thousands of queries daily across 4 global regions.

Why I Stopped Using LangChain
for Enterprise RAG

Moving a Generative AI prototype to a regulated production environment exposes the severe limitations of standard wrapper libraries. Within 30 days of deploying standard RAG on highly interlinked, 100+ page insurance documents, I diagnosed catastrophic context collapse and citation failure.

My solution wasn't a larger context window - it was a better architecture. I engineered a custom runtime schema transpilation layer and a hierarchical node retrieval engine from scratch. The result? A system that captures complex cross-references without hallucinations, dynamically scaling to thousands of users while keeping cloud costs flat.

"The best GenAI architecture isn't the one with the most features - it's the one that can provably tell you exactly why it gave the answer it did, every single time."

Real Impact at Scale

Production AI in a regulated, high-stakes industry - every number below is earned, not estimated from a demo.

-
Active Users
Insurance adjusters · US · UK · AU · EU
-
Daily AI Queries
Multi-carrier, multi-tenant platform
-
Latency Reduction
Hours of manual work → under 30 seconds
-
Est. Annual Savings
In insurance adjuster time costs
-
Global Regions
US · UK · Australia · EU deployment
-
Throughput Gain
More queries handled - zero added cloud cost

What I've Built

Production systems, published research, and shipped products - not demos. Everything here is real and has been used at scale.

PRODUCTION
Autonomous Multi-Line Claims Routing
ML Engineer / Lead · 2018–2021 · Confidential Employer
The Problem:Multi-line insurance claims were manually routed, costing excessive time and money while lacking the auditability regulators required.
The Build:Designed an explainable, human-in-the-loop XGBoost architecture that autonomously assessed claims based on cost, history, and location. Kept 100% regulatory compliance using SHAP values at the feature level. CTO: "No dedicated team has ever built anything like this."
📉 Impact: Handled 300+ daily claims with 45% reduction in routing costs (~$200K annual savings)
Speed: Accelerated claim settlements by 35% - adjusters moved from manual triage to high-value decisions
45%
Routing Cost Reduction
$200K
Annual Savings
35%
Faster Settlement
XGBoost SHAP Django RabbitMQ Human-in-the-Loop Explainable ML
🔤
INTERNAL PRODUCT
PriML: Natural Language Business Intelligence
Team of 10 · 2018–2021 · Pre-LLM Era
The Problem:Non-technical stakeholders were waiting up to 3 days for custom BI reporting due to complex multi-table JOINs across 50+ diverse relational schemas.
The Build:Fine-tuned Microsoft's Rat-SQL transformer model via a Django/RabbitMQ backend. Built an end-to-end pipeline: Natural Language query → SQL → Database → auto-generated Plotly dashboard. Demoed to external insurance clients; adopted internally.
🎯 Accuracy: 87% zero-shot accuracy on complex multi-table JOINs - before LLMs existed
⏱️ Speed: Eliminated the 3-day BI reporting SLA - stakeholders got answers in seconds, self-serve
Transformers (Rat-SQL) Fine-tuning 87% Zero-Shot Accuracy Plotly/Dash Django PostgreSQL
📄
PUBLISHED · PST 2023
Selective EEG Signal Anonymization using Multi-Objective Autoencoders
Research · Lakehead University · Copenhagen, Denmark · 2023
Co-authored research on advanced autoencoder architectures for securing sensitive biological telemetry. Proposes selective EEG anonymization - preserving clinically relevant signal components while eliminating re-identification vectors. Published at PST 2023, Copenhagen.
Multi-Objective Autoencoders EEG / BCIs PyTorch Published Research PST 2023 · Copenhagen

What I Work With

Tools chosen for production reliability, not resume padding. Highlighted chips indicate expert-level depth.

🤖 LLM Orchestration
FastMCP PydanticAI Real-Time Citation Validation Hierarchical Node Retrieval Cohere v3.5 Neural Reranking Azure OpenAI / GPT-4o Embedding Models Semantic Search
📊 Core ML / AI
PyTorch Transformers XGBoost SHAP (Explainability) Adobe / GPT-4o Vision OCR NLP / Fine-tuning Time Series (Prophet) Scikit-learn
☁️ Data & Scaling
PySpark Azure Cosmos DB (IVF) Azure Service Bus RabbitMQ PostgreSQL Azure Blob Storage MongoDB Docker
🚀 Delivery
FastAPI OpenTelemetry Arize Phoenix Django Flask Adobe PDF Services API Team Leadership (10-14) Insurance Domain (6 yrs)

Career Timeline

7+ years of progressively deeper AI work, from solo ML engineer to leading a cross-functional team of 14.

Jan 2024 - Present
Lead Data Scientist
Primus Software Corporation
📍 Waterloo, ON, Canada
Led a cross-functional team of 10–14 (frontend, backend, QA, 7 data scientists). Scaled enterprise AI platform to 2,000+ active users across US, UK, AU, and EU. Resolved two production crises, built the graph-based execution engine for zero-code carrier onboarding, and extended the platform into coverage verdicts and automated letter generation.
Promoted from Senior → Lead in 12 months
Jan 2023 - Jan 2024
Senior Data Scientist
Primus Software Corporation
📍 Waterloo, ON, Canada
Diagnosed LangChain's fundamental limits on multi-document insurance policies, proposed scrapping the framework, and designed the hierarchical RAG architecture from scratch - solo - in 3 months. Resolved a latency crisis (2.5 min → 40 sec) through single-trip retrieval redesign and cutting LLM calls from 5 to 2.
Sep 2021 - Apr 2023
M.Sc. Computer Science + Graduate Research Assistant
Lakehead University
📍 Thunder Bay, ON, Canada
Project-based Masters, supervised by Dr. Garima Bajwa. Published privacy-preserving ML research for Brain-Computer Interfaces at PST 2023 International Conference (Copenhagen). Continued part-time AI development at Primus Software concurrently - retained due to irreplaceability.
Published at PST 2023 · Copenhagen, Denmark
May - Aug 2022
Data Science Intern
Ciena
📍 Ottawa, ON, Canada
PySpark-based data pipelines, divisive clustering, manufacturing batch anomaly detection. Recognized early that the role skewed toward data engineering - confirmed commitment to applied data science path.
Jun 2018 - Dec 2022
ML Engineer → Senior ML Engineer
Primus Software Development
📍 Noida, India (+ Remote, Canada from 2021)
Built the FNOL classification system for Crawford & Company as sole ML engineer with direct client ownership. Built PriML (NL-to-SQL product) as technical lead on a team of 10. Insurance domain expertise begins here. Retained remotely by Primus after moving to Canada for Masters - a clear signal of value.
CTO recognition + $1,000 performance bonus

Publications

📚 Peer-Reviewed · International Conference
Selective EEG Signal Anonymization using Multi-Objective Autoencoders
20th International Conference on Privacy, Security and Trust (PST 2023) · Copenhagen, Denmark
Co-authored research on advanced autoencoder architectures for securing sensitive biological telemetry. Proposes selective anonymization of EEG signals - preserving clinically relevant signal components while eliminating re-identification vectors. Supervised by Dr. Garima Bajwa, Lakehead University, Thunder Bay, ON.
📚 Peer-Reviewed · Springer
In-Memory Computation for Real-time Face Recognition
International Conference on Inventive Computation Technologies (ICICT 2019) · Springer
Published research on optimizing edge-compute inference times for computer vision models. Demonstrates in-memory computation strategies that significantly reduce latency for real-time face recognition workloads on resource-constrained hardware.

Leadership & Communication

Data scientists who can communicate build better systems. The evidence is below.

🎤
Toastmasters International
Competitive public speaking training that directly informs how I present technical findings to non-technical stakeholders - executives, clients, and insurance carriers.
7× Best Impromptu Speaker
4× Best Evaluator
3× Best Prepared Speech
👥
Cross-Functional Team Lead
Ran day-to-day technical and delivery decisions for a team of 10–14. Direct stakeholder requirement gathering, refinement, and brainstorming. Second-most senior person on the team - manager above, Girijesh ran everything technical below.
10–14 person cross-functional team
Multi-region, multi-carrier delivery
Client-facing requirement ownership

Open to the Right
Lead & Staff DS Roles

Available for $190K+ Lead & Staff DS Roles
Waterloo area or remote · Big Tech, AI-native, Enterprise AI

If your engineering bar is high and you need an architect who thinks in systems rather than just notebooks, let's talk.