Open to Lead & Staff DS Roles · $190K+ CAD

Girijesh Singh

Lead Data Scientist Applied AI Architect Waterloo, ON

I build AI systems that work in the real world — where hallucinations carry legal consequences, latency costs millions, and production scale is non-negotiable.

2,000+ Production Users
30K+ Daily AI Queries
$10M+ Annual Impact
6 yrs Domain Depth

The 6-Year Thread

Most engineers change companies and domains. I stayed with the same core problem for six years and solved it with progressively more powerful technology. That's rare — and hard to replicate.

2024
Enterprise Platform Era
Enterprise RAG Platform
Custom hierarchical retrieval. No framework dependencies. 4 global regions. 2,000+ active users. New carriers onboarded through configuration alone — zero code.
Sections → subsections → bullets → node numbers → network retrieval (proprietary hierarchy)
Real-time citation validation — hallucinated sources eliminated character-by-character before reaching users
Graph-based execution engine: new insurance carriers onboarded via config, zero code changes required
10× latency reduction, 20× throughput gain, $10M+ estimated annual savings in adjuster time
Azure OpenAIGPT-4oMicroservicesAdobe PDF API
Click to collapse ↕

Real Impact at Scale

Production AI in a regulated, high-stakes industry — every number below is earned, not estimated from a demo.

Active Users
Insurance adjusters · US · UK · AU · EU
Daily AI Queries
Multi-carrier, multi-tenant platform
Latency Reduction
Hours of manual work → under 30 seconds
Est. Annual Savings
In insurance adjuster time costs
Global Regions
US · UK · Australia · EU deployment
Time to Promotion
Senior → Lead Data Scientist

What I've Built

Production systems, published research, and shipped products — not demos. Everything here is real and has been used at scale.

PRODUCTION
FNOL Claim Classification System
Sole ML Engineer · 2018–2021 · Confidential Employer
First Notice of Loss (FNOL) classification agent for a major US insurance carrier. Sole ML engineer — owned model development, client interaction, and deployment. Automated routing of 200–300 incoming insurance claims per day to the correct claim type and adjuster team. CTO called it the best thing any dedicated team had built.
30%
Labor Reduction
45%
Cost Reduction
35%
Faster Settlement
XGBoost Scikit-learn Python Insurance ML Interactive Dashboard
🔤
INTERNAL PRODUCT
PriML — Natural Language to SQL
Team of 10 · 2018–2021 · Pre-LLM Era
Before LLMs existed, built an NL-to-SQL product for insurance domain data. Fine-tuned Rat-SQL to handle the complex multi-table, multi-JOIN queries specific to insurance schemas. NL queries flowed into SQL, hit the database, and auto-generated interactive dashboards — demoed to external clients.
Rat-SQL Fine-tuning NLP Python PostgreSQL Auto-Dashboard
📄
PUBLISHED · PST 2023
Privacy-Preserving ML for Brain-Computer Interfaces
Research · Lakehead University · Copenhagen, Denmark · 2023
Published at the 20th IEEE International Conference on Privacy, Security and Trust (PST 2023) in Copenhagen. Proposes a framework for privacy-preserving machine learning applied to BCI neural data — preventing re-identification of sensitive brain activity signals while preserving model utility.
Differential Privacy BCIs PyTorch Published Research IEEE PST 2023

What I Work With

Tools chosen for production reliability, not resume padding. Highlighted chips indicate expert-level depth.

🤖 LLMs & Generative AI
Custom RAG Architecture GPT-4o / Azure OpenAI Hierarchical Retrieval Prompt Engineering Embedding Models Citation Validation Multi-Document QA LangChain LlamaIndex Pydantic Output Schemas
📊 ML & Data Science
Python SQL XGBoost Scikit-learn PyTorch PySpark Feature Engineering Statistical Modeling Anomaly Detection NLP / Fine-tuning
☁️ Infrastructure & Engineering
Azure FastAPI Docker PostgreSQL MongoDB Redis Azure Service Bus Azure Blob Storage Microservices REST APIs
🛠️ Domain & Leadership
Insurance Domain (6 yrs) Team Leadership (10–14) Adobe PDF Services API Tesseract OCR GPT-4o Vision API DAG Orchestration Observability Stakeholder Management Git

Career Timeline

7+ years of progressively deeper AI work, from solo ML engineer to leading a cross-functional team of 14.

Jan 2024 — Present
Lead Data Scientist
Primus Software Corporation
📍 Waterloo, ON, Canada
Led a cross-functional team of 10–14 (frontend, backend, QA, 7 data scientists). Scaled enterprise AI platform to 2,000+ active users across US, UK, AU, and EU. Resolved two production crises, built the graph-based execution engine for zero-code carrier onboarding, and extended the platform into coverage verdicts and automated letter generation.
Promoted from Senior → Lead in 12 months
Jan 2023 — Jan 2024
Senior Data Scientist
Primus Software Corporation
📍 Waterloo, ON, Canada
Diagnosed LangChain's fundamental limits on multi-document insurance policies, proposed scrapping the framework, and designed the hierarchical RAG architecture from scratch — solo — in 3 months. Resolved a latency crisis (2.5 min → 40 sec) through single-trip retrieval redesign and cutting LLM calls from 5 to 2.
Sep 2021 — Apr 2023
M.Sc. Computer Science + Graduate Research Assistant
Lakehead University
📍 Thunder Bay, ON, Canada
Project-based Masters, supervised by Dr. Garima Bajwa. Published privacy-preserving ML research for Brain-Computer Interfaces at PST 2023 International Conference (Copenhagen). Continued part-time AI development at Primus Software concurrently — retained due to irreplaceability.
Published at PST 2023 · Copenhagen, Denmark
May — Aug 2022
Data Science Intern
Ciena
📍 Ottawa, ON, Canada
PySpark-based data pipelines, divisive clustering, manufacturing batch anomaly detection. Recognized early that the role skewed toward data engineering — confirmed commitment to applied data science path.
Jun 2018 — Dec 2022
ML Engineer → Senior ML Engineer
Primus Software Development
📍 Noida, India (+ Remote, Canada from 2021)
Built the FNOL classification system for Crawford & Company as sole ML engineer with direct client ownership. Built PriML (NL-to-SQL product) as technical lead on a team of 10. Insurance domain expertise begins here. Retained remotely by Primus after moving to Canada for Masters — a clear signal of value.
CTO recognition + $1,000 performance bonus

Publications

📚 Peer-Reviewed · International Conference
Privacy-Preserving Machine Learning Framework for Brain-Computer Interfaces
20th IEEE International Conference on Privacy, Security and Trust (PST 2023) · Copenhagen, Denmark
Addresses critical re-identification risks in neural data processing. Proposes a framework that applies privacy-preserving techniques to BCI machine learning pipelines — preventing re-identification of sensitive brain activity data while maintaining model utility across tasks. Supervised by Dr. Garima Bajwa, Department of Computer Science, Lakehead University, Thunder Bay, ON.

Leadership & Communication

Data scientists who can communicate build better systems. The evidence is below.

🎤
Toastmasters International
Competitive public speaking training that directly informs how I present technical findings to non-technical stakeholders — executives, clients, and insurance carriers.
7× Best Impromptu Speaker
4× Best Evaluator
3× Best Prepared Speech
👥
Cross-Functional Team Lead
Ran day-to-day technical and delivery decisions for a team of 10–14. Direct stakeholder requirement gathering, refinement, and brainstorming. Second-most senior person on the team — manager above, Girijesh ran everything technical below.
10–14 person cross-functional team
Multi-region, multi-carrier delivery
Client-facing requirement ownership

Open to the Right
Lead & Staff DS Roles

I'm looking for roles where the technical bar is high, the domain is meaningful, and there's real scale to build for.
$190K+ CAD · Waterloo area or remote · Big Tech, AI-native, Enterprise AI.