available for interesting problems
Hi, I'm Utshav.
AI Engineer at MedForce AI. I build production LLM systems for clinical workflows. Currently focused on evaluation and reliability of medical AI.
Based in Kathmandu.
Experience
In [1]:
experience.list()
Out[1]:
AI Engineer
current MedForce AI London, UK · Remote
- AI engineer on a clinical-automation platform deployed in an NHS pilot. Designed the agent architecture and clinical RAG pipeline targeting clinician documentation and decision-support workflows.
- Fine-tune domain-specific LLMs (Mistral 8×22B, Llama 3, Command R family) on curated hepatology cases for medical reasoning and structured documentation; build clinical evaluation harnesses covering factuality, citation grounding, and hallucination rate.
- Cut inference cost and latency for real-time clinical use through various caching techniques (KV cache, prompt caching, semantic response cache), request batching, and quantization; built monitoring for latency, drift, and hallucination signals.
AI Engineer
Digital-Dandelion London, UK · Remote
- Built a clinical RAG system over EASL liver-disease guidelines using a multi-route retrieval architecture: simple queries served by embedding-based retrieval, complex queries routed to page-indexed advanced RAG. Reached 97% answer accuracy vs. 85% on a state-of-the-art baseline and cut end-to-end latency by 50%.
- Trained an EfficientNet image-classification system on GCP scoring 30,000+ dental websites for modernization; lifted accuracy from 89% → 95% via targeted data augmentation and an architecture switch from the prior baseline. Released the dental scraping and extraction datasets on Hugging Face.
- Built a multimodal ranking system for JLL evaluating hundreds of commercial real-estate offices via pairwise tournament comparisons across building exterior, interior, workspace, and floor-plan imagery to identify top performers.
- Fine-tuned an LLM on hundreds of past winning creative campaigns and creative-director rubrics for Page & Page (Novo Nordisk account); validated outputs on KPIs spanning relevance, creativity, and brand-fit.
Junior ML Engineer
Omdena New York, USA · Remote
- Coordinated a 10–20-person distributed team across multiple timezones building scraping infrastructure that assembled a 46,000-article Nepali news corpus, the working dataset for the program's media-representation research.
- Built Nepali-language NLP models capturing local idioms to classify how women and marginalized groups are represented in Nepali news media.
- Designed a media-diversity scoring model with explicit progress metrics surfaced to research stakeholders.
Projects
In [2]:
projects.featured()
Out[2]:
MeroDaktar
HealthcareAI medical consultation platform for Nepali patients. Voice-driven patient interviews powered by a custom MedGemma-Nepali multimodal model.
MedGemma 4B Gemma-3 Voice AI FastAPI
Llama-3-70B Extractor
LLMsFine-tuned Llama-3-70B-Instruct for structured information extraction from web content.
Llama 3 70B Information Extraction
Writing
In [3]:
posts.recent(n=3)