About Me

Hi, I’m Hersh

I’m an applied scientist at BCG X focused on responsible AI and GenAI evaluation. I spend most of my time figuring out whether AI systems are safe, reliable, and actually doing what they’re supposed to, across LLMs, multimodal models, and AI agents.

Before this, I spent about five years in DC government, first as a data scientist at the Department of Human Services and later as a principal data scientist and data strategy lead at the Office of the Chief Technology Officer. I built ML models, stood up data infrastructure, wrote DC’s responsible AI framework, and helped run a data science community across city agencies. I also spent time as a research fellow at The Lab @ DC, where I worked with social scientists on policy evaluation.

What I’m working on now

Evaluating safety, reliability, and performance across GenAI products
Building automated testing and CI/CD pipelines for continuous AI evaluation
Mentoring data scientists and engineers on measurement approaches for GenAI

What I’ve worked on before

Production GenAI assistants with safety guardrails for government use
ML models for fire risk identification, homelessness analytics, and social services
DC’s first responsible AI framework and evaluation guidelines
Machine learning models scored on 300M+ records for political campaigns

Get in touch

I’m always happy to connect — find me on Bluesky, LinkedIn, or GitHub.