About Me

Hi, I’m Hersh

I’m an applied scientist at BCG X focused on responsible AI and GenAI evaluation. I spend most of my time figuring out whether AI systems are safe, reliable, and actually doing what they’re supposed to, across LLMs, multimodal models, and AI agents.

Before this, I spent about five years in DC government, first as a data scientist at the Department of Human Services and later as a principal data scientist and data strategy lead at the Office of the Chief Technology Officer. I built ML models, stood up data infrastructure, wrote DC’s responsible AI framework, and helped run a data science community across city agencies. I also spent time as a research fellow at The Lab @ DC, where I worked with social scientists on policy evaluation.

What I’m working on now

  • Evaluating safety, reliability, and performance across GenAI products
  • Building automated testing and CI/CD pipelines for continuous AI evaluation
  • Mentoring data scientists and engineers on measurement approaches for GenAI

What I’ve worked on before

  • Production GenAI assistants with safety guardrails for government use
  • ML models for fire risk identification, homelessness analytics, and social services
  • DC’s first responsible AI framework and evaluation guidelines
  • Machine learning models scored on 300M+ records for political campaigns

Get in touch

I’m always happy to connect — find me on Bluesky, LinkedIn, or GitHub.