Date
Nov 22, 2024, 12:00 pm1:30 pm
Location
Bendheim House 103

Details

Event Description

Abstract: Evaluations (evals) have been undervalued in recent years and will be the topic of this talk. I will first discuss what makes a successful eval and give examples of successful evals. Then I will discuss the most common mistakes that make evals not successful and give some thoughts on recent approaches to evals in the LLM space. Finally, I will talk about a new hallucinations eval that we open-sourced, called SimpleQA, which aims to meet the criteria of being a good eval.

Jason Wei bio

Sponsor
Event organized by PLI