AI Safety and Benchmarks for LLMs
Efforts focus on developing robust benchmarks and safety measures to mitigate risks like hallucinations and biases in LLMs. This includes work on alignment techniques and evaluations to ensure reliable deployment.
7
Related Opinions
30
Related Papers
5
KOLs Discussing

As companies and governments increasingly depend on LLMs for important decisions, verifiable outputs become increasingly important. Great demo!

$3M to support the development of open benchmarks!

A truely generative meta-model of activations, for steering, probing, and understanding LLMs at scale!

Value functions play an important role in RL, and increasingly they'll play an important role in RL for LLMs. This new paper led by @rohin_manvi is one step in this direction: using value functions to optimize test-time compute with adaptive computation.

Debug your model with StringSight: LLMs all the way down!

introducing gpt-5.2, our latest model and most capable for knowledge work it sets a new state of the art across many benchmarks, including GDPval, which captures a cross-section of real world tasks it’s better at building spreadsheets, drafting presentations, coding, long

Super excited about our new work on pretrained 4-D robotic foundation models. LLMs learned with 4-D representations on egocentric datasets transfer well to real world tasks!