30 papers found
Uncertainty-Aware Step-wise Verification with Generative Reward Models
arXiv (Cornell University)20251 citations
The human factor in explainable artificial intelligence: clinician variability in trust, reliance, and performance
npj Digital Medicine20253 citations
Challenges and Considerations in the Evaluation of Bayesian Causal Discovery
arXiv (Cornell University)20241 citations
Physically Motivated Deep Learning to Superresolve and Cross Calibrate Solar Magnetograms
The Astrophysical Journal Supplement Series20247 citations
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
arXiv (Cornell University)20244 citations
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMs
arXiv (Cornell University)20242 citations
Deep Bayesian Active Learning for Preference Modeling in Large Language Models
arXiv (Cornell University)20241 citations
Detecting hallucinations in large language models using semantic entropy
Nature2024419 citations
Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities
arXiv (Cornell University)20242 citations
Estimating the Hallucination Rate of Generative AI
arXiv (Cornell University)20244 citations
Tractable Function-Space Variational Inference in Bayesian Neural Networks
arXiv (Cornell University)20236 citations
DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design
arXiv (Cornell University)20233 citations
Continual Learning via Sequential Function-Space Variational Inference
arXiv (Cornell University)20235 citations
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
arXiv (Cornell University)20236 citations
Can Active Sampling Reduce Causal Confusion in Offline Reinforcement Learning?
arXiv (Cornell University)20231 citations
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
arXiv (Cornell University)20234 citations
LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?
arXiv (Cornell University)202319 citations
ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages
arXiv (Cornell University)20232 citations
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
arXiv (Cornell University)202348 citations
Prediction-Oriented Bayesian Active Learning
arXiv (Cornell University)20237 citations
Fine-tuning can cripple your foundation model; preserving features may be the solution
arXiv (Cornell University)20231 citations
