
23 papers found
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
Proceedings of the AAAI Conference on Artificial Intelligence202442 citations
RedPajama: an Open Dataset for Training Large Language Models
arXiv (Cornell University)202417 citations
Benchmarking Large Language Models for News Summarization
Transactions of the Association for Computational Linguistics2024246 citations
Generative Agent Simulations of 1,000 People
arXiv (Cornell University)202430 citations
Lost in the Middle: How Language Models Use Long Contexts
Transactions of the Association for Computational Linguistics2024648 citations
BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
arXiv (Cornell University)202430 citations
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
arXiv (Cornell University)202413 citations
DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
arXiv (Cornell University)202314 citations
Evaluating Verifiability in Generative Search Engines
arXiv (Cornell University)202314 citations
Generative Agents: Interactive Simulacra of Human Behavior
arXiv (Cornell University)2023158 citations
Robust Distortion-free Watermarks for Language Models
arXiv (Cornell University)202318 citations
Lost in the Middle: How Language Models Use Long Contexts
arXiv (Cornell University)202351 citations
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
arXiv (Cornell University)202347 citations
Whose Opinions Do Language Models Reflect?
arXiv (Cornell University)202395 citations
Ecosystem Graphs: The Social Footprint of Foundation Models
Research Square (Research Square)202312 citations
Large Language Models as Analogical Reasoners
arXiv (Cornell University)202314 citations
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
arXiv (Cornell University)202329 citations
MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records
arXiv (Cornell University)202318 citations
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
arXiv (Cornell University)202353 citations
Holistic Evaluation of Language Models
Annals of the New York Academy of Sciences2023395 citations