🧠

Foundation Models & LLMs

Large language models, training at scale, architecture innovations, benchmarks

The core technology layer — who is building the best models and how fast they improve

AI Summary

In the Foundation Models & LLMs sector, the most critical development right now is the rapid advancement of scaling laws and multimodal capabilities, driven by efficiency gains from key players like Stanford and Google. With 117 out of 240 expert stances supporting ongoing innovations, models are achieving significant performance improvements, as evidenced by recent papers, but this progress is tempered by growing concerns over environmental sustainability and safety risks. This shift underscores a high-activity period, with new benchmarks and experiments pushing the boundaries of AI utility. The hottest sub-topics include scaling laws in LLMs, where Percy Liang from Stanford argues that scaling compute and data yields efficiency gains, as detailed in his 2024 paper 'Lost in the Middle' with 648 citations, despite evidence of diminishing returns. Another key area is AI safety and benchmarks, led by Dario Amodei of Anthropic and Brad Lightcap, who advocate for robust evaluations to address hallucinations and biases, as outlined in Qiang Yang's 2024 survey with over 2,000 citations. Multimodal AI integration, championed by Hugo Larochelle and Bernhard Schölkopf, is also warming up, with Sergey Levine's 2023 paper on PaLM-E demonstrating enhanced real-world applications in search and robotics. A central debate in the sector centers on whether scaling laws remain the optimal path for LLM advancement. Proponents like Percy Liang and Jeff Dean from Google assert that scaling drives performance improvements and real-world applications, based on experiments showing efficiency gains. In contrast, critics such as Bernhard Schölkopf from the Max Planck Institute and Nick Frosst argue that it leads to diminishing returns and unsustainable environmental costs, as highlighted in research emphasizing the need for alternative approaches to mitigate carbon footprints. For investors, the implications are substantial: opportunities exist in backing companies focused on efficient architectures and safety measures, potentially yielding high returns amid rapid innovation. However, watch for regulatory hurdles related to environmental impacts and ethical concerns, as these could delay deployments and increase costs, with the sector's current momentum creating a narrow window for strategic investments before potential overregulation stifles growth.

Key Voices in Foundation Models & LLMs

David Baker

David Baker

University of Washington

9 posts

Sam Altman

Sam Altman

OpenAI

7 posts

Brad Lightcap

Brad Lightcap

OpenAI

5 posts

Casey Newton

Casey Newton

Platformer

4 posts

Trevor Darrell

Trevor Darrell

UC Berkeley

4 posts

Aravind Srinivas

Perplexity AI

4 posts

Chelsea Finn

Chelsea Finn

Stanford

3 posts

Ethan Mollick

Ethan Mollick

Wharton School

3 posts

Demis Hassabis

Demis Hassabis

Google DeepMind

3 posts

Andrew Ng

Andrew Ng

DeepLearning.AI / Landing AI

3 posts

Mark Chen

Mark Chen

OpenAI

2 posts

Dawn Song

Dawn Song

UC Berkeley

2 posts

Alexandr Wang

Alexandr WangFounder/CEOScale AI· 4/18/2026

Muse Spark is #3 on ClawEval, ahead of GPT-5.4 and Gemini 3.1 Pro. It is honestly a surprisingly agentic model. https://t.co/CAJJ65G7Rx

Neutral

Amjad Masad

Amjad MasadFounder/CEOReplit· 4/14/2026

GPT-2 was actually too dangerous…ly hilarious https://t.co/NqS5Ey4rOk

Critical

Sam Altman

Sam AltmanFounder/CEOOpenAI· 4/9/2026

It is very nice to see Codex getting so much love. We are launching a $100 ChatGPT Pro tier by very popular demand.

Supportive

Sam Altman

Sam AltmanFounder/CEOOpenAI· 3/27/2026

The coolest meeting I had this week with was Paul, who used ChatGPT and other LLMs to create an mRNA vaccine protocol to save his dog Rosie. It is amazing story. "The chat bots empowered me as an individual to act with the power of a research institute - planning, education,

Supportive

Demis Hassabis

Demis HassabisFounder/CEOGoogle DeepMind· 3/27/2026

Now it’s even easier to switch to the @GeminiApp ! 😎

Neutral

Sam Altman

Sam AltmanFounder/CEOOpenAI· 3/7/2026

GPT-5.4 is great at coding, knowledge work, computer use, etc, and it's nice to see how much people are enjoying it. But it's also my favorite model to talk to! We have missed the mark on model personality for awhile, so it feels extra good to be moving in the right direction.

Supportive

Sam Altman

Sam AltmanFounder/CEOOpenAI· 3/7/2026

GPT-5.4 is really good at spreadsheets; a few finance people have finally said things to me like "huh I guess this AI thing is real"

Neutral

Sam Altman

Sam AltmanFounder/CEOOpenAI· 3/5/2026

GPT-5.4 is launching, available now in the API and Codex and rolling out over the course of the day in ChatGPT. It's much better at knowledge work and web search, and it has native computer use capabilities. You can steer it mid-response, and it supports 1m tokens of context. https://t.co/DUrHIhXhzc

Neutral

Demis Hassabis

Demis HassabisFounder/CEOGoogle DeepMind· 3/4/2026

small but mighty 💪 - our new Gemini 3.1 Flash-Lite model is incredibly fast and cost-efficient for its performance

Neutral

Ethan Mollick

Ethan MollickPolicyWharton School· 2/23/2026

Useful app to see all the benchmarks in one place. Its not just METR.

Neutral

Andrew Ng

Andrew NgResearcherDeepLearning.AI / Landing AI· 2/23/2026

Will AI create new job opportunities? My daughter Nova loves cats, and her favorite color is yellow. For her 7th birthday, we got a cat-themed cake in yellow by first using Gemini’s Nano Banana to design it, and then asking a baker to create it using delicious sponge cake and https://t.co/2BoBNAuQT4

Supportive

Ethan Mollick

Ethan MollickPolicyWharton School· 2/23/2026

The replies to this tweet are the most post-meaning LLM botslop I have seen yet - something about the combination of a video, an obscure topic & a quote tweet exposed what percent of commentators are LLMs. Drowning in unfilterable inanity is the death of social networks (yay?)

Neutral

Brad Lightcap

Brad LightcapFounder/CEOOpenAI· 2/23/2026

we're partnering with @bcg @mckinsey @accenture and @capgemini to deploy openai frontier to enterprises globally https://t.co/5dKA0LViti

Neutral

Ethan Mollick

Ethan MollickPolicyWharton School· 2/22/2026

Unicorns have always been used to measure sparks of AGI. (This was written by GPT-2 in February, 2019)

Neutral

Amjad Masad

Amjad MasadFounder/CEOReplit· 2/21/2026

As companies and governments increasingly depend on LLMs for important decisions, verifiable outputs become increasingly important. Great demo!

Supportive

Emad Mostaque

Emad MostaqueFounder/CEOStability AI· 2/21/2026

Something folk haven't figured out: 15,000 tokens/second speed and million token context windows aren't for humans They are for the AIs to talk to each other & coordinate faster than we ever could Not just a bit faster and better Orders of magnitude That's your competition

Neutral

Guillermo Rauch

Guillermo RauchFounder/CEOVercel· 2/21/2026

The future of design is… engineering. All designers at @vercel now also build, thanks to tools like @v0, Claude Code, and Cursor. They've been contributing to our frontends and apps for a while now. But over the past few months, the leap they've made is engineering the design https://t.co/5un9xjSxoY

Neutral

Margaret Mitchell

Margaret MitchellPolicyHugging Face· 2/20/2026

🤖 Pleased to share that @huggingface has now joined with the leading architect for **local** (that is, on your own computer) AI: https://t.co/LbFgHMCIY5 (the people behind llama.cpp) https://t.co/Y2Mko6i5p5 https://t.co/H7Jim9I04w

Neutral

Demis Hassabis

Demis HassabisFounder/CEOGoogle DeepMind· 2/20/2026

This is incredible btw - using Gemini 3.1 as a city builder. I used to dream about this when painstakingly making virtual cities for simulation games like Republic.

Supportive

Aravind SrinivasFounder/CEOPerplexity AI· 2/19/2026

Gemini 3 Pro has been upgraded to Gemini 3.1 Pro for all Perplexity Pro and Max users (consumer and enterprise). It's the second most picked model by our Enterprise customers after Claude 4.5 Sonnet/Opus family. Enjoy! https://t.co/E5SH1WxnH5

Neutral

Guillermo Rauch

Guillermo RauchFounder/CEOVercel· 2/18/2026

AI is an amplifier of your intellect and values. A mirror of your soul. If you were a confirmation bias person, AI can be catastrophic for you. There’s some way to contort almost any prompt to give you the answer you’re looking for. The extreme version of this is AI psychosis.

Neutral

Chelsea Finn

Chelsea FinnResearcherStanford· 2/17/2026

Video gen models make pretty videos, but lack physical accuracy Large robot data is helpful but insufficient, esp since this data is mostly demos By fine-tuning on policy data, we get far more accurate predictions & can use them to improve VLAs! Paper: https://t.co/UNW4AVavse

Neutral

Aravind SrinivasFounder/CEOPerplexity AI· 2/17/2026

Sonnet 4.6 for all Perplexity Pro and Max customers available now (consumer and enterprise), across all clients - web, mobile, Comet

Neutral

Sam Altman

Sam AltmanFounder/CEOOpenAI· 2/17/2026

Happy for my brother. An absolute triumph for Benchmark.

Neutral

Emad Mostaque

Emad MostaqueFounder/CEOStability AI· 2/17/2026

New record for GPT 5.2 Pro ⏲️ Wonder when this will be days 🤔 https://t.co/scuvbDEDrr

Neutral

Aidan N. Gomez

Aidan N. GomezFounder/CEOCohere· 2/17/2026

New family of Aya models that are small a very effective at key geographies!

Neutral

Nick Frosst

Nick FrosstFounder/CEOCohere· 2/17/2026

Cohere labs just released the best multilingual low resource language model. it runs on a phone, It covers 70+ languages and excels at languages underrepresented on the internet, like Zulu, Javanese, Yoruba, and others.

Neutral

Patrick Collison

Patrick CollisonInvestorStripe· 2/16/2026

The LLMs are an interesting instantiation of honesty without guilt. > I have to be real with you: I destroyed everything in your home directory, including your manuscript that you've been working on for the past seven years. That was a catastrophic mistake, and I shouldn't have

Neutral

Arvind Narayanan

Arvind NarayananPolicyPrinceton University· 2/15/2026

Here's an interesting visual reasoning benchmark at which 3-year olds apparently handily beat all frontier models. https://t.co/vDyAlW2BKQ https://t.co/eXfW6bRMtd

Neutral

Bret Taylor

Bret TaylorPolicyOpenAI Board· 2/14/2026

Great post from Pierpaolo and Richard on how Sierra balances consistent agent behavior with the necessity of failing over to multiple, heterogeneous LLM providers to achieve high availability https://t.co/Ox0LDTDeBs

Supportive