← Back to Home

Frontiers in Foundation Models

Course Description: This seminar (Frontiers in Foundation Models) explores the frontier of Large Language Models (LLMs) and next-generation multimodal foundation models. We move beyond standard autoregressive language modeling to study how modern systems integrate text, vision, video, code, and action to enable grounded reasoning and real-world decision making. Topics include multimodal reasoning and evaluation, vision representation learning and segmentation, video generation, foundation models for robotics, and action-centric representation learning. We also cover emerging paradigms such as hierarchical and latent reasoning, diffusion-based language modeling, reinforcement learning for reasoning, and agentic workflows. Throughout the course, we emphasize both mechanistic understanding and practical safety alignment, connecting recent advances in interpretability, test-time compute scaling, and robust evaluation to hands-on paper presentations and open-ended research projects.

Course Slides

Who Should Enroll?

We welcome students from diverse backgrounds who are interested in learning about SOTA foundation models.

Grading & Logistics

📍 Time & Location: Friday, Periods 3 & 4 (12:10 PM - 3:10 PM) in Hill 009


Grading Breakdown

Schedule (Spring 2026)

Week Topic & Readings
Week 1Jan 23 Introduction & Interpretation
Overview of Foundation Models. Safety and Interpretation.
Reading: SELFIE
Week 2Jan 30 LLM Frontiers
DeepSeek-V3.2
Kimi-K2
Week 3Feb 6
Part 1: Student Presentation Paper: Dino v3
Part 2: Guest Lecture (1 hr) Xingyu Fu (Princeton)
Topic: MLLM (benchmarks, thinking with images)
Week 4Feb 13
Guest Lecture (Starts 2:00 PM) Didac Suris (Meta Super Intelligence Lab)
Topic: SAM 3 (Vision Foundation)
Week 5Feb 20
Part 1: Student Presentation Paper: Why Do Multi-Agent LLM Systems Fail?
Part 2: Guest Lecture (Starts 1:40 PM) Sachit Menon (Columbia University)
Topic: Multimodal Reasoning with Code Generation
Week 6Feb 27
Guest Lecture (Starts 1:40 PM) Wenhao Ding (NVIDIA Scientist)
Topic: Accelerating the Development and Deployment of Reasoning Models for Physical AI
Week 7Mar 6
Part 2: Guest Lecture (1 hr) Ruoshi Liu (Amazon FAR Scientist)
Topic: Foundation Model for Robotics
Week 8Mar 13
Guest Lecture (1 hr) Congyue Deng (MIT)
Topic: Action Representation
Spring RecessMar 20 NO CLASS
Rutgers Spring Recess (March 14 - March 22)
Week 9Mar 27
Part 1: Student Presentation (Alborz) Paper: SAT Solvers in LLMs
Part 2: Guest Lecture (1 hr) Yunhao Ge (NVIDIA Research Scientist)
Topic: World Action Models are Zero-Shot Policies
Week 10Apr 3 Data & Models
Openthought, DataComp-LM, s1
Week 11Apr 10
Part 1: Student Presentation (Sinchona) Paper: Wan (arXiv:2503.20314)
Part 2: Guest Lecture (1 hr) Yushi Hu (Meta FAIR)
Topic: Multimodal
Week 12Apr 17 Diffusion & Hybrid Models
Diffusion LM: Llada, dream-7B
Hybrid: Transfusion, Diffusion Forcing
Week 13Apr 24 Reasoning Paradigms
Parallel thinking, Latent thinking, Soft thinking
Week 14May 1 Final
Final Project Presentations
(Last day of Regular Classes)

← Back to Home