Course Description: This seminar (Advanced Topics in Foundation Models) explores the frontier of Large Language Models (LLMs) and Multimodal Foundation Models. We move beyond standard autoregressive architectures to examine next-generation paradigms, including hierarchical reasoning, latent reasoning, diffusion-based language modeling, and reinforcement learning for reasoning. The course emphasizes both theoretical understanding and practical safety alignment, drawing heavily on recent breakthroughs in mechanism interpretability, test-time compute scaling, and agentic workflows.
We welcome students from diverse backgrounds who are interested in learning about SOTA foundation models.
📍 Time & Location: Friday, Periods 3 & 4 (12:10 PM - 3:10 PM) in Hill 009
| Week | Topic & Readings |
|---|---|
| Week 1 |
Introduction & Interpretation Overview of Foundation Models. Safety and Interpretation. |
| Week 2 |
LLM Frontiers DeepSeek-V3.2 Kimi-K2 |
| Week 3 |
Video Generation Wan (arXiv:2503.20314) Hunyuan Video 1.5 |
| Week 4 |
Part 1: Guest Lecture (1 hr)
Xingyu Fu (Princeton)
Topic: MLLM (benchmarks, thinking with images)
Part 2: Student Presentation
Paper: Lessons from the Trenches on Reproducible Evaluation of Language Models
|
| Week 5 |
Part 1: Guest Lecture (1 hr)
Didac Suris (Meta Super Intelligence Lab)
Topic: SAM 3 (Vision Foundation)
Part 2: Student Presentation
Paper: Dino v3 (Vision Representation)
|
| Week 6 |
Part 2: Student Presentation
Paper: Highly accurate protein structure prediction with AlphaFold
|
| Week 7 |
Part 1: Guest Lecture (1 hr)
Ruoshi Liu (Amazon FAR Scientist)
Topic: Foundation Model for Robotics
Part 2: Student Presentation
Paper: DAPO, the art of scaling RL for LLMs
|
| Week 8 |
Part 1: Guest Lecture (1 hr)
Congyue Deng (MIT)
Topic: Action Representation
Part 2: Student Presentation
Paper: TBD
|
| Week 9 |
Hierarchical Reasoning Hierarchical Reasoning Model (arXiv) Mixture of Depth (arXiv) Scaling up Test-Time Compute with Latent Reasoning |
| Week 10 |
Data & Models Openthought, DataComp-LM, s1 |
| Week 11 |
Part 1: Guest Lecture (1 hr)
Yushi Hu (Meta FAIR)
Topic: Multimodal
Part 2: Student Presentation
Paper: Why Do Multi-Agent LLM Systems Fail?
|
| Week 12 |
New Architectures Diffusion LM: Llada, dream-7B |
| Week 13 |
Reasoning Paradigms Parallel thinking, Latent thinking, Soft thinking |
| Week 14 |
Hybrid Reasoning Transfusion, Diffusion Forcing |
| Week 15 |
Final Final Project Presentations |