Columbia AI Alignment Club (CAIAC) Reading Notes

Weekly reflections and notes from the Columbia AI Alignment Club Technical Fellowship, covering key papers in AI safety, alignment, interpretability, control, and evaluation.

Week 1:

September 24, 2025

Topics: AI safety, alignment, RLHF

Read notes →

Week 2:

October 01, 2025

Topics: AI safety, alignment, misgeneralization

Read notes →

Week 3:

October 08, 2025

Topics: AI safety, forecasting, capabilities, risk scenarios

Read notes →

Week 4:

October 15, 2025

Topics: AI safety, mechanistic interpretability, superposition, sparse autoencoders

Read notes →

Week 5:

October 22, 2025

Topics: AI safety, control, scheming, red teaming

Read notes →

Week 6:

October 29, 2025

Topics: AI safety, scalable oversight, debate, weak-to-strong

Read notes →

Week 7:

November 05, 2025

Topics: AI safety, red teaming, evaluations, adversarial attacks

Read notes →

Week 8:

November 12, 2025

Topics: AI safety, timelines, careers, forecasting

Read notes →

Pranati Modumudi

Columbia AI Alignment Club (CAIAC) Reading Notes