CS 6785 Deep Probabilistic and Generative Models (2021SP)

Generative models are a class of machine learning algorithms that define probability distributions over complex, high-dimensional objects such as images, sequences, and graphs. Recent advances in deep neural networks and optimization algorithms have significantly enhanced the capabilities of these models and renewed research interest in them. This course explores the foundational probabilistic principles of deep generative models, their learning algorithms, and popular model families, which include variational autoencoders, generative adversarial networks, autoregressive models, and normalizing flows. The course also covers applications in domains such as computer vision, natural language processing, and biomedicine, and draws connections to the field of reinforcement learning.

First Lecture Information

The first lecture is going to be on Monday 02/08. Please use the following Zoom link to connect to the lecture:

You will also find Zoom links to all the lectures in Canvas under the "Zoom" tab.

Information

Instructor: Volodymyr Kuleshov

Credits: 3

Course Frequency: Spring Term

Times: Mon/Wed - 1:00-2:15pm Eastern Time.

The class will be held twice a week, on Mondays and Wednesdays. Instruction will be fully remote, and the lectures will be given via Zoom (see above for the URLs).

Teaching Staff and Office Hours

Volodymyr Kuleshov (Instructor). Office Hours: Tue 1pm-1:45pm ET; Wed 2:15pm-3pm ET. https://www.cs.cornell.edu/~kuleshov/

Shachi Deshpande (Teaching Assistant). Office Hours: Mon 8:30am-9:30am ET. https://www.cs.cornell.edu/~shachi/

Prerequisites

  • Basic knowledge about machine learning from at least one of: CS4780, CS4701, CS5785.
  • Basic knowledge of probabilities and calculus: students will work with computational and mathematical models.
  • Basic knowledge of deep neural networks (CNNs, RNNs; CS5787). Extensive experience implementing deep neural networks is not required but will be helpful for the class project.
  • Proficiency in a programming language (preferably Python) will be helpful for completing the class project if you want to perform an implementation.

Textbooks and Other Materials

We offer our own self-contained notes for this course. While there is no required textbook, we recommend "Deep Learning" by Ian Goodfellow, Yoshua Bengio, Aaron Courville. The online version available for free here.

Grading

Homework 1

Combination of theory and programming questions

15%

Homework 2

Combination of theory and programming questions

15%

Homework 3

Combination of theory and programming questions

15%

Presentation

Discuss 2-3 papers in teams of ~2.

15%

Project Proposal

Brief description of the planned project, around 300 words.

5%

Project Milestone

Mid-semester progress report on course project, 3-5 pages in length.

10%

Final Project

Final report on the course project, 5 pages in length.

25%

Total Points

100%

Assignments

Homeworks should be written up clearly and succinctly; you may lose points if your answers are unclear or unnecessarily complicated. You are encouraged to use LaTeX to writeup your homeworks, but this is not a requirement. You can find a basic LaTeX template here. Assignments will be submitted on Gradescope. You may work in teams of two: make sure to put both of your names on the submission and submit as a team in Gradescope.

Course Project

The course project will give the students a chance to explore machine learning in greater detail. Course projects will be done in groups of up to 3 students and can fall into one or more of the following categories:

  • Application of machine learning to a practical problem or a dataset.
  • Improvements to machine learning algorithms.
  • Theoretical analysis of any aspect of machine learning models.

Pick a topic that's meaningful to you and that excites you. For example, if you do PhD research in biology, you can do a project related a dataset that you work with. If you're in Urban Tech, you can work with a city dataset that you find interesting. You are encouraged to find something on your own, but feel free to talk to the teaching team during office hours about project ideas.

Proposal (Due 03/08 at 11:59pm ET)

Your proposal should give the title of the project, the project category, the names of your team members, their NetID, and a 300-500 word description of what you plan to do. It should contain the following information.

  • Motivation: What problem are you tackling? Is this an application or a theoretical result?
  • Method: What machine learning techniques are you planning to apply or improve upon and how?
  • Future work: What experiments are you planning to perform or what theorems do you want to prove?

The goal of the proposal is make sure you're on the right track. As long as you follow the above guidelines, you should do well.

Please submit the proposal via Gradescope and make sure to submit as a team.

Milestone (Due 04/19 at 11:59pm ET)

The milestone submission should describe what you've accomplished so far, and briefly say what else you plan to do. The format should be the same as of the final project, with an approximate length of 3-5 pages (excluding references). The goal is to make sure that you are on track to finish the final project.

  • Motivation: What problem are you tackling? Is this an application or a theoretical result?
  • Method: What machine learning techniques are you planning to apply or improve upon and how?
  • Preliminary experiments: Describe the experiments that you've run, the outcomes, and any error analysis that you've done. You should have tried at least one baseline.
  • Future work: What else do you plan to do?

The goal of the milestone is make sure you're on the right track. As long as you follow the above guidelines, you should do well.

Please submit the milestone via Gradescope and make sure to submit as a team.

Final Writeup (Due 05/21 at 11:59pm ET -- no late days!)

The final writeup should describe all the work you did for your course project and summarize the main results. You can think of it as a technical report that presents your findings to a general machine learning audience.

The style and format of the writeup should be similar to that of a research paper. The maximum length is 8 pages, excluding references. 

There are no strict requirements on the structure of the final writeup, but one way to structuring it would be include the following sections, which are fairly standard for a research paper.

  • Abstract: Summarize the problem, novel contributions, and results in one paragraph.
  • Introduction: Provide motivation for the problem and expand upon the overview in the abstract.
  • Background: Briefly summarize the background knowledge needed to understand the work.
  • Method: Describe the methods that will be used or implemented in the paper.
  • Theoretical analysis: If you are doing a theory project, describe your theoretical results here.
  • Experimental analysis: Describe in detail your experiments.
  • Discussion and Prior Work: Discuss the key takeaways from your experiments. Put your results in the context of previous work
  • Conclusion. You may summarize the paper or talk about open problems and open directions.

Regardless of how the writeup is structured, please make sure to cover the following points.

  • Motivation: What problem are you tackling? Why is it interesting? What type of project will this be (application, method, theory)?
  • Method: What machine learning techniques are you planning to apply or improve upon and how? Make sure to describe them in detail and provide enough context for the reader to understand the methods at least at a high level. Provide any background that is necessary for that.
  • Experiments: Describe the experiments that you've run, the outcomes, and any error analysis that you've done. Make sure that the setup is described in enough detail for someone else to reproduce your results. Also, if you have an experimental project, make sure to provide a detailed experimental analysis. Things you should consider including are: train/test performance, learning curves, model samples, error analyses, ablation analyses, etc. Most projects should also include baselines.
  • Theory: If doing a theory project, state your results formally as theorems. Make sure that all the symbols are defined. Also, the best presentation of theoretical results tends to also explain the results in plain language and conveys the intuition behind them.
  • Context: Explain how you build upon previous work and how your results compare to what has been done previously.

Writeups will be evaluated for their presentation clarity, the respect of the above guidelines, the significance of the project (does it explore a toy dataset or a real problem) and the technical quality of the work (the level of depth in the experimental or theoretical analyses, does the approach make sense technically, are the algorithms implemented reasonable and studied in enough detail, etc.).

Please submit the writeup via Gradescope and make sure to submit as a team.

Late Submissions

You have 4 late days which you can use at any time during the term without penalty (for both assignments and projects). The final project writeup cannot be submitted late because we need to grade it in a short amount of time. Once you run out of late days, you will incur in a 20% penalty for each extra late day you use. When submitting as a team, each one of you must use a late day. Each late submission should be clearly marked as “Late” on the first page. No submission will be accepted 3 days after the deadline.

Collaboration Policy and Honor Code

You are free to form study groups and discuss homeworks and projects. However, you must write up homeworks and code from scratch independently without referring to any notes from the joint session. You should not copy, refer to, or look at the solutions in preparing their answers from previous years’ homeworks. It is an honor code violation to intentionally refer to a previous year’s solutions, either official or written up by another student. Anybody violating the honor code will be referred to the Office of Judicial Affairs.

Paper Presentations

Students will be asked to deliver a presentation in the second half of the course. The expected length is 45-50 minutes, followed by a 10-15 min discussion. Presentations can be done individually or in groups of two and should cover 1-4 research papers.

Topic Ideas: The best topics are ones that you choose, but you can also choose out of our list.

  • The expected length of each presentation is 45-50 minutes. Presentations will be followed by 10-15 minutes of discussion. Students are asked to conclude their presentation with an initial set of discussion topics for the team and to contribute towards driving the discussion.
  • The expected length of the reviews is up to 1 page each. These should be on a topic that will be presented and should be submitted before the presentation topic.
  • There will be about ten presentation slots in the second half of the class. Students should email the instructor by 03/07 to reserve a presentation slot, choose a presentation topic, and choose one of the papers for reviews. Slots will be filled on a first come first served basis.
  • Each presentation team needs to (1) send a presentation outline to the instructor at least two weeks before the talk and (2) send presentation slides at least two days before the talk.

Presentation Format

The ideal presentation will touch the following topics:

  • Motivation for the problem being studied; why is it interesting?
  • context and previous work in this area;
  • high-level summary of the novel ideas and contributions in the presented papers;
  • detailed explanation of the technical material in the papers;
  • summary of experimental or theoretical results;
  • discussion of the results;
  • conclusion and open-ended questions.

Students should conclude the presentation with follow-up topics / questions to the audience to seed a discussion / q&a covering the presentation and future research directions based on these papers. They should drive the discussion and the instructor will help with that as well.

Paper/Presentation Review

Additionally, each team should write a review of the papers presented by another team. The expected length is 1-2 pages, standard page formatting, 12pt font, single-spaced, submitted as a PDF in Gradescope. The review will count for 5% of the class grade. The presentation will count for 10% of the grade. Thus, both count for 15%, as defined in the grading section.

We encourage you to submit the paper review as a team, preferably the with same team as for your presentation.

We ask you to adhere to the following timeline:

  • Have a look at the presentation topics and select the one you would like to review. It has to be different from your presentation. Send the instructor and the TA the set of papers you will review by April 11. All the presentation topics will be finalized at least a week before then.
    • In order to assure uniform coverage, each presentation has two review slots. In other words, at most two different teams can review the same set of papers. Review slots will be filled on a first come first serve basis.
    • Please reserve your slot as soon as possible. Don't wait until the deadline.
  • We ask you to submit your review the day before the presentation (by 11:59pm time zone of your choice). We are asking this because we want you to be already familiar with the topic of the presentation, and ask questions and/or give a different perspective on the presentation.
    • This year, because we are experiencing delays in collecting presentation topics, we will make a slight exception/adjustment the to the above rule.
    • If you are reviewing papers that are presented in the first two weeks (between April 5 and April 14), you can submit your review a week after the presentation.
    • If you are reviewing papers that are presented on April 19 and later, we ask you to submit the review the day before the presentation, as explained above.
  • We will be asking for mandatory attendance to the presentations. It is also mandatory to turn on video
  • We will be giving up to 3 bonus points for participation during presentations (i.e., asking questions).
  • We will publish the submitted reviews on the website after every presentation.

Review Format

We ask to prepare your reviews following the NeurIPS review guidelines (see the section "Review Content"). In particular, your review should contain the following elements:

  • Summary and contributions: Briefly summarize the paper and its contributions
  • Strengths: Describe the strengths of the work. Typical criteria include: soundness of the claims (theoretical grounding, empirical evaluation), significance and novelty of the contribution.
  • Weaknesses: Explain the limitations of this work along the same axes as above.
  • Correctness: Are the claims and method correct? Is the empirical methodology correct?
  • Clarity: Is the paper well written?
  • Relation to prior work: Is it clearly discussed how this work differs from previous contributions?
  • Reproducibility: Are there enough details to reproduce the major results of this work?
  • Additional feedback, comments, suggestions for improvement and questions for the authors
  • What do you see as the broader impact of this work, including potential negative ethical and societal implications of the work?
  • Does the submission raise potential ethical concerns? This includes methods, applications, or data that create or reinforce unfair bias or that have a primary purpose of harm or injury. If so, please explain briefly.

For examples of reviews, have a look at the NeurIPS web page linked above.

We encourage you to submit the paper review as a team, preferably the with same team as for your presentation.

Schedule

The slides below are ones from last year, you can look at them ahead of time. The slides for this year will be mostly the same, and I will gradually release them under "Files".

Week Date Lecture Topics Coursework Optional Readings
1 Feb 8 & 10 Introduction and Background
(slides 1, slides 2)
2 Feb 15 & 17 Autoregressive Models
(slides 3, slides 4)
HW 1 released van den Oord et al. (2016a, 2016b) Kalchbrenner et al. (2016) Vaswani et al. (2017)
3 Feb 22 & 24 Variational Autoencoders
(slides 5, slides 6)
Kingma et al. (2014) Gregor et al. (2015) Burda et al. (2016) Maddison et al. (2017)
4 Mar 1 & 3 Normalizing Flow Models
(slides 7, slides 8)
HW1 due (03/03), HW 2 released Kingma and Dhariwal (2018) Chen et al. (2018) Chen et al. (2019) Kumar et al. (2019)
5 Mar 8  Energy-Based Models
(slides 11)
Project Proposal: Due Monday, March 8, 2021.
6 Mar 15 and 17 Generative Adversarial Networks
(slides 9, slides 10)

HW 2 due (03/17) HW 3 released 

Dumoulin et al. (2016) Arjofsky et al. (2017) Zhu et al. (2017)
7 Mar 22 & 24 Probabilistic Reasoning, Combining Generative Model Variants
(slides 12)
8 Mar 29 & 31

Discreteness in Generative Modeling
(slides 14)
Evaluating Generative Models (slides 13)

HW 3 due (03/31)
9 Apr 5 & 7 Student Presentations

Malcolm Yang and Allan Bishop (04/05)

Varsha, Oliver and Aaron (04/07)

10 Apr 12 & 14 Student Presentations

Tejas, Aaron and Hadi (04/12)

Sachi Angle and Alexander Amy (04/14)

11 Apr 19 & Apr 21 Student Presentations

Zheng Li, Wilson Yoo and Yucheng Lu (04/19)

Rohan and Jonathan (04/21)

Project Progress Report: Due April 19, 2021.
12 Apr 28 Student Presentations Kiran Tomlinson and Matt Wilber [Interpreting Transformer Language Models]
13 May 3 & 5 Student Presentations

Junxiong Wang, Tao Yu and Jingyi Duan (05/03)

Frank Kim, Josephine Monika and Vijay Kumar Yarlagadda  (05/05)

14 May 10 & 12 Student Presentations

Yuxuan Zhao and Ertai Liu [Deep generative models with missing values] (05/10)

Roberto Halpin, Jack Wang and Tate Keller (05/12)

Final Project Reports: Due May 21, 2020.

 

Course Summary:

Date Details Due