# CSCI 6962/4962 Project Details, Fall 2023

The second major component of the course (primary being the homework), is the course project. The purpose of this component of the course is to give you experience with reading research literature critically, and experience with the process of crafting articulate technical presentations.

Your project selection must relate to the content of the course: it should involve large-scale machine learning or optimization. Due to the size of the class, the projects will be done in groups, each consisting entirely of undergrads or of grads. Preliminary group assignments will be posted to Piazza by Lecture 5; students are welcome to change group membership as long as everyone affected agrees.

Groups will choose to do one of two types of projects: *research* projects, or *pedagogical* projects. Graduate students must select a research project.

** Research projects.** In these, you will do original research related to the content of the course, either theoretical or applied; this research can be work you have already been conducting, or could be new. This research will be presented in a 20 minute presentation and a report written in the format of an ICLR workshop submission.

** Pedagogical projects.**
For these, you will develop written lecture notes and an accompanying 20 minute presentation covering a topic that is related to, but not covered, in the course; part of this presentation must consist of a freshly designed empirical evaluation of the method or result under consideration.The presentation should be accompanied by a problem set, with solutions, similar in difficulty to the ones assigned in class to test the understanding of students after watching the presentation. You can choose to survey algorithms in particular subfields of ML or optimization that we do not cover in class, or focus on a single paper in detail.

## Potential project ideas

Here are a few sample ideas to get your thoughts flowing:- Pedagogical: Graph neural networks
- Pedagogical: Online convex optimization
- Pedagogical: Neural Architecture Search
- Pedagogical: Low Rank Adaptation for Large Language Models
- Pedagogical: Stochastic approaches to large-scale variational inference
- Pedagogical: Metropolis-Hastings with applications in ML or optimization
- Research: Select a recent paper from NeurIPS, ICML, ICLR
- Research: An algorithm for structured low-rank tensor approximation
- Research: A novel method for fair federated learning
- Research: Adversarial missingness attacks on low-rank matrix factorization
- Research: Outlier robust kernel learning
- Research: AI alignment for LLMs

## Grading Rubric and Deadlines

Task | Due dates (by 11:59pm ET of indicated day) | Percentage of grade | Details |
---|---|---|---|

Project group assignment | Lecture 5 | 0 | The instructors will post the group assignments on Piazza. You are welcome to switch group membership if everyone in both group approves, and you send the instructors a group email saying so before the project selection deadline. Any group with a graduate student will do a research project. |

Project selection | Lecture 8 | 10 | via Submitty: the submitted project idea must have been preapproved |

Project progress report | Lecture 20 | 35 | via Submitty; should be 2/3 of the final report, include some experimental design and results, and complete introduction, background, and related works sections |

Deliverables | Lecture 26 | 20 | via Submitty: submit a link to a public Github repo |

Presentations and Reports | Lecture 26 | 35 | via Submitty: submit the report, submit a prerecorded presentation uploaded to RPI's Box. |

See the "What is Research?", "Giving Talks", and "Reading Papers" slide decks from the 2019 CS Grad Skills Seminar to understand the expectations I will use to evaluate your projects.

## Project selection

Project selection will be via Submitty: each person in the group should submit the same project idea. Groups must get *approval* to present the research or pedagogical topic that they have chosen. This means I need to be informed of your decision *and* have verified that it is an appropriate choice by the deadline. You must provide me with enough information to make this choice: for pedagogical topics, what specific topic will you cover, and why does it relate to the class? Similarly, for research topics, what is the problem you will tackle, what techniques will you attempt to use, and why does it relate to the class?

## Project Progress Report

Each member of your group must submit a written progress report in Submitty by the due date. In the case of pedagogical projects, this will be written out form of lecture notes; these should be written in proper English and give full mathematical details and algorithms when appropriate. In the case of research projects, this will consist of a draft of the final workshop paper.

The report will be graded in accordance with the fact that it will be submitted 2/3 of the way through the project timeline. In particular, I *require* you to have a significant portion (50%) of your experiments done so that we can discuss them, and the complete problem set if relevant.

If you have difficulties with your research or pedagogy, meet with me to resolve them well before your scheduled formal discussion.

See my suggestions on reading papers from the 2019 CS Grad Skills Seminar.

## Deliverables

*Code* should be submitted in a single github/gitlab repo for each project. The experimental results presented in your talk must be easily reproducible given access to this repo.

- Well-documented cross-platform code for reproducing your experimental evaluations. Julia, Python, R, and C++/C are acceptable.
- Either include the data sets you used (if small enough), or provide a script that downloads and preprocesses them to the format that your code expects as input
- A pdf slide deck for your 20 minute in-class presentation, using appropriately
*typeset math and legible figures*that addresses all of the points below. This may be uploaded after the deliverables are due, as long as it is there before the in-class presentations. - If your project is pedagogical, post the problem set here as well, clearly labeled.
- If your project is research-based, post the report pdf here.

## Presentations

Presentations will be played in class (20 minutes per group); each group member should speak during the presentation. See Prof. Anshelevich's suggestions on giving a good talk from the 2019 CS Grad Skills Seminar. Address the following points in your presentation to receive full credit for this portion. (This is for pedagogical projects, so adjust appropriately for research projects)

- Who are the authors, and the date and venue of publication?
- What is the problem that is addressed (pick one, if the paper addresses more than one), and why is it interesting or useful?
- What is the main result of the paper?
- Describe the result or algorithm and motivate it intuitively.
- What is the cost (time, space, or some other metric) of this algorithm, and how does it compare to prior algorithms for the same problem? (and similarly, for non-algorithmic results)
- What performance guarantees, if any, are provided for the algorithm?
- Give an accurate description of the analysis given in the paper: in simple cases this may be a tour through the entire argument; when this is not possible, focus on explaining a core lemma/theorem that supports the claim of the paper.
- Provide an empirical evaluation of the algorithm: compare its performance to reasonable baselines, and explore relevant aspects of the algorithm (its variability, sensitivity to relevant properties of the input, etc.). If presenting a non-algorithmic result and it is possible, provide some experimental evidence of its sharpness or lack thereof.