# CSCI 6220/4030 Project Details, Fall 2019

As teams, you will give a 20 minute presentation of an interesting or cool result or an algorithm (or class of algorithms) from a course-relevant technical or survey paper to the class; if your research is relevant, that could also be the topic of your presentation.

This can be material that requires no background beyond general knowledge of CS, or more niche/applied knowledge; in the latter case, you should quickly explain the relevant concepts and background. See the "What is Research?", "Giving Talks", and "Reading Papers" slide decks from the 2019 CS Grad Skills Seminar.

## Grading Rubric and Deadlines

Task | Due date | Percentage of grade |
---|---|---|

Paper selection | 11:59pm, October 28 | 10 |

Discuss paper with me | by 5pm, November 22 | 40 |

Deliverables | start of class, December 2 | 10 |

In-class 20 minute presentations | December 2 or 5 | 40 |

## Paper selection

You group must *get my approval* to present your research or the paper you have chosen. This means I need to be informed of your decision *and* have verified that it is an appropriate choice by the deadline.

- the kmeans++ algorithm
- the Blendenpik algorithm for solving linear systems
- the Goemans–Williamson randomized approximation to MAX-CUT
~~randomized triangle counting in graphs~~- randomized linear-time computation of minimum spanning trees
- locality sensitive hashing (e.g. SimHash)
- a stochatic variance reduced gradient method (I recommend the discussion around Theorem 6.5 of Bubeck's convex optimization book for a concise explanation of SVRG)
- the randomized approximate Caratheodory theorem
~~randomized element-wise sparsification of a matrix~~- compressed sensing (using the fact that random matrices satisfy the restricted isometry principle)
- the Johnson–Lindenstrauss transform, and application to approximate nearest neighbor search

## Discuss paper with me

See my suggestions on reading papers from the 2019 CS Grad Skills Seminar.
*All of your group* must discuss your research/paper in a group meeting with me by the deadline. Address all the points I raise below, to show that you will be able to give a quality presentation. Justify your choice of empirical evaluations and baselines. I *highly* suggest you have your experiments done so that we can discuss them.

Schedule this meeting at least the week before the deadline: I will not be able to meet with every group on the day of the deadline. If you have difficulties with your paper, meet with me to resolve them well before your scheduled formal discussion.

## Deliverables:

These should be submitted in a single github/gitlab repo for each project. The experimental results presented in your talk must be easily reproducible given access to this repo.

- Well-documented cross-platform code for reproducing your experimental evaluations. Julia, Python, R, and C++/C are acceptable.
- Either include the data sets you used (if small enough), or provide a script that downloads and preprocesses them to the format that your code expects as input
- A pdf slide deck for your 20 minute in-class presentation, using appropriately
*typeset math and legible figures*that addresses all of the points below.

## In-class presentation

See Prof. Anshelevich's suggestions on giving a good talk from the 2019 CS Grad Skills Seminar. Address the following points in your presentation to receive full credit for this portion.

- Who are the authors, and the date and venue of publication?
- What is the problem that is addressed (pick one, if the paper addresses more than one), and why is it interesting or useful?
- What is the main result of the paper?
- Describe the result or algorithm and motivate it intuitively.
- What is the cost (time, space, or some other metric) of this algorithm, and how does it compare to prior algorithms for the same problem? (and similarly, for non-algorithmic results)
- What performance guarantees, if any, are provided for the algorithm?
- Give an accurate description of the analysis given in the paper: in simple cases this may be a tour through the entire argument; when this is not possible, focus on explaining a core lemma/theorem that supports the claim of the paper.
- Provide an empirical evaluation of the algorithm: compare its performance to reasonable baselines, and explore relevant aspects of the algorithm (its variability, sensitivity to relevant properties of the input, etc.). If presenting a non-algorithmic result and it is possible, provide some experimental evidence of its sharpness or lack thereof.