Skip to content

SankalpK1/CS498

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

88 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS 498: Machine Learning Systems (F'25)

Logistics

Lectures: 1310 Digital Computer Laboratory, WF: 11:00 AM – 12:15 PM

Member (NetID) Role Office Hours
Fan Lai (fanlai) Instructor 3128 Siebel Center. W 3:00 PM – 4:00 PM
Jimmy Shong (jimmys2)
Jamlee Jin (jianlij2)
TAs Zoom. F 2:00 PM - 3:00 PM

Canvas: ALL communication regarding this course must be via Canvas. This includes questions, discussions, announcements, assignments, as well as private messages.

Course Description

Learning Objectives: This course will introduce the basic concepts and cutting-edge practices in the design and implementation of efficient software systems for supporting machine learning (ML) models, with a particular focus on Generative AI (GenAI). By the end of the course, students will be able to:

  • Understand and critique the design principles behind state-of-the-art ML systems, from model architecture to system-level considerations.
  • Develop and utilize tools to profile, monitor, and optimize the performance of ML systems
  • Explore and conduct research in topics related to the practical deployment and optimization of ML systems, contributing to the evolving landscape of efficient ML operations.

Structure: The course will combine lectures, guest lectures from practioners, lab assignments, reading summaries, and a semester-long project. We will explore key ML topics from a systems perspective, addressing the relevant challenges across the ML lifecycle. Topics include, but are not limited to:

  • Basics of ML models from a systems perspective;
  • Systems for ML lifecycle (pre-training, training, fine-tuning, inference serving, and grounding);

Note that this course is NOT focused on AI methods. Instead, we will focus on how one can build software systems so that existing AI methods can be used in practice and new AI methods can emerge.

Prerequisites: Students are expected to have good programming skills and must have taken at least one systems-related course (from operating systems, databases, distributed systems, or networking). Having an ML/AI background is helpful but not required.

Tentative Schedule

This is an evolving list and subject to changes due to the breakneck pace of AI

Date Topic Lecturer Slides Assignment/Summary
Aug 27 Course Introduction and Logistics Fan Lai Slides
Aug 29 Transformers Jimmy Shong Slides
Sept 3 Transformers Deep Dive Fan Lai Slides
Sept 5 Distributed Training Overview Fan Lai Slides DeepSeek-V3 Report (Sec 1-3)
Sept 10 Data Parallelism Fan Lai Slides
Sept 12 Tensor Parallelism Fan Lai Slides LlamaRL
Sept 17 Pipeline Parallelism Fan Lai Slides
Sept 19 Multi‑Dimensional Parallelism Fan Lai Slides Alpa
Sept 24 Mixed Precision Training Fan Lai Slides Assignment 1 Release
Sept 26 No Class (Meetings to discuss project ideas) Fan Lai
Oct 1 Memory Optimization Fan Lai Slides Project Proposal Due
Oct 3 Finetuning Techniques Fan Lai Slides ZeRO-style Data Parallelism
Oct 8 Course Project Proposal Feedback Fan Lai
Oct 10 Course Project Proposal Feedback Fan Lai
Oct 15 Course Project Proposal Feedback Fan Lai Assignment 1 Due
Oct 17 Efficient Machine Learning for Intelligent Machines (Guest Lecture) Chenfeng Xu
Oct 22 Inference Overview Fan Lai Speculative Decoding
Oct 24 Batch Serving Techniques Fan Lai DistServe
Oct 29 Paged Attention Fan Lai SGLang
Oct 31 Adaptive KV Fan Lai Assignment 2 Release
Nov 5 Quantization Fan Lai AWQ
Nov 7 LLM Inference Scheduling Fan Lai Mid-semester Report Due
Nov 12 Advanced topics: RAG Systems Fan Lai MoonCake
Nov 14 Advanced topics: Caching GenAI Fan Lai NIAVANA
Nov 19 Buffer Assignment 2 Due
Nov 21 Guest Lecture
Nov 22-30 Fall Break
Dec 3 Final Presentations
Dec 5 Final Presentations
Dec 10 Final Presentations
Dec 19 No Class Final Report Due

Tentative Grading

Groups: Panel discussion and research project will be performed in groups of 4-5 students. Form a group and declare your group's membership and paper preferences by Sept 8. After this date, we will form groups from the remaining students.

Weight
Attendance 10%
Panel Discussion 6% (3% + 3%)
Lab assignments 20% (2 lab assignments, 10% each)
Reading summary 24% (opt-in 8 out of 10 readings, 3% each)
Final project presentation 15%
Project report 25% (5% + 5% + 15%)

Academic integrity: The University's Honor Code applies to all activities related to this course. All material you submit in this course (reading summary, project reports, and presentation materials) must be your own. If you use someone else’s material, you must cite them properly.

AI Tool Policy: AI tools may be used for grammar checking and refining initial brainstorms, but the final reviews and codes must be authored by the student. Students are responsible for the entire content and must adhere to the Academic Integrity Policy.

Policies

Participation

Before Each Lecture: Some lectures may include a required reading. You must submit a summary of the paper by 11:59 PM on the due date.

During Lectures: Active participation is crucial for both your own understanding and to improve the overall quality of the course. You are expected to attend all lectures (up to 2 absences allowed for legitimate reasons), and more importantly, participate in class discussions. Not everyone must have add something every day, but it is expected that everyone has something to share over the semester.

Paper Summaries

You need to select and write paper summaries from 8 papers out of 10 listed papers. The summary should be done independently and include the following contents (five paragraphs):

  • P1: The problem the paper is trying to tackle. What's the impact of the work, e.g., why is it an important problem to solve?
  • P2: The main proposed idea(s).
  • P3: A summary of your understanding of different components of the proposed technique, e.g., the purpose of critical design choices.
  • P4: Your perceived strengths and weaknesses of the work, e.g., novelty, significance of improvements, quality of the evaluation, easy-to-use.
  • P5: Is there room for improvement? If so, what idea do you have for improving the techniques?

You do not need to write super long paragraphs, as long as you have the key points listed out in each paragraph. You can discuss the paper with other students, but all of your writing work should be your own. DO NOT use AI tools to draft it!

In terms of grading criteria, each summary has 10 points in total. For each review item above, you get:

  • 2: The summary item demonstrates a clear understanding of the paper.
  • 1: The summary item misses the point of the paper.
  • 0: The summary item is missing.

Due to selecting the 8/10 paper summaries, late submissions won't be accepted and will receive 0 points.

Post-Lecture Panel Discussion

To foster a deeper understanding of the papers and encourage critical thinking, lectures with paper summary will be followed by a panel discussion. This discussion will involve three distinct roles played by different student groups, simulating an interactive and dynamic scholarly exchange.

Roles and Responsibilities

  1. The Companion (Author) Group
  • Responsibility: As authors, you are expected to defend your paper against critiques, answer questions, and discuss how you might improve or extend your research in the future, akin to writing a rebuttal during the peer-review process.
  1. The Reviewer Group
  • Responsibility: Reviewers critically assess the paper, posing challenging questions and highlighting potential weaknesses or areas for further investigation. Your goal is to engage in a constructive critique of the paper, simulating a peer review scenario.
  1. Rest of the Class
  • Responsibility: During the panel discussions, feel free to actively ask questions and engage in the dialogue.

The lecturer will also pose challenging questions to both the companion and reviewer groups, so please come well-prepared!

Project

You will have to complete substantive work an instructor-approved problem and have original contribution. Surveys are not permitted as projects; instead, each project must contain a survey of background and related work.

You must meet the following milestones (unless otherwise specified in future announcements) to ensure a high-quality project at the end of the semester:

  • Turn in a 2-page draft proposal (template), plus as many pages as needed for references, by Oct 1. Remember to include the names and UIUC email addresses of the group members.
  • Each group must schedule project discussion with the instructor during class hours or office hours in the week of Oct 3 and Oct 8.
  • Each group must turn in a 3/4-page mid-semester report via email on or before 6:00PM CST on Nov 7.
  • Each group must turn in an 8-page final report and your code via email on or before 6:00PM CST on Dec 19. The report must be submitted as a PDF file, with formatting similar to that of the papers you've read in the class. The self-contained (i.e., include ALL dependencies) code must be submitted as a zip file. Each zip file containing the code must include a README file with a step-by-step guide on how to compile and run the provided code.
  • You can find how to access GPU resources here.

Acknowledgements

This course alternates with Prof. Minjia Zhang's CS 498. Big thanks to Prof. Minjia Zhang!

About

CS 498: Machine Learning Systems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 86.0%
  • Shell 14.0%