Course Notes: High Performance Machine Learning

less than 1 minute read

These are my notes on the “High Performance Machine Learning” course (ECE-GY 9143) offered at NYU Tandon by Parijat Dube and Zehra Sura.

Introduction to Benchmarking

Coding Exercise

We learn and implement microbenchmarks to measure execution time, memory bandwidth, and compute FLOPS. This provides insights into system efficiency and computational throughput.

Code

Introduction to Profiling ML Models

Coding Exercise

We implement a ResNet-18 model and train it on the CIFAR-10 dataset with various training configurations. We profile the performance, analyzing the impact of number of workers in data loaders, optimizers and batch norm layers.

Code

Introduction to Model Tuning

Coding Exercise

We program a ChatBot trained using parameters obtained from a Weights & Biases (W&B) parameter sweep. We profile the model using PyTorch Profiler to analyze performance bottlenecks. Additionally, we create an optimized TorchScript version for efficient deployment.

Code

Introduction to CUDA Programming

Coding Exercise

We implement CUDA kernels for Matrix Multiplication, Unified Memory, and Convolution. We then benchmark GPU performance, measuring execution time, memory throughput, and computational efficiency.

Code

Share on

X Facebook LinkedIn Bluesky

Rugved Mhatre

Course Notes: High Performance Machine Learning

Introduction to Benchmarking

Coding Exercise

Introduction to Profiling ML Models

Coding Exercise

Introduction to Model Tuning

Coding Exercise

Introduction to CUDA Programming

Coding Exercise

Share on

You May Also Enjoy

Book Notes: Linux Programming Interface - Linux Fundamentals

Real-Time Embedded Systems Course

Data Science for Business Course

Advanced Topics in Computer Vision Course