Foundations of Machine Learning 1

Foundations

CSCI 3151 is a rigorous introduction to the foundations of machine learning. The course emphasizes the mathematical, statistical, and computational ideas that make machine learning work in practice, while also giving students hands-on experience with modern machine learning tools and real datasets.

Students move from core learning paradigms and probabilistic foundations to model evaluation, regularization, and optimization; from kernel methods and feature engineering to dimensionality reduction and learned representations; and from feedforward neural networks to convolutional networks, recurrent networks, transformers, and self-/semi-supervised learning.

Course website

What students learn

By the end of the course, students will be able to:

explain core machine learning paradigms, including supervised, unsupervised, semi-supervised, and self-supervised learning
apply maximum likelihood estimation and understand its role in model training
evaluate models using appropriate metrics, validation strategies, and experimental design
analyze bias–variance tradeoffs and use regularization to improve generalization
implement and interpret kernel methods, neural networks, and dimensionality reduction techniques
diagnose training challenges such as overfitting and vanishing/exploding gradients
engineer features and use augmentation to improve robustness
describe the structure and intuition behind convolutional, recurrent, and transformer architectures
communicate machine learning results clearly and critically assess model behaviour

Course structure

The course is organized around a progression from ML foundations to modern architectures:

Foundations and paradigms: what machine learning is, when it is appropriate, and how supervised, unsupervised, semi-supervised, and self-supervised settings differ
Classical ML workflows: supervised pipelines, loss functions, logistic regression, maximum likelihood estimation, expectation maximization, model evaluation, train/validation/test design, and bias–variance
Capacity, optimization, and feature spaces: regularization, gradient descent, kernels, support vector machines, feature engineering, data augmentation, and dimensionality reduction with PCA, t-SNE, and UMAP
Deep learning foundations: feedforward neural networks, backpropagation, training stability, dropout, batch normalization, and representation learning
Modern architectures: convolutional neural networks, residual networks, recurrent neural networks, LSTMs, GRUs, attention, transformers, and introductory self-/semi-supervised learning