Foundations of Machine Learning 1

Foundations

CSCI 3151 is a rigorous introduction to the foundations of machine learning. The course emphasizes the mathematical, statistical, and computational ideas that make machine learning work in practice, while also giving students hands-on experience with modern machine learning tools and real datasets.

Students move from core learning paradigms and probabilistic foundations to model evaluation, regularization, and optimization; from kernel methods and feature engineering to dimensionality reduction and learned representations; and from feedforward neural networks to convolutional networks, recurrent networks, transformers, and self-/semi-supervised learning.

Course website

What students learn

By the end of the course, students will be able to:

  • explain core machine learning paradigms, including supervised, unsupervised, semi-supervised, and self-supervised learning
  • apply maximum likelihood estimation and understand its role in model training
  • evaluate models using appropriate metrics, validation strategies, and experimental design
  • analyze bias–variance tradeoffs and use regularization to improve generalization
  • implement and interpret kernel methods, neural networks, and dimensionality reduction techniques
  • diagnose training challenges such as overfitting and vanishing/exploding gradients
  • engineer features and use augmentation to improve robustness
  • describe the structure and intuition behind convolutional, recurrent, and transformer architectures
  • communicate machine learning results clearly and critically assess model behaviour

Course structure

The course is organized around a progression from ML foundations to modern architectures:

  • Foundations and paradigms: what machine learning is, when it is appropriate, and how supervised, unsupervised, semi-supervised, and self-supervised settings differ
  • Classical ML workflows: supervised pipelines, loss functions, logistic regression, maximum likelihood estimation, expectation maximization, model evaluation, train/validation/test design, and bias–variance
  • Capacity, optimization, and feature spaces: regularization, gradient descent, kernels, support vector machines, feature engineering, data augmentation, and dimensionality reduction with PCA, t-SNE, and UMAP
  • Deep learning foundations: feedforward neural networks, backpropagation, training stability, dropout, batch normalization, and representation learning
  • Modern architectures: convolutional neural networks, residual networks, recurrent neural networks, LSTMs, GRUs, attention, transformers, and introductory self-/semi-supervised learning