Foundations of Machine Learning 1
CSCI 3151 is a rigorous introduction to the foundations of machine learning. The course emphasizes the mathematical, statistical, and computational ideas that make machine learning work in practice, while also giving students hands-on experience with modern machine learning tools and real datasets.
Students move from core learning paradigms and probabilistic foundations to model evaluation, regularization, and optimization; from kernel methods and feature engineering to dimensionality reduction and learned representations; and from feedforward neural networks to convolutional networks, recurrent networks, transformers, and self-/semi-supervised learning.
Course website
What students learn
By the end of the course, students will be able to:
- explain core machine learning paradigms, including supervised, unsupervised, semi-supervised, and self-supervised learning
- apply maximum likelihood estimation and understand its role in model training
- evaluate models using appropriate metrics, validation strategies, and experimental design
- analyze bias–variance tradeoffs and use regularization to improve generalization
- implement and interpret kernel methods, neural networks, and dimensionality reduction techniques
- diagnose training challenges such as overfitting and vanishing/exploding gradients
- engineer features and use augmentation to improve robustness
- describe the structure and intuition behind convolutional, recurrent, and transformer architectures
- communicate machine learning results clearly and critically assess model behaviour
Course structure
The course is organized around a progression from ML foundations to modern architectures:
- Foundations and paradigms: what machine learning is, when it is appropriate, and how supervised, unsupervised, semi-supervised, and self-supervised settings differ
- Classical ML workflows: supervised pipelines, loss functions, logistic regression, maximum likelihood estimation, expectation maximization, model evaluation, train/validation/test design, and bias–variance
- Capacity, optimization, and feature spaces: regularization, gradient descent, kernels, support vector machines, feature engineering, data augmentation, and dimensionality reduction with PCA, t-SNE, and UMAP
- Deep learning foundations: feedforward neural networks, backpropagation, training stability, dropout, batch normalization, and representation learning
- Modern architectures: convolutional neural networks, residual networks, recurrent neural networks, LSTMs, GRUs, attention, transformers, and introductory self-/semi-supervised learning