AL3451 Machine Learng Syllabus
| Course Detail | Description |
| Course Code | AL3451 – Machine Learning |
| L–T–P–C | 3–0–0–3 |
| Total Periods | 45 periods |
Course Objectives
The course objectives are intended to help students achieve the following:
Understand basic concepts of machine learning.
Build supervised and unsupervised learning models.
Evaluate algorithms using appropriate metrics.
Course Outcomes (COs)
Upon successful completion of the course, students should be able to:
CO1: Explain basic Machine Learning (ML) concepts.
CO2: Construct supervised learning models.
CO3: Construct unsupervised learning algorithms.
CO4: Evaluate and compare different models.
Textbooks
The sources specify the following required textbooks for the course:
Ethem Alpaydin, Introduction to Machine Learning, MIT Press, 4th ed.
Stephen Marsland, Machine Learning: An Algorithmic Perspective, 2nd ed., CRC Press.
UNIT I: Introduction to Machine Learning (8 periods)
This unit focuses on the foundational concepts of Machine Learning.
| Topic | Subtopics |
| Review of Linear Algebra for ML | Minimum Linear Algebra concepts, applications (datasets, linear regression, regularization, PCA, SVD, Deep Learning) |
| Introduction and Motivation for Machine Learning | Introduction to ML, Machine Learning Definitions, Machine Learning Types (Supervised, Unsupervised, Reinforcement) |
| Examples of Machine Learning Applications | Netflix Recommendation Engine, Facebook’s Auto-tagging feature, Amazon’s Alexa, Google’s Spam Filter, Image Recognition, Traffic Prediction, Product Recommendations, Self-driving cars, Medical Diagnosis, Automatic Language Translation |
| Vapnik-Chervonenkis (VC) Dimension | Introduction, Shattering, Finding VC Dimension, Considerations & Keynotes |
| Probably Approximately Correct (PAC) Learning | — |
| Hypothesis Spaces, Inductive bias | Hypothesis in Machine Learning, Hypothesis in Statistics, Null Hypothesis, Alternative Hypothesis |
| Generalization | Definition of generalization, Determinant factors to train generalized models (Dataset, ML Algorithm, Model complexity, Regularization) |
| Bias variance trade-off | Bias, Variance Error, Different Combinations of Bias-Variance, Bias-Variance Trade-off |
UNIT II: Supervised Learning (11 periods)
This unit covers various supervised learning models for both regression and classification.
| Topic | Subtopics |
| Linear Regression Models | Least squares, single & multiple variables, Regression Algorithms, Least Square Method |
| Bayesian linear regression | Implementing Bayesian Linear Regression, Bayesian Linear Modeling Application |
| Gradient descent | Definition, How Gradient Descent works, Types of Gradient Descent (Batch, Stochastic, Mini-Batch), Challenges (Local Minima/Saddle Point, Vanishing/Exploding Gradient) |
| Linear Classification Models | Linear Classification, Discriminant function |
| Perceptron algorithm | Perceptron Structure (Inputs, Weights, Weighted Sum, Thresholding), Update Rule |
| Probabilistic discriminative model | Logistic regression, Sigmoid Activation Function |
| Probabilistic generative model | Generative Modelling, Types of Generative Models, Naive Bayes Classifier Algorithm (Bayes' Theorem, Types: Gaussian, Multinomial, Bernoulli) |
| Maximum margin classifier | Support vector machine (SVM), Maximum Margin Classifiers, Kernel Trick |
| Decision Tree | Definition, Tree Structure (Internal Node, Branch, Leaf Node), Attribute Selection Measures (Information Gain, Gini Index) |
| Random Forests | Combining multiple Decision Trees using bagging method |
UNIT III: Ensemble Techniques and Unsupervised Learning (9 periods)
This unit covers combining multiple models and algorithms designed for unlabeled data.
| Topic | Subtopics |
| Combining multiple learners, Voting | Model combination schemes, Simple Ensemble Techniques (Max Voting, Averaging, Weighted Averaging) |
| Ensemble Learning | Bagging (Bootstrap Aggregating), Boosting, Stacking (Advanced Ensemble Technique), Blending |
| Unsupervised learning | Definition, Types (Clustering, Association), Popular unsupervised learning algorithms (K-means, KNN, GMM, etc.) |
| K-means | K-means Clustering, Applications (Academic performance, Diagnostic systems, Search engines, Wireless sensor networks) |
| Instance-based learning | K-Nearest Neighbour (KNN) Algorithm for Machine Learning, Advantages and Disadvantages of KNN |
| Gaussian mixture models | GMM definition, Training steps, Applications (Density estimation, Clustering, Image segmentation, Anomaly detection, Time series analysis) |
| Expectation Maximization (EM) | EM algorithm steps (Initialization, E-Step, M-Step, Convergence), EM intuition, Applications |
UNIT IV: Neural Networks (9 periods)
This unit details the architecture and training methodologies of neural networks.
| Topic | Subtopics |
| Perceptron, Multilayer perceptron | Basic components (Input Nodes, Weights, Bias, Activation function), How Perceptron works, Multi-Layer Perceptron (MLP) Model, Advantages/Disadvantages of MLP |
| Activation functions | Types: Sigmoid, Tanh (Hyperbolic Tangent), ReLU (Rectified Linear Unit), Leaky ReLU, ELU (Exponential Linear Units) |
| Network training, Gradient descent optimization, Stochastic gradient descent (SGD) | Network training process (minimizing loss function), Gradient Descent objective, Types of Gradient Descent (Batch, Stochastic, Mini-batch), SGD algorithm steps |
| Error backpropagation | Definition, Backpropagation steps/algorithm (Forward/Backward stage), Types of Backpropagation (Static, Recurrent), Advantages |
| From shallow networks to deep networks | Error backpropagation applied to deep networks |
| Unit saturation / vanishing gradient problem | The problem, Why it happens (Sigmoid derivative), Solutions (Residual block, Batch normalization, ReLU) |
| ReLU | Advantages (Simpler computation, Representational Sparsity, Linearity), Disadvantages (Exploding Gradient, Dying ReLU) |
| Hyperparameter tuning | Definition of hyperparameters, Examples (C and sigma for SVM, k for k-nearest neighbors, learning rate), Strategies (Grid Search CV, Randomized Search CV) |
| Batch normalization | Normalization of the Input, Calculation steps (mean and standard deviation of hidden activation), Advantages (Speed up training, handles internal covariate shift) |
| Regularization | Regularization Parameter definition, Techniques (Ridge Regression/L2 norm, Lasso Regression/L1 norm, Elastic Net Regression), Common Regularization Methods (Early stopping, Weight decay, Dropout, Model combination) |
| Dropout | Definition (randomly ignoring nodes), Dropout implementation, Downside (requires 2-3 times longer to train than a standard network) |
UNIT V: Design and Analysis of Machine Learning Experiments (8 periods)
This unit focuses on experimental guidelines, evaluation, and comparison of models.
| Topic | Subtopics |
| Guidelines for Machine Learning Experiments | Project Life Cycle, Anatomy of a Machine Learning Experiment, Properties of an Experiment, Trial and Trial Component |
| Cross Validation (CV) and Resampling | Cross-Validation, The Validation Set Approach, K-Fold Cross-Validation, Bootstrapping Sampling Method |
| Measuring classifier performance | — |
| Assessing a single classification algorithm and comparing two classification algorithms | T-test, McNemar’s test, K-fold CV paired t test |