Machine Learning is a cutting-edge branch of Artificial Intelligence that has brought forth exciting new technological advances in recent years. This book introduces this important topic of current interest while also explaining its practical applications. Aimed at graduate students, teachers and researchers, this book will also help practitioners in implementing ML algorithms.

This book explores concepts such as feature engineering, model selection, model estimation, model validation and model explanation and provides an in-depth discussion of the main classification and clustering techniques and algorithms. It also examines optimal predictors and provides an introduction to Deep Learning architecture, including autoencoders and various neural networks.

This book is a valuable resource for anyone interested in machine learning, data mining and pattern recognition.

*Salient features*

- Clear and concise chapter learning objectives and summary of topics
- Over 125 solved examples to aid and enhance understanding of concepts
- Over 150 figures to provide visual impact and envisage abstract concepts
- Applications drawn from real-life data sets
- Over 125 conceptual and application-based exercise questions
- Comprehensive bibliography of sources and topics for further reading
- Appendix with hints in the form of code snippets for all the practical exercises
- Android app with chapter-wise PowerPoint slides and code snippets for the ML programs given in the book

Online resources available at: https://www.universitiespress.com/MachineLearningTheoryandPractice

**M N Murty** is Honorary Professor at the Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, India.

**Ananthanarayana V S **is Professor at the Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangaluru, India.

*Preface *

*Acknowledgements *

*List of Acronyms *

**Chapter 1: Introduction to Machine Learning **

Evolution of Machine Learning | Paradigms for ML | Learning by Rote | Learning by
Deduction | Learning by Abduction | Learning by Induction | Reinforcement Learning |
Types of Data | Matching | Stages in Machine Learning | Data Acquisition |
Feature Engineering | Data Representation | Model Selection | Model Learning |
Model Evaluation | Model Prediction | Model Explanation | Search and Learning |
Explanation Offered by the Model | Data Sets Used

**Chapter 2: Nearest Neighbor-Based Models **

Introduction to Proximity Measures | Distance Measures | Minkowski Distance |Weighted
Distance Measure | Non-Metric Similarity Functions | Levenshtein Distance |
Mutual Neighborhood Distance (MND) | Proximity Between Binary Patterns |
Different Classification Algorithms Based on the Distance Measures | Nearest Neighbor
Classifier (NNC) | K-Nearest Neighbor Classifier | Weighted K-Nearest Neighbor
(WKNN) Algorithm | Radius Distance Nearest Neighbor Algorithm | Tree-Based Nearest
Neighbor Algorithm | Branch and Bound Method | Leader Clustering | KNN
Regression | Concentration Effect and Fractional Norms | Performance Measures |
Performance of Classifiers | Performance of Regression Algorithms | Area Under the
ROC Curve for the Breast Cancer Data Set

**Chapter 3: Models Based on Decision Trees **

Introduction to Decision Trees | Decision Trees for Classification | Impurity Measures
for Decision Tree Construction | Properties of the Decision Tree Classifier (DTC) |
Applications in Breast Cancer Data | Embedded Schemes for Feature Selection |
Regression Based on Decision Trees | Bias–Variance Trade-off | Random Forests
for Classification and Regression | Comparison of DT and RF Models on Olivetti
Face Data | AdaBoost Classifier | Regression Using DT-Based Models | Gradient Boosting
(GB) | Practical Application

**Chapter 4: The Bayes Classifier **

Introduction to the Bayes Classifier | Probability, Conditional Probability and Bayes’
Rule | Conditional Probability | Total Probability | Bayes’ Rule and Inference | Bayes’ Rule
and Classification | Random Variables, Probability Mass Function, Probability Density
Function and Cumulative Distribution Function, Expectation and Variance | Random
Variables | Probability Mass Function (PMF) | Binomial Random Variable | Cumulative
Distribution Function (CDF) | Continuous Random Variables | Expectation of a Random
Variable | Variance of a Random Variable | Normal Distribution | The Bayes Classifier
and its Optimality | Multi-Class Classification | Parametric and Non-Parametric Schemes
for Density Estimation | Parametric Schemes | Class Conditional
Independence and Na.ve Bayes Classifier | Estimation of the Probability Structure |
Naive Bayes Classifier (NBC)

**Chapter 5: Machine Learning Based on Frequent Itemsets **

Introduction to the Frequent Itemset Approach | Frequent Itemsets | Frequent Itemset
Generation | Frequent Itemset Generation Strategies | Apriori Algorithm | Frequent
Pattern Tree and Variants | FP Tree-Based Frequent Itemset Generation | Pattern Count
(PC) Tree-Based Frequent Itemset Generation | Frequent Itemset Generation Using
the PC Tree | Dynamic Mining of Frequent Itemsets | Classification Rule Mining |
Frequent Itemsets for Classification Using PC Tree | Frequent Itemsets for Clustering
Using the PC Tree

**Chapter 6: Representation **

Introduction to Representation | Feature Selection | Linear Feature Extraction |
Vector Spaces | Basis of a Vector Space | Row Vectors and Column Vectors |
Linear Transformations | Eigenvalues and Eigenvectors | Symmetric Matrices |
Rank of a Matrix | Principal Component Analysis | Experimental Results on Olivetti
Face Data | Singular Value Decomposition | PCA and SVD | Random Projections

**Chapter 7: Clustering **

Introduction to Clustering | Partitioning of Data | Data Re-organization | Data
Compression | Summarization | Matrix Factorization | Clustering of Patterns |
Data Abstraction | Clustering Algorithms | Divisive Clustering | Agglomerative
Clustering | Partitional Clustering | K-Means Clustering | K-Means++ Clustering |
Soft Partitioning | Soft Clustering | Fuzzy C-Means Clustering | Rough Clustering |
Rough K-Means Clustering Algorithm | Expectation Maximization-Based Clustering |
Spectral Clustering | Clustering Large Data Sets | Divide-and-Conquer Method

**Chapter 8: Linear Discriminants for Machine Learning **

Introduction to Linear Discriminants | Linear Discriminants for Classification |
Parameters Involved in the Linear Discriminant Function | Learning w and b | Perceptron
Classifier | Perceptron Learning Algorithm | Convergence of the Learning Algorithm |
Linearly Non-Separable Classes | Multi-Class Problems | Support Vector Machines |
Linearly Non-Separable Case | Non-linear SVM | Kernel Trick | Logistic Regression |
Linear Regression | Sigmoid Function | Learning w and b in Logistic Regression |
Multi-Layer Perceptrons (MLPs) | Backpropagation for Training an MLP | Results on
the Digits Data Set

**Chapter 9: Deep Learning **

Introduction to Deep Learning | Non-Linear Feature Extraction Using Autoencoders |
Comparison on the Digits Data Set | Deep Neural Networks | Activation Functions |
Initializing Weights | Improved Optimization Methods | Adaptive Optimization | Loss
Functions | Regularization | Adding Noise to the Output or Label Smoothing |
Experimental Results on the MNIST Data Set | Convolutional Neural Networks |
Convolution | Padding Zero Rows and Columns | Pooling to Reduce Dimensionality |
Recurrent Neural Networks | Training an RNN | Encoder–Decoder Models | Generative
Adversarial Networks

*Conclusions*

*Appendix – Hints to Practical Exercises*

*Index*