Machine Learning is a cutting-edge branch of Artificial Intelligence that has brought forth exciting new technological advances in recent years. This book introduces this important topic of current interest while also explaining its practical applications. Aimed at graduate students, teachers and researchers, this book will also help practitioners in implementing ML algorithms.
This book explores concepts such as feature engineering, model selection, model estimation, model validation and model explanation and provides an in-depth discussion of the main classification and clustering techniques and algorithms. It also examines optimal predictors and provides an introduction to Deep Learning architecture, including autoencoders and various neural networks.
This book is a valuable resource for anyone interested in machine learning, data mining and pattern recognition.
Salient features
Online resources available at: https://www.universitiespress.com/MachineLearningTheoryandPractice
M N Murty is Honorary Professor at the Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, India.
Ananthanarayana V S is Professor at the Department of Information Technology, National Institute of Technology Karnataka, Surathkal, Mangaluru, India.
Preface Acknowledgements List of Acronyms
Chapter 1: Introduction to Machine Learning Evolution of Machine Learning | Paradigms for ML | Learning by Rote | Learning by Deduction | Learning by Abduction | Learning by Induction | Reinforcement Learning | Types of Data | Matching | Stages in Machine Learning | Data Acquisition | Feature Engineering | Data Representation | Model Selection | Model Learning | Model Evaluation | Model Prediction | Model Explanation | Search and Learning | Explanation Offered by the Model | Data Sets Used
Chapter 2: Nearest Neighbor-Based Models Introduction to Proximity Measures | Distance Measures | Minkowski Distance |Weighted Distance Measure | Non-Metric Similarity Functions | Levenshtein Distance | Mutual Neighborhood Distance (MND) | Proximity Between Binary Patterns | Different Classification Algorithms Based on the Distance Measures | Nearest Neighbor Classifier (NNC) | K-Nearest Neighbor Classifier | Weighted K-Nearest Neighbor (WKNN) Algorithm | Radius Distance Nearest Neighbor Algorithm | Tree-Based Nearest Neighbor Algorithm | Branch and Bound Method | Leader Clustering | KNN Regression | Concentration Effect and Fractional Norms | Performance Measures | Performance of Classifiers | Performance of Regression Algorithms | Area Under the ROC Curve for the Breast Cancer Data Set
Chapter 3: Models Based on Decision Trees Introduction to Decision Trees | Decision Trees for Classification | Impurity Measures for Decision Tree Construction | Properties of the Decision Tree Classifier (DTC) | Applications in Breast Cancer Data | Embedded Schemes for Feature Selection | Regression Based on Decision Trees | Bias–Variance Trade-off | Random Forests for Classification and Regression | Comparison of DT and RF Models on Olivetti Face Data | AdaBoost Classifier | Regression Using DT-Based Models | Gradient Boosting (GB) | Practical Application
Chapter 4: The Bayes Classifier Introduction to the Bayes Classifier | Probability, Conditional Probability and Bayes’ Rule | Conditional Probability | Total Probability | Bayes’ Rule and Inference | Bayes’ Rule and Classification | Random Variables, Probability Mass Function, Probability Density Function and Cumulative Distribution Function, Expectation and Variance | Random Variables | Probability Mass Function (PMF) | Binomial Random Variable | Cumulative Distribution Function (CDF) | Continuous Random Variables | Expectation of a Random Variable | Variance of a Random Variable | Normal Distribution | The Bayes Classifier and its Optimality | Multi-Class Classification | Parametric and Non-Parametric Schemes for Density Estimation | Parametric Schemes | Class Conditional Independence and Na.ve Bayes Classifier | Estimation of the Probability Structure | Naive Bayes Classifier (NBC)
Chapter 5: Machine Learning Based on Frequent Itemsets Introduction to the Frequent Itemset Approach | Frequent Itemsets | Frequent Itemset Generation | Frequent Itemset Generation Strategies | Apriori Algorithm | Frequent Pattern Tree and Variants | FP Tree-Based Frequent Itemset Generation | Pattern Count (PC) Tree-Based Frequent Itemset Generation | Frequent Itemset Generation Using the PC Tree | Dynamic Mining of Frequent Itemsets | Classification Rule Mining | Frequent Itemsets for Classification Using PC Tree | Frequent Itemsets for Clustering Using the PC Tree
Chapter 6: Representation Introduction to Representation | Feature Selection | Linear Feature Extraction | Vector Spaces | Basis of a Vector Space | Row Vectors and Column Vectors | Linear Transformations | Eigenvalues and Eigenvectors | Symmetric Matrices | Rank of a Matrix | Principal Component Analysis | Experimental Results on Olivetti Face Data | Singular Value Decomposition | PCA and SVD | Random Projections
Chapter 7: Clustering Introduction to Clustering | Partitioning of Data | Data Re-organization | Data Compression | Summarization | Matrix Factorization | Clustering of Patterns | Data Abstraction | Clustering Algorithms | Divisive Clustering | Agglomerative Clustering | Partitional Clustering | K-Means Clustering | K-Means++ Clustering | Soft Partitioning | Soft Clustering | Fuzzy C-Means Clustering | Rough Clustering | Rough K-Means Clustering Algorithm | Expectation Maximization-Based Clustering | Spectral Clustering | Clustering Large Data Sets | Divide-and-Conquer Method
Chapter 8: Linear Discriminants for Machine Learning Introduction to Linear Discriminants | Linear Discriminants for Classification | Parameters Involved in the Linear Discriminant Function | Learning w and b | Perceptron Classifier | Perceptron Learning Algorithm | Convergence of the Learning Algorithm | Linearly Non-Separable Classes | Multi-Class Problems | Support Vector Machines | Linearly Non-Separable Case | Non-linear SVM | Kernel Trick | Logistic Regression | Linear Regression | Sigmoid Function | Learning w and b in Logistic Regression | Multi-Layer Perceptrons (MLPs) | Backpropagation for Training an MLP | Results on the Digits Data Set
Chapter 9: Deep Learning Introduction to Deep Learning | Non-Linear Feature Extraction Using Autoencoders | Comparison on the Digits Data Set | Deep Neural Networks | Activation Functions | Initializing Weights | Improved Optimization Methods | Adaptive Optimization | Loss Functions | Regularization | Adding Noise to the Output or Label Smoothing | Experimental Results on the MNIST Data Set | Convolutional Neural Networks | Convolution | Padding Zero Rows and Columns | Pooling to Reduce Dimensionality | Recurrent Neural Networks | Training an RNN | Encoder–Decoder Models | Generative Adversarial Networks
Conclusions Appendix – Hints to Practical Exercises Index