Skip to content

Learning Algorithms

Learning Type Task Algorithm Comment Probabilistic Parametric Scope \(d_\text{VC}\) Bias Variance Generalization Advantages Disadvantages
Supervised Regression OLS Global \(k+1\) High Low Good
\(n >> k\)
Classification Logistic Global \(k+1\) High Low Good
\(n >> k\)
Regression/
Classification
Piecewise Constant Local
Regression/
Classification
Piecewise Polynomial Local
Regression/
Classification
SVM Margin-Based
Classification Gaussian
Regression/
Classification
KNN Nearest Neighbor
Regression/
Classification
Decision Tree Automatic Piecewise Constant

Exactly opposite in characteristics wrt to OLS
Local Low High - Highly-interpretable
- Auto-detect non-linear relationships
- Auto-model variable interactions
- Fast evaluation: Traversal only occurs on subset of attributes
- Poor regressive performance
- Unstable: Tree struct sensitive to train data; changing train data changes tree
- Require large no of splits for even simple relationships
Regression/
Classification
Linear Tree Automatic Piecewise Polynomial Local
Regression/
Classification
Random Forest Bagged Trees Local
Regression/
Classification
XGBoost Boosted Trees Local
Regression/
Classification
CatBoost Boosted Trees Local
Regression/
Classification
LightGBM Boosted Trees Local
Anomaly Detection Isolation Forest
Unsupervised Clustering K-Means
Anomaly Detection Kernel Density Estimation
Re-Inforcement Learning Q-Learning

Curse of Dimensionality

As the no of dimensions increases, relative distances tend to 0

Distance-based models are the most affected

  • KNN
  • K-Means
  • Tree-based classification
  • SVM?
Last Updated: 2024-05-12 ; Contributors: AhmedThahir

Comments