Course Outline

1. Understanding classification using nearest neighbors 

  • The kNN algorithm 
  • Calculating distance 
  • Choosing an appropriate k 
  • Preparing data for use with kNN 
  • Why is the kNN algorithm lazy?

2. Understanding naive Bayes 

  • Basic concepts of Bayesian methods 
  • Probability 
  • Joint probability
  • Conditional probability with Bayes' theorem 
  • The naive Bayes algorithm 
  • The naive Bayes classification 
  • The Laplace estimator
  • Using numeric features with naive Bayes

3. Understanding decision trees 

  • Divide and conquer 
  • The C5.0 decision tree algorithm 
  • Choosing the best split 
  • Pruning the decision tree

4. Understanding classification rules 

  • Separate and conquer 
  • The One Rule algorithm 
  • The RIPPER algorithm 
  • Rules from decision trees

5. Understanding regression 

  • Simple linear regression 
  • Ordinary least squares estimation 
  • Correlations 
  • Multiple linear regression

6. Understanding regression trees and model trees 

  • Adding regression to trees

7. Understanding neural networks 

  • From biological to artificial neurons 
  • Activation functions 
  • Network topology 
  • The number of layers 
  • The direction of information travel 
  • The number of nodes in each layer 
  • Training neural networks with backpropagation

8. Understanding Support Vector Machines 

  • Classification with hyperplanes 
  • Finding the maximum margin 
  • The case of linearly separable data 
  • The case of non-linearly separable data 
  • Using kernels for non-linear spaces

9. Understanding association rules 

  • The Apriori algorithm for association rule learning 
  • Measuring rule interest – support and confidence 
  • Building a set of rules with the Apriori principle

10. Understanding clustering

  • Clustering as a machine learning task
  • The k-means algorithm for clustering 
  • Using distance to assign and update clusters 
  • Choosing the appropriate number of clusters

11. Measuring performance for classification 

  • Working with classification prediction data 
  • A closer look at confusion matrices 
  • Using confusion matrices to measure performance 
  • Beyond accuracy – other measures of performance 
  • The kappa statistic 
  • Sensitivity and specificity 
  • Precision and recall 
  • The F-measure 
  • Visualizing performance tradeoffs 
  • ROC curves 
  • Estimating future performance 
  • The holdout method 
  • Cross-validation 
  • Bootstrap sampling

12. Tuning stock models for better performance 

  • Using caret for automated parameter tuning 
  • Creating a simple tuned model 
  • Customizing the tuning process 
  • Improving model performance with meta-learning 
  • Understanding ensembles 
  • Bagging 
  • Boosting 
  • Random forests 
  • Training random forests
  • Evaluating random forest performance

13. Deep Learning

  • Three Classes of Deep Learning
  • Deep Autoencoders
  • Pre-trained Deep Neural Networks
  • Deep Stacking Networks

14. Discussion of Specific Application Areas

  21 Hours
 

Number of participants


Starts

Ends


Dates are subject to availability and take place between 09:30 and 16:30.
Open Training Courses require 5+ participants.

Testimonials (1)

Related Courses

Related Categories