- Online
- $995
- Requires a prerequisite course
This course is part of the UBC Certificate in Key Capabilities in Data Science.
This introductory course on machine learning for prediction focuses on regression and classification models. Understand how to map data to the correct model type, evaluate and select models, and communicate and interpret model results to help organizations reduce operating costs, optimize market strategies and identify trends.
By the end of the course, you’ll be able to:
- describe supervised learning and identify what kind of tasks it is suitable for
- explain common machine learning concepts such as classification and regression, training and testing, overfitting, parameters and hyperparameters, and the golden rule
- choose a correct predictive modelling technique (e.g., regression or classification) given the available data
- identify when and why to apply data pre-processing techniques such as scaling and one-hot encoding
- describe at a high level how common machine learning algorithms work, including decision trees, and k-nearest neighbours
- use Python and the scikit-learn package to develop an end-to-end supervised machine learning pipeline.
Course outline
Week 1 and Week 2
- Module 1: Machine Learning Technology
- Module 2: Decision Trees
- Module 3: Splitting, Cross-Validation and the Fundamental Tradeoff
Week 3 and Week 4
- Module 4: Similarity-Based Approaches to Supervised Learning
- Module 5: Preprocessing Numerical Features, Pipelines and Hyperparameter Optimization
Week 5 and Week 6
- Module 6: Preprocessing Categorical Variables and Sklearn’s ColumnTransformer
- Module 7: Assessment and Measurements
- Module 8: Linear Models
Week 7
- Final Project
How am I assessed?
Each course module includes an auto-graded assignment. In weeks 4 and 7, you take an online 45-minute open-book quiz that covers materials from modules 1–4 and 5–8 respectively. At the end of the Week 7, you complete a final project using the skills you learned in the course. You must obtain an overall grade of 70% or higher, and complete the final project, to pass the course.
Expected effort
Expect to spend 8–12 hours per week to complete weekly modules, auto-graded quizzes, open-book quizzes and the final project.
Technology requirements
To take this course, and for the best experience, we recommend you have access to:
- an email account
- a computer, laptop or tablet
- the latest version of a web browser (or previous major version release)
- a reliable internet connection.
For virtual office hours, you’ll also need:
- a video camera and microphone.
One day before the start of your course, we’ll email you step-by-step instructions for accessing your course.
Requisites
The prerequisite course is Programming in Python for Data Science.
You must complete the prerequisite course before starting Introduction to Machine Learning.
Course format
This course is 100% online and facilitator supported with weekly facilitator office hours. Course work is done independently and at your own pace within deadlines set by your facilitator. Log in anytime to your course to access the modules.
Course virtual office hours (subject to change)
- Mondays, 6:30-7:30pm Pacific Time
- Wednesdays, 6:30-7:30pm Pacific Time
Join your facilitator and classmates by video conferencing to discuss course materials and assignments, receive feedback and ask questions.