### 2020-09-21, 09:15–10:45, Room 2

The eternal question which haunts every aspiring data scientist is - Where should I begin? Is traditional machine learning still relevant in this era to solve business problems? In this tutorial we will address these questions and take a deep dive into applying some of the most widely used traditional machine learning algorithms on real life use cases using scikit-learn, mlxtend and pandas.

*“A baby learns to crawl, walk and then run. We are in the crawling stage when it comes to applying machine learning.”*

With the advent of Deep Learning algorithms a decade back, the field of data science and machine learning has witnessed renewed zeal and enthusiasm. Today, every firm is eager to hire a data scientist who can derive value out of the data, but the key question is - Where should I begin? Various industry leaders are deploying deep learning models, should I do the same? Is traditional machine learning still relevant in this era to solve my business problem?

In this tutorial we will address these question and take **a deep dive into applying some of the most widely used traditional machine learning algorithms on real life use cases**. We will utilize open source libraries - scikit-learn, pandas & mlxtend for this purpose.

The key steps we will employ to tackle each problem are:

1. Understanding the algorithm

2. Importing the data

3. Data wrangling using pandas

4. Machine learning model development using scikit-learn/mlxtend

5. Model performance evaluation

Each exercise will employ a jupyter notebook based learning environment.

The workshop session (90 mins) will be divided as follows:

1. Introduction to Machine Learning - 5 mins

2. Why traditional machine learning is still relevant! - 5 mins

3. Exercise #1: Real Estate Valuation using Regression Algorithm (OLS) - 15 mins

4. Exercise #2: Market Basket Analysis using Association Rule Learning Algorithm (Apriori) - 15 mins

5. Exercise #3: Credit Risk Analysis using Instance-based Algorithm (kNN) - 15 mins

6. Exercise #4: Macroeconomic Analysis of Countries using Clustering Algorithm (k-Means) - 15 mins

7. Exercise #5: Credit Risk Analysis using Decision Tree Algorithm (CART) - 15 mins

8. Closing Remarks and Q&A - 5 mins

**Video introduction for tutorial** - Link

**Tutorial Slides** - Link

**Prerequisites**

- Technical: Basic Python Programming
- Software: Python 3.6+

Please install the following python packages -

`pip install scikit-learn, pandas, mlxtend, matplotlib`