01 Introduction & Linear Regression

🎲 Random

LAV ELI & yellowheart. - Որտեղ ապրում է սիրտը

Դե, բարի գալուստ ՄԼ-ի կտոր :^)

📚 Նյութը

Սլայդերի մեծ մասը հիմնված ա LMU Munich SLDS — Introduction to Machine Learning (I2ML) բաց նյութերի վրա։

🏡 Տնային

Two assignments. Assignment 1 is pure implementation (no scikit-learn); Assignment 2 is a practical end-to-end workflow.

Assignment 1 — Linear regression from scratch

Use the dataset data_lin_reg.csv (two columns, x and y). Only the libraries we have covered so far (numpy, pandas, matplotlib) — no scikit-learn.

Load the data with pandas and make a scatter plot.
Implement gradient descent for the model $\hat{y} = \theta_0 + \theta_1 x$. Write down the L2 empirical risk $R(\theta) = \frac{1}{n}\sum_i (\hat{y}_i - y_i)^2$ and derive its gradient.
Tune the learning rate $\alpha$. Plot iterations vs. risk for a few values of $\alpha$ (one too small, one good, one that diverges).
Implement the normal equation using the design matrix $X$ (with the column of ones). Confirm that gradient descent and the closed-form solution give the same $\theta$.
Plot the data together with the fitted line and the predictions. Report the final formula.
Polynomial regression: extend the design matrix to degree $d$ (e.g. $\theta_0 + \theta_1 x + \theta_2 x^2 + \dots$), fit it with the normal equation, and plot the fit for a few degrees. What happens as $d$ grows?

Additional

Plot a histogram of the residuals.
Plot the risk surface (or a $\theta_0$–$\theta_1$ contour) and overlay the gradient-descent trajectory on it.
Vectorize everything — no Python loops over individual samples.

Assignment 2 — Practical regression with scikit-learn

Use the dataset House_Rent_Dataset.csv. Build a full pipeline and interpret the result — don’t just call .fit().

EDA: load the data, describe() it, visualize the target and the feature distributions, and note any problems.
Missing data: detect it and decide how to handle it (drop vs. impute) — justify your choice.
Encode categorical features: choose One-Hot / Ordinal / Target encoding per feature and justify each. (Remember the trap: don’t label-encode an unordered category as if it were ordered.)
Scale the features (StandardScaler or MinMaxScaler) and explain when scaling matters for linear regression.
Interpret the coefficients: sort them by absolute value. Which features matter most? Does it make sense?

Additional

Add polynomial features and watch the model start to learn the noise in the data and lose generalizability.

Helpful links to get started with scikit-learn:

Getting Started with scikit-learn
User Guide: Preprocessing data (encoders, scalers, polynomial features)
LinearRegression API reference

--- title: "01 Introduction & Linear Regression" resources: - data/data_lin_reg.csv - data/House_Rent_Dataset.csv --- # 🎲 Random - [LAV ELI & yellowheart. - Որտեղ ապրում է սիրտը](https://www.youtube.com/watch?v=u4h0biGtqZI) ![image.png](../00_random_image/dog_and_jini.jpg) Դե, բարի գալուստ ՄԼ-ի կտոր :^) # 📚 Նյութը - [📺 Introduction & Linear Regression](https://youtu.be/4BfPAkVOBdc), [🎞️ Սլայդեր](L01_intro_linear_regression.pdf), [📝 Նշումներով](L01_intro_linear_regression_notes.pdf) - [📺 Design matrix, normal equation, polynomial regression](https://youtu.be/VmpkybaOBto), [🎞️ Սլայդեր](L01b_linear_regression_derivations.pdf), [📝 Նշումներով](L01b_linear_regression_derivations_notes.pdf) - [📺 Data preprocessing: missing data, categorical encoding, scaling](https://youtu.be/BVe9GbH8tzE), [🎞️ Սլայդեր](L01c_data_preprocessing.pdf), [📝 Նշումներով](L01c_data_preprocessing_notes.pdf) *** Սլայդերի մեծ մասը հիմնված ա [LMU Munich SLDS — Introduction to Machine Learning (I2ML)](https://slds-lmu.github.io/i2ml/) բաց նյութերի վրա։ # 🏡 Տնային Two assignments. **Assignment 1** is pure implementation (no scikit-learn); **Assignment 2** is a practical end-to-end workflow. ## Assignment 1 — Linear regression from scratch Use the dataset [`data_lin_reg.csv`](data/data_lin_reg.csv) (two columns, `x` and `y`). Only the libraries we have covered so far (numpy, pandas, matplotlib) — **no scikit-learn**. 1. Load the data with pandas and make a scatter plot. 2. Implement **gradient descent** for the model $\hat{y} = \theta_0 + \theta_1 x$. Write down the L2 empirical risk $R(\theta) = \frac{1}{n}\sum_i (\hat{y}_i - y_i)^2$ and derive its gradient. 3. Tune the learning rate $\alpha$. Plot **iterations vs. risk** for a few values of $\alpha$ (one too small, one good, one that diverges). 4. Implement the **normal equation** using the design matrix $X$ (with the column of ones). Confirm that gradient descent and the closed-form solution give the same $\theta$. 5. Plot the data together with the fitted line and the predictions. Report the final formula. 6. **Polynomial regression:** extend the design matrix to degree $d$ (e.g. $\theta_0 + \theta_1 x + \theta_2 x^2 + \dots$), fit it with the normal equation, and plot the fit for a few degrees. What happens as $d$ grows? **Additional** - Plot a histogram of the residuals. - Plot the risk surface (or a $\theta_0$–$\theta_1$ contour) and overlay the gradient-descent trajectory on it. - Vectorize everything — no Python loops over individual samples. ## Assignment 2 — Practical regression with scikit-learn Use the dataset [`House_Rent_Dataset.csv`](data/House_Rent_Dataset.csv). Build a full pipeline and **interpret** the result — don't just call `.fit()`. 1. **EDA:** load the data, `describe()` it, visualize the target and the feature distributions, and note any problems. 2. **Missing data:** detect it and decide how to handle it (drop vs. impute) — justify your choice. 3. **Encode categorical features:** choose One-Hot / Ordinal / Target encoding per feature and justify each. (Remember the trap: don't label-encode an unordered category as if it were ordered.) 4. **Scale the features** (StandardScaler or MinMaxScaler) and explain when scaling matters for linear regression. 5. **Interpret the coefficients:** sort them by absolute value. Which features matter most? Does it make sense? **Additional** - Add polynomial features and watch the model start to learn the noise in the data and lose generalizability. **Helpful links to get started with scikit-learn:** - [Getting Started with scikit-learn](https://scikit-learn.org/stable/getting_started.html) - [User Guide: Preprocessing data](https://scikit-learn.org/stable/modules/preprocessing.html) (encoders, scalers, polynomial features) - [`LinearRegression` API reference](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html) <a href="http://s01.flagcounter.com/more/1oO"><img src="https://s01.flagcounter.com/count2/1oO/bg_FFFFFF/txt_000000/border_CCCCCC/columns_2/maxflags_10/viewers_0/labels_0/pageviews_1/flags_0/percent_0/" alt="Flag Counter"></a>