06 Dimensionality Reduction
🎲 Random
TBD
📚 Նյութը
📝 Թեմայի վերաբերյալ հարցաշար (Google Form): TBD
References:
- Turk & Pentland, Eigenfaces for Recognition (1991) — the paper that started face recognition with PCA.
- scikit-learn example: Faces dataset decompositions.
🏡 Տնային
Project — Eigenfaces: a face in a handful of numbers 🧀🧀
A 64×64 face is 4096 numbers, but faces live near a much smaller subspace. PCA finds that subspace, and its components — drawn back as images — look like ghostly faces (“eigenfaces”). In this project you’ll compress faces, visualize them in 2-D, and finally recognize them, all with the tools from the lecture.
Setup. Data: sklearn.datasets.fetch_olivetti_faces (40 people × 10 images, 64×64, values in [0, 1]). One reproducible notebook, seed 509, ending with a 3–5 sentence conclusion. Save results to an out/ subfolder.
Tasks.
- Load the faces, show a few, and flatten to
(400, 4096). - Fit PCA; display the mean face and the top eigenfaces (components reshaped to 64×64).
- Plot the scree and cumulative explained variance; pick
kfor ~95%. - Reconstruct a face at several
k(e.g. 10 / 50 / 150) and plot reconstruction error vsk. How few numbers still look like the person? - Visualize face space: embed in 2-D three ways —
PCA,t-SNE, andUMAP— colored by person (use ~10 people so it stays readable). Which one separates identities best? - Recognizer: make a stratified train/test split, fit PCA on the training faces, and classify test faces by nearest neighbor in PCA space. Plot accuracy vs number of components, and compare to nearest neighbor on the raw pixels.
- (Callback) Run k-means in PCA space and compare the clusters to the true identities with the adjusted Rand index.
Bonus: whiten the PCA (whiten=True); try the larger LFW faces; add a few of your own photos; vary the train/test split.