_Objective:_ Analyze large scale dataset of fashion images to discover visually consistent style clusters. * _Dataset:_ StreetStye-27K. * _Code:_ demo [here](http://streetstyle.cs.cornell.edu/) ## New dataset: StreetStye-27K 1. **Photos (100 million)**: from Instagram using the [API](https://www.instagram.com/developer/) to retrieve images with the correct location and time. 2. **People (14.5 million)**: they run two algorithms to normalize the body position in the image: * [Face++](http://www.faceplusplus.com/) to detect and localize faces. * [Deformable Part Model](http://people.cs.uchicago.edu/%7Erbg/latent-release5/) to estimate the visibility of the rest of the body. 3. **Clothing annotations (27K)**: Amazon Mechanical Turk with quality control. 4000$ for the whole dataset. ## Architecture: Usual GoogLeNet but they use [Isotonice Regression](http://fastml.com/classifier-calibration-with-platts-scaling-and-isotonic-regression/) to correct the bias. ## Unsupervised clustering: They proceed as follow: 1. Compute the features embedding for a subset of the overall dataset selected to represent location and time. 2. Apply L2 normalization. 3. Use PCA to find the vector representing 90% of the variance (165 here). 4. Cluster them using a [GMM](https://en.wikipedia.org/wiki/Mixture_model#Multivariate_Gaussian_mixture_model) with 400 mixtures which represent the clusters. They compute fashion clusters for city or bigger entities: [![screen shot 2017-06-15 at 12 04 06 pm](https://user-images.githubusercontent.com/17261080/27176447-d33fc2dc-51c2-11e7-9191-dbf972ee96a1.png)](https://user-images.githubusercontent.com/17261080/27176447-d33fc2dc-51c2-11e7-9191-dbf972ee96a1.png) ## Results: Pretty standard techniques but all patched together to produce interesting visualizations.