[link]
_Objective:_ Image-to-image translation to perform visual attribute transfer using unpaired images. _Dataset:_ [Cityscapes](https://www.cityscapes-dataset.com/), [CMP Facade](http://cmp.felk.cvut.cz/%7Etylecr1/facade/), [UT Zappos50k](http://vision.cs.utexas.edu/projects/finegrained/utzap50k/) and [ImageNet](http://www.image-net.org/). _Code:_ [CycleGAN](https://github.com/junyanz/CycleGAN) ## Inner-workings: Basically two GANs for each domain with their respective Generator and Discriminator plus two additional losses (called consistency losses) to make sure that translating to the other domain then back yields an image that is still realistic. [](https://cloud.githubusercontent.com/assets/17261080/26717449/bcd8a9cc-477d-11e7-9137-fd277a0ec04f.png) For the consistency los they use a pixel-wise L1 norm: [](https://cloud.githubusercontent.com/assets/17261080/26717733/bc088cdc-477e-11e7-96af-2defa06a1660.png) ## Architecture: Based on [Perceptual losses for real-time style transfer and super-resolution](https://arxiv.org/pdf/1603.08155.pdf), code available [here](https://github.com/jcjohnson/fast-neural-style). Training seems to employ several tricks and then even use a batch of 1. ## Results: Very impressive and the really key point is that you don't need paired images which makes this trainable on any domain with the same representation behind. [](https://cloud.githubusercontent.com/assets/17261080/26717502/f6d1fb7e-477d-11e7-8174-7bdd621cf1b6.png) ![]()
Your comment:
|