hanoch kremer's profile - ShortScience.org

arxiv.org
arxiv-vanity.com
scholar.google.com

Efficient Convolutional Network Learning using Parametric Log based Dual-Tree Wavelet ScatterNet
Amarjot Singh and Nick Kingsbury
arXiv e-Print archive - 2017 via Local arXiv
Keywords: cs.LG, stat.ML
more

[link] Summary by hanoch kremer 4 years ago

ScatterNets incorporates geometric knowledge of images to produce discriminative and invariant (translation and rotation) features i.e. edge information. The same outcome as CNN's first layers hold. So why not replace that first layer/s with an equivalent, fixed, structure and let the optimizer find the best weights for the CNN with its leading-edge removed.
The main motivations of the idea of replacing the first convolutional, ReLU and pooling layers of the CNN with a two-layer parametric log-based Dual-Tree Complex Wavelets Transform (DTCWT), covered by a few papers, were:
Despite the success of CNNs, the design and optimizing configuration of these networks is not well understood which makes it difficult to develop these networks
This improves the training of the network as the later layers can learn more complex patterns from the start of learning because the edge representations are already present
Converge faster as it has fewer filter weights to learn
My takeaway: a slight reduction in the amount of data necessary for training!

On CIFAR10 and Caltech-101 with 14 self-made CNN with increasing depth, VGG, NIN and WideResnet:
When doing transfer learning(Imagenet): DTSCNN outperformed (“useful margin”) all the CNN architectures counterpart when finetuning with only 1000 examples(balanced over classes). While on larger datasets the gap decreases ending on par with. However, when freezing the first layers on VGG and NIN, as in DTSCNN, the NIN results are in par with, while VGG outperforms!

DTSCNN learns faster in the rate but reaches the same target with minor speedup (few mins)

Complexity analysis in terms of weights and operations is missing

Datasets: CIFAR-10 & Caltech-101, is a good start point (further step with a substantial dataset like COCO would be a plus). For other modalities/domains, please try and let me know

Great work but ablation study is missing such as comparing full training WResNet+DTCWT vs. WResNet

14 citation so far (Cambridge): probably low value per money at the moment
https://i.imgur.com/GrzSviU.png

hanoch kremer

sciscore: 2