Deep convolutional neural networks have led to breakthrough results in practical feature extraction applications. The mathematical analysis of such networks was initiated by Mallat, 2012. Specifically, Mallat considered so-called scattering networks based on semi-discrete shift-invariant wavelet frames and modulus non-linearities in each network layer, and proved translation invariance (asymptotically in the wavelet scale parameter) and deformation stability of the corresponding feature extractor. The purpose of this paper is to develop Mallat's theory further by allowing for general convolution kernels, or in more technical parlance, general semi-discrete shift-invariant frames (including Weyl-Heisenberg, curvelet, shearlet, ridgelet, and wavelet frames) and general Lipschitz-continuous non-linearities (e.g., rectified linear units, shifted logistic sigmoids, hyperbolic tangents, and modulus functions), as well as pooling through sub-sampling, all of which can be different in different network layers. The resulting generalized network enables extraction of significantly wider classes of features than those resolved by Mallat's wavelet-modulus scattering network. We prove deformation stability for a larger class of deformations than those considered by Mallat, and we establish a new translation invariance result which is of vertical nature in the sense of the network depth determining the amount of invariance. Moreover, our results establish that deformation stability and vertical translation invariance are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities. This offers an explanation for the tremendous success of deep convolutional neural networks in a wide variety of practical feature extraction applications. The mathematical techniques we employ are based on continuous frame theory.
from cs.AI updates on arXiv.org http://ift.tt/1Jq815O
via IFTTT
No comments:
Post a Comment