We propose a novel approach to the problem of neural network expressivity, which seeks to characterize how structural properties of a neural network family affect the functions it is able to compute. Understanding expressivity is a classical issue in the study of neural networks, but it has remained challenging at both a conceptual and a practical level. Our approach is based on an interrelated set of measures of expressivity, unified by the novel notion of trajectory length, which measures how the output of a network changes as the input sweeps along a one-dimensional path. We show how our framework provides insight both into randomly initialized networks (the starting point for most standard optimization methods) and for trained networks. Our findings can be summarized as follows:
(1) The complexity of the computed function grows exponentially with depth. We design measures of expressivity that capture the non-linearity of the computed function. These measures grow exponentially with the depth of the network architecture, due to the way the network transforms its input.
(2) All weights are not equal (initial layers matter more). We find that trained networks are far more sensitive to their lower (initial) layer weights: they are much less robust to noise in these layer weights, and also perform better when these weights are optimized well.
from cs.AI updates on arXiv.org http://ift.tt/1tznj7m
via IFTTT
No comments:
Post a Comment