The current paper proposes a novel dynamic neural network model for categorization of complex human action visual patterns. The Multiple Spatio-Temporal Scales Recurrent Neural Network (MSTRNN) adds recurrent connectivity to a prior model, the Multiple Spatio-Temporal Scales Neural Network (MSTNN). By developing adequate recurrent contextual dynamics, the MSTRNN can learn to extract latent spatio-temporal structures from input image sequences more effectively than the MSTNN. Two experiments with the MSTRNN are detailed. The first experiment involves categorizing a set of human movement patterns consisting of sequences of action primitives. The MSTRNN is able to extract long-ranged correlations in video images better than the MSTNN. Time series analysis on neural activation values obtained from the recurrent structure shows that the MSTRNN accumulates extracted spatio-temporal features which discriminate action sequences. The second experiment requires that the model categorize a set of object-directed actions, and demonstrates that the MSTRNN can learn to extract structural relationships between actions and action-directed-objects (ADOs). Analysis of characteristics employed in categorizing both object-directed actions and pantomime actions indicates that the model network develops categorical memories by organizing relational structures between each action and appropriate ADO. Such relational structure may be necessary for categorizing human actions with an adequate ability to generalize.
from cs.AI updates on arXiv.org http://ift.tt/1PJbylV
via IFTTT
No comments:
Post a Comment