Robust principal component analysis (RPCA) has been widely used for recovering low-rank matrices in many data mining and machine learning problems. It separates a data matrix into a low-rank part and a sparse part. The convex approach has been well studied in the literature. However, state-of-the-art algorithms for the convex approach usually have relatively high complexity due to the need of solving (partial) singular value decompositions of large matrices. A non-convex approach, AltProj, has also been proposed with lighter complexity and better scalability. Given the true rank $r$ of the underlying low rank matrix, AltProj has a complexity of $O(r^2dn)$, where $d\times n$ is the size of data matrix. In this paper, we propose a novel factorization-based model of RPCA, which has a complexity of $O(kdn)$, where $k$ is an upper bound of the true rank. Our method does not need the precise value of the true rank. From extensive experiments, we observe that AltProj can work only when $r$ is precisely known in advance; however, when the needed rank parameter $r$ is specified to a value different from the true rank, AltProj cannot fully separate the two parts while our method succeeds. Even when both work, our method is about 4 times faster than AltProj. Our method can be used as a light-weight, scalable tool for RPCA in the absence of the precise value of the true rank.
from cs.AI updates on arXiv.org http://ift.tt/2d5THbN
via IFTTT
No comments:
Post a Comment