You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Programs of fast randomized PCA algorithms for Sparse Data
Latest Update Tips:
1.We modified the compile file to a more simple version. User can use "source compile.sh" and choose "icc" or "gcc" compiler to compile the program.
2.We add some notes of pass paramter "q" in codes, pass parameter q should be larger than 1, and q times pass equals to (q-2)/2 times power iteration.
1.Main Algorithms
1.matlab/eigSVD.m ---- the algorithm used for doing ecnomic/truncated singular value decomposition by using eigendecomposition in [1] implemented by Matlab, and the parameter k is used for truncated singular value decomposition
2.matlab/frPCA.m ---- the algorithm used for fast randomized PCA for sparse data in [1] implemented by Matlab, and the parameter mode is used for the different size of the initial data matrix
3.icc/frpca.c ---- the frPCA and frPCAt algorithms in [1] which is implemented with ICC and OpenMP, the basic rPCA in [2] is included in the file.
2.Experiments of testing
(1)The ICC compiler or MKL library needs the support of Intel MKL [3], and when all the things have been prepared, run "source compile.sh" and choose the compiler ("icc" or "gcc"), then the excutable program will be compiled. The resuled program is an example of testing the dataset SNAP [4].
(2)"matlab/test.m" is a working example for frPCA, and compare the results of frPCA and svds. The i-th singular value of the test case is $\sigma_i=1/\sqrt{i}$.
(4)The matrix index should begin with 1 when using those programs, or the reuslts may become false.
Rederence
[1] Xu Feng, Yuyang Xie, Mingye Song, Wenjian Yu, and Jie Tang, "Fast Randomzied PCA for Sparse Data," in Proc. the 10th Asian Conference on Machine Learning (ACML), Beijing, China, Nov. 2018.
[2] N Halko, P. G Martinsson, and J. A Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. Siam Review, 53(2):217-288, 2011.
[4] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. https://snap.stanford.edu/data, June 2014.
[5] F. Maxwell Harper and Joseph A. Konstan. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS), 5(4):19, 2016.