Carview!

Programs of fast randomized PCA algorithms for Sparse Data

Latest Update Tips:

1.We modified the compile file to a more simple version. User can use "source compile.sh" and choose "icc" or "gcc" compiler to compile the program.

2.We add some notes of pass paramter "q" in codes, pass parameter q should be larger than 1, and q times pass equals to (q-2)/2 times power iteration.

1.Main Algorithms

1.matlab/eigSVD.m ---- the algorithm used for doing ecnomic/truncated singular value decomposition by using eigendecomposition in [1] implemented by Matlab, and the parameter k is used for truncated singular value decomposition

2.matlab/frPCA.m ---- the algorithm used for fast randomized PCA for sparse data in [1] implemented by Matlab, and the parameter mode is used for the different size of the initial data matrix

3.icc/frpca.c ---- the frPCA and frPCAt algorithms in [1] which is implemented with ICC and OpenMP, the basic rPCA in [2] is included in the file.

2.Experiments of testing

(1)The ICC compiler or MKL library needs the support of Intel MKL [3], and when all the things have been prepared, run "source compile.sh" and choose the compiler ("icc" or "gcc"), then the excutable program will be compiled. The resuled program is an example of testing the dataset SNAP [4].

(2)"matlab/test.m" is a working example for frPCA, and compare the results of frPCA and svds. The i-th singular value of the test case is $\sigma_i=1/\sqrt{i}$.

(3)The SNAP dataset is included in the the icc folder, and the Movielens datasets [5] can be downloaded from the website https://grouplens.org/datasets/movielens/.

(4)The matrix index should begin with 1 when using those programs, or the reuslts may become false.

Rederence

[1] Xu Feng, Yuyang Xie, Mingye Song, Wenjian Yu, and Jie Tang, "Fast Randomzied PCA for Sparse Data," in Proc. the 10th Asian Conference on Machine Learning (ACML), Beijing, China, Nov. 2018.

[2] N Halko, P. G Martinsson, and J. A Tropp. Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. Siam Review, 53(2):217-288, 2011.

[3] Intel parallel studio xe cluster edition for linux. https://software.intel.com/en-us/intel-parallel-studio-xe, 2018.

[4] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. https://snap.stanford.edu/data, June 2014.

[5] F. Maxwell Harper and Joseph A. Konstan. The movielens datasets: History and context. ACM Transactions on Interactive Intelligent Systems (TiiS), 5(4):19, 2016.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
icc		icc
matlab		matlab
LICENSE		LICENSE
Readme.md		Readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Programs of fast randomized PCA algorithms for Sparse Data

Latest Update Tips:

1.Main Algorithms

2.Experiments of testing

Rederence

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

XuFengthucs/frPCA_sparse

Folders and files

Latest commit

History

Repository files navigation

Programs of fast randomized PCA algorithms for Sparse Data

Latest Update Tips:

1.Main Algorithms

2.Experiments of testing

Rederence

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages