You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 26, 2021. It is now read-only.
Open windows/LightLDA.sln using Visual Studio 2013 and build all the projects.
Linux (Tested on Ubuntu 14.04)
Run $sh ./build.sh to install the program.
Running LightLDA
We provide some quick guidelines as follows for your reference. You can get get more detailed instructions about command line arguments by running $./lightlda --help
Preprocess
LightLDA takes specific binary format as its input. To run LightLDA you should first convert your own data to the format supported by LightLDA. A tool is provided to convert the LibSVM-format data to LightLDA-format data. So for simplicity, you can prepare your dataset in the LibSVM format first. In the following steps, we assume your dataset is in LibSVM format.
Counting dataset meta information. ./example/get_meta.py input.libsvm output.word_tf_file
Split your LibSVM data into several parts.
Convert your data from LibSVM format to binary format used by LightLDA. ./bin/dump_binary input.libsvm.part_id input.word_tf_file part_id
Training on single machine
We provide examples to illustrate how to use LightLDA to train topic models on a single machine. For instance, you can run in Powershell(Windows) $ ./example/nytimes.ps1 or in Bash(Linux)$ ./example/nytimes.sh to get a quick start of LightLDA.
Training with distributed setting with MPI
Running MPI-based distributed LightLDA is quite similar to the single machine setting. Just use mpiexec and prepare a machine list file. Run $ mpiexec -machinefile machine_list lightlda lda_argments.