You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code for NeurIPS 2019 paper Scalable inference of topic evolution via models for latent geometric structures
The code has been tested on Ubuntu 14.04, 16.04, 18.04 LTS
Three Settings
The code repo contains three different algorithms for
scalable inference of topic evolution considering
three different settings:
Distributed:
texts distributed/grouped in some manner
(e.g., location, categories)
Online (Streaming):
texts coming in online/streaming fashion
Distributed & Online (Streaming):
texts coming in both distributed and online fashion
####Algorithms Matching Three Settings (Setting:Algorithm)
Distributed: DM
Online: SDM
Distributed & Online: SDDM
usage:
Install the required dependencies:
./conf
To run three different algorithms, go to the corresponding folder
(distributed for DM, online for SDM, distributed_online for SDDM)
and refer the README.md file under each for algorithm & setting specific details.
###Perplexity:
To compute the perplexity on the new dataset with three algorithms mentioned above,
go to the folder perplexity and refer the README.md file under the folder for details.
###Data:
Two datasets are used - EJC (ejc) and Wiki (wiki) (more details regarding the two datasets are given in the paper)
Each dataset folder has 5 subfolders:
_group (dataset partitioned by different groups)
_time (dataset partitioned by different timestamp)
_time_group (dataset first partitioned by time and then partitioned by groups)
_test (test dataset)
_time_group_meta_data (contains vocabulary and group name & id mapping)
Controls how topic changes from time to time:
small value => sharp change of topics between different time points
tau1
Affects the number of topics:
small tau1 => larger variance between topics => less number of topics:
vectors with small difference are regarded as different topics
gamma0
Parameter of Beta process
About
Code for NeurIPS 2019 paper "Scalable inference of topic evolution via models for latent geometric structures"