Python 3.8.3 and Pytorch 1.9.0 implementation of the Hierarchical Block Distance Model (HBDM).
We propose a novel graph representation learning method named the Hierarchical Block Distance Model (HBDM). It extracts node embeddings imposing a multiscale block-structure that accounts for homophily and transitivity properties across the levels of the inferred hierarchy. Notably, the HBDM naturally accommodates unipartite, directed, and bipartite networks whereas the hierarchy is designed to ensure linearithmic time and space complexity enabling the analysis of very large-scale networks. We evaluate the performance of our approach on massive networks consisting of millions of nodes. Our HBDM framework significantly outperforming recent scalable approaches in downstream tasks providing superior performance even using ultra low-dimensional embeddings at the same time facilitating direct and hierarchical-aware network visualization and interpretation.
A Millon Node Unipartite Example with Flixster Network
![]() |
---|
Dendrogram - Binary Logarithm over the Sum of Euclidean Distances |
![]() |
![]() |
![]() |
![]() |
![]() |
---|---|---|---|---|
L=1 | L=3 | L=5 | L=8 | 2-D Embedding Space |
A Bipartite Example with GitHub Network
![]() |
---|
Dendrogram - Binary Logarithm over the Sum of Euclidean Distances |
![]() |
![]() |
![]() |
![]() |
---|---|---|---|
L=1 | L=3 | L=6 | 2-D Embedding Space |
conda create -n ${env_name} python=3.8.3
conda activate ${env_name}
pip install -r requirements.txt
Our Pytorch implementation uses the pytorch_sparse package. Installation guidelines can be found at the corresponding Github repository.
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.12.1+cpu.html
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-1.12.1+${CUDA}.html
where ${CUDA} should be replaced by either cu102, cu113, or cu116 depending on your PyTorch installation.
RUN: python main.py
optional arguments:
--epochs number of epochs for training (default: 15K)
--RH number of epochs to rebuild the hierarchy from scratch (default: 25)
--cuda CUDA training (default: True)
--LP performs link prediction (default: True)
--D dimensionality of the embeddings (default: 2)
--lr learning rate for the ADAM optimizer (default: 0.1)
--RE activates random effects (default: True)
--dataset dataset to apply HBDM (default: grqc)
The code has been primarily constructed and optimized for running in a GPU-enabled environment.
N. Nakis, A. Celikkanat, S. Lehmann and M. Mørup, A Hierarchical Block Distance Model for Ultra Low-Dimensional Graph Representations, IEEE Transactions on Knowledge and Data Engineering (TKDE).