Skip to content

puzzlef/louvain-communities-cuda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Design of CUDA-based Louvain algorithm for community detection.

Community detection involves identifying natural divisions in networks, a crucial task for many large-scale applications. This report presents GVE-Louvain, one of the most efficient multicore implementations of the Louvain algorithm, a high-quality method for community detection. Running on a dual 16-core Intel Xeon Gold 6226R server, GVE-Louvain outperforms Vite, Grappolo, NetworKit Louvain, and cuGraph Louvain (on an NVIDIA A100 GPU) by factors of 50x, 22x, 20x, and 5.8x, respectively, achieving a processing rate of 560M edges per second on a 3.8B-edge graph. Additionally, it scales efficiently, improving performance by 1.6x for every thread doubling. The paper also presents ν-Louvain, a GPU-based implementation. When evaluated on an NVIDIA A100 GPU, ν-Louvain performs only on par with GVE-Louvain, largely due to reduced workload and parallelism in later algorithmic passes. These results suggest that CPUs, with their flexibility in handling irregular workloads, may be better suited for community detection tasks.


Below we plot the time taken by Grappolo (Louvain), NetworKit Louvain, Nido (Louvain), cuGraph Louvain, and ν-Louvain on 13 different graphs. ν-Louvain surpasses Grappolo, NetworKit Louvain, Nido, and cuGraph Louvain by 20×, 17×, 61×, and 5.0× respectively, achieving a processing rate of 405M edges/s on a 2.2𝐵 edge graph.

Below we plot the speedup of ν-Louvain wrt Grappolo, NetworKit Louvain, Nido, and cuGraph Louvain.

Finally, we plot the modularity of communities identified by Grappolo, NetworKit Louvain, Nido, cuGraph Louvain, and ν-Louvain. ν-Louvain on average obtains 1.1%, 1.2%, and 1.3% lower modularity than Grappolo, NetworKit Louvain, and cuGraph Louvain (where cuGraph Louvain runs), but 45% higher modularity than Nido.

Refer to our technical report for more details:
CPU vs. GPU for Community Detection: Performance Insights from GVE-Louvain and ν-Louvain.


Note

You can just copy main.sh to your system and run it.
For the code, refer to main.cxx.



Code structure

The code structure of ν-Louvain is as follows:

- inc/_*.hxx: Utility functions
- inc/**.hxx: Common graph functions
- inc/csr.hxx: Compressed Sparse Row (CSR) data structure functions
- inc/Graph.hxx: Graph data structure
- inc/louvian.hxx: OpenMP-based Louvian algorithm
- inc/louvainCuda.hxx: CUDA-based Louvian algorithm
- inc/hashtableCuda.hxx: CUDA-based hybrid quadratic-double hashtable
- inc/mtx.hxx: Graph (MTX format) file reader
- inc/properties.hxx: Graph property functions (e.g., modularity)
- inc/symmetrize.hxx: Make graph symmetric
- inc/update.hxx: Graph update functions
- inc/main.hxx: Main header
- main.cxx: Experimentation code
- main.sh: Experimentation script
- process.js: Node.js script for processing output logs

Note that each branch in this repository contains code for a specific experiment. The main branch contains code for the final experiment. If the intention of a branch in unclear, or if you have comments on our technical report, feel free to open an issue.



References




ORG

About

Design of CUDA-based Louvain algorithm for community detection.

Resources

License

Stars

Watchers

Forks

Languages