|
| |
Motif listing in large graphs
Abstract
Motifs are small subgraphs (e.g., triangles, four-cycles) whose appearance in nature is much more frequent
than in classical random graphs. Their discovery
(enumeration or listing) plays an important role in various fields. This
project aims to analyze complexity of the involved algorithms and design
techniques that can handle trillion-edge graphs with limited RAM.
Journal Papers
| |
Y. Cui, D. Xiao, D.B.H. Cline, and D.
Loguinov, "Improving I/O Complexity of Triangle Enumeration," IEEE Transactions on Knowledge and Data Engineering,
vol. 34, no. 4, April 2022.
|
PDF |
|
Y. Cui, D. Xiao, and D. Loguinov, "On
Efficient External-Memory Triangle Listing," IEEE
Transactions on Knowledge and Data Engineering, vol. 31, no. 8, August
2019. |
PDF |
Conference
Papers
| |
D. Xiao, Y. Cui, and D. Loguinov,
"Towards Faster External-Memory Graph Computing on Small Neighborhoods,"
IEEE BigData, December 2025 (short paper).
|
PDF,
PPT |
| |
Y. Cui, D. Xiao, D.B.H. Cline, and D. Loguinov, "Improving
I/O Complexity of Triangle Enumeration," IEEE ICDM, November 2017.
|
PDF,
PPT |
| |
D. Xiao, Y. Cui, D.B.H. Cline, and D. Loguinov, "On
Asymptotic Cost of Triangle Listing in Random Graphs," ACM PODS, May 2017. |
PDF,
PPT |
| |
Y. Cui, D. Xiao, and D. Loguinov, "On
Efficient External-Memory Triangle Listing," IEEE ICDM, December 2016. |
PDF,
PPT |
Technical
Reports
| |
D. Xiao, Y. Cui, and D. Loguinov, "Towards Faster External-Memory Graph
Computing on Small Neighborhoods," Texas A&M Technical Report, November
2025.
|
PDF |
| |
Y. Cui, D. Xiao, D.B.H. Cline, and D. Loguinov, "Improving
I/O Complexity of Triangle Enumeration," Texas A&M Technical Report 2017-8-3, August 2017. |
PDF |
| |
D. Xiao, Y. Cui, D.B.H. Cline, and D.
Loguinov, "On Asymptotic Cost of Triangle Listing in Random Graphs,"
Texas A&M Technical Report 2016-9-2, September 2016. |
PDF |
| |
Y. Cui, D. Xiao, and D. Loguinov, "On
Efficient External-Memory Triangle Listing," Texas A&M Technical Report 2016-9-1, September 2016. |
PDF |
Datasets
The next four undirected graphs are used in the ICDM 2016 paper. Each consists of two files -- (source
node, degree) pairs and all adjacency lists dumped one after the other. All node
IDs and degrees are unsigned 4-byte integers, LSB order. The IDs are sequential,
with no gaps. Source nodes and neighbor lists are sorted ascending.
IRLbot domain (14 GB): degree, edges
IRLbot host (53 GB): degree, edges
IRLbot IP (6 GB): degree, edges
Full ClueWeb09 webgraph (358 GB): degree, edges
Trigon source code (can be used for PCF
as well)
|