SPA is a discrete growth process for networks featuring scale-free and overlapping community structure.
It models two phenomena simultaneously: the growth of community structure and the growth of the internal structure of communities.
In the growth of community structure, a vertex joins a community by creating an edge with an existing member.
The vertex (community) is a new one with probability p (q), and otherwise chosen within the existing set of vertices (communities) preferentially.
In so doing, SPA generates a network whose building blocks are overlapping groups of scale-free distributed communities.
The internal structure of communities is determined by an intuitive densification process.
A link is added between two vertices members of a same community, at a rate which is controlled by a third parameter, r.
Higher values leads to denser (more connected) communities
.The structural properties below only start to illustrate the richness of the SPA process.
Average probability that two neighbours of a given vertex are themselves connected.
Further ReadingCorrelation coefficient between the degree of a given vertex and its neighbours' degrees.
Further ReadingMaximum average degree within the most interconnected subset of the network.
Further ReadingThe average fraction of possible edges that actually appear within communities.
Further ReadingSimilarity of the set of vertices of two randomly selected communities.
Further ReadingThe running time explodes for extremal values of p, q, and r.
Since the structural properties vary continuously with the parameters, similar parameters generate similar networks.
Hence, a small subset of qualitatively different networks can give us a good idea of what's going on.
We identified nine representative sets of parameters by
Parameters | Selected properties | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
p | q | r | Density | Clustering | Degree assortativity | Maximal coreness | Largest community | Highest membership | Partition density | Average overlap |
0.10 | 0.05 | 1 | +++ | +++ | --- | +++ | +++ | +++ | +++ | +++ |
0.30 | 0.15 | 1 | +++ | ++ | -- | +++ | +++ | +++ | +++ | +++ |
0.85 | 0.85 | 1 | + | + | ++ | + | + | + | + | + |
0.45 | 0.60 | 1 | ++ | ++ | ++ | ++ | ++ | ++ | ++ | ++ |
0.75 | 0.55 | 1 | + | + | + | + | ++ | + | + | ++ |
0.05 | 0.85 | 10 | +++ | +++ | +++ | +++ | + | +++ | +++ | + |
0.85 | 0.20 | 10 | ++ | + | + | + | ++ | + | + | + |
0.75 | 0.10 | 10 | ++ | ++ | - | ++ | +++ | ++ | ++ | ++ |
0.70 | 0.80 | 10 | + | +++ | +++ | ++ | + | ++ | ++ | +++ |
The SPA process generates realistic synthetic networks with a known overlapping community structure.
Many algorithms aim to uncover such communities in empirical datasets, and telling them apart calls for good benchmarking tools.
It's a straightforward process.
One simply needs to produce an SPA network, use a detection algorithm to recover its community structure, and compare the results with the known communities (using the Normalized Mutual Information, for instance).
Below, we applied state-of-the-art community detection algorithms to the nine representative subsets of networks.
And we found pretty interesting results.
BigCLAM fits a generative model to empirical datasets using a coordinate descent.
Further ReadingCOPRA uses the steady state of a label propagation dynamics to determine the assignment of nodes to communities.
Further ReadingOSLOM optimizes a fitness function that expresses the statistical significance of communities with respect to random fluctuations.
Further ReadingOf course, this list is far from complete, so
and we'll add your algorithm here.
Or you can simply download the software below.
In the meantime, you can check out the stochastic process itself
To get a notification when the benchmarking paper is out, enter your email address here.
In the meantime, you may wish to check our other papers on SPA and its hierarchical variant..
Built by Jean-Gabriel Young and Louis Roy,
with the help of Edward Laurence and Laurent Hébert-Dufresne
.Code licensed under MIT, documentation under CC BY 3.0.