Social networks dataset collections

The question that is often asked in the SNA academic community is where a researcher can get ready datasets. As a prepared answer to such questions here is a short list of repositories of datasets for social network analysis.

  1. Stanford Large Network Dataset Collection (
    Perhaps one of the most famous SNA repositories. Includes huge datasets of Wikipedia, Twitter and Facebook data.
  2. KONECT (
    Koblenz Network Collection contains 281 large network datasets of different types from various spheres. Good collection of authorship, coauthorship and citation networks.
  3. Corporate Elites (
    Dutch financial social networks from as early as 1902. 12 datasets downloadable in cvs and xls.
  4. UCINET datasets (
    Some datasets in UCINET format.
    Datasets from different fields, including literature. Some older ones.

There are also smaller lists maintained by individuals: 1, 2. You should probably check them as well. As well as Indiana University list and Pajek databases.

If you have suggestions of what should be added to the list, please comment.

Facebook network visualisation

There is a simple Facebook data collector called NetVizz, which allows you to download information about the user’s network of friends, network of likes, community members or community activity.

Obtained data may be fed to software such as Gephi. Gephi is a free program to visualize graphs of social networks. In addition to simply showing the relationships, it can be configured for a lot of things and calculate various interesting parameters specific to social networks.

Above is the visualisation of my (tiny) personal social network Facebook. Points are my friends, while the connections are friendships between my friends.

Interesting insight from this data is the territoriality of my Facebook network. While the Internet is not a clearly territorial space, personal Facebook networks frequently are. All my connections are in fact grouped according to the geographical factor. It is more visible on the next graph:

All differently coloured groups correspond to different geographical locations. Isolated island is one of the conferences I’ve attended.

Here is the social network graph of my brother. He has 411 friends, more than four times my number. Looks impressive:

Here is a visualisation of like clouds of all of his 411 friends. That is clearly not the resolution appropriate for showing such data:

By the way, in all of the graphs displayed her, the maximum shortest path between two points is 6. Just as the six-degrees-of-separation theory predicts.
Though, in fact in all the cases maximum shortest path is two, since all the graphs show friends of one person.