ABC: Attentive Betweenness Centrality

ABC: Attentive Betweenness Centrality

Who are the most important people in a network? There are many different measures that describe the importance of a person based on her relationships and where these relationships place her in the network. For example, one well-known measure of betweenness considers how crucial a person is by looking at whether they are on a critical path for others in the network. If a person is on a large number of shortest paths between people in the network, then they are important. For example, in the above example, node A and B are on the shortest path between nodes X and Y. In fact, they are on the shortest path between all nodes in the left and right subgraphs. In essence this means that if X wants to send a message to Y, then A is the best conduit. This gives nodes A and B a crucial advantage. The betweenness measure is based on this idea: it finds the fraction of shortest paths between any pair of nodes that pass through a given node.

The problem with betweenness is that it disregards path that are almost the shortest path. For example, paths that pass through E,C,D are not as short as those that pass through A and B. But, these paths are still valuable. Especially if node A becomes unavailable, they provide a crucial alternate path. However, in betweenness centrality, nodes E,C,D will get almost no credit. To address this problem, we introduce a new measure called ABC centrality. Instead of looking at whether a node lies on the shortest path, it looks at the amount of flow that can pushed through a node. But, the longer the path, lesser the amount of flow. In fact, the flow gets split across all the outgoing edges and reduced by a factor of alpha at each step. To see an example, let’s look at the following figure:

Suppose node A wants to send flow to node B along the rightmost path. The flow will be divided by a half because node A has two outgoing edges. Now, the flow is not divided further. But, at each step, it attenuates by a factor of alpha (a). As a result, total flow reaching B along this path is alpha square divided by 2. Now, we compute the importance of the nodes along this path by looking at how much each node contributed to this flow. This process gives credit to all the nodes for the different paths they are on. ABC betweenness simply adds the credit for a node along all the different paths.

Our algorithm appears in the SocialComm 2012 Conference. We are not the first to recognize the shortcoming of the betweenness measure. In our paper (SocialComm2012_ABC_Centrality.pdf), we show that our algorithm better approximates the original betweenness measure than any of the other proposed algorithms, and at the same time overcoming the new problems introduced by these algorithms. The code for our algorithm can be found here.

For example, for the well known Karate Club dataset, our algorithm captures the nodes at the center of the two main factions very well as shown below.

Plus, the parameter alpha allows us to adjust how much attention we pay to the length of the paths. So, what types of properties do these nodes with high betweenness have? We find that nodes with high ABC scores are more diverse. For example, for the Internet Movie Database (IMDB), the actors with high ABC values also star in movies in many different genres. They tend to act as a bridge or connector between actors who specialize in a specific genre. Here are some of the actors with the highest ABC scores. If you do not know who they are, we guarantee you that you will recognize each actor as soon as you see a picture of him:

Tuesday, July 10, 2012

Software is available at: github.com/rpitrust/prominence/tree/master/AttnCentrality