Inside DP Ruto’s Social Network Engine (Part 1)

Herman Wandabwa
7 min readAug 20, 2021

Human relationships are built around many spheres, the most obvious being family. However, friendships in humans dictate relations especially for non-blood relatives. The concept of human relationships is mirrored by the mathematical hypothesis of graph theory. In this theory, nodes or vertices (vertex in singular form) represent people or users whereas edges represent the relationships in social networks[1]. The weight of the edges between the nodes/users represents the strength in the relationships. Fundamentally, a node and in this case a user, is the most important unit in a graph. A node in any graph is usually identified by a circle with a label, whereas edges are identified by lines connecting one node/vertex/user to another. This is a great illustration of how social networks work: one user is connected to another who has other connections to other users or group of users. The mathematical hypothesis of graph theory describes two categories of graphs; the undirected graphs which consists of a set of vertices (unordered pairs) and a set of edges, and the directed graphs which are made of a set of vertices (ordered pairs) and a set of arcs.

Political discussions can be equally unifying or divisive and politicians — the people we love to hate, often bear the reverberations. The modern politician is keen on exploiting online social platforms to expand their network, and in Kenya, this is not any different. This article will delve into the political content propagation for Twitter use @WilliamSRuto which belongs to the deputy president of the Republic of Kenya. Without a doubt, @WilliamSRuto is a highly engaging Twitter account with over 4 million followers. Most of the account’s activity is localized among Kenyans, an observation that can be explained by the theory of homophily.

The pervasive fact of homophily means that the cultural, behavioral, genetic, or material information that flows through networks tends to be localized. Enough of the theory. Let’s get to the exiting bits of unraveling the insights related to @WilliamSRuto’s Twitter social graph.

Twitter Data and Overall Metrics

@WilliamsRuto’s followership of 4 million depicts that is an active user on the platform. Responses to the tweets disseminated by the account between 15/07/2021 and 16/08/2021 were sampled for experimentation purposes. The retweets, likes, replies and related actions on @WilliamsRuto’s tweets formulate the user network.

Network Interactions: The network had 12,057 interactions within the time with 3367 active users. The total edges(connections ) among the users were 12055 with 6775 unique ones and 5280 duplicated. In addition, there were 58 self-loops i.e. users referring to themselves which basically is an edge starting and ending in the same vertex. Such edges are also called isolates.

Reciprocated Vertex Pair Ratio (RVPR): RVPR value for within the time was 0.017597. This is the proportion of vertices that have a connection returned to them. For example, when an edge from A to B is joined by another edge from B to A then their connection is “reciprocated”. A higher RVPR value denotes more of two-way interaction, among the users in a network. This is not the case in this network. A few users interacted more, but the majority seemingly did not “reciprocate”.

Graph Density: The RVPR value effect is further reflected in the graph density value of 0.000729 which is quite low. This is the measure of the number edges among a group of vertices over the total possible number if everyone was connected to everyone. A high graph density means that most people are connected to many others. A low graph density as in this network means that most people are not connected to many others.

Other Key Metrics

The below tables have a summary of other key metrics that were identified in the data.

Table 1: Top Domains, Hashtags and their counts in @WilliamsRuto’s network

The domain network in Table 1 is an indicator that @WilliamsRuto’s content is being pushed to the user’s network though not very aggressively. Its a passive-active approach which works well without overloading the network with external links. Interestingly, the live video broadcasting website tops as the domain of choice for this user, during this period. Top shares in this domain relate to streaming content here as well we in this link

The top hashtag is #hustlernation which epitomizes the economic model that the user has embraced in his speeches across the county [2]. The contextual relevance of the hashtag can be referenced to the top-word pairs inclusive-economic, model-expand, economic-model etc. in Table 2. The Kenya population is bound to understand this better.

Table 2: Top Words and Word-Pairs in Tweets

An interactive infographic of word-pairs is below. With all credit, the propagated content in the month touched so much and equally on the fundamental building blocks of the economy. I would personally assume that’s the economic policy that the users propagates.

Table 3: Top Replied to and Top-Mention Counts
Top Replied to Users in @WilliamsRuto’s network

Obviously, user @WilliamsRuto commanded the largest share of replied to and mentioned users as its his network. A lookup of the other usernames indicates users disseminating political content, more so in the mentions infographic below. This could be an indicator of users comparing @WilliamsRuto and other political actors such as @railaodinga. This comparison will be part of the work in Part 2 of this article. Ideally, I’ll try compare the social content propagation prowess between such users.

Top Mentions by @WilliamsRuto’s network.

Before embarking on the last bit of this post i.e. a quick review of @WilliamSRuto’s social graph, we’ll look at the user’s inner circle which by large is the fulcrum of a user whose content is meant to be propagated to masses on Twitter.

There is a score called the Betweenness Centrality (BC) that measures this relevance. On a high level, BC is a way of detecting the amount of influence a node/user has over the flow of information in a graph. It is calculated based on the person’s connection to otherwise disconnected groups of people. “Betweenness” is the “Bridge Score” that measures how much a user is the only way to connect from one part of the network to another. It is a sociological proxy for “influence”. Often, such users are called influencers. In addition, this score does NOT use follower count, tweet count, or retweet count as an input. It is based on the behavior of other people towards a user to determine their value. This by far is the best value for money in cases of advertising etc. where an influencer’s worth is evaluated.

Who is the Most Influential User in @WilliamSRuto’s Network?

I’m sure a wild guess would point towards the user himself. In this case @WilliamSRuto is the most influential in the network. Ideally, all proliferated content is about this user. This is the reason the user was removed from the data in Table 4 below.

Table 4: Top 20 Influencers in @WilliamSRuto’s network

The same data is replicated in the sunburst visualization below. @oleitumbi remains to be the closest person to @WilliamSRuto in the user’s network, at least by content propagation. The user ideally has influence to a tune of about 31.1% of content in @WilliamSRuto’s network, at least for the month in question. A deeper over time analysis is likely to reveal different patterns. Something worth looking at in the future.

The second spot in the influencer hierarchy is occupied by @bcollapsed. The user’s bio states that is a “Government Critic, controversial, unapologetic, looking for answers”. The user joined Twitter on June 2021 and a month later is a super-spreader of @WilliamSRuto related content. The same narrative is shared with @MbuiMumbi, who joined the network in May 2021 and is fourth in the table. Organically, it is less likely that these two users are “natural users”. They either have joined the platform with an agenda to spread content about @WilliamSRuto, or they are basically working their way up to influencer status on Twitter. As of now, the polarity of what they shared is yet to be known and that’s one other area I look forward to delving into in the future. Overall, the influence difference between the second and the last user in Table 4 is about 2.1% which isn't very significant.

The social graph below shows this propagation network. The red lines are the connections(edges) connecting users(nodes) in the clusters (represented by G1 etc). @WilliamSRuto’s cluster is denser followed by @oleitumbi’s in that order.

@WilliamSRuto’s Social Graph between 15th-July-2021 and 16th-August-2021
Figure 1: @WilliamSRuto’s social graph between between 15/07/2021 and 16/08/2021. Image by author


Three significant weaknesses are in this network setup: -

  1. Isolated users — Isolates in the network, more so around @WilliamSRuto’s cluster are many. This means that they are likely to miss out what for example @MbuiMumbi or @oleitumbi disseminates, unless is re-shared or by @WilliamSRuto which may not always be the case. This is depicted by the low Reciprocated Vertex Pair Ratio.
  2. Weak inter and intra cluster edges — Connections between clusters are weak, less for G1 to G5. This means content in the clusters is less likely to reach all users in it. The situation is even worse for inter-cluster connections.
  3. Influence isolation @oleitumbi is the only user of influence in this collection period. The user is a prime target for account suspension e.g. if someone reports of any policy violations. This is depicted by the low graph density value.


The key takeaway from the above analyses is to strategize on countering the three weaknesses above. Building resilient social graphs takes time and a lot of effort, and resilience is needed for information to flow to every intended and likely node. I’ll be expanding on the above by:-

  1. Extracting users who may be of interest to the clusters in Figure 1.
  2. Sentiment across the clusters and users. Is @WilliamSRuto’s content being received well in the user’s network? What can be changed to suit users.
  3. Detailing step-by-step measures on improving the network’s resilience.

Please feel free to look at my data science profile here . If you have any questions or want to collaborate, then kindly get in touch via my Linkedln.


  1. Bollobás, B. (2013). Modern graph theory (Vol. 184). Springer Science & Business Media.
  2. Iraki, X. N. (2021, June 28). Why the ‘Hustler’ narrative has gained traction. The Standard.

