Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Local/Global contagion of viral/non-viral information: Analysis of contagion spread in online social networks

  • Alon Bartal ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Validation, Visualization, Writing – original draft, Writing – review & editing

    bartala@gmail.com

    Affiliation Dept. of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States of America

  • Nava Pliskin ,

    Contributed equally to this work with: Nava Pliskin, Oren Tsur

    Roles Conceptualization, Writing – original draft, Writing – review & editing

    Affiliation Dept. of Industrial Engineering and Management, Ben-Gurion University of the Negev, Beer Sheva, Israel

  • Oren Tsur

    Contributed equally to this work with: Nava Pliskin, Oren Tsur

    Roles Conceptualization, Data curation, Formal analysis, Methodology, Writing – original draft, Writing – review & editing

    Affiliation Dept. of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel

Abstract

Contagion in online social networks (OSN) occurs when users are exposed to information disseminated by other users. Studies of contagion are largely devoted to the spread of viral information and to local neighbor-to-neighbor contagion. However, many contagion events can be non-viral in the sense of being unpopular with low reach size, or global in the sense of being exposed to non-adjacent neighbors. This study aims to investigate the differences between local and global contagion and the different contagion patterns of viral vs. non-viral information. We analyzed three datasets and found significant differences between the temporal spreading patterns of local contagion compared to global contagion. Based on our analysis, we can successfully predict whether a user will be infected by either a local or a global contagion. We achieve an F1-score of 0.87 for non-viral information and an F1-score of 0.84 for viral information. We propose a novel method for early detection of the viral potential of an information nugget and investigate the spreading of viral and non-viral information. In addition, we analyze both viral and non-viral contagion of a topic. Differentiating between local versus global contagion, as well as between viral versus non-viral information, provides a novel perspective and better understanding of information diffusion in OSNs.

1 Introduction

Contagion in an Online Social Network (OSN) is typically measured by the tendency of users to perform online activities such as re-posting or sharing of information, or adopting a new behavior after exposure to similar information or behavior respectively. For example, a user’s feed on Twitter or Facebook presents some of the online activities, e.g., posting or sharing, performed by other users whom s/he follows or is friends with. These social online platforms are designed to maximize interaction and engagement, creating an activity network based on a post-reply interaction, re-sharing activity, or “voting”, e.g., “like” on Facebook and “favoriting” (starring) on Twitter. Some OSNs support two types of networks, an explicit social network, and an implicit activity network. For instance, in addition to a social network of Following-Follower relationships, Twitter supports an activity network of who tweets whom [1] and a retweet (RT) activity network whose nodes are tweet authors and edges indicate paths of contagion spreading [2].

Contagion is often viewed by OSN researchers as a result of user exposure to content posted by adjacent neighboring users on the social graph. One way by which information can reach a user is through local exposure to posts by network neighbors whose distance from the user is 1-hop. Such exposure can lead to local information contagion in the form of infection, influence, or adoption. For example, on Twitter, exposure of a user to information can occur through her/his Follower lists [3] and can result in local contagion when a user retweets a message that a neighbor posted [1, 2]. However, contagion can also be triggered by non-local mechanisms [4] like content promotion [5], exposure to external sources as the mainstream media [6, 7], browsing for information [8], or recommender systems integrated into the social platform [9, 10]. This type of exposure to content, not propagated by the user’s direct neighbors, can lead to global contagion.

Previous studies have considered various aspects contributing to contagion. The temporal order of posted items was considered by [5, 1114]. Structural and non-structural approaches are used to model contagion spread and, most of them [2, 15, 16] infer the spread of local contagion by users who expose network neighbors to their posted content. Moreover, most of these approaches overlook information spread by non-neighbors, assuming implicitly that information can reach a user only within the network edges. However, global contagion can result from external out-of-network events as exposure to content on mainstream media [6]. Yet, global contagion has been addressed as an aggregated phenomenon [6, 7] without studying user behavior at a micro-level. However, a better understanding of contagion spread in OSNs requires accounting for both local and global effects. In addition, most studies of contagion focus on viral content that spreads to numerous users within a short period [6, 17]. While there is no consensus on the number of contagions that make an event “viral”, it is well established that most posts are shared a scant number of times [1719]. Yet, research on contagion spread of non-viral content is sparse [20].

Whereas most contagion studies model local contagion and spread of viral information, this study investigates local/global contagion spread of viral/ non-viral information in three datasets. To detect the depth of the reach of an information nugget, we measure the distance on the Following-relationship network from the source who originated the information, and the distance from the closest infected user to a newly infected user. We found significant differences in the spread of global versus local contagion of viral and non-viral information. Contrary to the common assumption that contagion diffuses from node-to-node, we found that contagion in OSNs also spreads globally beyond social network links. In addition, we found that on the micro-level, viral information spreads faster than non-viral information.

This work makes two main contributions to the vast body of scholarship addressing contagion spread in OSNs. First, our data analysis revealed that local contagion and global contagion are associated with significantly different temporal spreading patterns. Second, we explain the types of contagion spread by considering the time difference from the posting of the original tweet to the time of infection, the number of global and local contagions, and the different distances. Third, based on our data analysis, one can successfully detect whether a user will be infected by either a local or a global contagion with an F1-score of 0.87 for non-viral information and with an F1-score of 0.84 for viral information.

Organization. Section 2 sets the theoretical background for this study, followed by Section 3 which presents the methods used. Section 4 describes the different datasets, followed by the research hypotheses in Section 5. Sections 6 presents the results of the data analyses, followed by hypotheses testing in Section 7. Finally, the limitations of this study are briefly discussed and future research directions are proposed in Section 8.

2 Related work

2.1 Structural contagion models

Most contagion models focus on simple contagion and complex contagion [21, 22]. Simple contagion describes a controlled process with a contagion probability being independent of the number of exposures [22]. Complex contagion describes a process that requires multiple exposures to a contagious entity [22] and describes better than simple contagion the spread of ideas, technologies, or a behavior [22, 23]. Most information-spread studies, whether devoted to simple or complex contagion, focus on local contagion as a result of user-to-user exposure [1, 2, 24].

Addressing local contagion with structural modeling, Leskovec et al. [25] studied a recommendation graph and measured the extent to which a user’s activity in recommending a product is contagious and affects the purchase decisions of adjacent neighbors. Similarly, Sun at al. [26] studied the contagion effect on the participation of users in fan pages after some of their neighboring friends have done so, and Bakshy et al. [27] examined contagion in adopting the use of gestures among friends in the Second Life platform. The Linear Threshold Model [28] allows a user’s transition from a non-active to an active state following a similar transition in the participation of a network neighbor. Kleinberg et al. [29] used a structural transition model with a similar contagion definition of user participation and analyzed the propagation of local contagion by allowing a user to activate her/his inactive neighbors. Two-step diffusion models [15, 30], however, posit that a piece of information first spreads globally from the mainstream media to opinion leaders and, only afterward, propagates locally in a node-to-node manner from opinion leaders to a broader population.

The local neighbor-to-neighbor mechanism at the basis of the structural contagion models discussed so far, miss the global mechanisms behind exposure of OSN users to varied content by non-neighbors, beyond network structure, that can lead to global contagion [4, 6]. For example, a contagion event like retweeting (RT) a message on Twitter by re-posting someone else’s tweet [1, 3] and passing interesting pieces of information to followers, is limited neither to viral information that a user is exposed to nor to a local user-to-user mechanism only. Since contagion can occur beyond pairwise interaction, through higher-order structures than neighbor-to-neighbor [4, 6], it is important to consider contagion by all network users upon modeling contagion.

Detecting both local and global contagion mechanisms while considering network structure is crucial to better understanding human behavior online, as manifested by user interactions [31]. In the case of Twitter, users are exposed to information posted by non-neighbors via exposure to hashtags [32] as well as to promoted content on a user’s Timeline feed [13], which includes tweets of accounts that a user follows as well as content tweeted by non-neighbors either by advertisers purchase or tweets ranked as having a large engagement potential [33]. Similarly, Facebook and Reddit allow global exposure via trending topics that appear on a user’s front page [34]. This facilitation of global exposure in OSNs raises the need to consider non-structural contagion modeling, beyond network structure, as discussed next.

2.2 Non-structural contagion models

Non-structural contagion models (e.g. [35]) do not rely on the structure of the network to infer contagion. The Susceptible-Infected-Resistant (SIR) model [36] and the Susceptible-Infectious-Susceptible (SIS) model [36], for example, assume that every individual has the same probability to be infected. In SIR and SIS models, all users thus have the same contact rate, which is indicated by an edge formation in a network. However, contagion in OSNs is not evenly distributed among users [37] and is likely to depend on exposure rates [6], as modeled by the Linear Influence Model [6], which assumes a static network structure in which the infection likelihood of a user is affected by the number of contagious network users [38].

Aiming to better explain contagion spread with non-structural modeling, Wang et al. [39] modeled contagion by mainly focusing on temporal and topological dynamics while integrating a single topological feature that represents the distance between the infecting and the infected users. Global contagion can also result from homophily-driven diffusion by a peer-to-peer influence model [39], where the user interface has a substantial effect on the contagion process [1]. Other studies [12, 40] focus on the linguistic properties of textual information in predicting contagion spread. In recent years, evidence of global contagion due to exposure to external sources like mainstream media was found by [6, 7, 11] among others.

Since this study aims to investigate not only local versus global contagion but also the spread of viral and non-viral information, structural contagion models have been reviewed in the previous sub-section and non-structural contagion models have been reviewed in the present sub-section. Another goal of this study is to go beyond most studies [18, 20, 4147] which, as discussed next, focus on viral contagion, largely ignoring contagion spread of non-viral information.

2.3 Viral versus non-viral contagion

Analyzing viral contagion, Romero et al. [2] modeled complex contagion of the 500 most frequent hashtags in a Twitter dataset of three billion messages with a median hashtag count of over 93,000 occurrences. Dow et al., [17] studied the spread of 1-million images among Facebook users, restricting their analysis to viral images shared at least 100 times. The “Yes We Can” slogan used in the 2008 U.S. elections reached over 20 million views, demonstrating yet another analysis of viral contagion [48]. The number of contagion events that make a piece of information viral differs among studies. Gleeson et al. [14] found that the structure of the network and temporal dynamics can explain sub-critical (non-viral) cascades.

One of the lowest contagion events was defined by Myers et al. [6] who restricted their analysis to messages shared at least 50 times, while Dow et al. [17] restricted their viral-contagion analysis to images shared at least 100 times. Liben-Nowell et al. [42] measured cascades with hundreds of steps, showing that re-sharing contagion activities have the potential of information spreading to millions of users [20].

The reach distribution of contagious information is long-tailed since most information nuggets are shared only a small number of times if shared at all [18, 49]. Cheng et al. [18], who found that the cascade size distribution of photos posted by users follow a power-law curve with an exponent of α = 2.2, concluded that most information nuggets are scarcely shared further. It is reasonable, therefore, to assume that there exists a contagion mechanism in the few cases in which an information nugget is shared enough times to create a viral contagion.

Several studies [41, 45] have addressed the important task of detecting whether a piece of information will go viral [20] by focusing on information cascades as well as by modeling the contagion spread of hashtags [43], behavioral dynamics [46] or, for example, YouTube views [50]. Other studies tried to predict the size of information cascades by using network features [18, 47]. For instance, Cui et al. [44] applied a logistic model that considers the relative importance of each node given the list of previously infected nodes. Most of these approaches, however, lack the ability to detect virality early.

The scarcity of research regarding contagion spread of non-viral content and regarding global effects on the contagion mechanism has motivated us to distinguish between local versus global contagion and viral versus non-viral information. Complementing and contributing to the body of OSN scholarship about contagion spread of information, we focus on non-viral contagion as a mean to better understand both local and global contagion mechanisms and the nature of human behavior online.

3 Methodology

We begin by defining viral and non-viral information nuggets in Section 3.1. Then, in Section 3.2, we outline the methodology developed for detecting local and global contagion spread of viral and non-viral information nugget in the form of a particular message. Next, in Section 3.3, we generalize the methodology of contagion spread of a particular message to the contagion spread of a topic by defining local and global contagion of viral and non-viral topics. Finally, In Section 3.4, we present an innovative approach for detecting the virality of a tweet in its early stages.

3.1 Defining information virality

According to Dow et al. [17], viral events are manifested by resharing of photos on Facebook at least 100 times. According to Goel et al. [15], rare events are those manifested by resharing tweets on Twitter—0.025% of viral events identified by diffusion trees with at least 100 nodes. Following these authors, we define an information nugget as viral if it is shared at least 100 times and as non-viral if it is shared 10 to 99 times, overlooking information nuggets shared less than 10 times since their contagion spread signature is too low to allow studying contagion spread.

3.2 Detecting contagion spread of information

Consider a directed social network G = (V, E) as defined in Table 1. A post of User vjV can result in local contagion of User viV who Follows vj, starting a cascade of local contagions. Since contagion is time-dependent, we define it more formally while considering the temporal activities of users.

Let w denote an information nugget, like an original tweet, posted at Time t0 by User v0V. Contagion spread, like retweeting, of w at times t1tk by users v1, …, vjV, along with v0, could be thought of as a temporal activity network GTw = (VTw, ETw) as defined in Table 1. The social network G and the activity network GTw allow us to define next, as summarized in Table 1, local and global contagion events.

Local contagion event.

A contagion event of User vi is local if ejiE and eijETw. In other words, vi has shared an original post w after one of the users s/he follows has shared or posted w.

Global contagion event.

A contagion event of User vi is global if ejiE and eijETw. In other words, vi has shared a post w before any of the users s/he follows has shared or posted w.

Fig 1 illustrates an example of local and global contagion spread. Consider a network (Fig 1) in which User v0 posts a tweet w at Time t0, exposing Users v2 and v4 who Follow her/him. At Time t1, v2 retweeted w, demonstrating local contagion since s/he follows v0, while v1 retweeted w, demonstrating global contagion since s/he does not follow v0. The retweets of v3 at t2, v4 at t3, and v5 at t4 all demonstrate local contagion events since their opposite edges in ETw exist in E.

thumbnail
Fig 1. An activity network with a social network.

Distances are measured on the social network.

https://doi.org/10.1371/journal.pone.0230811.g001

To detect the reach depth of an information nugget w, the distance d, is defined (1):

  1. 1.. d—The distance on G from the source who originated w to a newly infected user.

To detect whether a contagion is local/global, distance dca is defined in (2).

  1. 2.. dca—The distance on G to a newly infected user from the closest infected user who adopted w or from the user who originated w.

The definitions of the variables are summarized in Table 1.

Returning to Fig 1, consider an example where the retweeting source User v0, and her Follower v3 are the only members of VTw at Time tk−1. Next, global contagion spreads to User v5 at Time tk. The minimal distance from any user in VTw to User v5 before tk is dca(v3, v5) = 2, or dca(v0, v5) = 2.

Several posts that describe the same subject can be grouped into a topic, and analyzing the contagion spread of a topic instead of the spread of a more specific nugget can uncover more realistic patterns of contagion spread [51].

3.3 Detecting contagion spread of a topic

A topic contagion event occurs when a single User vi posts several original messages about the same subject, with each original message potentially shared by different users.

For example, assume that User v0 (Fig 1) tweeted twice about the same specific topic—the discovery of the Higgs boson particle—described further in Section 4. The First tweet (i) was: #CERN scientists inexplicably present #Higgs #boson findings, and the second tweet (ii) was: New result from #LHC reinforces belief that the particle has been found. User v2 who follows v0, retweeted (i) exposing to the topic a follower—User v3—who does not follow v0. Next, v3 retweeted (ii). This contagion sequence facilities an event of local topic contagion: v3 was exposed to a topic and to v0, upon being infected by v2 whom s/he follows.

Our definitions of viral topic contagion and non-viral topic contagion are similar to the definitions of particular viral and non-viral tweets suggested in Section 3.2. A topic is viral if its set of original messages were shared at least 100 times and is non-viral if the set of its original messages were shared 10 to 99 times. There are two differences between contagion of a particular tweet and contagion of a topic. First, the number of retweets of each original message within a topic that was posted by the same User vi are summed. Second, while users who share a particular message whose origin is User vi are considered infected, a user infected by a topic is one who shares any of vi’s original messages on the same topic. Stated differently, in the case of topic contagion, a user can be directly exposed to one message whose origin is vi but share another message on the same topic whose origin is the same user vi.

Topical contagion is a more realistic scenario as users are infected by a concept rather than by a fixed sequence of characters.

For a more formal definition of topic contagion, the social network G is defined as suggested in Section 3.2, and a topical contagion is then defined: A set of original tweets w1p, …, wnp, posted within a selected time interval by User v0V about a topic p (denoted by wμp, μ = 1, …, n). The retweeting spread of wμpW at times t1tk by Users v1, …, vjV, along with User v0 who originated wμp, form a temporal activity network GTW, which is laid over G. In GTW, Nodes VTW depict v0 as well as users who retweet any wμpW at Times titk and Edges ETW represent retweet relations. These definitions are also summarized in Table 1.

Local topic contagion event.

A topic contagion event of User vi is local if ejiE and eijETW. In other words, vi has shared an original post wμpW after one of the users s/he follows has shared or posted wμpW.

Global topic contagion event.

A topic contagion event of User vi is global if ejiE and eijETW. In other words, wμpW was not posted by any user that User vi follows, and vi shared wμpW before any of the users s/he follows has shared or posted wμp.

We define the two types of distances (d and dca) similar to the manner defined for particular tweets in Section 3.2, as summarized in Table 1.

3.4 Detecting tweet virality in early stages

Our goal is to detect whether a tweet will become viral at its early stages—before it becomes viral. To achieve this goal, we developed the Back-in-time (BIT) approach that allows us to study the spreading patterns of viral tweets before they became viral and compare them to the spreading patterns of non-viral tweets. To analyze contagion spread of viral information in its early stages before it became viral, we roll a viral tweet back in time to a point when it was still a non-viral tweet (i.e., having 10 to 99 retweets). At the end of this stage, both rolled-back tweets (BIT-tweets) and non-viral tweets have 10 to 99 retweets. Aiming to show that the spreading patterns of BIT-tweets and non-viral tweets differ significantly, we compare the spreading patterns of BIT-tweets with the spreading patterns of non-viral tweets. Under the BIT approach, a viral tweet is sent back in time and the number of retweets (retweet-count) it had back then is determined, via three steps: (i) Learn the retweet count distribution of non-viral tweets by creating a kernel density estimate (KDE) of their retweet-counts. Accounting for non-viral tweets, the left-most and right-most points of the grid at which the density is estimated are 10 and 99 respectively; (ii) Randomly sample from the KDE a number y that represents a retweet-count and assign it to an original viral tweet. For that purpose, we draw a point xi from the set of points x1, …, xn included in the KDE. Then, we draw a value y associated with xi to ensure that more probable values of non-viral retweet-counts in the distribution are more likely to be randomly sampled; (iii) Roll a viral tweet back in time by keeping the oldest retweets, such that its retweet-count is equal to y.

Next, we describe the three datasets used in this study.

4 Datasets description

Dataset 1 (DS1) contains local/global contagion as well as viral/non-viral messages and was collected using Twitter stream API, by following four Steps. (i) Hebrew tweets were collected during five months prior to the Israeli elections of April 2019, since their politics content can potentially go viral and allow us to select, in the next step, a set of active users who posted at least once; (ii) A set C of several hundreds of active users, containing politicians and journalists, was defined; (iii) All tweets that mention, or interact with each uiC from December 2018 to January 2019 were captured, where each original tweet has the potential to be contagious and initiate a series of contagion events; and (iv) Collect followers of users uiC, who posted an original tweet.

Dataset 2 (DS2) facilitates studying contagion spread of non-viral information by using the Twitter stream API to first collect into the initial dataset, during March 2017, all the accounts of users who posted public tweets that mention the president of the United States (US) or a member of the US Congress. Then, the appropriate dataset was curated in five steps: (i) Users who posted a tweet in the initial dataset were included in DS2 if the user’s Followers list and Friends list range between 100—to avoid inactive users—and 1,000—to avoid celebrities who tend to go viral; (ii) The 200 most recent tweets of each user selected in Step i were collected (as done in [52]), without topic limitation, thus countering the possible bias toward topics about politics in the initial dataset. The most recent tweets were collected since the older a tweet, the more likely that it is that its’ retweet list will be missing a user and possibly bias the results; (iii) To include non-viral tweets only, each tweet with 10 to 100 retweets (RTs), collected in Step ii was kept; (iv) Accounts of users who retweeted an original tweet in Step iii were kept in DS2; and (v) The Followers-list of users kept in Step iv were collected.

To verify that the retweet counts of non-viral tweets did not grow to be viral, Non-viral tweets in DS1 and DS2 were queried using Twitter API at least a month after their posting date and only non-viral tweets were considered.

Dataset 3 (DS3) was curated from the dataset, described in [53], to study the way topical contagion spreads within a time interval, covering tweets posted from July 1st to July 7th, 2012 about the discovery of the Higgs boson subatomic particle. DS3 contains interactions on the same topic by mentions, replies, retweets, and original tweets with at least one of the following keywords or hashtags: “lhc”, “cern”, “boson”, “higgs”. DS3, curated in three steps Included: (i) Users who posted an original tweet; (ii) The RT list for each original tweet; and (iii) The Following relationships among users, selected in the two earlier steps.

DS3 differs from DS1 and DS2 in containing data about contagion spread of a topic instead of data about the spread of a single original tweet. Homophily often explains social relationships [54]. Since viral and non-viral tweets in DS3 are discussing the same topic, users who retweeted them might be socially connected by a Following relationship due to similar topic interests. Therefore, our analyses are based on the combined social network formed by users in DS3 who retweeted or posted viral and non-viral content.

Table 2 summarizes the three analyzed datasets after preprocessing.

In the context of the background and the proposed differentiation approach, we address four hypotheses next.

5 Research hypotheses

Global contagion is not limited to a node-to-node spreading mechanism over time [4] and, hence, we hypothesize:

H1. Local and global contagion have different temporal spreading patterns in an OSN.

Only a small fraction of information goes viral and reaches a large audience. Detecting in the early stage of its life-cycle whether the information would evolve to be viral or not is extremely useful [17]. Due to the difference between viral and non-viral information, expressed by the temporal dynamics of information spread, we hypothesize:

H2. Viral and non-viral information have different local and global contagion spreading patterns, starting at an early stage of the appearance of the information.

A piece of information can spread by a single sharing activity such as a retweet. Several messages on the same subject can be grouped by a topic [2, 55]. Analyzing contagion spread of such a topic can uncover realistic patterns of contagion spread. Hence, we hypothesize:

H3. Topic contagion has similar temporal spreading patterns to contagion spread of information.

6 Results of data analyses

6.1 Analysis of contagion reach depth

This section describes how we detect a contagion and measure its reach. Given a network, local contagion begins at some node and then spreads over the edges. Typically, local contagions are measured on the activity network [56]. However, the social network plays a fundamental role in the dynamics of spreading [53, 57, 58].

We detect a contagion using the activity network (GTw) and measure the depth it reaches on the social network (G). The depth is defined as the largest distance (d) that the information (w) spreads from the user originating the information (Table 1).

To detect the reach depth of an original tweet (w), each user retweeting w was assigned a distance (d) on G and in the absence of a path on G, an infinite distance (INF) was assigned. Instead of INF, a path might exist at larger distances by collecting wider circles of Following-lists but contagion at such larger distances is global and, thus, does not affect interpreting the results.

Most contagion events of both viral and non-viral information are local (d = 1). Viral information (Fig 3a) has a contagion reach depth d ≤ 18, whereas non-viral information (Fig 3b), has a shorter contagion reach with depth d ≤ 7.

For both viral information (when d ≤ 9, d ∈ [13, 18]), and non-viral information (when d ≤ 7) in DS1, if a path exists on G, the more distant a user is from the source who originated the information (User v0), the less likely s/he is to retweet w. We observed a trend shift for viral information when d ∈ [10, 12], and found that the more distant a user is from the source User v0, the more likely s/he is to retweet w.

For non-viral information, DS2 presents similar trends to DS1 (Table 3) for d ≤ 9, d ≠ 8 (Fig 3c), if a path exists on G, the more distant a user is from the source User v0, the less likely s/he is to retweet w. For d = 8, users are more likely to be infected than d = 7. This finding might be attributed to global contagion (e.g. mass media or the Timeline). Similar to DS1, most contagion events in DS2 are local (d = 1).

thumbnail
Table 3. Contagion spread: The larger the distance the less likely a user will retweet.

https://doi.org/10.1371/journal.pone.0230811.t003

To better understand contagion spread of viral and non-viral information, we analyze next, contagion spread by type.

6.2 Analysis of contagion types

This section explains how we detect local and global contagion. As explained in Section 3.2 and demonstrated in Section 6.1, local contagion results when d = 1. However, the mechanism for global contagion is different. Fig 2 demonstrates global contagion where User v0 is the source of w and User v3 retweets w at Time t1 before User v2 retweets w at Time t2. This order of events implies global contagion of User v3.

thumbnail
Fig 2. An activity network with a social network illustrating global contagion.

https://doi.org/10.1371/journal.pone.0230811.g002

To account for such cases, we assigned to each infected node a local or global contagion type as described in Section 3.2. Table 2 shows the percentage of local and global contagions in each dataset. Next, we computed dca (Table 1). Analyzing DS1 for viral information, Fig 3a depicts two types of distances and, by analyzing DS1 for non-viral information Fig 3b also presents two types of distances. By analyzing DS2, which contains non-viral information only, Fig 3c and 3d depict two types of distances as well. Similarly, by analyzing DS3 for viral topic spread, Fig 3e depicts two types of distances and, by analyzing DS3 for non-viral topic spread, Fig 3f depicts two types of distances.

thumbnail
Fig 3. Distributions of distances, measured on the social network (G) for each of the three analyzed datasets, DS1 to DS3 (Table 2).

https://doi.org/10.1371/journal.pone.0230811.g003

In Fig 3, which presents a distribution of d and dca distances as measured on the social network G for each of the analyzed datasets D1 to D3, dca = 1 indicates the existence of a Following relationship among the closest adopter and an infected user, thus demonstrating local contagion. The most frequent contagion mechanism is depicted by the highest bar. On the other hand, dca > 1 indicates no Following relationship among the closest adopter and an infected user, thus demonstrating global contagion.

Table 3 summarizes these indications. In both DS1 and DS2, local contagion is the most frequent contagion mechanism (dca = 1), and the closer a user is to an adopter, the more likely s/he is to retweet. It is likely that a user who is retweeting at dca < 8 was exposed to content via network neighbors by manually crawling through the feeds of neighbors and neighbors of neighbors, or through exposure to the Twitter Timeline that presents tweets of accounts that a user follows. These results are in line with findings of previous studies on neighbor-to-neighbor contagion [1, 2, 24] that network members have a significant influence on their neighbors.

For global contagion spread of viral and non-viral information, in both DS1 and DS2, the larger the distance the less likely a user is to retweet w. In DS1 for viral information when dca > 8 (Fig 3a) and for non-viral information when dca > 7 (Fig 3b), as well as in DS2 when dca > 6 (Fig 3d): the larger the distance, the more likely is a user to retweet w. These finding can be attributed to the global contagion, resulting from global exposure to external sources. Since non-viral tweets are less likely to be covered by the mainstream media yet their global contagion distances (Fig 3b and 3d) are similar to those of viral tweets (Fig 3a), it is reasonable to conclude that the Twitter Timeline algorithm exposed users to non-viral information. Due to the low available time and limited attention of users [51], information pushed on one’s Timeline significantly increases the contagion rate [1, 19, 33]. Thus, it is more likely that the global contagion observed in Fig 3b and 3d occurred due to content promoted by the Timeline algorithm rather than due to local neighbors.

Each added local contagion expands the circles of exposed users. Hence, the number of globally exposed users at an arbitrary distance of dca also increases and might lead to global contagion. Aiming to learn the effect of each added local contagion on global contagion spread, we examine next the explanatory power of local contagion on the number of global contagions.

6.3 Explaining contagion spread

Our goal is to explain contagion spread by type, whether local or global while differentiating between viral and non-viral information. We developed one model for viral information and another model for non-viral information. Given contagion events, we aim to predict whether a user will be infected by a local or by a global contagion.

We measured the following explanatory terms for each individual user at the time of adoption, when s/he retweeted the information.

  1. ΔTo—Time difference from the posting of the original tweet.
    Captures bursty user interactions [59, 60], which can explain contagion spread [61].
  2. GC—Total number of global contagions.
    Captures contagion spread similar to non-structural models (Section 2.2).
  3. d—Distance from the tweeting source (Table 1).
    Accounts for our findings about the likelihood of a user to retweet (Fig 3a, 3b and 3c). Note, a local contagion can occur even when d > 1.

Table 4 presents the results of a logistic regression model fit with least squares, separated by tweet virality, where the outcome variable is local or global contagion, and local contagion is the base parameter. The ΔTo variable yields the greatest explanatory power on the contagion type of users for both viral, and non-viral information.

Non-viral information. A significant negative coefficient of ΔTo indicates that the larger the time difference from the posting of an original tweet, the less likely a user to be infected via global contagion and, thus, local contagion is more likely.

Viral information. The opposite is found: the larger ΔTo, the more likely a user to be infected by a global contagion. Viral information is persistent in the sense that repeated exposures to information are more likely to have contagion effects [2].

For both models, a significant negative coefficient of the term GC indicates that the more global contagions occur in a network, the less likely another global contagion will occur. This finding can be attributed to one’s reluctance to retweet information of whom s/he does not know [62]. A single global contagion, e.g. of a user within a clique, can start a cascade of local contagions. Thus, local contagion minimizes the chance of a global contagion by minimizing the number of non-adopters if it spreads faster than global contagion. We test this assumption in Section 6.4. Finally, a significant positive coefficient of the d Term in both models indicates that the larger the distance from the user originating the tweet, the higher the probability of global contagion. To estimate model performance, we used for training in the viral group and the non-viral group 70% of the data and for testing 30%. The performance of each model is consolidated into the F1-score measure. Since an F1-score of 1.0 means perfect precision and recall, an F1-score close to 1.0 in Table 4 implies better model performance.

Since time difference has the largest explanatory power (Table 4), we present next, a temporal analysis of contagion spread of information.

6.4 Temporal analysis of contagion spread

To analyze the patterns of contagion spread, we calculate the time difference between every two consecutive retweets, i.e. inter-retweet times of viral and non-viral tweets. For DS1, Fig 4a presents four empirical cumulative distribution functions (ECDFs) of inter-retweet times that correspond to each contagion type, local versus global, and virality: viral tweets versus non-viral tweets.

thumbnail
Fig 4. ECDFs of inter-retweet times.

The time axis scale is transformed into a log scale.

https://doi.org/10.1371/journal.pone.0230811.g004

For DS1, the two ECDF curves (Fig 4a) for local and global contagion of viral tweets are located above the two ECDFs of non-viral tweets, showing that users are more likely to retweet viral information. Regardless the of tweet vitality, the local contagion curve is located above the global contagion curve (Fig 4a), showing for DS1, that a user is more likely to retweet information tweeted or retweeted by users whom s/he follows. Unlike DS1, however, the ECDF of global contagion for DS2 (Fig 4b) is located above the ECDF of the local contagion up to Point P1, indicating that in DS2 local contagion spreads more slowly than global contagion. DS2 users retweeting due to global contagion, respond quicker than users infected by local contagion. At P1, 88% of the inter-retweet times of users were under 2 hours and afterward, the spread of global contagion is slightly slower than the spread of local contagion.

Since DS2 contains non-viral information, which is less likely covered by mainstream media, the Twitter Timeline algorithm likely promoted information up to Point P1. Since the algorithm constantly changes [33] and affects the spread of global contagion, however, the algorithm likely stopped promoting the information at Point P1.

Our findings show that Twitter users interact in bursts in the sense that short periods during which they send several tweets are separated by long periods of reduced activity [60, 63]. For DS1, the results show that local contagion spreads faster than global contagion regardless of tweet virality. Furthermore, regardless of it is local or global, contagion of non-viral information spreads faster in DS2 than in DS1 (Table 5).

thumbnail
Table 5. Time when 80% of inter-retweet times of local and global contagion are reached, considering non-viral information.

https://doi.org/10.1371/journal.pone.0230811.t005

The nature of local contagion may explain our findings. Local contagion brings information to a user through neighbors whom s/he trusts, whereas global contagion can be promoted by algorithms. Another explanation involves the nature of the datasets. DS2 contains tweets written in English. Thus, many users worldwide can understand and retweet as, for example, when external regimes interfere with U.S. politics [64] or in discussions about movies, TV shows, and sports [2]. The high engagement of users about different topics is expressed by short inter-retweet times and call for a deeper analysis of topic-contagion spread.

6.5 Contagion spread of a topic

This section aims to detect and measure for viral and non-viral topics the reach depth of local and global contagion. Each topic contagion event W in DS3 can contain several original tweets wμpW about the Higgs boson particle.

Viral events. The percentage of global topic-contagion of viral events (16.9%) in DS3 is lower than the percentage of global contagion of viral original tweets (21.75%) in DS1 (Table 2). In terms of the contagion spreading distance from the tweeting source (d) of viral events (Fig 3e), DS3 presents similar trends to the contagion spread of original tweets in DS1 (Fig 3a), as summarized in Table 3. In DS3, for d ≤ 17, d ≠ 10, if a path exists on G, the more distant a user is from the user originating the information (source User v0), the less likely s/he is to retweet wμpW.

Both DS1 and DS3 present a shift in trend for d = 10. In DS1, a shift is observed for d ∈ [10, 12], while in DS3 a shift is observed only for d = 10. This trend shift might be attributed to global contagion. Also, local topic-contagion spread of viral events (DS3), and local contagion spread of viral original tweets (DS1), are the most frequent (d = 1).

Considering the distance from the closest adopter (dca in Table 1), viral global topic-contagions in DS3 (Fig 3e), present similar trends (for dca ≤ 9) as viral global events in DS1 for dca ≤ 8 (Fig 3a). The larger dca, the less likely a user is to retweet wμpW in DS3 and, similarly, the less likely a user is to retweet an original tweet w in DS1. This trend is reversed for dca > 9 in DS3 and for dca > 8 in DS1.

Non-viral events. For non-viral events, the percentage of global topic-contagion (Table 2) in DS3 (12.8%) is similar to that in DS1 (12.14%), but smaller than in DS2 (18.29%). The trends of the contagion-spreading distances (d) for non-viral events in DS1 (Fig 3b), in DS2 (Fig 3d), and in DS3 (Fig 3f) are similar. For d ≤ 13 in DS3, similarly to DS1 and DS2 (Table 3), if a path exists on G, the more distant a user is from the source User v0, the less likely s/he is to retweet wμpW. We also observe that local contagion spread of non-viral events in DS1, DS2, and DS3 are the most frequent ones (d = 1). Considering the distance from the closest adopter (dca), in all three datasets (Fig 3b, 3d and 3f), global contagion spread presents similar trends. In DS1 and DS3 for dca ≤ 7, the closer a user to an adaptor, the more likely that user will retweet, while in DS2 the findings are similar for dca ≤ 6.

In all three datasets, local contagion is the most frequent contagion mechanism. The results show similar contagion spreading trends of (i) Viral information and viral topics, and (ii) Non-viral information and non-viral topics. In Fig 3, a contagion at d > 1 does not necessarily indicate node-to-node contagion spread and might be attributed to global contagion as well.

Based on the results of the data analysis in the present section, we test the three hypotheses of this study (Section 5).

7 Hypotheses testing

The results in the previous section pave the way to focusing in the present section on the findings of Hypothesis testing. Section 7.1 presents the testing of H1, based on the temporal spreading patterns of local and global contagion. Section 7.2 presents the testing of H2, using the findings about local and global contagion spread while, at the same time, considering the information virality by using the BIT approach developed in Section 3.4. Finally, Section 7.3 presents the testing of H3 regarding the similarity between contagion spread patterns of a topic to those of particular tweets.

7.1 H1 testing

To test H1 and reveal whether local and global contagion has different temporal spreading patterns, we compare the distribution of the ECDFs in Fig 4 above by using the Kolmogorov-Smirnov (KS) D-statistic test [18, 65]. The D-statistic is defined as the maximum distance: D = max(|F1(x) − F2(x)|), where x represents the range of the random variable, and F1 and F2 represent the empirical cumulative distributions functions. The smaller the distance, the more similar the distribution curves and, hence, the more likely are the two samples to come from the same distribution. In the KS-test, a pvalue < 0.05 indicates that the samples are not drawn from the same distribution.

For DS1, we found support for H1 via four KS-tests for pairs of ECDFs (Fig 4a) and found, as elaborated upon in Table 6, significant differences between the ECDFs (with varying D-statistics and every Pvalue ≤ 2.2 × 10−16). Thus, we found the temporal spreading patterns of local and global contagion to be significantly different. Also, for DS2, similar to DS1, we found support for H1 via a single KS-test between the ECDFs of local and global contagions (Fig 4b). The KS-test revealed that the two ECDFs are significantly different (D-statistic 0.56, Pvalue ≤ 2.2 × 10−16), thus uncovering different patterns of user behavior. In sum, the KS-tests for both DS1 and DS2 support H1 by revealing that local and global contagion have significantly different temporal spreading patterns.

Aiming to go beyond examining and explaining local versus global contagions, we apply next, the Back-in-time (BIT) approach for early detection of viral posts before becoming viral.

7.2 H2 testing

Next, to test H2 and reveal if viral and non-viral information have different local and global contagions spreading patterns, we use the BIT approach for DS1 since DS1 contains both viral information, needed for applying the BIT approach, and non-viral information.

Applying the developed BIT approach to DS1 in Step i, we created a KDE to estimate the probability density function of the retweet-count based on 1,950 non-viral tweets (Table 2). Having found in Step ii (Table 2) that the number of original viral tweets was 127, we sampled therefore 127 numbers from the KDE and randomly assigned a BIT retweet-count y to each of the 127 viral tweets. In Step iii, we rolled back in time each of the 127 viral tweets to a point where its retweet-count was y. We repeated Steps ii to iii 100 times. Fig 5 presents for DS1 the retweet-count distribution of non-viral information, the KDE, and the distribution of the KDE samples, i.e. the distribution of the retweet-count of BIT tweets obtained in Step ii.

thumbnail
Fig 5. Retweet-count distribution of BIT and non-viral tweets.

https://doi.org/10.1371/journal.pone.0230811.g005

The BIT-tweets are viral tweets that were rolled back in time to a point when their retweet-count was 10 to 99, the same as the retweet-count for non-viral tweets. We then assigned to each BIT tweet a contagion type. Fig 4c above presents two local and global curves for the BIT tweets and two local and global curves for the non-viral tweets. As depicted in Fig 4c, although the BIT tweets have a smaller retweet count compared to viral tweets, their contagion spread patterns are similar to those of viral tweets and the ECDFs of BIT tweets are located above the ECDFs of non-viral tweets. In addition, local contagion is located above global contagion for both non-viral tweets and BIT tweets.

Support for H2 was found via four KS-tests by analyzing the four curves in Fig 4c based on DS1 for: local non-viral versus local BIT tweets, global non-viral versus global BIT tweets, global non-viral tweets versus non-viral local tweets, and global BIT tweets versus local BIT tweets. As elaborated upon in Table 6, all ECDFs are significantly different from one another (with varying D-statistics and every Pvalue ≤ 2.2 × 10−16). Finding support for H2 suggests that both the global and the local contagion spreading patterns of the BIT tweets significantly differ from the patterns of non-viral tweets.

7.3 H3 testing

Based on DS1 and DS2 analyses, H1 and H2 testing focused on contagion spread of a single original tweet. However, contagion of an idea or a topic is not limited to a single tweet and can spread by a sequence of tweets discussing the same topic, as in the DS3 dataset, which is devoted to the announcement about the discovery of the Higgs boson particle.

To test H3 and reveal if topic contagion has similar temporal spreading patterns to information contagion spread, we studied the inter-retweet time of topic contagion spread. Fig 4d presents four ECDFs of inter-retweet times that correspond to each contagion type—local versus global and each event virality—viral versus non-viral. Similar to the inter-retweet patterns of viral original tweets in DS1 (Fig 4a), Fig 4d shows for DS3 that the two ECDFs for local and global viral events are located above the respective ECDFs for non-viral events. Similar to the inter-retweet patterns of non-viral original tweets in DS1 (Fig 4a), Fig 4d also shows for DS3 that the global contagion curves are located below the local contagion curves, up to point P2. After P2, the global contagion curves slightly exceed the local contagion curves. In DS3, user interaction occurs before, during, and after the announcement of the Higgs boson discovery and different time-varying dynamics of user activities were found in these three periods [53]. The popular media’s coverage of the events toward the last period of the data collection, led most likely to global contagion [53]. In both DS3 and DS2, as depicted in in Fig 4d and 4b respectively, the ECDFs for local and global contagion of non-viral events alternate their relative location at P1 and P2. This alternation implies for non-viral events the sensitivity of global contagion spread to exposure by global contagion sources like the mainstream media. Like in DS1 and as elaborated in Table 6 for DS3, we found support for H3 by conducting four KS-tests. The results revealed that the four ECDFs curves are significantly different (with varying D-statistics and every Pvalue ≤ 2.2 × 10−22).

8 Conclusions

Most contagion research is bound by modeling local contagion and the spread of viral information only. This study analyzed and compared user behavior in three Twitter datasets and found, contrary to the common assumption that contagion diffuses from node-to-node, that contagion in OSNs also spreads globally beyond social network links. Our finding of significant differences in the spread of global versus local contagion of viral and non-viral information implies different mechanisms of contagion. We also found that the contagion spread of a topic presents similar patterns to the contagion spread of a single information nugget (original tweet). Regarding the spread of global contagion of non-viral events (original tweets or topics), we found that the larger the distance of a user from the closest adopter, the less likely s/he is to be globally infected. For all three datasets, this trend is reversed approximately at a distance of 7 from the closest adopter and at a distance of 8 from the closest adopter for global contagion of viral events. These findings can be explained by user reluctance to search for information via user-to-user page crawling at these distances. Thus, contagion more likely occurs due to content promoted by external sources that facilitate jumps between content pages, like content recommendation algorithms, mass media, or searching for information in the Twitter search box.

We also analyzed the temporal retweeting activity of users and found that viral information spreads faster than non-viral information. In addition, the patterns of local and global contagion of non-viral information depend on the nature of the analyzed dataset, probably since retweeting non-viral content, is less likely covered by mainstream media and is less appealing. Therefore, contagion spread depends on content promoting algorithms like Twitter Timeline.

KS-tests revealed that local and global contagion have significantly different spreading patterns, supporting H1. Analysis of inter-retweet times revealed significant differences between the spreading patterns of viral and back-in-time tweets, supporting H2. In support of H3, we found that the contagion spreading patterns of viral and non-viral topic contagion events are similar to the contagion spreading patterns of viral and non-viral information. Other than the contributions of this work via hypothesis testing, another innovative contribution is the development of the back-in-time (BIT) approach used to analyze the spreading patterns of viral tweets at a point back in time when they had a retweet-count of non-viral tweets.

One limitation of this study is that the size of the Twitter Following-list might have changed during the data-collection period, possibly influencing user exposure to information. To test the change in size of a user’s Following-list, we recollected two weeks after the initial collection date the Following-lists of randomly selected 538 users from DS2. Our finding that 80% of users had an absolute difference between their Following-lists of less than 53 reveals that most users had a minor change. Another limitation is that some tweets can become viral after the collection date. Therefore, non-viral tweets were queried using Twitter API at least a month after the date they were posted, verifying that their retweet counts did not grow to be viral.

Whereas most studies focus on user-to-user contagion, our focus is on how viral and non-viral contagion spread. Since most content is non-viral, its analysis makes a valuable contribution to the OSN research. The novelty of this study is in modeling user behavior in an OSN, accounting for local and global effects, and in developing the BIT approach.

References

  1. 1. Hodas NO, Lerman K. The simple rules of social contagion. Scientific reports. 2014;4:4343. pmid:24614301
  2. 2. Romero DM, Meeder B, Kleinberg J. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. In: Proceedings of the 20th international conference on World wide web. ACM; 2011. p. 695–704.
  3. 3. O’Reilly T, Milstein S. The twitter book. “O’Reilly Media, Inc.”; 2011.
  4. 4. Bartal A, Pliskin N, Ravid G. Modeling influence on posting engagement in online social networks: Beyond neighborhood effects. Social Networks. 2019;59(1):61–76.
  5. 5. Bakshy E, Messing S Adamic LA. Exposure to ideologically diverse news and opinion on Facebook. Science. 2015;348(6239):1130–1132. pmid:25953820
  6. 6. Myers SA, Zhu C, Leskovec J; ACM. Information diffusion and external influence in networks. 2012; p. 33–41.
  7. 7. Cha M, Haddadi H, Benevenuto F, Gummadi PK. Measuring user influence in twitter: The million follower fallacy. Icwsm. 2010;10(10-17):30.
  8. 8. Jain A, Lupfer N, Qu Y, Linder R, Kerne A, Smith SM. Evaluating tweetbubble with ideation metrics of exploratory browsing. In: Proceedings of the 2015 ACM SIGCHI Conference on Creativity and Cognition. ACM; 2015. p. 53–62.
  9. 9. Pariser E. The filter bubble: What the Internet is hiding from you. Penguin UK; 2011.
  10. 10. Buettner R. A framework for recommender systems in online social network recruiting: An interdisciplinary call to arms. In: 2014 47th Hawaii International Conference on System Sciences. IEEE; 2014. p. 1415–1424.
  11. 11. Leskovec J, Backstrom L, Kleinberg J. Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2009. p. 497–506.
  12. 12. Tsur O, Rappoport A. Don’t Let Me Be #Misunderstood: Linguistically Motivated Algorithm for Predicting the Popularity of Textual Memes. In: International AAAI Conference on Weblogs and Social Media; 2015.
  13. 13. Wakamiya S, Kawai Y, Aramaki E. Twitter-based influenza detection after flu peak via tweets with indirect information: text mining study. JMIR public health and surveillance. 2018;4(3):e65. pmid:30274968
  14. 14. Gleeson JP, Durrett R. Temporal profiles of avalanches on networks. Nature communications. 2017;8(1):1227. pmid:29089481
  15. 15. Goel S, Anderson A, Hofman J, Watts DJ. The structural virality of online diffusion. Management Science. 2015;62(1):180–196.
  16. 16. Kramer AD, Guillory JE, Hancock JT. Experimental evidence of massive-scale emotional contagion through social networks. Proceedings of the National Academy of Sciences. 2014;111(24):8788–8790.
  17. 17. Dow PA, Adamic LA, Friggeri A. The anatomy of large facebook cascades. In: Seventh international AAAI conference on weblogs and social media; 2013.
  18. 18. Cheng J, Adamic L, Dow PA, Kleinberg JM, Leskovec J. Can cascades be predicted? In: Proceedings of the 23rd international conference on World wide web. ACM; 2014. p. 925–936.
  19. 19. Weng L, Flammini A, Vespignani A, Menczer F. Competition among memes in a world with limited attention. Scientific reports. 2012;2:335. pmid:22461971
  20. 20. Subbian K, Prakash BA, Adamic L. Detecting large reshare cascades in social networks. In: Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee; 2017. p. 597–605.
  21. 21. Karsai M, Iniguez G, Kaski K, Kertész J. Complex contagion process in spreading of online innovation. Journal of The Royal Society Interface. 2014;11(101):20140694.
  22. 22. Min B, San Miguel M. Competing contagion processes: Complex contagion triggered by simple contagion. Scientific reports. 2018;8(1):10422. pmid:29991815
  23. 23. Mønsted B, Sapieżyński P, Ferrara E, Lehmann S. Evidence of complex contagion of information in social media: An experiment using Twitter bots. PloS one. 2017;12(9):e0184148. pmid:28937984
  24. 24. Bakshy E, Rosenn I, Marlow C, Adamic L. The role of social networks in information diffusion. In: Proceedings of the 21st international conference on World Wide Web. ACM; 2012. p. 519–528.
  25. 25. Leskovec J, Adamic LA, Huberman BA. The dynamics of viral marketing. ACM Transactions on the Web (TWEB). 2007;1(1):5.
  26. 26. Sun E, Rosenn I, Marlow CA, Lento TM. Gesundheit! modeling contagion through facebook news feed. In: Third international AAAI conference on weblogs and social media; 2009.
  27. 27. Bakshy E, Karrer B, Adamic LA. Social influence and the diffusion of user-created content. In: Proceedings of the 10th ACM conference on Electronic commerce. ACM; 2009. p. 325–334.
  28. 28. Chen W, Yuan Y, Zhang L. Scalable influence maximization in social networks under the linear threshold model. In: 10th International Conference on Data Mining (ICDM). IEEE; 2010. p. 88–97.
  29. 29. Kleinberg J. Cascading behavior in networks: Algorithmic and economic issues. Algorithmic game theory. 2007;24:613–632.
  30. 30. Katz E. The two-step flow of communication: An up-to-date report on an hypothesis. Public opinion quarterly. 1957;21(1):61–78.
  31. 31. Bartal A, Ravid G. Member Behavior in Dynamic Online Communities: Role Affiliation Frequency Model. IEEE Transactions on Knowledge and Data Engineering. 2019.
  32. 32. Chang H. A new perspective on Twitter hashtag use: Diffusion of innovation theory. Proceedings of the Association for Information Science and Technology. 2010;47(1):1–4.
  33. 33. Koumchatzky N, Andryeyev A. Using Deep Learning at Scale in Twitter’s Timelines; 2017. https://blog.twitter.com/engineering/en_us/topics/insights/2017/using-deeplearning-at-scale-in-twitters-timelines.html.
  34. 34. Richterich A. ‘Karma, precious karma!’Karmawhoring on Reddit and the Front Page’s econometrisation. Journal of Peer Production, 4(1). 2014.
  35. 35. Wang F, Wang H, Xu K. Diffusive logistic model towards predicting information diffusion in online social networks. In: 2012 32nd International Conference on Distributed Computing Systems Workshops. IEEE; 2012. p. 133–139.
  36. 36. Bailey NT, et al. The mathematical theory of infectious diseases and its applications. Charles Griffin & Company Ltd, 5a Crendon Street, High Wycombe, Bucks HP13 6LE.; 1975.
  37. 37. Guille A, Hacid H, Favre C, Zighed DA. Information Diffusion in Online Social Networks: A Survey. ACM SIGMOD Record. 2013;42(2):17–28.
  38. 38. Yang J, Leskovec J. Modeling information diffusion in implicit networks. In: 2010 IEEE International Conference on Data Mining. IEEE; 2010. p. 599–608.
  39. 39. Wang W, Zhou H, He K, Hopcroft JE. Learning Latent Topics from the Word Co-occurrence Network. In: National Conference of Theoretical Computer Science. Springer; 2017. p. 18–30.
  40. 40. Tsur O, Rappoport A. What’s in a hashtag?: content based prediction of the spread of ideas in microblogging communities. In: Proceedings of the fifth ACM international conference on Web search and data mining. ACM; 2012. p. 643–652.
  41. 41. Leskovec J, McGlohon M, Faloutsos C, Glance N, Hurst M. Patterns of cascading behavior in large blog graphs. In: Proceedings of the 2007 SIAM international conference on data mining. SIAM; 2007. p. 551–556.
  42. 42. Liben-Nowell D, Kleinberg J. Tracing information flow on a global scale using Internet chain-letter data. Proceedings of the national academy of sciences. 2008;105(12):4633–4638.
  43. 43. Yang J, Leskovec J. Patterns of temporal variation in online media. In: Proceedings of the fourth ACM international conference on Web search and data mining. ACM; 2011. p. 177–186.
  44. 44. Cui P, Jin S, Yu L, Wang F, Zhu W, Yang S. Cascading outbreak prediction in networks: a data-driven approach. In: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2013. p. 901–909.
  45. 45. Subbian K, Aggarwal C, Srivastava J. Content-centric flow mining for influence analysis in social streams. In: Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM; 2013. p. 841–846.
  46. 46. Yu L, Cui P, Wang F, Song C, Yang S. From micro to macro: Uncovering and predicting information cascading process with behavioral dynamics. In: 2015 IEEE International Conference on Data Mining. IEEE; 2015. p. 559–568.
  47. 47. Wang S, Yan Z, Hu X, Philip SY, Li Z. Burst time prediction in cascades. In: Twenty-Ninth AAAI Conference on Artificial Intelligence; 2015.
  48. 48. Nahon K, Hemsley J, Walker S, Hussain M. Fifteen minutes of fame: The power of blogs in the lifecycle of viral political information. Policy & Internet. 2011;3(1):1–28.
  49. 49. Bild DR, Liu Y, Dick RP, Mao ZM Wallach DS. Aggregate characterization of user behavior in Twitter and analysis of the retweet graph. ACM Transactions on Internet Technology (TOIT). 2015;15(1):4.
  50. 50. Crane R, Sornette D. Robust dynamic classes revealed by measuring the response function of a social system. Proceedings of the National Academy of Sciences. 2008;105(41):15649–15653.
  51. 51. Lorenz-Spreen P, Mønsted BM, Hövel P, Lehmann S. Accelerating dynamics of collective attention. Nature communications. 2019;10(1):1759. pmid:30988286
  52. 52. Mahmud J, Nichols J, Drews C. Where is this tweet from? inferring home locations of twitter users. In: Sixth International AAAI Conference on Weblogs and Social Media; 2012.
  53. 53. De Domenico M, Lima A, Mougel P, Musolesi M. The anatomy of a scientific rumor. Scientific reports. 2013;3:2980.
  54. 54. McPherson M, Smith-Lovin L, Cook JM. Birds of a feather: Homophily in social networks. Annual review of sociology. 2001;27(1):415–444.
  55. 55. Cardoso FM, Meloni S, Santanche A, Moreno Y. Topical homophily in online social systems. arXiv preprint arXiv:170706525. 2017.
  56. 56. Zhang ZK, Liu C, Zhan XX, Lu X, Zhang CX, Zhang YC. Dynamics of information diffusion and its applications on complex networks. Physics Reports. 2016;651:1–34.
  57. 57. Newman MEJ. The structure and function of complex networks. SIAM review. 2003;45(2):167–256.
  58. 58. Stewart AJ, Mosleh M, Diakonova M, Arechar AA, Rand DG, Plotkin JB. Information gerrymandering and undemocratic decisions. Nature. 2019;573(7772):117–121. pmid:31485058
  59. 59. Barabasi AL. The origin of bursts and heavy tails in human dynamics. Nature. 2005;435(7039):207. pmid:15889093
  60. 60. Karsai M, Kaski K, Barabási AL, Kertász J. Universal features of correlated bursty behaviour. Scientific reports. 2012;2:397. pmid:22563526
  61. 61. Karimi F, Holme P. Threshold model of cascades in empirical temporal networks. Physica A: Statistical Mechanics and its Applications. 2013;392(16):3476–3483.
  62. 62. Fischer P, Krueger JI, Greitemeyer T, Vogrincic C, Kastenmüller A, Frey D, et al. The bystander-effect: A meta-analytic review on bystander intervention in dangerous and non-dangerous emergencies. Psychological bulletin. 2011;137(4):517. pmid:21534650
  63. 63. Kalman YM, Ravid G, Raban DR, Rafaeli S. Pauses and response latencies: A chronemic analysis of asynchronous CMC. Journal of Computer-Mediated Communication. 2006;12(1):1–23.
  64. 64. Badawy A, Ferrara E, Lerman K. Analyzing the digital traces of political manipulation: The 2016 russian interference twitter campaign. In: 2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE; 2018. p. 258–265.
  65. 65. Leskovec J, Faloutsos C. Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM; 2006. p. 631–636.