Despite efforts by major platforms to limit its spread, copies of the widely debunked conspiracy video “Plandemic” continued to multiply and spread largely through niche online conspiracy communities in early May 2020.
The DFRLab used the CrowdTangle API and an R package called CooRNet, developed by Fabio Giglietto, Nicola Righetti, and Luca Rossi, to track the spread of the viral conspiracy through hundreds of Facebook groups.
This document walks through the data analysis portion of the research, providing reproducible code for the key visualizations.
The first step was to get a dataset from CrowdTangle of posts promoting the Plandemic conspiracy shared to public Faebook groups that also contained URLs. The goal here was to capture posts that linked to either a copy of the video hosted off of Facebook, such as on YouTube or dedicated domains, or to other content that furthered the conspiracy (blog posts, op-eds, etc).
We created a search for posts containing the Plandemic video in CrowdTangle. Then, we used the CrowdTangle Historical Data feature to get all of the posts from the saved search containing links that were posted between May 3, 2020 - May 10, 2020.
CooRNet
is an R package that detects “coordinated link sharing behavior,” which it defines as when public Facebook entities, such as pages and groups, repeatedly share the same links within an unusually short period of time from each other.
What constitutes an “unusually short period” of time is defined by the “coordination interval,” which CooRNet calculates algorithmically. The rationale of using this measure as a proxy for coordination is that it would be unlikely that different Facebook entities would share the same links as one another within that unusually short period of time on a repeated basis.
In this analysis, we were not as interested in capturing coordination as we were in mapping the rapid spread of the Plandemic conspiracy through Facebook groups. The below analysis, therefore, is not necessarily evidence of coordination on the part of the disparate Facebook assets; rather, it suggests a pattern of rapid link-sharing related to Plandemic throughout hundreds of different conspiracy communities, demonstrating the conspiracy’s crossover appeal and the shared dynamics among these communities.
We started out by following the tutorial available on the CoorNet site to extract a list of entities engaged in rapid linksharing. This series of steps, especially calling get_ctshares
, will take a long time, as it queries the CrowdTangle API. NOTE: This process may go very slowly. To speed it up, you can request a rate limit increase from CrowdTangle using this form and include the sleep_time = 1
parameter in get_ctshares
to reduce the sleep time between calls.
#From the tutorial:
urls <- get_urls_from_ct_histdata(ct_histdata_csv="/Users/zkharazian/Downloads/2020-05-09-21-40-09-EDT-Historical-Report-plandemic-2020-05-03--2020-05-10.csv")
ctshares <- get_ctshares(urls, "url", "date", sleep_time = 1, clean_urls = TRUE)
output <- get_coord_shares(ctshares, parallel = TRUE, clean_urls = TRUE, keep_ourl_only = TRUE)
get_outputs(output, ct_shares_marked.df = TRUE, highly_connected_g = TRUE, highly_connected_coordinated_entities = TRUE)
We now have a dataframe of highly_conneted_coordinated_entities
that repeatedly shared the same URLs within the coordination interval.
And now we’ll display the top 50 entities sorted by coord.shares in an inline table:
# Load DT package for displaying inline tables
library(DT)
# Display inline table of top 50 Facebook groups identified by CooRNet, sorted by coord.shares
datatable(head(highly_connected_coordinated_entities_names, 50), options = list(order = list(list(3, 'desc'))))
But what is the threshold that defines a rapid link share for these highly connected entities? To determine that, we ran the estimate_coord_interval
function in CooRNet
.
cord_int<-estimate_coord_interval(ctshares, q=0.1, p=0.5)
cord_int
## [[1]]
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 9 14 7399 4251 103950
##
## [[2]]
## [1] "14 secs"
This returned a coordination interval of 14 seconds. A link share between two groups that occurred within 14 seconds is defined as unusually rapid, relative to the rest of the dataset.
One of our outputs – highly_connected_g
– was a large igraph object representing a network. The next step was to prepare this network for analysis in Gephi. We obtained summary statistics for degree
. In the study of networks, degree is the number of connections a node has to other nodes. For the purposes of our data, nodes were individual Facebook groups, and connections were shares of URLs.
summary(V(highly_connected_g)$degree)
To make the graph less cluttered, we filtered it by deleting all vertices with a degree less than 1000. This will leave us with only the most connected Facebook groups.
library(igraph)
g <- delete.vertices(highly_connected_g, V(highly_connected_g)[degree < 1000])
#Export the graph as a graphml object
write.graph(g, file = "g.graphml", format = c("graphml"))
Our work was done in R for now, and we were ready to move to Gephi.
After exporting the graphml file from R, we imported it into Gephi. The result was something that looked like this:
We turned this network into a meaningful visualization through a series of steps.
First, in the Data Laboratory, we copied the data from the account.name column to Label.
Back in Overview, we started editing the visualization. We sized the nodes by coordinated.share. Groups with a larger number of coordinated shares were represented by larger nodes.
Next, we colored the nodes by degree, an option under the “Ranking” tab.
Color the edges by weight, under “Ranking.” Edges with a higher weight – more shares from between the two groups – will be a darker shade.
After running the Force Atlas 2 algorithm with Scaling = 200, we got this:
And then immediately after, we ran the Noverlap algorithm:
Toggling on node labels and running the Label Adjust algorithm will get you a graph like this:
We are ready to move to rendering the graph in Preview. Here are our settings:
…And we can export the graph as a PNG. You can do further editing, such as adding a title and annotations, in Inkscape or Adobe Illustrator.