Despite efforts by major platforms to limit its spread, copies of the widely debunked conspiracy video “Plandemic” continued to multiply and spread largely through niche online conspiracy communities in early May 2020.
The DFRLab used the CrowdTangle API and an R package called CooRNet, developed by Fabio Giglietto, Nicola Righetti, and Luca Rossi, to track the spread of the viral conspiracy through hundreds of Facebook groups.
This document walks through the data analysis portion of the research, providing reproducible code for the key visualizations.
The first step was to get a dataset from CrowdTangle of posts promoting the Plandemic conspiracy shared to public Faebook groups that also contained URLs. The goal here was to capture posts that linked to either a copy of the video hosted off of Facebook, such as on YouTube or dedicated domains, or to other content that furthered the conspiracy (blog posts, op-eds, etc).
We created a search for posts containing the Plandemic video in CrowdTangle. Then, we used the CrowdTangle Historical Data feature to get all of the posts from the saved search containing links that were posted between May 3, 2020 - May 10, 2020.
We now have a dataframe of
highly_conneted_coordinated_entities that repeatedly shared the same URLs within the coordination interval.
And now we’ll display the top 50 entities sorted by coord.shares in an inline table:
# Load DT package for displaying inline tables library(DT) # Display inline table of top 50 Facebook groups identified by CooRNet, sorted by coord.shares datatable(head(highly_connected_coordinated_entities_names, 50), options = list(order = list(list(3, 'desc'))))
But what is the threshold that defines a rapid link share for these highly connected entities? To determine that, we ran the
estimate_coord_interval function in
cord_int<-estimate_coord_interval(ctshares, q=0.1, p=0.5) cord_int
## [] ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 0 9 14 7399 4251 103950 ## ## [] ##  "14 secs"
This returned a coordination interval of 14 seconds. A link share between two groups that occurred within 14 seconds is defined as unusually rapid, relative to the rest of the dataset.
One of our outputs –
highly_connected_g – was a large igraph object representing a network. The next step was to prepare this network for analysis in Gephi. We obtained summary statistics for
degree. In the study of networks, degree is the number of connections a node has to other nodes. For the purposes of our data, nodes were individual Facebook groups, and connections were shares of URLs.
To make the graph less cluttered, we filtered it by deleting all vertices with a degree less than 1000. This will leave us with only the most connected Facebook groups.
library(igraph) g <- delete.vertices(highly_connected_g, V(highly_connected_g)[degree < 1000]) #Export the graph as a graphml object write.graph(g, file = "g.graphml", format = c("graphml"))
Our work was done in R for now, and we were ready to move to Gephi.
After exporting the graphml file from R, we imported it into Gephi. The result was something that looked like this: