A team of researchers has identified that a new Machine Learning (ML) program accurately identifies Covid-19-related conspiracy theories on social media and models how they evolved.
The study showed that misinformation tweets contain more negative sentiment when compared to factual tweets and that conspiracy theory evolve, incorporating details from unrelated conspiracy theories as well as real-world events.
“A lot of ML studies related to misinformation on social media focus on identifying different kinds of conspiracy theories,” said researcher Courtney Shelley from Los Alamos National Laboratory in the US.
For the study, published in the journal JMIR Public Health and Surveillance, the team used publicly available, anonymized Twitter data to characterize four Covid-19 conspiracy theory themes and provide context for each through the first five months of the pandemic.
The four themes the study examined were that 5G cell towers spread the virus; that the Bill and Melinda Gates Foundation engineered or has otherwise malicious intent related to Covid-19; that the virus was bioengineered or was developed in a laboratory; and that the Covid-19 vaccines, which were then all still in development, would be dangerous.
“We began with a dataset of approximately 1.8 million tweets that contained Covid-19 keywords or were from health-related Twitter accounts,” the researchers said.
“From this body of data, we identified subsets that matched the four conspiracy theories using pattern filtering, and hand-labelled several hundred tweets in each conspiracy theory category to construct training sets,” they added.
Using the data collected for each of the four theories, the team built random forest ML, or Artificial Intelligence (AI), models that categorized tweets as Covid-19 misinformation or not.
Furthermore, the study found that a supervised learning technique could be used to automatically identify conspiracy theories and that an unsupervised learning approach (dynamic topic modelling) could be used to explore changes in word importance among topics within each theory.