Using tweets from 2009-2021, researchers have developed a predictive model that can detect extremist users and content related to the militant group ‘Islamic State’ (ISIS).
Their work could help social media companies identify and eventually restrict such accounts in a timelier manner and abate their impact on online communities, they said.
The researchers from the Pennsylvania State University, US, identified potential propaganda messages and their characteristics and developed an image classifier to find the most frequent categories of images attached to tweets about ISIS.
“The Islamic State group and its affiliates, sympathisers and followers continue to manipulate online communities to spread extremist propaganda,” said Younes Karimi, a graduate student at the university pursuing a doctorate in informatics and the first author of the paper published in the journal Social Media Analysis and Mining.
Apart from the ISIS-linked tweets they used for analysis, the researchers further collected a dataset of tweets from potential ISIS supporters to investigate their recent activities.
According to Karimi, the Islamic State group is increasingly relying on social media to spread propaganda, undermine its rivals and recruit sympathisers, despite countermeasures by websites like X (formerly Twitter) to restrict its online activities.
For the study, the researchers used artificial intelligence (AI) techniques – machine learning and natural language processing – to differentiate the users sharing ISIS-related content. While machine learning makes predictions based on the past data, natural language processing involves manipulating textual data.
ISIS accounts identified before 2015 served as the labelled data for the study’s ISIS users, while for identifying potential ISIS supporters, the researchers built a user classifier using the old dataset.
The users in the dataset included known members of the Islamic State group and those who retweeted, quoted or mentioned ISIS, said Karimi.
“We believe that users who retweet or quote Islamic State group content are more likely to be affiliates or sympathisers, while those who just mention the content are less likely to be supporters. However, tweets posted by mentioners are still very likely related to ISIS and contain topics similar to ISIS tweets, which make mentioners suitable to be considered as our non-ISIS users and non-trivial counterparts to ISIS users,” said Karimi.
The researchers then analysed the tweets to identify what they referred to as “candidate propaganda.” They compared topics used by known Islamic State group accounts prior to 2015 in the old dataset to the content posted after 2015 by potential affiliates and supporters in their recent dataset.
Secondly, the team examined ideology-based words and images, which they said are “often designed to elicit an emotional response and influence a large audience.”
Thirdly, studying content involving hashtags, Karimi said, “Supporters and affiliates of the Islamic State group recruited people to retweet hashtags to create trending ideas, such as strong religious references, and curate group messaging to improve the group’s branding and ensure message longevity.”
The team found that the most used hashtags from ISIS included “The Islamic State”, “Caliphate News”, “Urgent”, “The State of the Caliphate” and “ISIS”.
Karimi said that the longitudinal perspective of the dataset was important because it included data from before and after 2015, when a major crackdown by Twitter removed user accounts and content involving the Islamic State group.
“In response, the extremists had to change their online strategy and move to other platforms, and little is known about their online whereabouts since that crackdown,” said Karimi.
The team said that their approach focussing on users and user content could be employed on other social media platforms as well.
First Published: Jan 27 2024 | 6:40 PM IST
Note:- (Not all news on the site expresses the point of view of the site, but we transmit this news automatically and translate it through programmatic technology on the site and not from a human editor. The content is auto-generated from a syndicated feed.))