Fighting disinformation with artificial intelligence: fundamentals, advances and challenges

Internet and social media have revolutionised the way news is distributed and consumed. However, the constant flow of massive amounts of content has made it difficult to discern between truth and falsehood, especially in online platforms plagued with malicious actors who create and spread harmful stories. Debunking disinformation is costly, which has put artificial intelligence (AI) and, more specifically, machine learning (ML) in the spotlight as a solution to this problem. This work revises recent literature on AI and ML techniques to combat disinformation, ranging from automatic classification to feature extraction, as well as their role in creating realistic synthetic content. We conclude that ML advances have been mainly focused on automatic classification and scarcely adopted outside research labs due to their dependence on limited-scope datasets. Therefore, research efforts should be redirected towards developing AI-based systems that are reliable and trustworthy in supporting humans in early disinformation detection instead of fully automated solutions.


Introduction
Amidst the prevailing post-truth era, people are overwhelmed with an enormous and uninterrupted flow of information, making it difficult to discern reliable material from content that seeks to mislead, whether intentionally (i.e., disinformation) or unintentionally (i.e., misinformation) (Wardle; Derakhshan, 2017). As a result, disinformation poses a significant and wide-ranging threat that can potentially transform any society's political, economic, and cultural fabric, thus eroding the fundamental principles of democratic nations.
While domain experts and fact-checkers may find it relatively easy to disprove hoaxes, more resources are necessary to drive and speed up their work and empower non-specialised citizens and organisations. Hence, the interest in developing technological tools for automatic information verification has grown, particularly in the ever-changing social media environment. Machine learning (ML), a subfield of Artificial Intelligence (AI), has significantly contributed to combating disinformation in recent years. Essentially, ML algorithms can be trained with data to automatically detect patterns indicative of disinformation and then apply these patterns to discern the likely truth or falsehood of unseen content. Deep Learning (DL), a subset of ML algorithms based on neural networks, has proved very useful in multiple domains (LeCun; Bengio; Hinton, 2015) and currently completely dominates the AI landscape. ML is also the predominant approach to fight disinformation (Xu;Sheng;Wang, 2023), but at the same time it can be used to generate synthetic content, increasing the impact of disinformation (Masood et al., 2022).
ML is a very active, technical, and complex subject, making it difficult for non-specialists to understand and incorporate solutions arising in this field. At the same time, ML researchers must be aware of the multiple facets of a social problem like disinformation. Consequently, the research objective of this paper is to provide a brief and multidisciplinary guide to navigate the recent literature on AI to combat disinformation focusing on ML. This paper discusses the effectiveness of AI and ML techniques in detecting and counter-fighting disinformation and identifies the challenges and limitations of current approaches. We also suggest research directions for developing trustworthy AI-based systems that can assist humans in the early detection of disinformation.
Disinformation in social and digital media has prevalently spread through text. Therefore, when training ML algorithms, the primary characteristics considered are related to the syntax and content of the messages, including aspects such as syntactic, lexical, stylistic, and semantic features, which fall into the field of natural language processing (NLP). Furthermore, social network analysis (SNA) has researched the topology of disinformation networks. By analysing the network structure and identifying communities, it is possible to identify groups of users who are likely to generate and disseminate harmful content, whether in a coordinated or uncoordinated way. Accordingly, we centre our work on NLP and SNA as the areas of AI more often related to disinformation analysis.
Automated disinformation analysis has been addressed from multiple perspectives. Here we propose an organisation into three overlapping approaches: -disinformation identification by automated classification; -feature extraction to characterise disinformation; and -providing support to fact-checking tasks.
This organization is consistent with the approaches of the revised research works and reflects the historical development of the area: -Disinformation classification. Automated classification is the most straightforward way of disinformation analysis -given a labelled dataset, we can train an ML classification model to distinguish legit contents. However, this methodology has the drawback that trained models on one domain are hardly extensive to others. -Feature-based disinformation identification. Feature extraction, in turn, focuses on finding characteristics of disinformation that can be used manually or automatically to detect content and communities of interest afterwards. -Hybrid-based fact-checking. Detecting misleading content by specialized journalists has proved very effective for disinformation analysis but also bottleneck in the process. This limitation has led to the emergence of a third type of approach known as semi-automated fact-checking.
The remainder of this manuscript is accordingly divided into three parts. The first describes AI techniques and methods used to detect disinformative content. The second focuses on the AI methods proposed in the literature to combat disinformation, including the features used to train these models and how fact-checkers can take advantage of these technological advances. The third one describes the increasing use of AI to generate disinformative content automatically. Finally, we end the paper with a summary of the main findings and the most promising research lines for future work.

Background
ML is a powerful tool within AI that can help to address the growing problem of disinformation by automating the detection and analysis of untrustworthy content. This section provides the reader with a background on ML and an overview of the fundamentals of Natural Language Processing and Social Network Analysis. Readers familiar with AI and ML can skip this section; otherwise, more information can be found in the classical books by Russell and Norvig (2020) and Bishop (2006).

Machine Learning
Machine Learning is a field of AI that encompasses a range of methods, techniques, and tools for building intelligent systems by exploiting large volumes of data related to a specific problem. Specifically, ML falls under the pattern recognition paradigm, i.e., it identifies repeating characteristics in a data sample using statistical and computational processes. These patterns serve two primary functions: making predictions about future events (predictive analysis) and uncovering insights from the data (descriptive analysis). Depending on the learning mode and the process of obtaining patterns, there are three main families of ML techniques: Supervised, Unsupervised, and Reinforcement Learning. Based on artificial neural networks, Deep Learning mainly falls into Supervised Learning, but it can also be applied in Unsupervised and Reinforcement Learning setups. This subsection focuses on Supervised and Unsupervised techniques (including Deep Learning), the most representative ML techniques to combat disinformation.
Supervised Learning seeks to develop models from labelled training data that allows predicting the labels of unseen or future data. Supervised Learning can be classified into two basic categories, depending on the nature of the target variable: classification and regression. In classification, the target variable has a limited number of discrete values. Archetypical methods within this category are Decision Trees, Logistic Regression, Support Vector Machines, and the K-Nearest-Neighbour algorithm. In regression, the target variable is a real number. Some regression algorithms are Linear Regression, Polynomial Regression, Regression Splines, and Regression Trees. Supervised Learning methods are often combined to increase accuracy, yielding ensemble models such as Bagging, Boosting, and Random Forest.
Unsupervised Learning refers to techniques that deal with unlabelled or unstructured data. The most prevalent technique is clustering, utilised to identify hidden groups within a dataset for descriptive analysis. We have partitional clustering, where clusters are disjoint and typically encompass the entire item set (e.g., the Dbscan and k-means algorithms), and hierarchical clustering, where groups are organised into a hierarchy. Another notable technique within Unsupervised Learning is association rules, which aim to discover dependencies between a set of items in a database. -Convolutional Neural Networks (CNNs), which are specialised neural networks that process data with a regular structure, like images; -Recurrent Neural Networks (RNNs), which process sequential data allowing feedback loops in the networks and work well with time series; and -Transformers, which learn to identify relevant sections of sequences by applying attention models and are very useful with textual data.

Natural Language Processing
Natural Language Processing (NLP) involves using computational linguistics techniques to analyse text in a specific language, whether written or spoken (Manning; Schütze, 1999). Before developing a ML model for natural language processing (e.g., a language model), it is crucial to tackling three critical challenges: text preprocessing, feature extraction, and representation.
1) Text preprocessing involves cleaning the text and eliminating unimportant elements so that only useful information remains. The fundamental steps of text preprocessing are tokenisation (the division of the raw text into units), stopword removal (elimination of common words not significant for the analysis) and stemming (heuristic-type rules for cutting off the ends of words or affix removal) or lemmatisation (transformation of words into their base form or lemma).
2) Feature extraction involves identifying and selecting basic features from raw text data suitable to the task. Some of the most widely used techniques for feature extraction are Part-Of-Speech tagging (POS) to identify lexical categories, Named-Entity Recognition (NER) for identifying entities within the text, and bag-of-words to represent linguistic units based on their frequency of occurrence.
Another more advanced feature extraction technique is Sentiment Analysis (SA), also called Opinion Mining, which aims to automatically grasp a text's sentiments, opinions, emotions, or attitudes (Serrano-Guerrero et al., 2015). It can also include eliciting the author's psychological traits through specific-purpose annotated lexicons (John; Srivastava, 1999; Pennebaker et al., 2015).
3) Representation involves creating a numerical encoding of the text so that other ML algorithms can use it. Many techniques exist, but word embeddings are the most widely used today. They are representations of text units in the form of numerical vectors that capture their semantics. Word2Vec (Mikolov et al., 2013) and GloVe (Pennington; Socher; Manning, 2014) are the most used techniques for obtaining embeddings. There are also available public embeddings for common terms precalculated from massive text sources, like Wikipedia, that can be reused in other applications. Once a document is represented as numbers, ML techniques (and particularly Deep Learning methods) can be applied to solve a downstream task (e.g., text classification or text prediction).
In this regard, Transformer networks with attention mechanisms aim to overcome the limitations of previous methods by learning to hold on to essential parts of the input text (Vaswani et al., 2017). Particularly, Large Language Models (LLMs) are neural network base systems specialised in predicting the next word in a sequence that can be used for text generation and translation between sequences. A particularly noteworthy LLM is the Generative Pre-trained Transformer (GPT) (Brown et al., 2020). Its current incarnation GPT-3 and GPT-4 can generate natural language and perform a wide range of NLP tasks, such as text generation, machine translation and question-answering (Zhu; Luo, 2022). More recently, a variant of GPT-3 named ChatGPT has been successfully trained through human interaction to engage in realistic conversations (Megahed et al., 2023).

Social Network Analysis (SNA)
Social Network Analysis is the computational field that explores social entities' relationships, patterns, and structures to understand the system, position, and linkage between these actors (Barabási, 2016). SNA uses mathematical and computational methods to analyse data from social media through two different approaches (Aggarwal, 2011;Camacho et al., 2020): -structural analysis (topology of the network, communities, and important nodes); and -content-based analysis (information about social media users, shared content).
Structural analysis focuses on studying the topology of a network by applying graph theory. Often-used structural metrics include local measures like centrality, degree, closeness or betweenness -used for identifying the importance of certain nodes (users) within the network, and global measures such as density, diameter, radius, or transitivity -used to study the global structure of the network. An essential problem in SNA is community detection, which aims to identify sets of more tightly connected nodes (Bedi; Sharma, 2016). The task of community detection is closely related to the clustering problem, so most techniques belong to this broad family of algorithms (Fortunato, 2010). Other approaches are based on the maximisation of modularity, a measure that balances the number of internal and external connections of a community. Some algorithms based on modularity are Newman's greedy method (Newman, 2004) and the Blondel method (Blondel et al., 2008).
Content-based analysis examines both the content and the connections between nodes, for example, by incorporating text to provide additional context to the network (Cambria; Wang; White, 2014). Content analysis is commonly applied in the following ways: -user profiling, which gathers extra information about the human actors -e.g. behaviour or physical features-in a network (Harrigan et al., 2021); -topic extraction, which identifies the main themes of discussion among a group of nodes (Yin et al., 2012), or the interests of users through their social connections (Wang et al., 2013); -sentiment analysis, which examines the tone of the messages exchanges among the nodes (Camacho et al., 2020).

Disinformation classification with Machine Learning
Supervised Learning is the most widely employed approach for the automatic identification of disinformation. Thereby, the identification of disinformation is usually modelled as a binary classification problem. Given a set of representative features of an information item I, the task is to predict whether I is truthful or not, i.e.: where f is the function we want to learn from the available data. The combination of the features to obtain f can be done manually or automatically. In the first case, Multi-Criteria Decision Making (MCDM) has been applied to define criteria and probability weights to calculate an information credibility score and rank the candidate solutions (Pasi; De-Grandis; Viviani, 2020). In the second case, DL has been applied to learn the features and the combination weights (Amador; Molina-Solana; Gómez-Romero, 2019; Molina-Solana; Amador; Gómez-Romero, 2018).
Nevertheless, disinformation flows in shades of grey, not black and white, rendering a binary classification insufficient. In the literature, we can find more precise definitions of labels to capture the more subtle nuances of disinformation. For example, Wang (2017) proposed a manually labelled dataset with six fine-grained labels where the degree of truthfulness (pants-fire, false, barely true, half-true, mostly true, and true) of thousands of statements was evaluated. Nakamura, Levy and Wang (2020) used a labelling hierarchy of two, three, and six categories for each sample of their multimodal dataset enabling the implementation of classification models at different levels of granularity.
The performance of Supervised Learning depends directly on the quality of the labelled data, which usually represents situations, making it difficult to extend the models to other similar domains. This limitation is even more noticeable when applied to the automatic detection of disinformation since it is challenging to build datasets with enough quality to cover the nuances of disinformation in heterogeneous contexts (Shu et al., 2017). Dataset construction involves -data extraction, either through APIs provided by platform owners or web scraping methods, and -annotation, which is a manual time-consuming and error-prone task with little automatic support (Simko et al., 2021).
Annex includes datasets used in the literature for testing ML disinformation classification models.
As reported several times (Guo et al., 2020; Meel; Vishwakarma, 2020; Zhang; Ghorbani, 2020), studies that work directly on automatic disinformation detection with Unsupervised Learning are scarce. Some works formulate automatic identification of disinformation as an anomaly detection problem on social networks, employing an autoencoder as an Unsupervised Learning method (Li et al., 2021), another uses Bayesian statistics to compute the veracity of news and the credibility of their authors . Nevertheless, most studies use Unsupervised Learning in a complementary way to Supervised Learning; that is, they use a Semi-supervised approach (

Feature-based automated disinformation detection
As explained, methods for disinformation detection need relevant features representative of the news items. Classically, they have been classified into content-based and context-based features (Bondielli; Marcelloni, 2019).
-Content-based features are relevant attributes extracted directly from the data item, usually a text stating or supporting the potential hoax and often associated with several images or videos that reinforce it. -Context-based features refer to data or metadata surrounding the piece of information. This section focuses on various features that can be extracted and used to detect false information.

Natural language processing for stylistic characterisation of messages
Content-based methods use the linguistic features of false information, including syntactic and semantic characteristics ) that can be obtained by applying NLP techniques (Ruffo et al., 2023). Among syntactic features, we can find POS tags and relevant groups of words (bigrams, trigrams, or n-grams). Semantic features can be obtained through sentiment analysis, opinion mining, topic detection, or encodings with word embeddings.
A specific kind of linguistic feature is style-based features. The rationale behind methods based on them is that ML can capture the distinctive style that malicious actors use to increase the diffusion and acceptance of their content (Zhou; Zafarani, 2020). The style of news text has been formalised and measured in terms of the frequency of morphological Regarding morphological patterns, in an early study, Afroz, Brennan and Greenstadt(2012) were able to identify false information by analysing the number of syllables and words, vocabulary, grammatical complexity, and POS tags. Misleading content spreaders were also found to use more informal language (Giachanou et al., 2022), e.g., certain patterns in the use of personal pronouns and swear words (Rashkin et al., 2017). Regarding the emotional tone of the discourse, Del-Vicario et al. (2016) showed that the emotional state of social media users is linked to their level of engagement in the community -more activity leads to more negative emotions and vice versa. Accordingly, the use of polarised language patterns is often seen as a sign of message engineering to increase impact by provoking negative emotions in the receiver, such as anger, disgust, or fear (Giachanou; Rosso; Crestani, 2021), and therefore, an indicator of low credibility (Ghanem et al., 2021; Stella; Ferrara; De-Domenico, 2018).
Conversely, disinformers can learn style-based features to replicate the writing styles of trustworthy information sources and disguise their actions. This is particularly problematic if language models are used to generate disinformation, which is currently a trend and a challenge. For example, Schuster et al. (2020) showed that NLP models for disinformation identification based on stylistic features work well with human writing. Still, they tend to fail when confronted with synthetic text created by language models trained to replicate trusted media.

Contextual aspects of disinformation in social networks
Contextual features are extracted by considering the relevant data related to an information item, including metadata or other external elements. This information is primarily available in social networks, where context can be connected to the users, their posted messages, or the network (Guo et al., 2020).

Features based on the context of the users
User-based features include the number of posts, number of followers, demographics, whether the account is verified, or the age of the account on the platform. A usual metric built from such profile data is user credibility, which can indicate the likelihood of sharing false information (Shu; Wang; Liu, 2019). Credibility can be obtained from network metadata to analyse whether there is a correlation between a user profile and the publication of false information . Furthermore, user engagement (likes, retweets, and replies) with tweets written by verified users can also be used to assess credibility .
A very interesting type of social network user is bots. Bots are computer programs that carry out autonomous actions, including automatically generating false information and amplifying disinformation during the initial dissemination stages Ciampaglia et al., 2018). Bots tend to have particular profiles on social networks, e.g., they are usually recent accounts (Davis et al., 2016) with lengthy usernames using weird characters (Oehmichen et al., 2019). Their behaviour is also different from humans' (Ruffo et al., 2023); e.g., they retweet more, get fewer retweets, receive fewer replies and mentions, and publish fewer original tweets (Ferrara et al., 2016). All these features can be obtained from the public profiles and the graph of retweets for automatic bot identification alone (Des-Mesnards et al., 2022) or combined with message data (Kudugunta; Ferrara, 2018).
Disinformation is closely related to the user's personality and mental processes. Given that psychological characteristics regulate behaviour and interaction in the physical world, it is logical to assume that they also impact virtual communities. Psychological traits can influence how individuals interpret and engage with information, increasing the likelihood of spreading false information and toxic narratives. For example, inherently human cognitive biases such as limited reality perception and confirmation bias can increase the likelihood of perceiving fake news as real and thus encourage its dissemination (Shu et al., 2017). Unlike disseminators of accurate information, disinformers have been found to be extroverted, less neurotic and present more stress in their tweets (Shrestha;Spezzano, 2022). In contrast, Srinivas, Das and Pulabaigari (2022) suggest that users who spread false political information are neurotic, conservative and have psychopathic traits. The difference in the conclusions of these works is mainly due to the way of detecting and measuring these psychological traits.

Features based on the context of the messages
Contextual user and message-based features are often not clearly distinguished (Guo et al., 2020) and even merged . Still, for clarity, we consider the context of the posted messages separately, which are different, more dynamic, and specific than the users' features (Tacchini et al., 2017). Thus, metadata about posts in social networks has been mainly used to increase the effectiveness of another principal feature (Della-Vedova et al., 2018). Likewise, multimedia resources associated with messages have been used to complement ML models, yielding multimodal disinformation analysis (Hangloo; Arora, 2022).
The features employed for disinformation classification can be categorized into two groups: content-based features and context-based features Multimodal analysis has been focused to date on images and addressed in three main forms: forensic -evaluates whether an image has been subjected to modification or manipulation (Qi et al., 2019)-, contextual -the image and the text are consistent (Kang; Hwang; Yu, 2020; Xiong et al., 2023)-, and hybrid -the image is processed to extract additional information to be used in (Giachanou; Zhang; Rosso, Jing et al., 2023;Khattar et al., 2019;Li;Yao et al., 2022;Singh et al., 2023;. For example, Zhang, Giachanou y Rosso (2022) combined textual, visual, and contextual information to build the "scene" depicted in the post, obtaining statistically significant differences in the appearance of specific places, weather, and seasons in false and truthful content.

Features based on the network structure
Network-based features refer to the static structure of the social network, such as central nodes and communities based on users' connections, and the more dynamic propagation of (dis)information, including critical actors, dissemination paths, and infiltration from one community to another (Bondielli; Marcelloni, 2019; Zhou; Zafarani, 2020).
Most works in the literature focus on detecting false information by modelling the information dissemination network, assuming that true and false information have different propagation patterns ( . This approach is highly effective for stopping the propagation of disinformation, as it prioritises identifying (and removing) disinformative over the more costly analysis of individual publications. Specifically, the networking characteristics of users involved in disseminating false information have been investigated through initiatives such as the PAN challenges (Buda; Bolonyai, 2020; Vogel; Meghana, 2020). In addition, modern ML techniques have been recently applied to this topic, e.g., Rath, Salecha y Srivastava (2022) proposed a graph neural network model to identify nodes prone to disseminate false information using network topology and historical user activity data.

AI-supported fact-checking
Fact-checking is journalism focused on checking public assertions (Graves; Nyhan; Reifler, 2016). While verifying information is a foundational part of journalism, fact-checking emphasises the relevance of the checking process and the development of methods and tools to do so effectively and transparently. The first proposals to automate online fact-checking appeared more than 15 years ago (Graves, 2018), already highlighting that full automation is practically impossible because of the critical judgment, sensitivity, and experience required to make a decision that is not binary (Arnold, 2020). The fact-checking community acknowledges that the rapid dissemination of false information presents scalability issues -i.e., spreading a lie is way faster than debunking it (Vosoughi; Roy; Aral, 2018)-but this should not undermine the rigour of the fact-checking process.
Accordingly, the approaches in the literature tend to AI-supported fact-checking rather than automated fact-checking, which is often labelled as human-in-the-loop systems (La-Barbera; Roitero; Mizzaro, 2022; Shabani et al., 2021;Yang et al., 2021). AI can support fact-checking at different stages of the verification workflow (Guo; Schlichtkrull; Vlachos 2022; Nakov; Corney et al., 2021): (1) monitoring, recognition, and prioritisation of content susceptible to verification; (2) evaluating whether claims are verifiable or not and topic prioritisation; (3) searching for previous verifications that apply to the same case; (4) retrieval of evidence for further investigation; (5) semi-automated classification in categories (hoax, misleading content, false context, etc.); (6) dissemination of the verifications; and (7) speeding-up writing and documenting the fact-checks.
Proposals in the literature have primarily focused on stages 1-4. For stage 5, the contributions described in Section 3 could be applied, although they show limitations in their applicability to multiple domains, as already described.
Various methods have been proposed for the check-worthiness of claims (stages 1 and 2), either based on the ranking of claims by score prediction (Kartal; Kutlu, 2023; Nakov; Da-San- Martino et al., 2021) or the classification of the claim using specific annotations (Konstantinovskiy et al., 2021). Since automated systems can introduce biases in claim selection, research has pivoted towards tools like news alerts, speech recognition, and translation models to filter claims more effectively (Rashkin et al., 2017).
Detecting previously fact-checked claims, including those verified in other languages or countries, has been addressed with NLP and information retrieval techniques (stages 3 and 4). In the first case, semantic textual similarity has been applied to match new claims with already-verified ones in English (Thorne; Vlachos, 2018) and Spanish (Martín et al., 2022). In the second case, software tools with different levels of intelligence have been developed for evidence retrieval, including structured data extraction, speech recognition, reverse image search, video forensics, or natural language search (Das et al., 2023).
The current trend leans towards fact-checking assisted by Artificial Intelligence rather than relying solely on fully automated fact-checking One remarkable tool covering different stages is InVid, a free platform that hosts tools to detect, authenticate and check the reliability and authenticity of images and videos. https://www.invid-project.eu The vera.ai project is expected to continue and expand AI-supported verification tools and services in Europe. https://www.veraai.eu

The challenge of the automatic generation of disinformation
Large Language Models (LLMs), introduced in Section 2.2, are one of the most challenging technologies for the massive generation of textual disinformative content. For example, GPT-3 and ChatGPT can produce synthetic text that can be exploited to spread disinformation in many ways (Solaiman et al., 2019): -to camouflage false content under the guise of real information; -to create bots and web pages amplifying a disinformative discourse; -to elude stylistic checkers, etc.
Additionally, since there is no control over the sources used to train LLMs, much of the content they learn and produce is false and biased (Marcus, 2022). Therefore, it is crucial to develop effective methods for detecting and mitigating the impact of LLM-generated disinformation; unfortunately, attempts to date are still ineffective (Mitchell et al., 2023).
Disinformation is not limited to text format; images, videos, and audio can also be spread, often even more harmful than text. Face manipulation in images and videos, either partial or total, has been one of the most active areas of research to date and one that poses a significant challenge to fighting disinformation. Full face generation refers to the creation of a completely fake face (Serengil; Ozpinar, 2021) using architectures such as ProGAN (Karras et al., 2018) or StyleGAN (Karras; Laine; Aila, 2019). Partial manipulation, in turn, refers to modifications like face swapping, attribute manipulation (hair, skin tone, eyes, etc.), face re-enactment, and lip-syncing (Tolosana et al., 2020). Conversely, there is a wide range of ML-based techniques for detecting deepfakes. Particularly, convolutional neural networks (CNNs) with attention mechanisms have been recently used (Dagar; Vishwakarma, 2022; Rana et al., 2022;Tolosana et al., 2020), but their effectiveness lags behind the advances in deepfake generation and the possibility of manually refine the deepfakes in post-production.

Conclusions and future work
The speed and the amount of available data make it challenging to distinguish trustworthy information from disinformative content that is often disguised as legit and appeals to emotions and beliefs. Computational technologies have arisen as suitable tools to address disinformation but also have exceptional capabilities to exacerbate the problem through invention and falsification. In this manuscript, we have described the current trends in AI and ML applied to disinformation detection and characterisation, as well as the challenges posed by synthetic text and media generation.
Most of the reviewed proposals perform an a posteriori analysis of disinformative content once it has become impactful, focusing on different features that can be used in automatic classification. While the approaches assume that solutions in specific problems and domains can be extended to others, they strongly depend on the datasets used and the processes to create them. Therefore, there is a need for new, high-quality, and unbiased datasets, particularly in languages other than English. Furthermore, more efforts are required to transfer and evaluate trained models from one domain to another. Similarly, AI-based disinformation analysis tools are not widely available or lack the maturity that non-technological users need.
Early detection of disinformation is crucial to limit the impact of a phenomenon that otherwise is impossible to deter. Therefore, we identify two future research directions for fighting disinformation with AI and ML. The first one is the study of patterns of creation and propagation, including paths and ecosystems, to understand better and anticipate the spread of harmful propaganda and conspiracy theories. The second one is the application of intelligent technologies to amplify the scope of fact-checks and media literacy, similarly as disinformers engineer their messages to reach wider audiences.
These initiatives will require creating explainable AI methods to provide results and justify them and facilitating the interplay between technological tools and practitioners with deep domain knowledge, including fact-checkers, experts, and decision-makers. By addressing these challenges, we will progress towards AI-based systems that can detect and combat disinformation more effectively, ultimately contributing to a better-informed society.
Early detection of disinformation is crucial to limit the impact of a phenomenon that otherwise is impossible to deter