Semantic similarity models for automated fact-checking: ClaimCheck as a claim matching tool



Palabras clave:

Verification, Automated fact-checking, Claim matching, Semantic similarity, Paraphrase models, Disinformation, Artificial intelligence, AI, Algorithms, Software


This article presents the experimental design of ClaimCheck, an artificial intelligence tool for detecting repeated falsehoods in political discourse using a semantic similarity model developed by the fact-checking organization Newtral in collaboration with ABC Australia. The study reviews the state of the art in algorithmic fact-checking and proposes a definition of claim matching. Additionally, it outlines the scheme for annotating similar sentences and presents the results of experiments conducted with the tool.


Los datos de descargas todavía no están disponibles.


Adair, Bill (2021). "The lessons of Squash, Duke´s automated fact-checking platform". Poynter, 16 June.

Adair, Bill; Li, Chengkai; Yang, Jun; Yu, Cong (2018). Automated pop-up fact-checking: challenges & progress.

Agadjanian, Alexander; Bakhru, Nikita; Chi, Victoria; Greenberg, Devyn; Hollander, Byrne; Hurt, Alexander; Kind, Joseph; Lu, Ray; Ma, Annie; Nyhan, Brendan; Pham, Daniel; Qian, Michael; Tan, Mackinley; Wang, Clara; Wasdahl, Alexander; Woodruff, Alexandra (2019). "Counting the Pinocchios: the effect of summary fact-checking data on perceived accuracy and favorability of politicians". Research & politics, v. 6, n. 3.

Arslan, Fatma (2021). Modeling factual claims with semantic frames: definitions, datasets, tools, and fact-checking applications. Doctoral dissertation. The University of Texas at Arlington.

Babakar, Mevan; Moy, Will (2016). The state of automated factchecking. How to make factchecking dramatically more effective with technology we have now. Full Fact.

Baker, Collin F.; Fillmore, Charles J.; Lowe, John B. (1998). "The Berkeley FrameNet project". In: Proceedings of the joint conference of the international conference on computational linguistics and the Association for Computational Linguistics (Coling-ACL), pp. 86-90.

Beltrán, Javier; Mí­guez, Rubén; Larraz, Irene (2019). "ClaimHunter: an unattended tool for automated claim detection on Twitter". KnOD@WWW. CEUR workshop proceedings, v. 2877, n. 3.

Corney, David (2021a). "How does automated fact checking work?". Full Fact, 5 July.

Corney, David (2021b). "Towards a common definition of claim matching". Full Fact, 5 October.

Dolan, William B.; Brockett, Chris (2005). "Automatically constructing a corpus of sentential paraphrases". In: Proceedings of the third international workshop on paraphrasing (IWP2005), pp. 9-16.

Floodpage, Sebastien (2021). "How fact checkers and are fighting misinformation". Google, 31 March.

Graves, Lucas (2018). Understanding the promise and limits of automated fact-checking. Reuters Institute for the Study of Journalism. Factsheets.

Hassan, Aumyo; Barber, Sarah J. (2021). "The effects of repetition frequency on the illusory truth effect". Cognitive research: principles and implications, v. 6, n. 38.

Hassan, Naeemul; Adair, Bill; Hamilton, James T.; Li, Chengkai; Tremayne, Mark; Yang, Jun; Yu, Cong (2015). "The quest to automate fact-checking". In: Proceedings of the 2015 computation + journalism symposium. Columbia University.

Hassan, Naeemul; Arslan, Fatma; Li, Chengkai; Tremayne, Mark (2017). "Toward automated fact-checking: detecting check-worthy factual claims by ClaimBuster". In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD "˜17). New York: Association for Computing Machinery, pp. 1803-1812.

Hí¶velmeyer, Alica; Boland, Katarina; Dietze, Stefan (2022). "SimBa at CheckThat! 2022: lexical and semantic similarity based detection of verified claims in an unsupervised and supervised way". In: CLEF 2022: Conference and labs of the evaluation forum, 5-8 September, Bolonia, Italia.

Jiang, Ye; Song, Xingyi; Scarton, Carolina; Aker, Ahmet; Bontcheva, Kalina (2021). "Categorising fine-to-coarse grained misinformation: an empirical study of Covid-19 Infodemic". Arxiv.

Kazemi, Ashkan; Garimella, Kiran; Gaffney, Devin; Hale, Scott A. (2021). "Claim matching beyond English to scale global fact-checking". In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing. Association for Computational Linguistics, pp. 4504-4517.

Kazemi, Ashkan; Li, Zehua; Pérez-Rosas, Verónica; Hale, Scott A.; Mihalcea, Rada (2022). "Matching tweets with applicable fact-checks across languages". Arxiv.

Kessler, Glenn; Fox, Joe (2021). "The false claims that Trump keeps repeating". The Washington Post, 20 January.

Lan, Zhenzhong; Chen, Mingda; Goodman, Sebastian; Gimpel, Kevin; Sharma, Piyush; Soricut, Radu (2020). "ALBERT: a lite Bert for self-supervised learning of language representations". In: Conference paper at International conference on learning representations (ICLR). Arxiv.

Lim, Chloe (2018). "Checking how fact-checkers check". Research & politics, v. 5, n. 3.

Mansour, Watheq; Elsayed, Tamer; Al-Ali, Abdulaziz (2022). "Did I see it before? Detecting previously-checked claims over Twitter". Lecture notes in computer science, pp. 367-381.

Martí­n, Alejandro; Huertas-Tato, Javier; Huertas-Garcí­a, Álvaro; Villar-Rodrí­guez, Guillermo; Camacho, David (2021). "FacTeR-check: semi-automated fact-checking through semantic similarity and natural language inference". Arxiv.

Mukherjee, Amit; Sela, Eitan; Al-Saadoon, Laith (2020). "Building an NLU-powered search application with Amazon SageMaker and the Amazon opensearch service KNN feature". Amazon SageMaker, artificial intelligence, 26 October.

Murray, Samuel; Stanley, Matthew; McPhetres, Jon; Pennycook, Gordon; Seli, Paul (2020). ""˜I´ve said it before and I will say it again"¦´: repeating statements made by Donald Trump increases perceived truthfulness for individuals across the political spectrum". PsyArXiv preprints, 15 January.

Nakov, Preslav; Corney, David; Hasanain, Maram; Alam, Firoj; Elsayed, Tamer; Barrón-Cedeño, Alberto; Papotti, Paolo; Shaar, Shaden; Da-San-Martino, Giovanni (2021). "Automated fact-checking for assisting human fact-checkers". International joint conference on artificial intelligence. Arxiv.

Nakov, Preslav; Da-San-Martino, Giovanni; Alam, Firoj; Shaar, Shaden; Mubarak, Hamdy; Babulkov, Nikolay (2022). "Overview of the CLEF-2022 CheckThat! Lab task 2 on detecting previously fact-checked claims". In: CLEF 2022: conference and labs of the evaluation forum, 5-8 septiembre, Bolonia, Italia.

Nguyen, Vincent; Karimi, Sarvnaz; Xing, Zhenchang (2021). "Combining shallow and deep representations for text-pair classification". In: Proceedings of the 19th Annual workshop of the Australasian Language Technology Association, pp. 68-78.

Phillips, Whitney (2018). The oxygen of amplification. Better pratices for reporting on extremists, antagonists, and manipulators online. Data & Society Research Institute.

Porter, Ethan; Wood, Thomas J. (2021). "The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom". Proceedings of the National Academy of Sciences of the United States of America, v. 118, n. 37.

Real, Andrea (2021). "Casado mezcla diferentes estadí­sticas de empleo para asegurar que hay 4 millones de parados, pero es falso". Newtral, 6 octubre.

Reimers, Nils; Gurevych, Iryna (2019). "Sentence-bert: sentence embeddings using siamese bert-networks". In: Proceedings of the 2019 Conference on empirical methods in natural language processing and the 9th International joint conference on natural language processing (EMNLP-IJCNLP). Hong Kong, November, pp. 3982-3992.

Shaar, Shaden; Alam, Firoj; Da-San-Martino, Giovanni; Nakov, Preslav (2021a). "The role of context in detecting previously fact-checked claims". Arxiv.

Shaar, Shaden; Babulkov, Nikolay; Da-San-Martino, Giovanni; Nakov, Preslav (2020). "That is a known lie: detecting previously fact-checked claims". In: Proceedings of the 58th Annual meeting of the Association for Computational Linguistics, pp. 3607-3618.

Shaar, Shaden; Haouari, Fatima; Mansour, Watheq; Hasanain, Maram; Babulkov, Nikolay; Alam, Firoj; Da-San-Martino, Giovanni; Elsayed, Tamer; Nakov, Preslav (2021b). "Overview of the CLEF-2021 CheckThat! Lab task 2 on detecting previously fact-checked claims in tweets and political debates". In: CLEF 2021: Conference and labs of the evaluation forum, 21-24 September, Bucharest, Romania.

Sheng, Qiang; Cao, Juan; Zhang, Xueyao; Li, Xirong; Zhong, Lei (2021). "Article reranking by memory-enhanced key sentence matching for detecting previously fact-checked claims". In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing (volume 1, Long papers).

Sippitt, Amy (2020). What is the impact of fact checkers´ work on public figures, institutions and the media?. Africa Check, Chequeado and Full Fact.

Stanford Institute for Human-Centered Artificial Intelligence (2023). Artificial intelligence index. Stanford University.

The Washington Post (2018). "Meet the bottomless Pinocchio | Fact Checker". [Video]. YouTube, 10 December.

Thorne, James; Vlachos, Andreas (2018). "Automated fact checking: task formulations, methods and future directions". Arxiv.

Wardle, Claire (2018). "Lessons for reporting in an age of disinformation". Medium, 28 December.

Zeng, Xia; Abumansour, Amani S.; Zubiaga, Arkaitz (2021). "Automated fact-checking: a survey". Language and linguistics compass, v. 15, n. 10.



Cómo citar

Larraz, I., Mí­guez, R., & Sallicati, F. (2023). Semantic similarity models for automated fact-checking: ClaimCheck as a claim matching tool. Profesional De La información Information Professional, 32(3).



Artificial Intelligence