Semantic similarity models for automated fact-checking: ClaimCheck as a claim matching tool
DOI:
https://doi.org/10.3145/epi.2023.may.21Palabras clave:
Verification, Automated fact-checking, Claim matching, Semantic similarity, Paraphrase models, Disinformation, Artificial intelligence, AI, Algorithms, SoftwareResumen
This article presents the experimental design of ClaimCheck, an artificial intelligence tool for detecting repeated falsehoods in political discourse using a semantic similarity model developed by the fact-checking organization Newtral in collaboration with ABC Australia. The study reviews the state of the art in algorithmic fact-checking and proposes a definition of claim matching. Additionally, it outlines the scheme for annotating similar sentences and presents the results of experiments conducted with the tool.
Descargas
Citas
Adair, Bill (2021). "The lessons of Squash, Duke´s automated fact-checking platform". Poynter, 16 June. https://www.poynter.org/fact-checking/2021/the-lessons-of-squash-the-first-automated-fact-checking-platform
Adair, Bill; Li, Chengkai; Yang, Jun; Yu, Cong (2018). Automated pop-up fact-checking: challenges & progress. https://ranger.uta.edu/~cli/pubs/2019/popupfactcheck-cj19-adair.pdf
Agadjanian, Alexander; Bakhru, Nikita; Chi, Victoria; Greenberg, Devyn; Hollander, Byrne; Hurt, Alexander; Kind, Joseph; Lu, Ray; Ma, Annie; Nyhan, Brendan; Pham, Daniel; Qian, Michael; Tan, Mackinley; Wang, Clara; Wasdahl, Alexander; Woodruff, Alexandra (2019). "Counting the Pinocchios: the effect of summary fact-checking data on perceived accuracy and favorability of politicians". Research & politics, v. 6, n. 3. https://doi.org/10.1177/2053168019870351
Arslan, Fatma (2021). Modeling factual claims with semantic frames: definitions, datasets, tools, and fact-checking applications. Doctoral dissertation. The University of Texas at Arlington. https://rc.library.uta.edu/uta-ir/bitstream/handle/10106/30765/ARSLAN-DISSERTATION-2021.pdf
Babakar, Mevan; Moy, Will (2016). The state of automated factchecking. How to make factchecking dramatically more effective with technology we have now. Full Fact. https://fullfact.org/media/uploads/full_fact-the_state_of_automated_factchecking_aug_2016.pdf
Baker, Collin F.; Fillmore, Charles J.; Lowe, John B. (1998). "The Berkeley FrameNet project". In: Proceedings of the joint conference of the international conference on computational linguistics and the Association for Computational Linguistics (Coling-ACL), pp. 86-90. https://aclanthology.org/C98-1013.pdf
Beltrán, Javier; Míguez, Rubén; Larraz, Irene (2019). "ClaimHunter: an unattended tool for automated claim detection on Twitter". KnOD@WWW. CEUR workshop proceedings, v. 2877, n. 3. https://ceur-ws.org/Vol-2877/paper3.pdf
Corney, David (2021a). "How does automated fact checking work?". Full Fact, 5 July. https://fullfact.org/blog/2021/jul/how-does-automated-fact-checking-work
Corney, David (2021b). "Towards a common definition of claim matching". Full Fact, 5 October. https://fullfact.org/blog/2021/oct/towards-common-definition-claim-matching
Dolan, William B.; Brockett, Chris (2005). "Automatically constructing a corpus of sentential paraphrases". In: Proceedings of the third international workshop on paraphrasing (IWP2005), pp. 9-16. https://aclanthology.org/I05-5002.pdf
Floodpage, Sebastien (2021). "How fact checkers and Google.org are fighting misinformation". Google, 31 March. https://blog.google/outreach-initiatives/google-org/fullfact-and-google-fight-misinformation
Graves, Lucas (2018). Understanding the promise and limits of automated fact-checking. Reuters Institute for the Study of Journalism. Factsheets. https://ora.ox.ac.uk/objects/uuid:f321ff43-05f0-4430-b978-f5f517b73b9b
Hassan, Aumyo; Barber, Sarah J. (2021). "The effects of repetition frequency on the illusory truth effect". Cognitive research: principles and implications, v. 6, n. 38. https://doi.org/10.1186/s41235-021-00301-5
Hassan, Naeemul; Adair, Bill; Hamilton, James T.; Li, Chengkai; Tremayne, Mark; Yang, Jun; Yu, Cong (2015). "The quest to automate fact-checking". In: Proceedings of the 2015 computation + journalism symposium. Columbia University. http://cj2015.brown.columbia.edu/papers/automate-fact-checking.pdf
Hassan, Naeemul; Arslan, Fatma; Li, Chengkai; Tremayne, Mark (2017). "Toward automated fact-checking: detecting check-worthy factual claims by ClaimBuster". In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD "˜17). New York: Association for Computing Machinery, pp. 1803-1812. https://doi.org/10.1145/3097983.3098131
Hí¶velmeyer, Alica; Boland, Katarina; Dietze, Stefan (2022). "SimBa at CheckThat! 2022: lexical and semantic similarity based detection of verified claims in an unsupervised and supervised way". In: CLEF 2022: Conference and labs of the evaluation forum, 5-8 September, Bolonia, Italia. https://ceur-ws.org/Vol-3180/paper-40.pdf
Jiang, Ye; Song, Xingyi; Scarton, Carolina; Aker, Ahmet; Bontcheva, Kalina (2021). "Categorising fine-to-coarse grained misinformation: an empirical study of Covid-19 Infodemic". Arxiv. https://doi.org/10.48550/arXiv.2106.11702
Kazemi, Ashkan; Garimella, Kiran; Gaffney, Devin; Hale, Scott A. (2021). "Claim matching beyond English to scale global fact-checking". In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing. Association for Computational Linguistics, pp. 4504-4517. https://doi.org/10.18653/v1/2021.acl-long.347
Kazemi, Ashkan; Li, Zehua; Pérez-Rosas, Verónica; Hale, Scott A.; Mihalcea, Rada (2022). "Matching tweets with applicable fact-checks across languages". Arxiv. https://doi.org/10.48550/arXiv.2202.07094
Kessler, Glenn; Fox, Joe (2021). "The false claims that Trump keeps repeating". The Washington Post, 20 January. https://www.washingtonpost.com/graphics/politics/fact-checker-most-repeated-disinformation
Lan, Zhenzhong; Chen, Mingda; Goodman, Sebastian; Gimpel, Kevin; Sharma, Piyush; Soricut, Radu (2020). "ALBERT: a lite Bert for self-supervised learning of language representations". In: Conference paper at International conference on learning representations (ICLR). Arxiv. https://doi.org/10.48550/arXiv.1909.11942
Lim, Chloe (2018). "Checking how fact-checkers check". Research & politics, v. 5, n. 3. https://doi.org/10.1177/2053168018786848
Mansour, Watheq; Elsayed, Tamer; Al-Ali, Abdulaziz (2022). "Did I see it before? Detecting previously-checked claims over Twitter". Lecture notes in computer science, pp. 367-381. https://doi.org/10.1007/978-3-030-99736-6_25
Martín, Alejandro; Huertas-Tato, Javier; Huertas-García, Álvaro; Villar-Rodríguez, Guillermo; Camacho, David (2021). "FacTeR-check: semi-automated fact-checking through semantic similarity and natural language inference". Arxiv. https://doi.org/10.48550/arXiv.2110.14532
Mukherjee, Amit; Sela, Eitan; Al-Saadoon, Laith (2020). "Building an NLU-powered search application with Amazon SageMaker and the Amazon opensearch service KNN feature". Amazon SageMaker, artificial intelligence, 26 October. https://aws.amazon.com/es/blogs/machine-learning/building-an-nlu-powered-search-application-with-amazon-sagemaker-and-the-amazon-es-knn-feature
Murray, Samuel; Stanley, Matthew; McPhetres, Jon; Pennycook, Gordon; Seli, Paul (2020). ""˜I´ve said it before and I will say it again"¦´: repeating statements made by Donald Trump increases perceived truthfulness for individuals across the political spectrum". PsyArXiv preprints, 15 January. https://doi.org/10.31234/osf.io/9evzc
Nakov, Preslav; Corney, David; Hasanain, Maram; Alam, Firoj; Elsayed, Tamer; Barrón-Cedeño, Alberto; Papotti, Paolo; Shaar, Shaden; Da-San-Martino, Giovanni (2021). "Automated fact-checking for assisting human fact-checkers". International joint conference on artificial intelligence. Arxiv. https://doi.org/10.48550/arXiv.2103.07769
Nakov, Preslav; Da-San-Martino, Giovanni; Alam, Firoj; Shaar, Shaden; Mubarak, Hamdy; Babulkov, Nikolay (2022). "Overview of the CLEF-2022 CheckThat! Lab task 2 on detecting previously fact-checked claims". In: CLEF 2022: conference and labs of the evaluation forum, 5-8 septiembre, Bolonia, Italia. https://ceur-ws.org/Vol-3180/paper-29.pdf
Nguyen, Vincent; Karimi, Sarvnaz; Xing, Zhenchang (2021). "Combining shallow and deep representations for text-pair classification". In: Proceedings of the 19th Annual workshop of the Australasian Language Technology Association, pp. 68-78. https://aclanthology.org/2021.alta-1.7.pdf
Phillips, Whitney (2018). The oxygen of amplification. Better pratices for reporting on extremists, antagonists, and manipulators online. Data & Society Research Institute. https://datasociety.net/wp-content/uploads/2018/05/FULLREPORT_Oxygen_of_Amplification_DS.pdf
Porter, Ethan; Wood, Thomas J. (2021). "The global effectiveness of fact-checking: Evidence from simultaneous experiments in Argentina, Nigeria, South Africa, and the United Kingdom". Proceedings of the National Academy of Sciences of the United States of America, v. 118, n. 37. https://doi.org/10.1073/pnas.2104235118
Real, Andrea (2021). "Casado mezcla diferentes estadísticas de empleo para asegurar que hay 4 millones de parados, pero es falso". Newtral, 6 octubre. https://www.newtral.es/parados-espana-casado-pp-factcheck/20211007
Reimers, Nils; Gurevych, Iryna (2019). "Sentence-bert: sentence embeddings using siamese bert-networks". In: Proceedings of the 2019 Conference on empirical methods in natural language processing and the 9th International joint conference on natural language processing (EMNLP-IJCNLP). Hong Kong, November, pp. 3982-3992. https://doi.org/10.18653/v1/D19-1410
Shaar, Shaden; Alam, Firoj; Da-San-Martino, Giovanni; Nakov, Preslav (2021a). "The role of context in detecting previously fact-checked claims". Arxiv. https://doi.org/10.48550/arXiv.2104.07423
Shaar, Shaden; Babulkov, Nikolay; Da-San-Martino, Giovanni; Nakov, Preslav (2020). "That is a known lie: detecting previously fact-checked claims". In: Proceedings of the 58th Annual meeting of the Association for Computational Linguistics, pp. 3607-3618. https://doi.org/10.18653/v1/2020.acl-main.332
Shaar, Shaden; Haouari, Fatima; Mansour, Watheq; Hasanain, Maram; Babulkov, Nikolay; Alam, Firoj; Da-San-Martino, Giovanni; Elsayed, Tamer; Nakov, Preslav (2021b). "Overview of the CLEF-2021 CheckThat! Lab task 2 on detecting previously fact-checked claims in tweets and political debates". In: CLEF 2021: Conference and labs of the evaluation forum, 21-24 September, Bucharest, Romania. https://ceur-ws.org/Vol-2936/paper-29.pdf
Sheng, Qiang; Cao, Juan; Zhang, Xueyao; Li, Xirong; Zhong, Lei (2021). "Article reranking by memory-enhanced key sentence matching for detecting previously fact-checked claims". In: Proceedings of the 59th Annual meeting of the Association for Computational Linguistics and the 11th International joint conference on natural language processing (volume 1, Long papers). https://doi.org/10.18653/v1/2021.acl-long.425
Sippitt, Amy (2020). What is the impact of fact checkers´ work on public figures, institutions and the media?. Africa Check, Chequeado and Full Fact. https://fullfact.org/media/uploads/impact-fact-checkers-public-figures-media.pdf
Stanford Institute for Human-Centered Artificial Intelligence (2023). Artificial intelligence index. Stanford University. https://aiindex.stanford.edu/report
The Washington Post (2018). "Meet the bottomless Pinocchio | Fact Checker". [Video]. YouTube, 10 December. https://www.youtube.com/watch?v=zoS1sVZRfUU
Thorne, James; Vlachos, Andreas (2018). "Automated fact checking: task formulations, methods and future directions". Arxiv. https://doi.org/10.48550/arXiv.1806.07687
Wardle, Claire (2018). "Lessons for reporting in an age of disinformation". Medium, 28 December. https://medium.com/1st-draft/5-lessons-for-reporting-in-an-age-of-disinformation-9d98f0441722
Zeng, Xia; Abumansour, Amani S.; Zubiaga, Arkaitz (2021). "Automated fact-checking: a survey". Language and linguistics compass, v. 15, n. 10. https://doi.org/10.1111/lnc3.12438
Descargas
Publicado
Cómo citar
Número
Sección
Licencia
Derechos de autor 2023 Profesional de la información
Esta obra está bajo una licencia internacional Creative Commons Atribución 4.0.
Condiciones de difusión de los artículos una vez son publicados
Los autores pueden publicitar libremente sus artículos en webs, redes sociales y repositorios
Deberán respetarse sin embargo, las siguientes condiciones:
- Solo deberá hacerse pública la versión editorial. Rogamos que no se publiquen preprints, postprints o pruebas de imprenta.
- Junto con esa copia ha de incluirse una mención específica de la publicación en la que ha aparecido el texto, añadiendo además un enlace clicable a la URL: http://revista.profesionaldelainformacion.com
La revista Profesional de la información ofrece los artículos en acceso abierto con una licencia Creative Commons BY.