Artificial intelligence applied to radio news: a case study of automatic segmentation of news items at RNE

Authors

DOI:

https://doi.org/10.3145/epi.2021.may.20

Keywords:

RTVE, RNE, Artificial intelligence, Language technologies, Voice-to-text-transcription, Radio news, Audiovisual archives, Radio archives, Audiovisual documentation, Radio documentation, Diarization, News segmentation, WER, Word error rate, Precision, Quality control

Abstract

The results of a project on news segmentation at Radio Nacional de España (RNE) carried out by the RTVE Technological Innovation and Media Management areas is presented. The aim of this project is to apply artificial intelligence to automatically transcribe and cut the news items that make up a radio news program. The main goals of this project are to increase the accessibility of the content and to allow its reusability on various platforms and social media. The project was planned in two phases, covering system configuration and service delivery. The minimum quality criteria required were defined in advance, both for automatic voice transcription and for news segmentation. For the speech-to-text process, the highest word error rate (WER) allowed was 10%, while the precision rate for the news segmentation was 85%. System performance in both transcription and segmentation was considered to be sufficient, although a higher degree of accuracy in news cutting is expected in the coming months. The results show that, despite using these quite mature technologies, adjustment and learning processes and human intervention are still necessary.

Downloads

Download data is not yet available.

References

Associated Press (2020). Automated insigts customer stories. https://automatedinsights.com/customer-stories/associated-press

Bazán-Gil, Virginia (2018). "El renacimiento de los archivos: inteligencia artificial y semántica aplicada a la descripción de contenidos audiovisuales". En: IX Encuentros de Centros de Documentación de Arte Contemporáneo. Explotación, Integración y Difusión de Conocimiento de las Instituciones Patrimoniales. https://biblioteca.artium.eus/Record/164617

Bazán-Gil, Virginia (2020). "Artificial intelligence: an object of desire". In: FIAT/IFTA, 13 May. http://fiatifta.org/index.php/media/archivalreads/archivalreads-artificial-intelligence

Bazán-Gil, Virginia; Guerrero-Gómez-Olmedo, Ricardo (2018). "Descripción automática de archivos audiovisuales: NeuralTalk, un modelo de video2text aplicado al archivo de RTVE". BiD: Textos universitaris de biblioteconomia i documentació, n. 41. https://doi.org/10.1344/BiD2018.41.7

Bazán-Gil, Virginia; Lleida-Solano, Eduardo; Pérez-Cernuda, Carmen; Gómez-Zotano, Manuel J.; De-Prada, Alberto (2019). "Tecnologí­as del habla: nuevas oportunidades para los archivos de televisión". En: 14º Congreso ISKO España. http://eprints.rclis.org/38447/1/CatedraRTVEUnizar_ISKO2019.pdf

BBC (2020). The equality project 50:50. https://www.bbc.co.uk/5050

Calero, Juan F. (2020). "La inteligencia artificial que lleva a otro nivel el subtitulado automático en informativos de TV o conferencias". Madri+d, 23 julio. http://www.madrimasd.org/notiweb/noticias/inteligencia-artificial-que-lleva-otro-nivel-subtitulado-automatico-en-informativos-tv-o-conferencias

Cátedra RTVE (2017). http://catedrartve.unizar.es

Corral, David (2020). "Periodismo tecnológico o ¿tecnologí­a para el periodismo? En tiempos de pandemia". RTVE, 30 abril. https://www.rtve.es/rtve/20200430/periodismo-tecnologico-tecnologia-para-periodismo-tiempos-pandemia/2013145.shtml

Data for hope (2020). https://dataforhope.com

Doukhan, David (2019). "Gender equality monitor". In: FIAT/IFTA World Conference in Dubrovnik. https://es.slideshare.net/fiatifta/doukhan-gender-equality-monitor

EBU (2019). The next newsroom: Unlocking the power of AI for public service journalism. News report 2019. https://www.ebu.ch/publications/strategic/login_only/report/news-report-2019

Etiqmedia (2020). Etiqmedia. Radio indexing. http://www.etiqmedia.com/soluciones/radio-indexing.php

Fraunhofer-Gesellschaft (2019). "Software that can automatically detect fake news". Phys org, February 1. https://phys.org/news/2019-02-software-automatically-fake-news.html

Galvez, Giovanni (2020). "AI machine translation of subtitling for live news and sports". In: MDN Workshop. https://tech.ebu.ch/contents/publications/presentations/2020/mdn2020/ai-machine-translation-of-subtitling-for-live-news-and-sports.html

Graves, Lucas (2018). Understanding the promise and limits of automated fact-checking. Reuters Institute. https://reutersinstitute.politics.ox.ac.uk/sites/default/files/2018-02/graves_factsheet_180226%20FINAL.pdf

Grothaus, Michael (2019). "Machine learning isn´t effective at identifying fake news". Fast Company, 15 October. https://www.fastcompany.com/90417625/machine-learning-isnt-effective-at-identifying-fake-news

Herrero-Diz, Paula; Varona-Aramburu, David (2018). "Uso de chatbots para automatizar la información en los medios españoles". El profesional de la información, v. 27, n. 4, pp. 742-749. https://doi.org/10.3145/epi.2018.jul.03

IPTC (2020). News codes. https://iptc.org/standards/newscodes

Jones, Bronwyn; Jones, Rhianne (2019). "Public service chatbots: Automating conversation with BBC News". Digital journalism, v. 7, n. 8, pp. 1032-1053. https://doi.org/10.1080/21670811.2019.1609371

Lempinen, Jaakko; Kokko, Jan; Matusiak, Marek (2020). "Applications of automated media extraction from Yle Areena videos". In: MDN Workshop. https://tech.ebu.ch/contents/publications/presentations/2020/mdn2020/applications-of-automated-media-extraction-from-yle-areena-videos.html

León-Carpio, Antonio; López-De-Quintana, Eugenio (2020). "Artificial Intelligence for a role change in television archives: the Atresmedia experience". In: IASA - FIAT/IFTA. https://2020iasafiatiftaconference.sched.com/event/eLpP

Lleida-Solano, Eduardo (2020). Iberspeech 2020 evaluation challenges. Cátedra RTVE - Universidad de Zaragoza, Albayzí­n Evaluations. http://catedrartve.unizar.es/albayzin2020.html

Lleida-Solano, Eduardo; Ortega-Giménez, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Pérez-Cernuda, Carmen; Gómez-Zotano, Manuel; De-Prada, Alberto (2018). RTVE2018 Database Description. http://catedrartve.unizar.es/reto2018/RTVE2018DB.pdf

Lleida-Solano, Eduardo; Ortega-Giménez, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Pérez-Cernuda, Carmen; Gómez-Zotano, Manuel; De-Prada, Alberto (2019a). "The IberSpeech-RTVE challenge on speech technologies for Spanish broadcast media". Applied sciences, v. 9. https://www.mdpi.com/2076-3417/9/24/5412

Lleida-Solano, Eduardo; Ortega-Giménez, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Pérez-Cernuda, Carmen; Gómez-Zotano, Manuel; De-Prada, Alberto (2019b). "Albayzin 2018 evaluation: The IberSpeech-RTVE". Applied sciences, n. 22. https://www.mdpi.com/2076-3417/9/24/5412/pdf

Lleida-Solano, Eduardo; Ortega-Giménez, Alfonso; Miguel, Antonio; Bazán-Gil, Virginia; Pérez-Cernuda, Carmen; Gómez-Zotano, Manuel; De-Prada, Alberto (2020). RTVE2020 Database Description. http://catedrartve.unizar.es/reto2020/RTVE2020DB.pdf

LSE (2020). JournalismAI Case studies. The London School of Economics and Political science. https://www.lse.ac.uk/media-and-communications/polis/JournalismAI/Case-studies

Molumby, Conor; Whitwell, Joe (2019). "General election 2019: Semi-automation makes it a night of 689 stories". BBC news labs, 13 diciembre. https://bbcnewslabs.co.uk/news/2019/salco-ge

Nixon, Lyndon (2020). "Metadata-driven TV content repurposing and republication". In: MDN Workshop. https://tech.ebu.ch/contents/publications/events/presentations/mdn2020/metadata-driven-tv-content-repurposing-and-republication

Opoku-Boateng, Judith; Asano, Jun (2020). "NHK´s diversification of search methods using AI". In: IASA - FIAT/IFTA conference. https://2020iasafiatiftaconference.sched.com/event/eLoy

Parmentier, Matthieu (2020). "Analysing political debates to feed data journalists". In: MDN Workshop. https://tech.ebu.ch/contents/publications/presentations/2020/mdn2020/analysing-political-debates-to-feed-data-journalists.html

Prensa RTVE (2018). RTVE lanza el proyecto "˜Journalism Innovation Hub´ para estudiar la transformación digital de los informativos. https://www.rtve.es/rtve/20180222/rtve-lanza-proyecto-journalism-innovation-hub-para-estudiar-transformacion-digital-informativos/1682664.shtml

Proyecto covid 19 (2020). https://covid19tracking.narrativa.com

Rath, Sid; Veerwaijen, Christiaan; Forster, Christoph (2020). "AI-enabled hyper-tagging engines for football archives". In: IASA - FIAT/IFTA Conference. https://2020iasafiatiftaconference.sched.com/event/eLoj

Rehm, Georg (2020). Research for CULT Committee - The use of artificial intelligence in the audiovisual sector. European Parliament, Policy Department for Structural and Cohesion Policies, Brussels. https://www.europarl.europa.eu/thinktank/en/document.html?reference=IPOL_IDA(2020)629221

ReTV (2020). https://retv-project.eu

RNE (2020a). Madrid Informativos de RNE. https://www.rtve.es/alacarta/audios/informativo-de-madrid

RNE (2020b). RNE 14h. https://www.rtve.es/alacarta/audios/14-horas

Rozalén-Serrano, Miguel-Ángel; Aranda-Jiménez, Álvaro (2020). "Generación automática de palabras clave para monitorizar dominios en redes sociales". En: Actas del IV Congreso ISKO España-Portugal 2019, XIV Congreso ISKO España 2019. https://dialnet.unirioja.es/servlet/libro?codigo=766025

Rozalén-Serrano, Miguel-Ángel; Aranda-Jiménez, Álvaro; Rodrí­guez, Francisco; Álvarez-Rodrí­guez, José-Marí­a (2020). Proyecto Social Media Radar. Madrid. ISBN: 108283064X

RTVE (2019). Portal de licitaciones. https://licitaciones.rtve.es/licitacion/licitaciones/detalle?id=744264

RTVE (2020). Portal de licitaciones. https://licitaciones.rtve.es/licitacion/licitaciones/detalle?id=1208797

Saarikoski, Lauri (2020). "How do the end-users find all this automated metadata?". In: MDN Workshop. https://tech.ebu.ch/contents/publications/presentations/2020/mdn2020/how-do-the-end-users-find-all-this-automated-metadata.html

Sanjinés, Diana (2020a). "Cómo RTVE implementa la personalización de contenido en sus aplicaciones". Noticias OI2. http://oi2media.es/2020/07/03/como-rtve-implementa-la-personalizacion-de-contenido-en-sus-aplicaciones

Sanjinés, Diana (2020b). "OI2 publica el tercer informe sobre periodismo e inteligencia artificial". Publicaciones OI2. http://oi2media.es/2020/11/06/oi2-publica-el-tercer-informe-sobre-periodismo-e-inteligencia-artificial

Túñez-López, Miguel; Toural-Bran, Carlos; Cacheiro-Requeijo, Santiago (2018). "Uso de bots y algoritmos para automatizar la redacción de noticias: percepción y actitudes de los periodistas en España". El profesional de la información, v. 27, n. 4, pp. 750-758. https://doi.org/10.3145/epi.2018.jul.04

Túñez-López, Miguel; Toural-Bran, Carlos; Valdiviezo-Abad, Cesibel (2019). "Automatización, bots y algoritmos en la redacción de noticias. Impacto y calidad del periodismo artificial". Revista latina de comunicación social, n. 74, pp. 1411-1433. https://doi.org/10.4185/RLCS-2019-1391

Vállez, Mari; Codina, Lluí­s (2018). "Periodismo computacional: evolución, casos y herramientas". El profesional de la información, n. 27, n. 4. https://doi.org/10.3145/epi.2018.jul.05

Van-Rijsselbergen, Dieter (2020). "Metadata proccessing in the H2020 Memad prototype platform". In: MDN Workshop. https://tech.ebu.ch/contents/publications/presentations/2020/mdn2020/metadata-processing-in-the-h2020-memad-prototype-platform.html

WashPostPR (2017). "The Washington Post leverages automated storytelling to cover high school football". Washington Post, 1 September. https://www.washingtonpost.com/pr/wp/2017/09/01/the-washington-post-leverages-heliograf-to-cover-high-school-football

Yle (2018). Yle releases code for "˜robot journalist´ Voitto. https://yle.fi/uutiset/osasto/news/yle_releases_code_for_robot_journalist_voitto/10126261

Published

2021-06-14

How to Cite

Bazán-Gil, V., Pérez-Cernuda, C., Marroyo-Núñez, N., Sampedro-Canet, P., & De-Ignacio-Ledesma, D. (2021). Artificial intelligence applied to radio news: a case study of automatic segmentation of news items at RNE. Profesional De La información, 30(3). https://doi.org/10.3145/epi.2021.may.20

Issue

Section

Non research articles