Applying AI and Cloud for content analysis at the RTVE Archive
by Fondo Documental RTVE – Estrategia Tecnológica RTVE.
The AI and cloud content analysis project has been developed in cooperation with the Digital Strategy Area of RTVE as part of a strategic plan for the company’s digital transformation. RTVE selected in a public tender the proposal of the Spanish companies, VSN and Etiqmedia. RTVE’s technological purpose was to prepare the archive management system to receive automatically generated metadata, measure the quality and quantity of this metadata, and design automatic workflows from content selection to metadata integration in the archive system.
The project’s main goal was to apply metadata management automation using Artificial Intelligence to 11,000 hours of content from RTVE Archive, including material produced by the broadcaster in the 60s and 70s. One of the main requirements of the project was to integrate this solution into a private cloud service provided by a supplier using their technology or integrating third-party technology. Therefore, the project requested a MAM tool available to work with AI engines in a private cloud environment. VSN presented its project based on integrating its cloud-ready MAM System, VSNExplorer MAM, working with Etiqmedia’s AI engine.
The project has two phases: A testing process, developed from May 2021 to October 2021 and the final one-year project implementation starting in October 2021, which is currently in operation. During the first one, RTVE and VSN defined the workflows between teams and completed the equipment installation. In addition, this testing phase successfully processed 160 hours of audio and video.
The workflow process starts with the ingesting of the content from the RTVE Archive on VSNExplorer MAM. During this phase, the system creates an asset with the media and an XML file with information about the content provided by the network. Once the content is ready on the system, the Etiqmedia AI engine starts the analysis and generates the content’s metadata. In terms of audio, this technology can retrieve a complete text transcription in just a few minutes, its capitalization and accentuation, and it also recognizes people, places, events, products, organizations, and dates. Furthermore, it extracts an automated catalog of the content and the main keywords. On the video side, the AI allows facial recognition and identifies and catalog the scene, along with the objects, tags, and signs that appear in the footage.
When the process is over, all the extracted information is displayed in a single centralized interface on VSNExplorer MAM. Here, RTVE’s professionals can review, check and edit the results to adjust them to the correct catalog parameters in a simple and fast way. Once this check is completed, the system automatically calculates the transcription WER (Word Error Rate), comparing the results obtained by the engine with the cataloger’s correction.
After all these steps are completed, VSNExplorer MAM creates several XML Files with all this information that goes back to the RTVE Archive. Therefore, the process of content metadata is fully automated and provides an enormous amount of data within a few minutes. It only needs to be checked and controlled by RTVE’s cataloguers. Thanks to this, the process is much more efficient and accurate.