1. Gender Equality Monitor
David Doukhan | INA | France
Gender Equality Monitor (GEM) is a project initiated in July 2017 at INA (the French National Audiovisual Institute). It aims to describe automatically representation and treatment differences existing between women and men in French-language media such as TV, radio, newspapers and song lyrics collections. The ambition of this project is to analyze several million documents sampled over a period of more than 80 years, in order to produce the most comprehensive description of the representation of men and women in French media.
Automatic audiovisual indexing methods based on recent advances in machine learning (speaker gender detection, face recognition and characterization, speech-to-text and spoken language understanding) are proposed to deal with this challenging amount of data.
A large-scale study focusing on women and men’s speech time has been conducted using one million hours of audiovisual material, sampled from 1995 to 2018 over 21 French radio stations and 34 TV channels. Speech-time, described using Women Speech Time Percentage (WSTP), was obtained using inaSpeechSegmenter: an open-source machine learning software built at INA, which automatically detects women and men voices in sound signals. WSTP variations have been analyzed across channels, years, hours, and regions. Key findings include the fact that men spoke twice as much as women on TV and on radio in 2018, and spoke three times as much than women before 2004. Only one musical radio station out of the 55 channels considered is associated with a larger women speech-time. WSTP is lower during high-ratings time-slots on private channels. WSTP is higher for channels aimed at a female audience, and lower for sport and cultural thematic channels. Detailed estimates resulting from these analyses have been released in open-data, thus allowing academics, journalists and audiovisual professionals to combine it with other structured information sources (channel governance, ratings, political events…).
A smaller-scale study was conducted using face detection and gender classification software realized at INA. Facial exposition time was compared to speech time estimates. Early findings suggest women visual presence on French TV is larger than women speaking-time on all the considered channels. Additional preliminary work based on speech-to-text and lexicometry allowed to provide insights on the words chosen to refer to male and female speakers, showing clear differences between channels. More exhaustive analyses are being carried on to confirm these trends and monitor their evolution over time.
Based on these extremely encouraging results, the GEM project has recently obtained funding from ANR (the French National Research Agency). Starting in January 2020 and for a duration of 42 months, it will involve a transdisciplinary consortium of seven members: two major audiovisual media actors (INA and online streaming platform Deezer), two STEM laboratories specialized in automatic information extraction from text and speech (LIUM, LIMSI) and three humanities laboratories specialized in the study of gender and media (CARISM, LERASS, ENS LYON). The project will be based on three complementary lines of work: formalizing descriptors relevant for the quantification of representation differences between genders, implementing these descriptors using information extraction methods, and carrying out quantitative studies based on the exploitation of the descriptors obtained automatically.