Gender Distribution across Topics in the Top 5 Economics Journals: A Machine Learning Approach

Impacto

Downloads

Downloads per month over past year



Conde Ruiz, José Ignacio and Ganuza, Juan-José and García, Manu and Puch, Luis A. (2021) Gender Distribution across Topics in the Top 5 Economics Journals: A Machine Learning Approach. [ Documentos de Trabajo del Instituto Complutense de Análisis Económico (ICAE); nº 09, 2021, ISSN: 2341-2356 ]

[thumbnail of 2109.pdf]
Preview
PDF
8MB



Abstract

We analyze all the articles published in the top five (T5) Economics journals be- tween 2002 and 2019 in order to find gender differences in their research approach. We implement an unsupervised machine learning algorithm: the Structural Topic Model (STM), so as to incorporate gender document-level meta-data into a probabilistic text model. This algorithm characterizes jointly the set of latent topics that best fits our data (the set of abstracts) and how the documents/abstracts are allocated to each latent topic. Latent topics are mixtures over words where each word has a probability of belonging to a topic after controlling by journal name and publication year (the meta-data). Thus, the topics may capture research fields but also other more subtle characteristics related to the way in which the articles are written. We find that fe- males are unevenly distributed along the estimated latent topics, by using only data driven methods. This finding relies on “automatically” generated built-in data given the contents in the abstracts of the articles in the T5 journals, without any arbitrary allocation of texts to particular categories (as JEL codes, or research areas).


Item Type:Working Paper or Technical Report
Additional Information:

We thank Antonio Cabrales, Pedro Delicado and Nagore Iriberri for helpful comments, and Elvira Alonso for excellent research assistance. We also thank the Editor and two anonymous referees for their suggestions, as well as session participants at Computing in Economics & Finance Conference, Tokyo (virtual) 2021.
José Ignacio Conde-Ruiz and, Manu García and Luis Puch, respectively, acknowledge the Spanish Ministry of Science and Innovation for financial support through projects PID2019-105499GB-I00 and PID2019-107161GB-C32.
Juan-José Ganuza gratefully acknowledges the financial support from the Spanish Agencia Estatal de Investigación, through the Severo Ochoa Programme for Centres of Excellence in R&D (CEX2019-000915-S) and the Spanish Ministry of Education and Science Through Project ECO2017-89240-P.
†Corresponding Author: Juan-Jose Ganuza, Universitat Pompeu Fabra, Ramon Trias Fargas 27, 08005, Spain; E-mail: juanjo.ganuza@gmail.com

Uncontrolled Keywords:Machine Learning; Gender Gaps; Structural Topic Model; Gendered Language; Research Fields.
Subjects:Social sciences > Economics
Social sciences > Economics > Econometrics
JEL:I20, J16, Z13
Series Name:Documentos de Trabajo del Instituto Complutense de Análisis Económico (ICAE)
Volume:2021
Number:09
ID Code:67146
Deposited On:22 Jul 2021 07:23
Last Modified:13 Sep 2021 11:27

Origin of downloads

Repository Staff Only: item control page