Gender Differences in Mathematics Achievement: The Case of a Business School in Spain

Previous research on the gender gap in mathematics indicates that, in the case of Spain, the gap is stable and has even increased during the last years both in primary and secondary education. The objective of this paper is to analyze whether this gender gap remains by changing the academic level and the way of measuring performance. A sample of 2,713 undergraduate management students (1,478 female and 1,235 male) has been analyzed. The results show that the mathematical level in university entrance is significantly higher in male students, confirming the existence of the aforementioned gap. Nevertheless, throughout the first year of university studies, it has been observed that in most degrees there are no differences between male and female students in terms of their achievement in mathematics. In fact, in those cases in which differences do exist, they occur in the opposite direction, with female students achieving better performance.


Introduction
According to the latest available data, the majority of undergraduate students in the Spanish university system are women, 56% in the 2020 academic year (Instituto Nacional de Estadı´stica, 2021).However, there still seems to be a gender gap when it comes to mathematics achievement.Fuentes De Frutos and Renobell Santaren (2020, p. 72) concluded that ''[Spanish male students present better scores in mathematics [six PISA editions, from 2000[six PISA editions, from to 2015] ] than female students in percentages that vary depending on the edition.The gender gap detected is stable and has even increased during the last editions with respect to the previous ones.''In fact, the results of TIMSS (Trends in International Mathematics and Science Study) also show higher performance in mathematics in boys in Spain, both in primary and secondary education (UNESCO, 2019).That is, there seems to be a mathematical gap between pre-university students in Spain.
The possible existence of a gender gap in mathematics achievement has been extensively studied, and several meta-analyses suggest that, on average, this gap has disappeared (see Hyde, 2016).However, in the group of high performers, differences still seem to exist: ''Overwhelmingly obvious is, however, the persisting trend of more male students in the group of highachievers for both mathematics and science in many educational systems'' (Meinck & Brese, 2019, p. 20).It seems that the differences between genders are larger among the ''highest performing portion of the population'' (Casey & Ganley, 2021, p. 19).This is coherent with the fact that most data on achievement in mathematics competitions (as Olympiads) show gender differences in favor of males, although it seems that such differences cannot be explained by gender disparities among topperforming students alone (Steegh et al., 2019).However, the gender gap in the high-performers group ''has closed rapidly over time'' (Charlesworth & Banaji, 2019, p. 7233).Additionally, some authors point out that ''gender differences could vary in direction depending on the mathematics content domain'' (Leder & Forgasz, 2018, p. 695).
For ungraduated students, in the area of mathematics only four studies of the meta-analysis of Voyer and Voyer (2014) report higher grades in male students, two studies show results without differences by gender and fourteen articles report more favorable results for females (the mean estimated effect size of these samples measured by Cohen's d (female-male) is 0.12 and the 95% confidence interval is 0.04-0.20).All these studies took place in United States universities.Buchmann and DiPrete (2006) concluded that the female advantage in college performance in the USA has become a general pattern from the analysis of National Educational Longitudinal Survey (NELS).However, as we have already indicated, it seems that a gender gap continues to exist at least in high performers (in some countries), despite the fact that the evidence points to its progressive reduction.And in the specific case of Spain, which we analyze in this paper, Fuentes De Frutos and Renobell Santaren (2020) concluded that the gender gap (in general terms, not only among high performers) not only exists but has even increased.
This gender gap in learning mathematics has important consequences for female representation in higher education, especially in STEM programs, and women's career development (Garcı´a-Holgado et al., 2020).There are compelling reasons for promoting gender equality.First, as a social justice issue (Etzkowitz & Kemelgor, 2001).Second, as a problem for sustainable development, which should not miss talent from women (Leggon, 2010;World Economic Forum, 2020).''Ensure inclusive and equitable quality education and promote lifelong learning opportunities for all'' and ''Achieve gender equality and empower all women and girls'' are included among the 17 Sustainable Development Goals of the 2030 Agenda adopted by United Nations General Assembly (United Nations, 2015).
Literature highlights the influence of sociocultural factors (Guiso et al., 2008;Kane & Mertz, 2012), thus gender equality action plans could diminish these differences (Miyake et al., 2010;Polcuch et al., 2018).Stereotypes of mathematics as a male domain are of particular relevance (UNESCO, 2019), and they worsen the gender gap in different ways (Jouini et al., 2018).Firstly, they impact on decision-making and result in gender self-selection bias.Secondly, stereotypes cause female low self-esteem, which generates stereotype threat (Osborne, 2001;Spencer et al., 1999;Steele, 1997) and math anxiety (Bieg et al., 2015).On the one hand, Doyle and Voyer's (2016) metaanalysis concluded that stereotype threat manipulations can affect women's performance in mathematics.On the other hand, some papers have reported that female students tend to have higher levels of math anxiety (Baloglu & Kocak, 2006;Nu´n˜ez-Pen˜a et al., 2015;Stoet et al., 2016), and the meta-analysis developed by Else-Quest et al. (2010) concluded that female students reported higher math anxiety than male students, something that may be due to ''stereotyped beliefs about how women should feel about math'' (Ramirez et al., 2018, p. 154).However, although anxiety levels may be different between male and female students, the negative relationship between math anxiety and academic performance seems to be similar in both genders, according to the results of the meta-analysis conducted by Zhang et al. (2019), Barroso et al. (2021), andCaviola et al. (2022).
Another element closely related to stereotypes are cultural aspects: gender differences are not universal, as there is considerable cross-cultural variability (Reilly et al., 2019).The meta-Analysis developed by Ghasemi and Burley (2019) concluded that ''gender difference in liking mathematics, confidence in mathematics, and valuing mathematics are very small and negligible in general [although] variations were observed in the magnitude of the differences across different nations.''In the same line, the meta-analysis by Else-Quest et al. (2010, p. 125) pointed out that ''on average, males and females differ very little in mathematics achievement [.].Yet, these findings of mean similarities in math are qualified by substantial variability across nations.''As Charlesworth and Banaji (2019, p. 7233) pointed out, studies conducted by several researchers have shown that there is evidence that gender differences have ''mutability based on local contexts.''In the same line, the meta-analysis by Li et al. (2018) in elementary and secondary education in China, concluded that gender differences vary across school locations (urban vs. rural schools).
Other factors as age and educational level do not seem to have a neutral effect on the mathematics gender gap, as concluded in the meta-analysis conducted by Lindberg et al. (2010).The evolution of the mathematics gender gap with the age is studied by Borgonovi et al. (2017) and Borgonovi et al. (2018) by comparing the results of three large-scale cross-national assessments PISA, TIMMS, and PIAAC (Programme for International Assessment of Adult Competencies).These articles concluded that the gender gap widens when participants become adults, except for the group of high-achievers, and the cause could be the specialization of genders in different areas of education and the labor market resulting from self-selection bias.
The way of measuring mathematics achievement must also be considered in the study of gender differences.For large cross-national assessments, the use of standardized tests to measure achievement has decided advantages, but they could fail to evaluate objectives such as a comprehensive vision or application of methods in complex situations (Biggs & Tang, 2007;Morales, 1995).According to Ramsden (1992, p. 191), ''a greater variety of methods may be administratively inconvenient, but it offers more latitude for students to display their knowledge, and it has the potential to provide a more accurate depiction of each student's achievement.''An alternative way to measure achievement is to take course grades.A recent work (Griselda, 2020) concluded that males perform better than females in tests where there is a higher proportion of multiple-choice questions, and standardized tests (used, e.g., in many university admission processes) use exclusively or in large part multiple-choice items.However, when grades are used, instead of standard tests, they frequently favor girls.The meta-analysis by Voyer and Voyer (2014) does find an overall female advantage in scholastic achievement in mathematics.In line with this result, Lyons et al. (2022) concluded that low-stakes measures of mathematics achievement, such as grades, favor female students, while high-stakes measures (such as the admission test) tend to reverse the gap.As Gallagher and Kaufman (2005) pointed out, females regularly get lower scores than males when considering standardized tests of mathematics, although no differences exist in the classroom.
In summary, it appears that the gender gap varies greatly across cultures and countries, depends on how achievement is measured, and could vary depending on the mathematical field, among many other factors.As Leder (2019, p. 303) pointed out ''after four decades of consistent, persistent, and often insightful research on gender and mathematics there seems to be at best limited consensus on the size and direction of gender differences in mathematics performance.''

Objectives
The present study analyzes the possible gender gap from the results of mathematics grades of Business Administration students at a Spanish university.According to World Economic Forum, Spain occupies the eighth position in the ranking of the most advanced countries in terms of gender parity but must bridge large gaps in wages, income, and the presence of women in managerial positions (World Economic Forum, 2020).Our goal is to test whether the gender gap in mathematics achievement detected by Fuentes De Frutos and Renobell Santaren (2020) for pre-university students in PISA's tests, remains by changing the academic level (from pre-university students to university students) and the way of measuring performance (final grades in mathematics courses instead of standardized tests).Thus, the research question we intend to answer is whether there is a gender gap in mathematics at the level of undergraduate business students.The hypothesis to be tested is that a gender gap in favor of men does not exist when performance in mathematics at the University is measured by grades.
A disadvantage of our approach, that is, measuring mathematics achievement through grades in mathematics courses, is being local in scope, because depending on the country or center, the contents of university courses in mathematics may be different.It does not facilitate the comparability of results, and to overcome this drawback, we report standardized gaps to allow integration in cumulative science which could draw more general conclusions (Borgonovi et al., 2017;Cohen, 1988;Lakens, 2013).
To the best of our knowledge, there is little research about the gender gap in mathematics for undergraduate students in Spain when achievement is measured by grades.Therefore, this study contributes a relevant result to the academic literature and provides evidence that the gender gap in mathematics learning is not generalizable.

Participants
The study was carried out at the Comillas Pontifical University, a medium-size private Spanish university.Currently, the Comillas Pontifical University is included among the best universities in the world (World Class University) according to the QS ranking.
The participants were 2,713 undergraduate management students (1,478 female and 1,235 male), from five different degrees (three single and two double degrees) that entered Comillas Pontifical University between the academic years 2012 to 2013 and 2017 to 2018: Business Administration (BA), Bilingual Business Administration in English (BA Bilingual), Business Administration and Law (BA + Law), Business Administration with International Mention (BA International) and Business Administration and International Relations (BA + I. Relations).Table 1 shows the sample composition, as well as the grades obtained in the entrance test and in the two undergraduate subjects considered in this paper.

Procedure
A quantitative research methodology has been chosen, with a non-experimental design structured in four phases: identification of the most relevant variables to explain academic performance in mathematics among university students, gathering of information, exploratory data analysis, and statistical analysis.
In the first stage, a comparison of means for the math level at the entrance to the degree has been developed.As part of its admission process, the Comillas Pontifical University carries out a battery of tests, one of which is in mathematics.As it is an identical test for all candidates, it constitutes a more objective indicator of the mathematical level than, for example, the grades obtained in high school, where differences may exist due to the school or province of origin.We need to bear in mind that this mathematical level is measured through an achievement test.In all cases a contrast of normality (Shapiro-Wilk) has been carried out as well as a contrast of equality of variances using the Leven Brown-Forsythe test.In those cases in which normality is not verified, but equality of variances is, the non-parametric Mann-Whitney-Wilcoxon test has been used, and in those cases in which both conditions are verified, the t-test has been applied.
In the second stage, we have employed a state-of-theart algorithm published in May 2022 for the sensitivity analysis of neural networks (Pizarroso et al., 2022).These researchers have implemented in R a technique that allows estimating sensitivities to input variables in a neural network (NN) model.In a very schematic way, we can define a NN as a combination of processing units, called perceptrons, that generates a map between a set of input variables x i (independent variables) and an output variable y (dependent variable).For a regression problem, as the one stated in this research, in which only one hidden layer is worked with, the mathematical expression can be formulated as: In this expression, N is the number neurons in the hidden layer, j x ð Þ is an activation function (being the sigmoid function the most conventional one), and w (i)  k are the parameters that will be estimated using the dataset.Frequently a regularization term (decay) is used to prevent overfitting.For a more detailed explanation, the textbook by Aggarwal (2018) can be consulted.After estimating the parameters w (i)  k , NNs calculates the sensitivity of the dependent variable (y) to the independent variables (x i ).The calculation method is the corresponding partial derivative: df x, w ð Þ dx k .Although, as Pizarroso et al. themselves point out there already were other algorithms that facilitated the understanding of NN, their R package has an important advantage: the interpretation of the results is very similar to that of the models commonly used in the social sciences.The implications are considerable since it allows NN models to be used for explanatory purposes, and to do so in terms very similar to those of, for example, a regression model.Not only do the NNs have higher power than more conventional models, but they do not require a specific functional specification.Any nonlinear effect, interaction, or change in trend will be identified automatically, without the need for it to be explicitly formulated by the researcher.To give a concrete example from this work, there is some evidence that pre-university performance has a nonlinear effect on university achievement.Therefore, if a regression model is used, this effect should be explicitly included in the model specification.This is unnecessary in the case of a NN: if a nonlinear effect exists, it will be automatically identified by the network.This opens the door to identifying complex effects that have not been previously described in the literature.The optimal NN in each case has been identified using the grid method, considering between 1 and 10 neurons in the hidden layer, and with a decay between 10 27 and 10 22 .The selection criterion used was the root-meansquare error (RMSE).In all cases, 10-fold crossvalidation has been used to avoid overfitting problems.Figure 1 shows the optimal network identified in all cases (7-1-1), except in the case of Mathematics 2 in BA + Law, where the optimal NN is 7-2-1.However, the difference with a 7-1-1 network is negligible: the 7-2-1 network (RMSE = 0.856 and R 2 = 0.272) is very slightly better than a 7-1-1 network with the same penalty (RMSE = 0.855 and R 2 = 0.270).So, for simplicity and consistency with the other models, the latter has been chosen also in this group.
To the best of our knowledge, this is the first time that the NeuralSens algorithm has been applied in the education field.Therefore, we considered it necessary to contrast the results obtained with linear regression models, to determine if there exist differences between the two approaches.To this end, we have worked with standardized variables and in those cases in which problems of heteroscedasticity were detected, robust standard deviations have been used.As will be discussed in the next section, both models (linear regression and NN) provide practically identical results except in one case (BA Bilingual degree).And in that case, it is found that it is the NN that was providing a better fit to the reality of the data because it had captured a non-linear effect.This confirms the adequacy of interpretable NNs: in the absence of unknown nonlinear effects, that is, if the functional specification of the regression model is perfect, both models will provide virtually identical results.But if there is any unknown effect, as is the case for one of the five degrees studied, the NN will outperform the linear regression model.

Variables
The dependent variables used are the grades obtained in the subjects of business mathematics 1 and business mathematics 2. Business mathematics 1 is offered in the first semester of the first year and is articulated around four blocks: vectorial spaces, linear applications, quadratic forms, and theory of the integral.It is one of the most complex subjects in the first year, at least in terms of academic results, which are usually substantially lower than those obtained by students in other subjects.In fact, the average grade (in the first call) in the sample considered is 5.1 out of 10.Regarding business mathematics 2, it is offered in the second semester of the first year, and it is organized into two different blocks: functions of multiple variables and theory of optimization.In this subject, either because it is taught in the second semester when students have already adapted to the university, or because the content is less complex or more attractive to them, the results are slightly better, with an average grade of 5.4 out of 10.In both cases, the evaluation is based on a series of mid-term exams and a final exam.
About the independent variables, the first one is the students' gender (Gender), the object of study in this paper, a binary variable that takes value 1 for female and 0 for male students.In relation to the control variables, we have followed the criteria of Arroyo-Barrig€ uete et al. (2020).The first control variables to be considered are pre-university performance indicators, as it has been identified as one of the indicators with the greatest predictive capacity in university students since it synthesizes both the student's work capacity and background.So, models include the results from three admission exams carried out by Comillas Pontifical University, focused on students' knowledge in Mathematics (TMathematics), Spanish (TSpanish) and English (TEnglish).Additionally, we have included the EvAU grade (Evaluacio´n para el Acceso a la Universidad-Evaluation for University Access).In the case of regression models, we have also included its quadratic term to capture possible nonlinear effects (Arroyo-Barrig€ uete et al., 2020).In the Spanish university system, the EvAU is the equivalent of the SAT, in the sense that the score obtained is used by public universities to select their future students.It is composed of the score obtained in a test that evaluates the competence of students in different areas, and the average score in the two years of high school.In the case of private universities, including the Comillas Pontifical University, it is usual to select their students using both the grade obtained in EvAU and the grade reached in their own entrance exams.
Another control variable included in the model is whether the student comes from a high school in Madrid (Madrid), city where the university campus is located, a binary variable that takes value 1 for Madrid and 0 otherwise.
In the same way, the high school specialty (HSSpecialty) has been included.It that takes value 1 for science and 0 for social science and humanities.In the Spanish university system, high school students can choose different branches of specialization.The science major is for students who wish to pursue a STEM-style degree, while the social sciences and humanities major is recommended for those who will choose a degree in those fields, which includes the business administration degree.One of the key differences between the two majors is the content in mathematics, which is significantly more demanding in the science major.Finally, all regression models include the interaction of this variable with the average grade in EvAU (Interaction EVAU-HSS).The reason to include this interaction is the existence of certain evidence indicating that the effect of the EvAU grade on academic performance at university is different depending on the high school specialty (Arroyo-Barrig€ uete et al., 2020).

Crude Analysis
As can be seen in Figure 2, it seems that the math level at the entrance to the degree, evaluated according to the score obtained in the math test, is higher in the case of male students, that is, there is a statistically significant difference in favor of male students.Considering the overall mean (4.37), female students are 1.4 times more likely to be below this value (Risk Ratio = 1.4,95% CI [1.3, 1.5], p-value \.001).This difference is confirmed by the corresponding contrasts of means (Table 2), which shows that there are statistically significant differences in all the degrees.That is, in the five degrees, it is the male students who begin their university studies with a higher mathematical level, measured with a standardized test, although Cohen's D indicates a small effect size in all cases.So, at the pre-university level, when mathematical knowledge is measured through an achievement test (specifically, admission tests), the existence of a gap in mathematics is confirmed.

Neural Network Analysis
We will initially work with the case of BA to illustrate, as an example, the NN and regression comparison.Table 3 shows the results for the subject of mathematics 1 in the BA degree.The mean sensitivity, sensitivity standard deviation and mean squared sensitivity, as proposed by Pizarroso et al. (2022), have been included.In all cases, these statistics refer to the slopes (sensitivities) obtained.Since a slope is obtained for each data (equivalent to beta in a regression model), a given variable has as many betas as the amount of data in the sample.Table 3 shows the mean, standard deviation, and mean squared for each of them.With some nuances, the mean value can be compared with the beta obtained in the regression model, which has been included precisely for this purpose.
The results are interpreted according to Pizarroso et al. (2022, p. 12): if both mean and standard deviation are near zero, it indicates that the output is not related to the input (irrelevant variable); if the mean is different from zero and the standard deviation is near zero, it indicates that the output has a linear relationship with the input (relevant variable with a linear effect); finally, if the standard deviation is different from zero, regardless of the value of mean it indicates that the output has a non-linear relationship with the input (relevant variable with a non-linear effect).
As can be seen, the results are very similar in both models.The signs of the effects are the same (negative for TSpanish and TEnglish and positive for the rest) and the mean values of the sensitivities in the NN are practically identical to the betas obtained in the regression model.In fact, TSpanish is not significant in the regression model and the NN confirms that it is an irrelevant variable (both mean and standard deviation are near zero).In the regression model, a non-linear effect appears in the EvAU variable (significant quadratic term in the regression model).We also found that in the NN the sensitivity standard deviation in this variable is relatively high, above the rest of the variables except HSSpecialty, which is consistent with regression results.In fact, the Sensitivity standard deviation value for the HSSpecialty variable also suggests a possible non-linear effect.The detailed analysis with NeuralSens is shown in Figure 3 and confirms the results of Table 3, with EvAU and HSSpecialty being the two variables with the  greatest impact on the dependent variable (higher mean squared sensitivity).Exactly the same result as that obtained with the regression model.In this particular case, the use of a NN does not represent a relevant advantage over a conventional regression model, since no strong nonlinear effects or complex interactions have been detected.It simply allows us to confirm that the specification of the regression model was adequate.However, in other cases, such as BA Bilingual, the NN does detect non-linear effects, which leads to relevant differences with respect to the equivalent regression model.
Comparing the results of the NN for BA Bilingual (Table 4) with the corresponding regression models (Tables 6 and 7 in annex 1), important differences can be seen in some variables, gender being one of them.The detailed analysis (Figures 5 and 6 in annex 2) shows a bimodal distribution of the sensitivities of several variables, which may be due to an interaction or to a different effect by tranches.This leads to the gender variable, which according to the regression model was not significant, appearing as a relevant variable in the case of the NN, as will be analyzed below.

Crude Analysis
Crude analysis shows that the mathematical level in university entrance is significantly higher in male students, a result that is consistent with that of Fuentes De Frutos and Renobell Santaren (2020).In other words, at the pre-university level, when mathematical knowledge is measured through an achievement test (specifically, admission tests), the existence of a gap in mathematics is confirmed, which translates into worse results for women on the entrance test.

Mathematics 1
In four out of five degrees, the EvAU grade and the high school specialty are the two variables with the greatest impact.This confirms that previous academic performance has a relevant impact, which is consistent with previous research: The meta-analysis conducted by Richardson et al. (2012), after studying the research results between 1997 and 2010, indicates that high school grades, together with the results of admission tests (SAT/ ACT in the United States), showed medium-sized positive correlations with grades obtained at university.Regarding high school specialty, it seems that students from science major obtain higher grades than those from social sciences and humanities major.This is interesting because although the content in mathematics is significantly more demanding in the science major, the social sciences and humanities major is designed specifically to pursue the careers analyzed in this paper.That is, its mathematical content is focused on the subjects that will be found in those degrees, which is not the case with the science major, oriented to STEM-style degrees.It is also confirmed that in the Spanish context, the change of residence seems to generate a negative effect in academic performance, as the variable ''Madrid'' is relevant in four out five degrees.This result is consistent with that of Arroyo-Barrig€ uete et al. ( 2020).
For the BA Bilingual degree, there is a small nonlinear effect (bimodal distribution in the sensitivities.See Figures 5 and 6 in annex 2), which points to a different effect by tranches or to an interaction.Since the aim of this paper is to analyze the effect of gender, it is beyond the scope of this research to analyze the reasons for this effect.However, it is worth at least mentioning the fact, as it highlights the greater power of the NNs with respect to the regression models.In fact, once the non-linear effect has been detected, a simple exercise confirms that there is indeed at least one significant interaction with the Gender variable.Table 5 compares the initial regression model (based on the literature review) with a regression model incorporating the interaction between the numerical variables and Gender (suggested by the NN).The existence of a significant interaction at 95% is confirmed, the R 2 of the regression model improves and the beta obtained for gender is now much closer to the NN result.Further investigation of the rest of the possible interactions and differences in slope by tranches would lead to a regression model that would match the NN.This is an example of the advantages of a NN approximation, as it allows to detect nonlinear effects and interactions automatically.
In conclusion, there seems to be a positive effect of the Gender variable in the BA and BA Bilingual degrees, although in the latter case its effect is more complex as there is an interaction with another variable.That is, females achieve better results in mathematics 1 in these degrees.In the rest of the degrees, Gender does not seem to be relevant, which is a consistent result with our research hypothesis.This result is coherent with previous meta-analysis on performance of undergraduate students in the area of mathematics, such as Voyer and Voyer (2014).It is also consistent with meta-analyses on the child population (Else-Quest et al., 2010;Hyde et al., 2008) and on children and adult population (Lindberg et al., 2010).

Mathematics 2
As in the previous case, in four out five degrees the EvAU grade and the high school specialty are the two variables with the greatest impact.In relation to the Gender variable, there is a linear and positive effect in BA and BA + I. Relations.In the case of BA Bilingual we detected exactly the same interaction effect as in mathematics 1, so that the variable Gender has a positive effect, and additionally interacts with the variable TMathematics.In BA + Law and BA International the gender variable has no effect.Therefore, the conclusions are practically identical to those obtained in the case of Mathematics 1, although in this case the advantage in favor of women occurs in three degrees, instead of only two.It could be inferred that this small difference may be due to the different contents of both subjects, because Leder and Forgasz (2018, p. 695) stated that gender differences could vary depending on the mathematics content domain.

Overall Interpretation of the Results
The mathematical level at university entrance (measured with a standardized test) confirms the existence of a gender gap.However, performance in two mathematics subjects during the first year (measured by grades) points to the existence of no gender gap.These results are coherent with the evidence of different behavior of gender gap when mathematics achievement is measured with achievement tests or in scholastic achievement (Voyer &Voyer, 2014).Gallagher and Kaufman (2005) pointed out that females regularly get lower scores than males when considering standardized tests of mathematics, although no differences exist in the classroom.Griselda (2020) concluded that males perform better than females in tests where there is a higher proportion of multiplechoice questions (as the test used in university admission process analyzed in this paper), but when grades are used they frequently favor girls.Similarly, Lyons et al. (2022) stated that low-stakes measures of mathematics achievement (such as grades), favor female students, while highstakes measures (such as the admission test) tend to reverse the gap.Therefore, our results are fully consistent with previous research, and suggest that the gap detected at entry is a consequence of using standardized tests.However, given the design of this research we cannot conclusively state that this is the only relevant factor.For example, it has been proposed that teaching methodologies that tend to reduce this anxiety will favor the performance of female students.Nu´n˜ez-Pen˜a et al. (2015) pointed out that feedback not only favors students' learning in general but may reduce the impact of math anxiety on the performance of students with high math anxiety.In this sense, the teaching methodology used at the Comillas Pontifical University, specifically in mathematics subjects, has as one of its fundamental pillars to provide constant feedback.It is therefore an element that can contribute to reducing the gender gap.Additionally, although there are mixed results, some studies found positive effects of same-gender teachers (Delaney & Devereux, 2021).''[F]emale STEM professors not only provide positive role models for women, but they also help to reduce the implicit stereotype that science is masculine in the culture-at-large'' (Young et al., 2013, p. 283).Eble and Hu (2019, p. 44) concluded that ''positive role models such as female math teachers can counter the harms that exposure to these beliefs [that boys have  (Eble & Hu, 2020).Again, in the degrees analyzed in this paper, math teachers are mostly female (from 80% to 60% depending on the year), providing female students with positive role models for their math performance.Finally, as women enter traditionally maledominated environments, they may feel as if they do not belong, which can lead them to develop the well-known ''impostor effect.''In fact, impostor feelings increase in situations where one sex is predominant (Harvey & Katz, 1985. Cited in Wierzchowski, 2019).The fact that the percentage of men and women is very similar in the five analyzed degrees, and in some cases with a higher percentage of women, may contribute to mitigating the possible impostor effect that usually appears in degrees with a women underrepresentation.

Conclusions
The research question we intend to answer in this paper is whether there is a gender gap in mathematics at the level of undergraduate business students.Our conclusion is that such a gap does not exist, that is, there do not seem to be relevant differences in mathematics performance according to gender at the higher education level.At the pre-university level, when mathematical knowledge is measured through an achievement test, there exists a gap in mathematics, that is, gender seems to be a relevant variable in that male students perform better in mathematics than their female peers.However, throughout the first year of university studies, it has been observed that in most degrees there are no differences between men and women in terms of their performance in mathematics.In fact, in those cases in which differences do exist, they occur in the opposite direction, with female students achieving better performance.In this regard, we can confirm the research hypothesis of the present paper: the gender gap in favor of men does not exist when performance in mathematics at the university level is measured by grades.Therefore, it is possible that the gender gap in mathematics is an artifact derived from several factors such as the way of measuring mathematics achievement, the teaching methodologies, the positive or negative role models provided by instructors, or the women's underrepresentation in some knowledge areas.
The second conclusion of this paper is related to the methodological approach employed.By comparing the results of NeuralSens (Pizarroso et al., 2022) with regression models, we have verified that the interpretable NNs approach not only provides results consistent with linear models but also identifies with surprising ease nonlinearities that are not simple to detect with more traditional models.This result is in itself valuable for social science research.To the best of our knowledge, this is the first time that the NeuralSens algorithm has been used in education.With this paper, we prove that this algorithm, initially designed for engineering problems, has a direct and useful application in the social sciences as well.
Finally, we must point out that the main limitation of this study is that we have worked with students from only one university.It is true that the sample size is relatively large, but to guarantee the external validity of these results it would be necessary to carry out a similar analysis with data from other Spanish universities and of different degrees.In this regard, standardized gaps are reported in Annex 3 to facilitate integration in possible future meta-analyses.We believe that it would be very interesting to replicate this work in universities and degrees where there is a greater imbalance between the percentage of male and female students, as is the case in some engineering schools.This would allow us to confirm or refute the results obtained in the present paper and, at the same time, to evaluate the impostor effect, which according to previous literature may play an important role in the gender gap in mathematics.In any case, we consider our results to be notably robust, both because of the sample size and the analysis methodology.Therefore, we can conclude that in our sample there is no gender gap in mathematics when the level of achievement is measured on the basis of the grades obtained.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figure 1 .
Figure 1.NN identified as optimal in all cases.

Figure 2 .
Figure 2. Distribution of grades in the math admission test (0-10 scale).The vertical lines indicate the mean of each group.

Table 4 .
Results of the Neural Network Models Using the Grade in Mathematics 1 and Mathematics 2 as Dependent Variables.

Table 5 .
Results of the Regression Models (Initial and Corrected) and the Neural Network Model Using the Grade in Mathematics 1 as Dependent Variable (BA Bilingual Degree).