versão On-line ISSN 0719-0433
Lat. Am. J. Econ. vol.49 no.1 Santiago maio 2012
LATIN AMERICAN JOURNAL OF ECONOMICS
VOL 49 NO. 1 (MAY, 2012), 1-35
ESTIMATING PRIVATE RETURNS TO EDUCATION IN MEXICO*
Arnold C. Harberger** Sylvia Guillermo-Peón***
** Department of Economics, 8283 Bunche Hall, UCLA, 405 Hilgard Avenue, Los Angeles, CA 900951477. E-mail address: email@example.com.
*** Facultad de Economía, Benemérita Universidad Autónoma de Puebla, Avenida San Claudio y 22 Sur, Col. San Manuel, Ciudad Universitaria, Puebla, Pue., Mexico, 72570 and Visiting Scholar at the UCLA Economics Department. E-mail address: firstname.lastname@example.org.
This study explores the relationship between education and wages in Mexico. It contributes to our understanding of the structure of wages, helping explain individuals' choices concerning education level. First, we estimate the age-earnings functions for each level of education. Then, taking into account some important costs of added years of study, we estimate the net present value of investment in human capital in each of four steps up the educational ladder. We estimate the internal rate of return associated with investment in each successive step considering different scenarios, two of which take into account prospective economic growth and mortality.
JEL classification: I21, J31
Keywords: Returns to education, age-earnings profiles, Mincer equation
Economists have long noted that successive steps of education typically add to the student's future earning power and thus have the characteristics of an investment. This is recognized by the students themselves as well as by their families, and serves as one important motivation for pursuing higher levels of schooling. At the macroeconomic level, the added productivity of a better-educated labor force raises the nation's real domestic product.
Jacob Mincer is a pioneer in the study of the effects of education on income, specifically on how education influences a person's earnings path over lifetime. Mincer (1958) developed a model to explore how differences in personal income levels are linked to differences in educational levels. But in order to perceive the income difference brought about by a further step up the educational ladder, individuals must forgo income that they otherwise could have earned, and also often end up incurring additional cash expenditures. This pattern of extra costs leading to extra benefits defines an investment profile for each successive step. The main purposes of this paper are first to quantify and then to analyze these profiles.
Of the various attempts to analyze the returns to education in Mexico1, only a few actually estimate what can appropriately be called a rate of return (e.g., Zamudio, 1995, Rojas et al., 2000). The majority of authors wrongly consider the coefficient associated with education (expressed in years or levels) in the Mincer equation (or in another model derived from it) as a rate of return. This is simply a mistake. The rate of return can only be obtained from a profile delineating a full time path of costs and benefits.
Our study compares the age-earnings profiles of adjacent levels of educationfor example, high school and college. The benefits of college are measured as the extra earnings of college alumni versus their counterparts who have graduated only from high school. The costs that we measure are the earnings that the high school graduates obtain during the years that the college graduates are studying (forgone earnings, in the standard methodology).
The aim of this paper is to estimate the NPV (net present value) of investment in human capital as well as its internal rate of return (IRR). The estimation procedure uses data from urban areas in each of Mexico's 31 states plus the Federal District. Our first step is to estimate the human capital earnings function based on the Mincer equation. This is done in a single estimating equation, with provision for the age-income profile of each education step to have its own constant term, slope and curvature. As a second step we adjust the resulting estimated age-earnings profiles for different education levels to account for alternative growth rates of real wages and also for mortality. This task enables us to provide estimates of the IRR to education based on a moving picture of the worker's wage income throughout his or her lifetime. The resulting real internal rates of return to middle school education center around 3.5 and 6.5 percent for males and females respectively. For high school the corresponding central rates of return are 6.5 and 8.4 percent and for college level they are 11.5 and 11.4 percent. Finally, for graduate level, the real internal rates of return center around 15.3 percent for males and 15.0 percent for females. The separate rates of return measured for the different states typically fall within a range of ± one percentage point around the cited central values.
Certainly, schooling and education are not synonymous. Education is a much broader concept and includes learning acquired within the family, from one's cohorts, and from participating in a broader society as well as from one's work experience. All these elements work alongside formal schooling to develop the individual's capacities. But formal schooling is the point where society enters most directly and consciously in the preparation of each new generation of young people. It is therefore a matter of interest, both to society as a sponsor and to individuals as participants, to know how additional years in school affect future earning power and productivity.
Several studies in the literature present estimates of what have been called returns to education based on different versions of the Mincer (1974) human capital earnings function, which treats the logarithm of earnings as a linear function of years of schooling. The basic version of the so-called Mincer equation is:
where W is the wage income, EX is the worker's experience (for which age is typically used as the proxy2) and ED is the worker's years of education. This equation gives us the earnings profile for an individual. The coefficient β4 is often wrongly interpreted as the rate of return to education. However, we must point out that β4 cannot be a rate of return simply because it does not take into account any costs associated with education. Based on Mincer's pioneering work (1958, 1974), we present the methodology to estimate the NPV and the IRR of personal investment in education. We consider that the most important cost associated with one's decision to undertake further years of study is the income that one forgoes while studying3.
The NPV of investment in educational level n can be expressed as:
Wt,n are the annual wage earnings of a worker with educational level n, Wt,n-1 are the annual wage earnings of a worker with the educational level preceding n, I is the retirement age and r is the discount rate.
Note that Wt,n = 0 for the d years spent in school at level n. Hence, is the present value of the opportunity cost of education at level n (earnings forgone while studying).
As specified in Equation (2), in order to obtain an estimate of NPVn we must have estimates of the age-earnings profile. We could simply estimate Wt using Equation (1). However, there are other variables that might influence the earnings profile, and their inclusion in estimating the earnings function can improve our estimates. In particular, the worker's occupation is strongly related to his or her educational level, because education typically qualifies students for an enhanced set of occupations (evidence of this relationship is shown later with the sample data). Econometrically, if the level of education and occupation are correlated and we exclude occupation as an explanatory variable in the human capital earnings function, then we would have biased estimates due to the omission of a relevant variable. The occupation variable also plays an important role in the earnings function as a proxy for ability. This provides another justification for including occupation as a categorical explanatory variable in our specification of the age-earnings profile. The omission of ability as an explanatory variable in the log (wage) equation is a potential source of bias for the education coefficient, an issue widely discussed in the literature. In his 1977 paper and using subsamples of the National Longitudinal Survey of Young Men (data for the U.S.), Griliches showed that when schooling is allowed to be subject to measurement errors and correlated to the disturbance in the earnings function, "the usual conclusion of a significant positive 'ability-bias' in the estimated schooling coefficients is not only not supported but possibly even reversed". Griliches also shows that the magnitude of the relative ability-bias will depend on the magnitudes of the schooling coefficient and the covariance between the schooling and ability variables. And because the estimated schooling coefficient differs across studies, "there is no reason to expect the relative ability-bias to be constant across different samples or to generalize easily from one study to another and to the population..."4. The estimated ability-bias size in Griliches' study was 0.008 and 0.006 respectively for models using age and experience5 as regressors, which are reasonably low magnitudes6.
The age-earnings profile might also be influenced by gender when average earnings are different for males and females. Therefore, our proposed human capital earnings function is specified as follows:
where w is the monthly wage income reported by the worker, age is the individual's age, gn for n =2, 5 is a vector of dummy variables taking the value 1 when the worker belongs to the nth education level and zero otherwise, and where we have defined our classification of education levels as follows:
Level 1: Primary School (5 to 7 years of education)
Level 2: Middle School (8 to 10 years of education)
Level 3: High School (11 to 13 years of education)
Level 4: College (14 to 17 years of education)
Level 5: Graduate (more than 17 years of education)
The category of reference for education level is primary school.
age χ gn represents the interaction term between age and education level (Garen, 1984). Besides accounting for the influence of occupation on earnings, which is represented as a parallel shift in the function, this equation allows us to incorporate Mincer's statement that differences in education result in differences in the slopes of the earnings profile as well as in the levels of earnings. It is also widely recognized that age-earnings profiles are steeper with higher educational levels (Harberger, 1965).
Oci for i = 1,2, 21 and i ≠ 19 is a vector of dummy variables indicating occupation where Oci = 1 if the worker has occupation i and zero otherwise (see Table A1 in the appendix for the classification of occupations7). Occupation 19 (8-2: Household Services) was chosen as the category of reference, and G is a categorical variable indicating gender (G = 1 if male and zero if female).
The model expressed in Equation (3) is estimated using data from the Encuesta Nacional de Ocupación y Empleo 2010 (National Survey of Occupation and Employment, known as ENOE8). We estimated this equation for each of Mexico's 31 states plus the Federal District, using observations from at least one metropolitan area in each state (see Table A2 in the appendix).
When estimating the NPV and IRR of human capital investment, we assume two different scenarios. The first one simply works with the data from our sample. This assumption is equivalent to taking a snapshot of the profiles of many different workers, one from each successive age, at a given time (in this case 2010). The second scenario is more realistic; it incorporates economic growth and mortality into the story, and thus simulates a movie of the worker's income profile over his/her lifetime. In each of these alternative scenarios, we make the following basic assumptions9:
Each additional year of schooling reduces the length of working and earning life by exactly one year (i.e., all workers have the same retirement age, which is 70);
|a)||the relevant investment costs are time costs (forgone earnings while studying);|
|b)||each of the profiles we generate is based on the data from a given state.|
As stated earlier, our study compares the age-earnings profiles of adjacent levels of education. For example, when comparing high school and college, the costs of college are measured as the earnings that the high school graduates obtain during the years that the college people are studying. We now address some quite obvious questions about these costs that are likely to occur to readers. On the one hand, many students (for example, college students) have earnings streams, typically from part-time jobs, during some or all of their years of study. Such earnings would offset some of the cost (the earnings of the high school group) that we count. From this angle our estimates of the NPV and the IRR of the college investment would be biased downward. On the other hand, an important minority10 of Mexican students attend private schools, thus incurring out-of-pocket costs above and beyond their forgone earnings. Neglecting these costs introduces an upward bias in our measures of NPV and IRR.
Although we are not in a position to assert that these two biases cancel each other out, they certainly work in opposite directions, and neither is likely to be as important as forgone earnings. Only a fraction of students hold part-time jobs, and they average far fewer hours of work than the high school group working full-time from which the forgone earnings of the college students are calculated. So from this point of view, the forgone earnings figure should be reduced, but only by a fraction, to account for part-time work by students. Similarly, cash outlays on tuition and books by those attending private colleges amount to only a fraction of the annual full-time earnings of high school graduates. So our two offsetting biases are each a rather modest fraction (surely well under half) of the forgone earnings we measure. And since these fractional errors have opposite signs, the net effect is even more likely to be small in size, and even of unknown sign11. We must rest our case on these offsetting biases, because our data give us no way to ascertain the relevant amounts of part-time earnings or private school costs needed to incorporate these considerations directly into our quantitative estimates.
2.1. The sample data
Our data source is the ENOE, a survey published quarterly by the Instituto Nacional de Estadística y Geografía (National Institute of Statistics and Geography, or INEGI). Information at the micro level on remunerated and subordinated workers12 for the second quarter of 2010 was used. These data were sorted into 32 groupings, one for each of the 31 states in Mexico plus the Federal District. Most states are represented by a single Metropolitan Area13 (MA). In a few states several metropolitan areas were grouped together. Finally, states without an official MA were represented by urban data from their leading city14 (see Table A2).
The distribution of the sample by occupational category and by education level is presented in Table 1, where we observe that 34% of male workers are industrial workers, while office workers account for the highest proportion of females. For the whole sample, almost 57% of workers are concentrated in occupational categories 4, 5 and 6 (office, industrial and commerce workers respectively). Regarding education levels, 35.5% of male workers have between 8 and 10 years of schooling (middle school), while almost 30% of female workers have between 11 and 13 years of schooling (high school). For the whole sample, almost 24% of workers have education level 4 (college), while barely 2.3% of workers in the sample have reached education level 5 (graduate level).
The relationship between education level and occupation is evidenced in the sample. Table 2 presents information on the distribution of workers by occupation and education level and shows a clear relationship between these two variables. For example, workers with the highest education level (postgraduate studies: level 5) are very likely to be located in occupations 2-1 and 2-2, which are university and college professors and high school teachers respectively (see Table A1 for the definition of the occupations). A smaller proportion of these workers with education level 5 are also hired in occupations 1-1 and 3-2 (professional workers and officials and executive workers respectively). On the other hand, workers with college level education (level 4) are located in occupations 1-1 through 5-1. It is also evident that workers with only primary education (level 1) are basically located in occupations 5-2 through 9-1, while workers with middle school (level 2) are most likely to be found in occupations 4-1 through 9-2, and workers with high school (level 3) can be found in the majority of occupations but very only rarely in occupations 1-1, 2-1 and 2-2.
Phi correlation coefficients15 between education level and occupation are presented in Table A3. Almost all of them are statistically different from zero (at the 5% significance level), and several coefficients are notably high, as is the case for education level 4 (g4) and occupations 1-1, 2-1, 2-3, 3-2 and 4-1, as well as for education level 5 (g5) and occupations 1-1, 2-1, 2-2 and 3-2.
The relationship between education level and occupation shown here leads us to conclude that omitting the vector of occupations would cause bias in the estimated parameters of the age-earnings function. But also the distribution of workers by occupation and education level allows us to calculate average fitted values of the wage for each education level, weighted by its corresponding occupation distribution. We calculated the occupation weights for each of the 32 states of Mexico and separately for each gender (recognizing that the distribution of occupations by education level is different between males and females). The resulting average fitted values of the wage were used in estimating the age-earnings profiles for each education level.
3. ESTIMATION RESULTS
3.1 The age-earnings profiles
In estimating the age-earnings function (Equation 3) we employ two alternative procedures: state-by-state regression and pooled regression (specified by Equation A1 in the appendix). The latter allows the intercept and slope to vary by state through the inclusion of state dummies and interaction variables between the state intercept and age. The state-by-state regression procedure, however, seems to provide more evident differences in the slopes of the earnings profiles by state, and also more evident differences in the internal rates of return by state. The summary of the parameter estimates for the 32 regressions is presented in Table A4 where we also discuss some technical econometric issues. This table shows that the estimates for β2(age) and β2(age2) are 0.036 and -0.00042 on average respectively; and they do not vary much across states. The four estimates for λn (age-education interaction) are 0.00125, 0.00604, 0.00921 and 0.01171 on average for education levels 2, 3, 4 and 5 respectively (recall that our category of reference is level 1=primary school). These latter estimates show substantial variation across states. Given these estimated coefficients, we may expect (on average) that an extra year of experience for an average 30-year-old college-level person would bring a 2% increase in his or her wage, while for an average person of same age but with middle school education, the extra year of experience would imply an increase of 1.2% (on average across the 32 states). As explained in the previous section, the incorporation of the term in Equation (3) allows us to capture the differences in the slopes of the age-earnings profiles by educational level. As expected, the lambdas increase as we move up to higher education levels.
The estimated coefficient associated with gender was 0.16 on average for our 32 samples, and variation is small among states. In all regressions this coefficient was statistically different from zero at a significance level of 5% or lower. We may interpret this particular result as evidence of an important difference in wages by gender. Given the log-linear functional form of the wage function, on average we expect wages to be about 16% higher for males.
Table A6 shows the pooled regression results (based on Equation A3). The estimated coefficients for age, age squared, gender, education levels and the interaction terms between age and education levels are very similar to the average of those obtained from the state-by-state regressions. The t statistics and corresponding P-values show evidence that α's and λ's are statistically different from zero at 10% significance level or lower (except α2). In analyzing all these results, it is important to keep in mind that our aim is to obtain an accurate forecast of the age-earning profiles for each education level, and this is the reason why we incorporate the information on occupations.
As explained above, the average fitted values of the wage16 for each education level (using estimated Equation 3) were calculated using the corresponding weights according to the distribution of occupations in each state. For example, in calculating the average fitted wage for education level 5 in Nuevo León, the weights were 0.583 for occupation 1-1, 0.125 for occupation 2-1, 0.125 for occupation 2-2, and 0.166 for occupation 3-2. If the fitted monthly wage values for the corresponding occupations were for example 10,000, 18,000, 13,000 and 25,000 pesos, then the corresponding weighted average of the fitted wage would be 13,875 pesos. This was done for each of the five education levels. Figures 1 and 2 present the estimated average earnings profiles for males and females, using the sample data for Nuevo León for Equation (3). It should be mentioned here that the coefficient associated with gender was 0.17 on average (for all 32 states), implying that on average, wages for males are about 17% higher than those for females (for a given occupation, age, and education level). The difference in peak wages by education level between male and female labor markets can be clearly seen in the graphs.
An interesting example of estimated percentage difference in wage for each education level (with respect to the wage earned by those with only primary school) is presented in Table 3 below. These estimates are based on results presented in tables A4 and A6. The reader may observe that, relative to the category of reference, the percentage change in wage due to a change in education level increases as the individual moves up the education ladder. Additionally, the percentage change in wage is an increasing function of experience (age).
Figures 3 and 4 also show how the model captures the difference in slopes between education levels, so that the earnings profile for workers with college (g4) and graduate studies (g5) is steeper than those for lower levels of education.
As also mentioned above, the earnings profiles were estimated under different scenarios: a) no real wage growth and no mortality, which is the case presented above; b) 1% wage growth per year and state mortality rates and c) 2% wage growth per year and state mortality rates. These scenarios were also applied under the pooled regression procedure (the only difference is that here we used nationwide mortality rates instead of state mortality rates).
Now, in estimating the age-earnings profiles to account for real wage growth and mortality, we used the following expression:
Here ŵGR,α is the estimated wage at age a, building in an annual growth rate of real wages equal to GR. ŵα is the average fitted wage obtained from Equation (3) and S(α-14) is the survival rate between age 14 and age17 α. Figures 3 and 4 show the estimated profiles under the alternative growth rates for the real wages of males.
Four types of investments in human capital were appraised in this paper:
|1)||An investment in middle schooling, compared with a person's entering the labor force after primary schooling (moving from education level 1 to 2).|
|2)||An investment in three years of high school, compared with a person's entering the labor force after middle school (moving from education level 2 to 3).|
|3)||An investment in five years of college or university studies, compared with a person's entering the labor force after high school (moving from education level 3 to 4).|
|4)||An investment in three years (on average) of graduate studies, compared with a person's entering the labor force after undergraduate studies (moving from education level 4 to 5).|
The summary of IRRs and NPVs for each human capital investment project under the different scenarios and procedures explained above are presented in tables A7 and A8 in the appendix.
3.2 Interpreting the impacts on IRR and Npv
The most striking overall result of this study is the sharp increase in rates of return and net present values as one moves up the educational ladder. In the "snapshot" exercise, the median rate of return for males increases from 2.13% to 5.86% to 11.26% to 14.27% as one moves up from middle school to high school to college to postgraduate education. The corresponding figures for women are 5.49%, 7.26%, 10.36% and 14.39% (see Table A7). The message here is that there still seems to be substantial room in Mexico for increasing the attendance of young people at the college and postgraduate levels of education. Note too that the dispersions surrounding these estimates are relatively small. The first and third quartiles tell much the same story as the medians. Recall that each median is calculated from 32 separate estimates for Mexico's 31 states plus the Federal District. Statistically, these 32 estimates are independent of each other, and give great credibility to the reliability of the results. The means and standard deviations of the distribution of the 32 state estimates tell the same story. Standard deviations are low, except for the graduate school level in the statewide regressions. This reflects the behavior of outlier states (Baja California with 4.8% at the lower extreme, and at the upper extreme, the Federal District with 20.9% and Colima with 22%) skewing the distribution, as it is not much in evidence in the range between the first and third quartiles.
The low rate of return to middle school education (particularly for males but also for females) is probably the result of two factors. First, those who leave after primary school very likely obtain jobs as unskilled workers of various types jobs where their productivity is fairly close to that of older workers (i.e., wages do not rise with experience at the unskilled level). And second, it is likely that, as in many other countries (including the U.S.), the educational process in Mexico may at that level be troubled by inefficiencies. This would help explain why the gap between the primary and middle school age-earnings profiles is so small. We are not in a position to judge the Mexican reality in this regard, but our data suggest the wisdom of looking further into this possibility.
3.2.1 Net present values and consumer benefits
The net present value calculations are helpful in two dimensions. First, they reveal that the low return to middle school education is mostly compensated (and even overcome) by the gain at the high school level, and most assuredly by the gains at the college and postgraduate levels. If indeed the rate of return to middle school education is irremediably low, it is still a worthwhile investment (in the fashion of seed capital for a new enterprise), which will be adequately compensated by added returns at subsequent levels. The second notable result from the present value calculations is the relatively high present values that accrue to the college and graduate levels. The figures presented are, as net present values, over and above the costs (of forgone earnings) that are deducted in calculating NPV.
As we move from the snapshot to the growth scenarios, it is also notable that the NPV rises much more rapidly than did the internal rate of return. Some readers may ask why we used a real discount rate as low as 5 percent in this calculation. In the first place, note that we are measuring the returns that accrue to the average worker in each comparison that we make. These are private returns and should be appropriately discounted at the subject's marginal rate of time preference. One metric commonly used for this purpose is the real rate of return that people obtain on their savings (in savings accounts or equity investments) or that they pay on their major debts. Considering that savings accounts pay well below 5 percent in real terms, and that mortgages in Mexico carry a rate not far above this level, we feel that our choice of a 5 percent discount rate is reasonable18. An added motivation for the use of a 5 percent rate lies in the fact that by concentrating only on the effect of education on earnings, we are not counting other very important benefits of education. First is the consumption benefit. The utility of illiterate persons would be increased if they learn to read, even if this were not accompanied by an increase in their wage. At higher education levels, people's utility is raised by their being able to deal with financial and other mathematical problems, by their being able to appreciate literature, art and music, and their being able to communicate with others in richer and more subtle terms.
We do not know how many points should be added to the IRR to account for these neglected benefits, or how many real pesos should be added to the NPV, but certainly these additions should be fairly substantial. Thus, the IRR and NPV (including consumption benefit) are surely greater than our measurements, even though we are unable to quantify the difference.
Before leaving this point, it should be mentioned that there is another important benefit of education, one that is rarely discussed in the literature: Education is highly beneficial in giving people the capacity to better cope with adversity of all kinds. Consider graduate students at U.S. universities. They often live (in pairs or larger groups) in rented apartments. More often than not, they have a car, go to the movies and occasional concerts, take sightseeing trips out of town, etc. The surprising fact is that most of them are able to do all this while living below the U.S. government's official poverty line. How are they able to do this? Because their education has enabled them to use their income wisely -to buy in bulk at sale prices and to avoid being tempted by sharp advertising and social fads that encourage the purchase of expensive but unnecessary items. The effect of education on consumer choice efficiency has been also discussed in Hettich (1972), Michael (1972, 1973), Schultz (1975) and Rauscher (1993). Education at all levels contributes to the capacity to cope with adversity of all kinds, and it should be emphasized that this important consumer benefit comes on top of what we measure.
Additionally, there are other nonmonetary positive effects of education on social behavior, such as crime reduction (Ehrlich, 1975, Usher, 1997, Lochner and Moretti, 2004, and Lochner, 2004), social cohesion (Gradstein, 2002, and Green et al., 2003) and even reduction in gender discrimination (Dougherty, 2005). Education also has a positive effect on health through reduction of mortality rates (Deaton, 2001, Lleras, 2005). Empirical evidence showing that own schooling positively and significantly affects the health of a person and their family members has been analyzed and discussed in the literature (Grossman, 1973, 2005, 2008 and Berger, 1989).
3.2.2. Dealing with selection bias
The standard caveat that should be mentioned in a study like this is that a degree of self-selection takes place at each step up the educational ladder. Those who go on to the next step are likely to be more capable and more highly motivated than those who drop out. Likewise, they are likely to have wealthier parents than the dropouts. This means that the result that we measure may well overstate the benefits that would have been perceived by the typical dropout, if he or she had continued on to the next stage (Willis and Rosen, 1978)19. Recent literature on this topic for Mexico includes some efforts to deal with this issue (see footnote 1), but these attempts were substantially flawed by the requirement that both dropouts and "continuers" are still living with their parents during the years when they perceive their earnings profile. We feel this constraint creates more problems than it solves, and therefore approach the issue from an entirely different angle20.
To deal with the issue of selection bias, we concentrate on a particular, real-world question: How well do we expect our results to predict these likely outcomes for new cohorts of students in Mexico (i.e., for young people who will be climbing the education ladder over the next several years)? If these new cohorts are similar to the 2010 school population in terms of demographic composition and other attributes, our results clearly should be good, unbiased predictors of their expected performance. However, how can we deal with a situation in which school attendance rates are higher among the new cohorts than they were in 2010? Before entering into details, we must inquire into the nature of such an increase in the attendance rate. Our expectation is that the newly added attendees will be, demographically, those "next in line" to enter college or to finish high school, or to continue to middle school after primary school. They should not be considered the average of the dropouts at any stage, but the "most likely to continue" of the dropouts. As such, they should be fairly close to the more disadvantaged of those who were actually present in our 2010 survey.
Accordingly, we look at the states with the lowest average earnings in our study at the beginning of each upward step, and think of them as similar to the new people who are likely to be added to the school-attending population over the next decade or so. To quantify our results, at each level we take the nine states with the lowest average earnings (for males and females). We then calculate their average internal rate of return for the move to the next stage (middle school). Finally, in comparing this last figure to the first quartile of the state-by-state distribution of IRRs at each stage we found in every case that the IRR of our average poor state exceeds the first quartile of our reported distribution (of the overall state-by-state IRRs).
We carry out this experiment for four initial education levels (primary, middle, high school and college) and for both males and females. In all cases the average IRR of the low-income states exceeded the first quartile of the overall state distribution. Our reasoning is that the next group to be added to school attendees at each level, over the next decade or so in Mexico, is unlikely to be more disadvantaged than the average person from the typical poor state in 2010. And their prospects should not be worse than the average results that we actually measure for the poor state. When we check those measured results against the overall distribution from all the states, we find that poor-state IRRs are in all cases greater than the first quartile of the state-by-state distribution. Our final step is then to consider that the first quartile is a plausible lower limit that takes selectivity bias into account in a relevant way.
3.2.3. Economic growth and mortality
We want to emphasize this feature of our work simply because in spite of its being such an obvious step to take, it is notably absent in other literature on this subject. The message is simple; a person's life is like a motion picture, so that while movements in further education take place at one point in time, the rewards come later and are spread over the rest of that person's working lifetime. If we lived in a static world without growth, expected mortality should be taken into account in measuring the expected returns to education. But in the real world, we also have sound reasons to expect economic growth to enter (with some cyclical interruptions) into the plausible future. The exponential growth rates that we explore (1% and 2% per year) are based on the average experience of countries like the U.S. and Canada over the full span of the 20th century. They are far below the levels achieved by the growth leaders of recent times. In both these senses, we consider them to be reasonably conservative projections of Mexico's likely future experience. We believe it would be unwise for education policymakers in Mexico to base their decisions on an expectation of zero lifelong growth in real wages for the cohorts to be educated over the next four years and this would also be applicable to most other countries. Therefore, we encourage future researchers to take the relatively simple steps of incorporating both expected real wage growth and expected mortality into their calculations of the returns to education, as the moving picture scenario is much better and much more plausible than the snapshot.
4. CONCLUDING REMARKS
We would like to emphasize the significance of the results we obtain from our state-by-state data. In relation to the key variables that we estimate -the IRR and the NPV associated with each upward educational step the 32 independent estimates are closely bunched around their central tendency. This suggests that similar underlying economic forces are at work in the Mexican labor market. It also provides strong reinforcement for our net present values and internal rates of return to investment in successive steps up the education ladder. Our procedures differ from many preceding studies in counting both costs and benefits in our calculations, and in allowing for both real wage growth and expected mortality in developing internal rates of return and net present values. Finally, we present a new approach to dealing with potential selection bias in our estimates. This approach is well adapted to answering relevant questions concerning expected returns to educational investment by the new cohorts of Mexican students that will climb the education ladder in the coming years.
* The authors want to thank the two anonymous referees for valuable comments. Research support for this paper was provided by The National Council on Science and Technology (Consejo Nacional de Ciencia y Tecnología) in Mexico through the Scientific Development Office.
1. Several empirical studies have been carried out to estimate rates of return to education. The most recent literature on this topic for Mexico addresses several problems related to the wage function and rate of return estimation procedures. These problems are basically the endogeneity of education (Sarimaña, 2002, Barceinas, 2003), and selectivity bias (Zamudio, 1995, Ordaz, 2007 and Austria and Venegas, 2011). The latter deals with the decision to invest in higher education (college and/or university), as compared with entering the labor force with no higher education. The studies on selectivity bias use information on the parents' characteristics (e.g., parents' years of education and/or parents' income). However, to represent key selectivity criteria, the survey lacked or had too few observations on ages above 30, since it provided this information only for cases where the individual was living in the same household as his or her parents. This greatly reduces the sample, and probably introduces important new biases. One clear indication that the resulting data are not representative is the fact that the resulting earnings profile does not have the normal concave shape (Zamudio, 1995, p. 84).
2. Experience is often defined as years spent in the labor force.
3. The great bulk of our sample is comprised of students in Mexico's public school system, where forgone earnings are indeed the relevant cost. Those who study in Mexico's private schools have higher costs but also end up earning higher incomes. We do not have data to actually measure either the proportion of each cohort educated in private schools or the resulting difference in income. However, a simple exercise can shed light on the issue. Suppose that the internal rate of return for public schools (where the private cost is just forgone earnings) is r, and that private-school costs (tuition, etc) amount to λ percent of forgone earnings. Then, the rate of return on private school education will equal r if the wage differential is λ percent higher than that corresponding to public school education (at the same level). The private school wage differential would have to be higher than λ in order for the private-school rate of return to be higher than r.
4. See Griliches (1977), pp. 4-5.
5. Griliches uses a nonlinear measure of work experience based on independent data on weeks worked since end of schooling or age 14.
6. Card's (2001) model of endogenous schooling deals with two sources of bias: the standard case of ability-bias in the OLS estimatorwhich he shows to be positiveand what he calls comparative advantage bias. The latter arises because people with a higher return to education have an incentive to acquire more schooling, and this also generates an upward bias in the OLS estimator. Card deals with this problem by using instrumental variables estimation (where the instruments are institutional features of the supply side of the education system such as compulsory schooling laws, accessibility of schools, etc). He also reviews a set of studies that have attempted to measure the effect of education on earnings, and concludes that instrumental variables estimates of the return to schooling typically exceed the corresponding OLS estimates. These findings contradict the usual a priory assumption that OLS methods lead to upward-biased estimates of the true education effect on earnings. "The explanationproposed by Griliches (1977) and echoed by Angrist and Krueger (1991)is that ability biases in the OLS estimates of the return to schooling are relatively small, and that the gaps between the IV and OLS estimates reflect the downward bias in the OLS estimates attributable to measurement errors"
7. The number of occupations originally defined was 23 (see Table A1). However, information for occupations 3-1 (Government Officials, Superiors and Legislators) and 7-2 (Air Transportation Workers) was practically null for estimation purposes (i.e., only 3 observations for air pilots within our 32 metropolitan area samples). Therefore, we use only 21 occupations in this study. For female workers there were a few occupations for which data were quite scarce and hence not included in the sample for estimation purposes. These were Transportation Workers (occupational category 7) and Army and Police Workers (occupation 9-2) which are occupations with extremely low female representation in Mexico.
8. Encuesta Nacional de Ocupación y Empleo, 2nd quarter, 2010.
9. Mincer (1974).
10. Thirty-three percent of students in college and postgraduate programs attend private universities, colleges or other institutions (Asociación Nacional de Universidades e Instituciones de Educación Superior, 2009).
11. This assumption is similar to the one found in Card (2001, pp. 1130-1131). The author develops a model of endogenous schooling that specifies a lifecycle utility function conditional on schooling and a given consumption profile. An individual's optimal schooling choice and consumption path is found by maximizing the utility function subject to a budget constraint. The latter takes into account that individuals who are in school at time t work part time and have some earnings, and also pay tuition costs. To derive an expression for the optimal schooling choice, Card assumes that part-time earnings are approximately equal to tuition costs.
12. The ENOE uses the term Subordinate and Remunerated Worker to refer to those receiving wages or salaries. It specifically excludes the self-employed and employers. The sample of subordinate and remunerated workers in this paper takes into account only those observations reported by ENOE from urban localities belonging to municipalities within a Metropolitan Area as specified by INEGI (2008a). Also, the 60,231 observations in our sample only include workers reporting they worked at least 25 hours per week.
13. Each metropolitan area includes those urban municipalities specified by INEGI (2008a) that were included into the ENOE survey.
14. The municipality considered a self-representative city by ENOE, which has enough observations to statistically represent the population of that municipality.
15. The phi correlation coefficient is a measure of association of two binary variables (introduced by Karl Pearson). This measure is similar to the Pearson correlation coefficient in its interpretation.
16. Taking into account the log-linearity of the wage function, the fitted wage was calculated as follows:
17. For Mexico as a whole, if out of 1,000 people at age 14, 774 would be expected to be alive at age 65, thus S(65-14) would be 0.774.
18. Taking Mexican government bonds (Cetes at 28 days, annually compounded rate) as the reference for savings rates, we have (on average) annual nominal rates of 5.5% for 2009 and 4.54% for 2010. With respect to mortgage rates, one of the major banks in Mexico (Bancomer) shows annual nominal rates ranging from 11% to 15.3%, and INFONAVIT (Institute for The National Fund of Workers Housing, one of the largest government-backed mortgage programs in Mexico aimed at financing housing, especially for low-income workers) shows annual nominal rates ranging between 4% and 10%. If we take into account that the annual inflation rate in Mexico was 4.27% on average during the last six years (2006-2010), then we have annual real rates of return close to zero for savings, while for mortgages, the real rates range between zero and 10 percent.
19. Following these authors, "It is well understood that college and high school graduates may have different abilities so that income forgone during college by the former is not necessarily equal to observed earnings of the latter"(pp. 1-2). In addition, "The data support the comparative advantage theory: Those who did not attend college would have earned less as college graduates than those who actually chose to attend. More surprisingly, those who attended college would have earned less as high school graduates than did those who actually chose high school." The comparative advantage theory is also supported by empirical evidence in Garen (1984).
20. Heckman and Li (2003) mention that "conventional approaches to selection and missing data problems do not account for heterogeneity in responses to schooling on which agents select into schooling". They develop a semiparametric framework that accounts for heterogeneity and selection. Using cross-sectional microdata from the 2000 China Urban Household Investment and Expenditure Survey, they show that conventional OLS and IV estimators of the earnings gains to college are downward and upward biased respectively. The downward biasness of OLS estimates was explained by an important negative selection bias.
Austria, M.A. and F. Venegas Martínez (2011), "Rendimientos privados de la educación superior en México en 2006. Un modelo de corrección del sesgo por autoselección," El Trimestre Económico Vol. LXVIII (2), No. 310, April-June 2011. [ Links ]
Berger, M.C. and J.P. Leigh (1989), "Schooling, self-selection, and health," The Journal of Human Resources, Vol. 24, No. 3: 433-455. [ Links ]
Card, D. (2001), "Estimating the return to schooling: Progress on some persistent econometric problems," Econometrica, Vol. 69, No. 5: 1127- 1160. [ Links ]
Deaton, A. and C. Paxson (2001), "Mortality, education, income, and inequality among American cohorts," in Wise, D.A., ed., Themes in the Economics of Aging, Chicago: The University of Chicago Press. [ Links ]
Dougherty, C. (2005), "Why are the returns to schooling higher for women than for men?" The Journal of Human Resources, Vol. 40, No. 4: 969-988. [ Links ]
Ehrlich, I. (1975), "On the relation between education and crime," in Juster, F.T. ed., Education, Income, and Human Behavior, New York: McGraw-Hill. [ Links ]
Friedman, M. (1953), "Choice, chance, and the personal distribution of income," Journal of Political Economy, Vol. LXI, No. 4: 277-290. [ Links ]
Garen, J. (1984), "The returns to schooling: A selectivity bias approach with a continuous choice variable," Econometrica, Vol. 52, No. 5: 1199-1218. [ Links ]
Gradstein, M. and M. Justman, (2002), "Education, social cohesion and economic growth," The American Economic Review, Vol. 92, No. 4: 1192-1204. [ Links ]
Green A., J. Preston, and R. Sabates (2003), Education, equity and social Cohesion: A distributional model, London: Center of Research for the Wider benefits of Learning, Institute of Education. [ Links ]
Griliches, Z. (1977), "Estimating returns to schooling: Some econometric problems," Econometrica, Vol. 45, No. 1: 1-22. [ Links ]
Grossman, M. (1973), 'The correlation between health and schooling," Working Paper Series No. 22, National Bureau of Economic Research. [ Links ]
Grossman, M. (2005), "Education and nonmarket outcomes," Working Paper Series No. 11582, National Bureau of Economic Research. [ Links ]
Grossman, M. (2008), "The relationship between health and schooling," Eastern Economic Journal, Vol. 34, No. 3: 281-292. [ Links ]
Harberger, A. (1965), "Investment in men versus investment in machines: The case of India," reprinted in Harberger, A. (1972), Project Evaluation: Collected Papers, Chicago: The University of Chicago Press. [ Links ]
Harberger, A. (2005), "On the process of growth and economic policy in developing countries," PPC Issue Paper No. 13, USAID. [ Links ]
Heckman, J. and L. Xuesong, (2003), "Selection bias, comparative advantage and heterogeneous returns to education: Evidence from China in 2000," Working Paper Series No. 9877, National Bureau of Economic Research. [ Links ]
-. (2008b), "Clasificación mexicana de ocupaciones (CMO)," Vol. I. [ Links ]
-. (2008c), "Clasificación mexicana de ocupaciones (CMO)," Vol. II. [ Links ]
Lleras Muney, A. (2005), "The relationship between education and adult mortality in the United States," The Review of Economic Studies, Vol. 72, No. 1: 189-221. [ Links ]
Lochner, L. and E. Moretti (2004), "Effect of education on crime: Evidence from prison inmates, srrests, and self-reports," The American Economic Review, Vol. 94, No. 1: 155-189. [ Links ]
Lochner, L. (2004), "Education, work, and crime: A human capital approach," International Economic Review, Vol. 45, No. 3: 811-843. [ Links ]
Michael, R. (1972), The effect of education on efficiency in consumption. New York: Columbia University Press. [ Links ]
-. (1973), "Education in nonmarket production," Journal of Political Economy, Vol. 81, No. 2, Part 1: 306-327. [ Links ]
Mincer, J. (1958), "Investment in human capital and personal income distribution," Journal of Political Economy, Vol. LXVI, No. 4.: 281-302. [ Links ]
-. (1974), Schooling, experience and earnings. New York: National Bureau of Economic Research, Columbia University Press. [ Links ]
Ordaz, J.L. (2007), "México: capital humano e ingresos. Retornos a la educación, 1994-2005," Mexico: ECLAC, Agriculture Unit. [ Links ]
Rauscher, M. (1993), "Demand for social status and the dynamics of consumer behavior," The Journal of Socio-Economics, Vol. 22, No. 2: 105-113. [ Links ]
Sarimaña, J.E. (2002), "Rendimiento de la escolaridad en México: una aplicación del método de variables instrumentales para 1998," Gaceta de Economía, Vol. 7, No. 14. [ Links ]
Usher, D. (1997), "Education as a deterrent to crime," The Canadian Journal of Economics, Vol. 30, No. 2: 367-384. [ Links ]
Willis, R. and S. Rosen (1978), "Education and Self-Selection," Working Paper Series No. 249, National Bureau of Economic Research. [ Links ]
Zamudio, A. (1995), "Rendimientos a la educación superior en México: ajuste por sesgo utilizando máxima verosimilitud," Economía Mexicana-Nueva Época, Vol. IV, No. 1. [ Links ]
The Joint Test 1 in Table A5 refers to the null hypothesis that all X's (coefficients associated with the interaction terms between age and education level) are equal to zero. This is a Wald criterion-based test for linear restrictions:
The test statistic has a sample distribution F(J, N-K), where J is the number of restrictions to be tested (4 in this case), and N and K are the sample size and number of estimated parameters in the unrestricted regression respectively. Table A5 shows that at the 10% significance level, we reject the null hypothesis in 27 (out of 32) states, and at the 5% significance level we reject it in 24 states.
On the other hand, Joint Test 2 in Table A5 refers to the null:
This is a test for J=8 linear restrictions. The F statistic and its corresponding P-value for each state indicate that the null hypothesis is rejected in all reported cases (at 5% or lower significance level).
The pooled regression version of the model allows for different intercepts and slopes and is specified as follows:
Sj for 1, 2, 32 and j ≠5 is a vector of dummy variables taking on value 1 if the observation belongs to state j and zero otherwise. State 5 (Chiapas) was chosen as the category of reference.
age χ Sj represents the interaction term between age and the state dummy variable.
The estimation results are presented in Table A8. Using Breusch-Pagan and Score tests, we found evidence of heteroscedasticity in all regressions. Therefore, robust standard errors (consistent in the presence of no homoscedastic errors) are reported.