The wage-gap decomposition that didn't decompose cleanly

Blinder-Oaxaca decomposition is the standard tool for analysing wage gaps. The procedure separates the observed difference in average wages between two groups — men and women, upper and lower castes, urban and rural workers — into a part explained by differences in observed characteristics (years of education, experience, occupation, industry) and an unexplained residual usually labelled as discrimination or unmeasured returns to skill.

The procedure is elegant. The headline number it produces — "X per cent of the wage gap is unexplained by observable characteristics, and therefore attributable to discrimination" — is widely cited. Our running observation, after working with NSSO and PLFS data on gender and caste wage gaps over several years, is that the headline is almost always wrong in the same direction.

The residual is not a measure of discrimination. The residual is whatever the model did not have a variable for. In most decompositions we have seen in the Indian context, the residual is between sixty and eighty per cent of the total gap. A finding that says "three-quarters of the gap is unexplained" is not a finding about discrimination. It is a finding about model insufficiency, reported in language that makes the insufficiency disappear.

The structural problems are four, and they do not get cleaner with bigger samples.

The first is occupational segregation. Women and lower-caste workers in India are concentrated in low-paying occupations — agricultural labour, domestic work, low-paid services. If the decomposition treats occupation as an endowment (a worker-side characteristic), it assumes occupational assignment is exogenous, which is the very thing the wage gap is partly about. If it omits occupation, the residual absorbs the entire effect of segregation. Most published decompositions do one or the other without being explicit about which, and the choice changes the headline by twenty to thirty percentage points. There is no neutral answer.

The second is selection into the labour force. Female labour force participation in India is among the lowest in the world and is non-randomly selected on household characteristics, caste, education, and local labour market conditions. The wages we observe for working women are not the wages for women in general; they are the wages for women whose households allowed or required them to work, in the local labour markets where they could. Heckman corrections require an exclusion restriction (something that affects participation but not wages) that is hard to justify in this setting. Without correction, the decomposition compares non-comparable populations and labels the resulting difference as discrimination.

The third is the quality of skill proxies. Years of schooling is the standard human-capital variable. It does not capture quality of schooling, quality of the institution, field of study, English-language fluency, or networks built through school. In the Indian context, these matter enormously and are systematically correlated with caste, gender, and region. The variable says "twelve years" for both a worker from a government Tamil-medium school in Tiruvarur and a worker from a private English-medium school in south Bengaluru. The wage difference between them is then absorbed by the residual and labelled as discrimination, although a substantial part of it is the model's failure to measure skill.

The fourth is measurement at the data-collection end. PLFS and NSSO wage data have known issues with casual labour, multiple work arrangements, in-kind payments, and seasonal variation. The female sample skews more heavily toward exactly these underreported categories. The wage variable is noisier for women than for men, and the noise is not symmetric. Residual variance shows up in the decomposition's unexplained term.

A pattern from our own work: gender wage gap decompositions on PLFS 2022-23 data, run with what we considered a defensible specification, produced an explained share of about twenty-five per cent. The "discrimination" residual was three-quarters of the gap. We refused to report the headline that way and rewrote the result section to say what the model was actually telling us — that occupational segregation, selection into participation, and the inadequacy of standard human-capital variables together account for a substantial share of what looks like an unexplained residual. The funder pushed back. The headline they wanted was "X per cent of the wage gap is discrimination," because that headline travels. The headline we provided was harder to translate into a press release. We provided it anyway.

What is the better practice? Report the explained share with the confidence interval the model's standard errors actually support. Be explicit about whether occupation is treated as endowment or omitted, and report sensitivity to that choice. If selection correction is applied, name the exclusion restriction; if it is not, say what the headline number is conditional on. Most importantly, do not call the residual discrimination. Call it the unexplained share. The honest paragraph that follows reads: "this unexplained share contains the effect of any group-level discrimination, plus everything the model did not measure."

The methods literature is good on this. The applied literature, including most of the policy-facing applied literature in India, is not. We have come to read decompositions in policy reports by checking the explained share first; when it is below thirty per cent, we read the methods section three times before believing the headline.

Useful references: Jann's Stata Journal paper on the BO decomposition remains the standard practical treatment; PLFS Annual Reports are the primary data source; and Fortin, Lemieux, and Firpo's NBER handbook chapter on decomposition methods is the most thorough methodological reference.