The interview did not happen. The enumerator had walked to the Dalit hamlet at the edge of the village, on the second day of household-level data collection in a peri-urban district in central India. The hamlet was on the sampling frame. The respondent was on the list. The enumerator was a young woman from the regional supervisor's office, trained, scripted, IRB-cleared, on schedule.
A man stopped her at the entrance to the hamlet. He was, the enumerator later understood, a non-Dalit functionary in the village — not a sarpanch, not a member of any official body, but someone whose authority within the local hierarchy was settled enough to act without being challenged. He told her the people in the hamlet would not talk to her, that they were uneducated and would not understand the questions, and that he would arrange a better time. She said she had come from the block office and her schedule did not permit a return visit. He said he would talk to them himself and report back. She walked back to the central village. The next household on the sampling frame, in a non-Dalit settlement, gave the interview easily. The replacement protocol absorbed the gap.
We have heard variants of this story so many times we keep a private taxonomy of them. The enumerator who is told the household is away when they are visibly present. The household head who answers on behalf of the woman who was on the sampling list. The village functionary who insists on sitting in for the duration. The Dalit enumerator refused entry to upper-caste households. The upper-caste enumerator who gets formally polite answers from Dalit households that conceal what the field instrument was designed to elicit. Each is its own variation. None of them appears in the methods section of the report. The methods section says: response rate 92 per cent. The 8 per cent that did not respond is reported, if at all, as "non-availability" or "refusal," neither of which captures what happened.
Caste is not a variable in our standard methods toolkit in the way it is in the social reality of Indian fieldwork. It shapes who answers the door, who agrees to speak, who is allowed to speak, what is said in front of which other family members, and what is said only after the enumerator has been judged sympathetic. None of these dynamics are random. All of them are systematically structured by hierarchy. The data we collect is consistent with the standard sample frame and inconsistent with the underlying population in ways the standard analysis cannot detect.
The first failure mode is doorway refusal mediated by a third party. The respondent does not know the enumerator came; the intermediary turns the enumerator away before the contact is logged. The methods section records this, if at all, as a non-contact at the household level. The respondent who was sent away on someone else's behalf is invisible in the resulting numbers.
The second is who answers when contact is made. In many household surveys we have observed, the household head or the senior male present takes the interview when a female respondent was scheduled. The substitution is sometimes recorded as such; more often the answers are taken at face value as the female respondent's views. Cross-checking through follow-up with the named respondent, when we have done it, has shown the answers to be different and sometimes opposite. The published headline does not reflect this.
The third is enumerator caste and gender. There is good evidence in survey methodology that interviewer characteristics affect responses systematically — the so-called interviewer effects. In the Indian context this is sharper. A Dalit enumerator may be refused entry to non-Dalit households or may be given politely truncated answers. A non-Dalit enumerator at a Dalit household may receive courtesy-laden responses that conceal the experiences the survey was designed to measure. Caste-matched enumerator teams are one corrective, but they require known caste of the respondent at the sampling stage, which is itself ethically and practically fraught. Cross-caste enumerator pairings produce yet another bias pattern. There is no clean configuration.
The fourth is the absent woman. Women in patriarchal households are often physically present during the interview and substantively absent from it. The husband answers, the mother-in-law contributes, the woman herself is briefly consulted on facts and excluded from opinion items. The standard response-rate metric does not record this. We have begun to log, in our own qualitative protocols, who was present during the interview and who spoke; the resulting data on who-spoke-for-whom is often more informative than the survey responses themselves.
What is the corrective? Several things, all imperfect.
Document the social geography of the village before the household visits. A short observation walk with a community-recruited guide identifies the caste-segregated hamlets, the routes that are and are not walkable for which kinds of enumerators, the times of day at which which kinds of respondents are available. This adds half a day to the field schedule and changes what gets achieved on the schedule meaningfully.
Match enumerator characteristics to respondent demographics where the topic warrants it, and report the match rate. If twenty per cent of caste-relevant questions were asked by non-matched enumerators, say so. Do not absorb that variance into the unexplained residual.
Record who was present at every interview and who spoke for each block of items. Add it as a metadata column. The cost is minor; the analytic value, in any survey where social hierarchy matters, is substantial.
When the replacement protocol is invoked, log the reason in enough detail to distinguish "household head was at work" from "intermediary turned the enumerator away" from "respondent declined." These are three different non-responses and they introduce three different biases. Most replacement codes collapse them into one.
None of this is novel. Veteran field researchers in India have known these dynamics for as long as fieldwork has existed. The new thing, we think, is to write them into the methods section rather than the informal handover notes. The sample frame is who we asked. The achieved sample is who answered. The gap between the two is structured. Pretending the gap is random is the methodological choice that protects the headline.
Useful references: Bharti and Roy's IDFC Institute work on caste enumeration in Indian survey practice; the IHDS field manual is unusually candid about field access challenges; and Deshpande's The Grammar of Caste remains the methodological touchstone for caste in applied research.