Insights
08 Oct 2023 · 8 min read
Choosing a Sampling Approach for a National Household Survey in Ghana
Ghana's administrative geography, population distribution, and field logistics make sampling for national household surveys a practical problem as much as a statistical one. This is what researchers and programme evaluators need to know before designing the frame.
Designing the sampling frame for a national or multi-regional household survey in Ghana requires navigating a combination of statistical requirements, administrative geography, available sampling frames, and the realities of field logistics in a country that spans coastal cities, forest zones, savannah, and remote rural communities. Getting it right at the design stage is cheap; fixing it after fieldwork is expensive or impossible.
What sampling frame should a national household survey in Ghana use?
The standard and most defensible sampling frame for national household surveys in Ghana is the Ghana Statistical Service (GSS) Enumeration Areas (EAs) from the most recent Population and Housing Census. The 2021 PHC EAs provide the most current frame, with population estimates and geographic coordinates that allow stratified selection. Most large-scale surveys in Ghana — the GDHS, GLSS, and donor-funded programme evaluations — use this frame.
The EA frame has known limitations. EAs may not perfectly reflect current population distribution in rapidly urbanising peri-urban areas, particularly in Greater Accra and Ashanti. In some remote areas, EA boundaries and household lists may be outdated. For surveys in areas where the EA frame is known to be weak, supplementary listing may be required — field teams visit selected EAs to update the household list before survey enumeration begins. This adds time and cost but protects the validity of the probability sample.
Project-specific or programme-defined frames (e.g. beneficiary lists, community registries) are appropriate only when the survey's inference is intentionally limited to the programme population. Using a project register as a frame for a nationally representative survey is a design error that will be flagged in any credible peer review.
How does Ghana's administrative geography affect stratification decisions?
Ghana is divided into 16 regions, each further divided into Metropolitan, Municipal, and District Assemblies (MMDAs). For a nationally representative survey, regions are the standard first-level stratification. Within regions, urban/rural stratification is common, using the GSS classification. For programmes targeting specific sub-national areas, additional stratification by district may be appropriate, but only if the sample size supports district-level inference — this is a power calculation question, not a preference.
The 2019 regional expansion (from 10 to 16 regions) requires care when comparing new survey data against pre-2019 datasets. Brong-Ahafo became Bono, Bono East, and Ahafo; Northern became Northern, North East, and Savannah; and Volta became Volta and Oti. Any trend analysis spanning this change must either aggregate to the original 10-region structure or explicitly model the split. Mishandling this is a frequent source of errors in reports using mixed-vintage data.
What is a design effect and why does it matter for sample size calculation?
In a simple random sample (SRS), observations are independent, and standard power calculations apply directly. In a cluster sample — the practical standard for household surveys in Ghana — households within the same EA tend to be more similar to each other than to households in other EAs (shared infrastructure, similar livelihoods, common services). This intra-cluster correlation (ICC, or rho) reduces the effective sample size. The design effect (DEFF or deff) quantifies this reduction: a deff of 2.0 means that your cluster sample gives you the same precision as an SRS of half its size.
For household surveys in Ghana, typical design effects range from 1.3 to 2.5 depending on the indicator, the cluster size, and the degree of geographic clustering of the outcome. A commonly used conservative assumption is deff = 1.5 for a cluster size of 10–15 households per EA. Using a deff of 1.0 (i.e. ignoring clustering) systematically underestimates the required sample size and produces confidence intervals that are too narrow, leading to false precision in reported findings.
The deff must be estimated from the baseline data and used to inform midline and endline sample size revisions. This is standard practice in multi-wave evaluations, but it requires that the baseline team archive the raw data with EA identifiers — something that is frequently not done.
How many households should be sampled per cluster?
The optimal cluster size (households per EA) balances statistical efficiency against field cost. Increasing the cluster size reduces the number of EAs required (and thus travel cost), but at the cost of increasing the design effect and reducing statistical efficiency. For most Ghana-based household surveys, a cluster size of 10–20 households per EA, drawn from a systematic sample with a random start within the EA household list, is the standard range.
For programmes requiring high sub-national precision (district-level estimates in sparsely populated regions), a smaller cluster size with more EAs per district is more efficient. For surveys with tight field budgets in remote areas, a larger cluster size per EA may be the only practical option — this should be documented as a design constraint with its implications for precision stated explicitly.
How do you handle seasonal fieldwork variation in Ghana?
Ghana has two main rainfall seasons (March–July and September–November in the South; May–September in the North), and agricultural labour demand peaks during planting and harvest periods. Survey fieldwork in rural areas during these periods faces higher household absenteeism and reduced cooperation from farming households. For programme evaluations measuring agricultural outcomes, conducting fieldwork in the same season across waves is essential — if the baseline was conducted in the dry season and the endline in the wet season, seasonal variation will contaminate the treatment effect estimate.
For the AfDB GASSLIP sanitation baseline in Greater Accra, fieldwork timing was designed around the market calendar and urban household patterns rather than agricultural seasonality — a different set of constraints. Urban household availability in Accra's periurban communities is affected by trading hours, shift patterns, and, in low-income areas, the frequency with which household members are away seeking casual employment. Field protocols must anticipate and document the specific access constraints of the survey population.
What are the most common field data quality failures in Ghana household surveys?
The most common field data quality failures are interviewer effects (leading questions, proxy completion without flagging, fabricated responses), inadequate supervision (supervisors covering too many teams in too large a geographic area), and poor instrument translation (questionnaires translated into Twi, Ga, Dagbani, or Ewe at an insufficient literacy level for enumerators who are not native speakers of the survey language). Each of these is preventable through enumerator training, real-time data quality monitoring (CAPI with logic checks, audio recording of a sample of interviews), and back-check surveys.
CAPI deployment — using tablets with offline-capable survey software (ODK Collect, SurveyCTO, KoBoToolbox) — has become standard for most donor-funded surveys in Ghana, and it eliminates a significant share of manual data entry errors. However, CAPI does not automatically eliminate interviewer effects; it makes them easier to detect through metadata (interview duration, GPS coordinates, response pattern analysis), which is why CAPI quality dashboards should be monitored daily during fieldwork, not retrospectively.
Frequently asked questions
- Can we use our programme's beneficiary list as the sampling frame?
- Only if your survey's inference is explicitly limited to programme beneficiaries. A beneficiary list frame cannot support claims about the general population, the comparison group, or any population outside the programme roll. For an evaluation that needs to compare beneficiaries with non-beneficiaries, a probability sample from the GSS EA frame is required.
- How do we account for the 2019 regional reorganisation in trend analysis?
- Aggregate new regions back to their parent 2010 PHC regions for trend comparisons, or explicitly model the split and document the comparability limitation. Never present data from the 16-region structure as directly comparable to 10-region data without a bridging note.
- What survey software does Devtplan use for CAPI deployment?
- Devtplan has deployed ODK Collect, SurveyCTO, and KoBoToolbox on various assignments, and selects the platform based on project requirements (offline reliability, licensing cost, donor requirements, and dataset size). We manage server deployment, form design with built-in logic checks, and daily quality dashboards during fieldwork.
- How long does a national household survey in Ghana typically take to complete?
- A nationally representative survey of 1,500–3,000 households covering 10+ regions typically requires 6–8 weeks of field enumeration, preceded by two to three weeks of enumerator training and CAPI deployment. Data cleaning and analysis add four to eight weeks depending on complexity. Compressed timelines are possible with a larger field team but increase quality monitoring demands.
Related reading
-
14 Mar 2024
What Baseline, Midline, and Endline Studies Actually Measure — and Why the Sequence Matters
A baseline is not just a starting snapshot. Understanding the full measurement sequence — and what each wave is designed to answer — is the difference between an evaluation that informs decisions and one that creates noise.
-
22 Jan 2024
Resettlement Done Right: Why Community Engagement Is Not Optional in Infrastructure Projects
Resettlement without adequate community engagement is the single most common source of infrastructure project delays, cost overruns, and reputational damage for development financiers. Here is what "done right" looks like in practice.