Ch. 3: General Analytic Guideline : Bloomberg Philanthropies Data For Health Initiative - NCD Mobile Phone Surveys

Overview

The NCD Mobile Phone Survey Country Report and Fact Sheet are important documents that enable countries to present key findings and facilitate cross-country comparison. The Country Report provides detailed results in the context of each country’s unique NCD surveys. The Fact Sheet is intended to provide an overview of the key findings and highlights of the survey for a broad audience. This document provides general data analysis and reporting guidelines and recommendations for the Country Report and a template for the Fact Sheet.

Reporting Point Estimates and Confidence Intervals

The NCD Mobile Phone Survey employs a complex sampling design; therefore, analysis must account for stratification, multiplicity, naturally occurring clustering, and unequal selection probabilities to obtain valid point estimates, standard errors (SEs), confidence intervals (CIs), and tests of hypotheses. If the sampling design is not accounted for, the variance may either be underestimated (which usually occurs when sampling designs include clustering and unequal probabilities of selection) or be overestimated (which can occur with stratification and multiplicity). It is suggested to report the weighted point estimate along with the lower and upper bound of 95% CI. The 95% CIs can be calculated based on the point estimates and their SEs (i.e., lower bound = point estimate − 1.96 × SE; upper bound = point estimate + 1.96 × SE) using appropriate methods for variance estimation of complex survey data. The commonly used variance estimation methods supported by statistical software for a two-phase sample design are Successive Difference Replication (SDR) and model-assisted estimation.

Currently, only two statistical software packages support these variance estimation methods, Stata and R. Stata employs SDR as the option sdr in the svy module. R offers the twophase function of the survey package, as well as within the functionality of the Design package.

Reporting Estimates in Subgroups

The suggested tables in Section 6 include the recommended subgroups for reporting NCDs and NCD risk factors. The variables used for classifying subgroups include the following selected demographic characteristics from the core questionnaire:

Sex. Male and female
Age. Four broad age groups (18–29, 30-44, 45-59, and 60 years and older)
However, countries may choose to adjust subgroups based on their specific needs.
Statistical tests are used to determine the significance of differences between subgroups. Differences between point estimates should be considered statistically significant if p<0.05.

Evaluating Missing Data

Typically, responses with “don’t know” or “refused” are excluded from analysis for each specific indicator. See Questions and Indicators Manual for specific guidance on addressing missing data for each indicator.

When a sampled person refuses to answer a question, a “refused” response is assigned a value of #. A “don’t know” response is assigned a value of 888. Failing to identify these types of missing data or treating the assigned values for “refused” or “don’t know” as real values will distort analysis results. Therefore, the analyst must recode a “refused” response to 999 and a “don’t know” response to 888 as missing values.

Missing data may bias the analysis results and some adjustments may be considered. As a general rule, if 10% or less of the data for the main outcome variable for a specific indicator are missing for eligible respondents, continuing analysis without further evaluation or adjustment is usually acceptable (Langkamp, 2010). If, however, more than 10% of the data for an indicator are missing, the analyst may need to further examine respondents and nonrespondents with respect to the main outcome variable and decide whether imputation of missing values or use of adjusted weights is necessary. Note that even if the overall item nonresponse rate is less than 10%, a subgroup item nonresponse rate within the indicator may exceed 10% and need to be further examined for statistical bias.

Reporting Small Sample Size

If an unweighted cell sample size or denominator is less than 25, it is recommended to report only unweighted data. The point estimate and 95% CI should be suppressed and replaced with a dash (—) in the cell and an explanatory footnote at the bottom of the table. For example, “— indicates an estimate based on an unweighted sample size of less than 25 and has been suppressed.”

Computing Population Counts

Calculating population counts in addition to the prevalence of a health risk factor is often helpful to further understand its direct public health impact or burden. Here are the basic steps to calculating a population count:

Estimate the unadjusted (crude) prevalence of the NCD or NCD risk factor.
Determine the relevant population totals from the country’s census or equivalent population projection.
Multiply the prevalence estimate of the NCD or NCD risk factor by the corresponding country census population total to obtain an estimate of the number of country citizens with the NCD or NCD risk factor.
Population counts should be reported to the nearest thousand, with a 95% CI computed from the prevalence estimate and the SE.

Using Statistical Analysis Software Packages

To account for the complex survey design, the sample design information should be explicitly used when producing statistical estimates or undertaking statistical analysis of the NCD Mobile Phone Survey data. The sample weights reflect the unequal probabilities of selection, adjustments for nonresponse, and adjustments to country-specific population sizes. Thus, the proper sample weight and stratification of the design must be incorporated into an analysis to obtain the correct estimates and standard errors of the estimates.

Currently, most of the statistical software programs, such SAS, SUDAAN, and SPSS, do not offer procedures or modules for analyzing survey data with a two-phase sampling design. However, data from this design can be analyzed using Stata or R. Stata offers the sdr option in the svy module. R offers the twophase function of the survey package, as well as within the functionality of the osDesign package. Technical assistance is available for the use of both Stata and R for data analysis. Note that using any statistical software based on data from simple random sample is usually not appropriate to analyze survey data with a complex design. Ignoring the complex design can lead to biased estimates and overstated significance levels (Brogan, 1998).

Bloomberg Philanthropies Data For Health Initiative - NCD Mobile Phone Surveys

How can we help you today?

Ch. 3: General Analytic Guideline Print

Overview

Reporting Point Estimates and Confidence Intervals

Evaluating Missing Data

Reporting Small Sample Size

Computing Population Counts

Using Statistical Analysis Software Packages

How can we help you today?

Ch. 3: General Analytic Guideline Print

Overview

Reporting Point Estimates and Confidence Intervals

Evaluating Missing Data

Reporting Small Sample Size

Computing Population Counts

Using Statistical Analysis Software Packages

Related Articles