Enroll in the Global Health Research Certificate Program

Quality Health Data


Quality data is critical to assessing the global burden of disease and developing public health initiatives. We live in an era of unprecedented technological advancement, which has provided us with increased access to data. However, just because data has become more available does not mean that all data is accurate and reliable. This article describes quality data, identifies how one can ensure the quality of their data, and discusses the challenges associated with obtaining quality data in resource-poor settings.

What is Quality Data?

Data quality is defined as “the totality of features and characteristics of a data set that bear on its ability to satisfy the needs that result from the intended use of the data.”(1) High quality data effectively satisfies its intended use in decision making and planning.  In a 1994 study, Wang. et. al. analyzed the different attributes of good quality data and organized them into fifteen different categories.(2)

  1. Access Security: Data must be restricted and kept secure to ensure confidentiality and the protection of civil liberties.
  2. Accessibility: Data must be available or easily retrievable.
  3. Accuracy: Data must be correct and free of errors.
  4. Appropriate Amount of Data: The quantity of data must be appropriate.
  5. Believability: Data must be regarded as true and credible.
  6. Completeness: Data must be sufficient in breadth, depth, and scope for its desired use.
  7. Concise Representation: Data must be represented without being overwhelming.
  8. Ease of Understanding: Data must be clear.
  9. Interpretability: Data must be in appropriate language and units.
  10. Objectivity: Data must be unbiased.
  11. Relevancy: Data must be applicable to the task at hand.
  12. Representational Consistency: Data must be presented in a consistent format.
  13. Reputation: Data must come from a trusted source.
  14. Timeliness: Data should be recorded as quickly as possible and used within a reasonable time period.
  15. Value-Added: Data must provide valuable insight.

There are a lot of attributes that are characteristic of high quality data. The appropriate set of attributes and acceptable levels of these attributes may differ depending on the research situation and setting. It is also important to note that many of these attributes are interdependent.  For example, data that arrives too late, or takes too long to gather, will no longer be relevant. Similarly, data must be interpretable (in the appropriate language and units) in order for them to be easily understood.

Ensuring Quality Data

To ensure quality data, the data must be managed correctly from the time of collection until the time of analysis. The data first must be recorded properly on the desired survey or questionnaire. Measurement errors and faulty recording must be avoided during this step. To ensure accurate measurements and responses, there must be mutual trust and understanding between the participants and the research staff because when participants trust the researchers, they are more likely to provide reliable responses. Thus, it is very important to choose local staff that are familiar with the culture and language of the study population to conduct the data collection. In order to ensure accurate responses, it is also important that the staff collecting the data receive training in culturally-sensitive household entry procedures, and it is important to have legible handwriting and an accurate recording of responses. Next, the data needs to be verified and analyzed. Data analysis should be conducted by trained personnel and is normally done by inputting the data into an electronic database. Programming errors and computer misreads must be avoided during this step.  Lastly, post-entry data cleaning and extraction into a data set for analysis must be done carefully and without data cleaning errors.(3)

Why Quality Data is Important

“Poor and inaccurate information is hampering global aid efforts to improve the lives of the world’s poor.”(4) Good public health decision making is dependent on accurate and timely statistics and data. It is critical that quality health data be obtained in order to assess the magnitude and distribution of the disease burden, so that programs can be developed to address health needs worldwide. Vital statistics, such as births, deaths, and causes of death are also critical for addressing health needs and recording progress towards the Millennium Development Goals and other developmental objectives.(5) In a clinical setting, quality data is important because it can improve the care provided. For example, a study on child mental health services showed that 58% of the patients had improved outcomes after a data quality improvement project was implemented.(6) It is also equally important that the researcher does not utilize inaccurate data for programming or planning purposes since the associated medical errors can lead to long term damage or death in patients, as well as economic losses.(7)

Challenges With Obtaining Quality Data in Resource-Poor Settings

Lack of Data

All high-income countries have national civil registration systems that record births and deaths, and the countries generate statistics. Unfortunately, these statistics and registration systems are not usually available in lower income countries where premature mortality and infant mortality are highest. “Too many people, especially the poor, are never counted; they are born, and live and die uncounted and ignored. It is a fundamental principle of human rights that every life counts, that every individual matters. If we are to give life to such principles, it is time to start counting everyone.”(8) Developing countries currently account for 99% of unregistered births worldwide, totaling an estimated 48 million unregistered births. Country data for %age of births registered showed that the 42 countries with complete birth registration had a mean purchasing power parity of $17,357, while the 57 countries which reported lower levels of birth registration, had a mean purchasing power parity of only $2,675.(9) In addition, over half the countries in Africa and Southeast Asia record no data on cause of death.(10)

It is especially important that resource-poor settings begin to create civil registration systems in order to measure vital statistics. Vital statistics, such as birth and death rates and cause of death, are critical for targeting and assessing the effectiveness of public health initiatives.(11) It is also important to obtain quality health data because public health decisions can be driven in wrong directions when whole categories of data are not identified.(12) It is important to note, however, that the value of data lies in their use and not in their collection. Data must not be left unanalyzed, which is often what happens in resource-poor settings.

Lack of Infrastructure

In resource-poor settings, poor roads, political instability and crime may reduce the completeness and accuracy of the collected data. Lack of technology and data management infrastructure also present a challenge to data management and collection. For example, specimen collection and processing of bodily fluids is especially dependent on timely and reliable analysis, which is not always possible in resource poor settings due to the lack of infrastructure and technology. Lack of cell phones and harsh climate also impede data flow. (13)

Population Demographics

In resource-poor settings, populations are often highly mobile. Thus, problems collecting data may arise when the household heads are away for extended periods of time in search of food, water, or causal work. (14)  In addition, cultural and linguistic differences between the research staff and respondents may lead to misreporting. For example, a pilot study in The Gambia asked people to record their spending by using a grid with pictures of items and pictures of currency. This was a linguistically competent initiative because it enabled even the illiterate to participate and record their spending, but it was not culturally accurate. To represent livestock, the researchers drew a cow with one hump. They later realized that the reason few people were recording spending money on livestock was because the cow was missing a hump.(15) In addition, when researchers from developed countries arrive in a resource-poor country to conduct research, local community members might often treat them with skepticism. Thus, if possible, a community advisory board should be consulted to enhance the cultural appropriateness of questionnaires, and introduce the study to the community, thereby reducing community members’ worries and suspicions.(16)


(1) “Background Issues on Data Quality.” Connecting for Health Common Framework. (2006). http://bok.ahima.org/PdfView?oid=63654.

(2) Abate, M. L., Diegert, K. V., & Allen, H. W. (1998). A hierarchical approach to improving data quality. Data Quality4(1), 365-369.

(3) Van den Broeck, J., et. al. “Maintaining data integrity in a rural clinical trial.” Clinical Trials. 4 (2007): 572-582.

(5) Setel, P. W., Macfarlane, S. B., Szreter, S., Mikkelsen, L., Jha, P., Stout, S., ... & Monitoring of Vital Events (MoVE) writing group. (2007). A scandal of invisibility: making everyone count by counting everyone. The Lancet370(9598), 1569-1577.

(6) “Background Issues on Data Quality.” Connecting for Health Common Framework. (2006). http://bok.ahima.org/PdfView?oid=63654.

(7) Ibid.

(8) AbouZahr, C. “Who Counts? 4 The Way Forward.” The Lancet. 370. (2007): 1791-99.

(9) Byass, P. “The Unequal World of Health Data.” PLOS Med. 6.11 (2009).

(10) Setel, P. W., Macfarlane, S. B., Szreter, S., Mikkelsen, L., Jha, P., Stout, S., ... & Monitoring of Vital Events (MoVE) writing group. (2007). A scandal of invisibility: making everyone count by counting everyone. The Lancet370(9598), 1569-1577.

(11) Ibid.

(12) Lopez, A. D., Abouzahr, C., Shibuya, K., Gollogly, L., & Cleland, J. (2007). Who Counts? 4 The way forward. Commentary. Lancet (British edition)370(9601).

(13) Van den Broeck, J., et. al. “Maintaining data integrity in a rural clinical trial.” Clinical Trials. 4 (2007): 572-582.

(14) Wiseman, V., Conteh, L., Matovu, F. “Using diaries to collect data in resource-poor settings: questions on design and implementation.” Oxford University Press.

(15) Ibid.

(16) Van den Broeck, J., et. al. “Maintaining data integrity in a rural clinical trial.” Clinical Trials. 4 (2007): 572-582.