{"id":80455,"date":"2019-07-03T14:24:35","date_gmt":"2019-07-03T12:24:35","guid":{"rendered":"https:\/\/www.scribbr.nl\/?p=80455"},"modified":"2023-06-22T13:29:55","modified_gmt":"2023-06-22T11:29:55","slug":"reliability-vs-validity","status":"publish","type":"post","link":"https:\/\/www.scribbr.com\/methodology\/reliability-vs-validity\/","title":{"rendered":"Reliability vs. Validity in Research | Difference, Types and Examples"},"content":{"rendered":"

Reliability<\/strong> and validity<\/strong> are concepts used to evaluate the quality of research. They indicate how well a method<\/a>, technique. or test measures something. Reliability is about the consistency<\/span> of a measure, and validity is about the accuracy<\/span> of a measure.opt<\/p>\n

It\u2019s important to consider reliability and validity when you are creating your research design<\/a>, planning your methods, and writing up your results, especially in quantitative research<\/a>. Failing to do so can lead to several types of research bias<\/a> and seriously affect your work.<\/p>\n\n\n\n\n\n\n\n
Reliability vs validity<\/caption>\n
<\/th>\nReliability<\/span><\/th>\nValidity<\/span><\/th>\n<\/tr>\n<\/thead>\n
What does it tell you?<\/th>\nThe extent to which the results can be reproduced when the research is repeated under the same conditions.<\/td>\nThe extent to which the results really measure what they are supposed to measure.<\/td>\n<\/tr>\n
How is it assessed?<\/th>\nBy checking the consistency of results across time, across different observers, and across parts of the test itself.<\/td>\nBy checking how well the results correspond to established theories and other measures of the same concept.<\/td>\n<\/tr>\n
How do they relate?<\/th>\nA reliable measurement is not always valid: the results might be reproducible<\/a>, but they\u2019re not necessarily correct.<\/td>\nA valid measurement is generally reliable: if a test produces accurate results, they should be reproducible.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

<\/p>\n

Understanding reliability vs validity<\/h2>\n

Reliability and validity are closely related, but they mean different things. A measurement can be reliable without being valid. However, if a measurement is valid, it is usually also reliable.<\/p>\n

What is reliability?<\/span><\/h3>\n

Reliability refers to how consistently a method measures something.\u00a0If the same result can be consistently achieved by using the same methods under the same circumstances, the measurement is considered reliable.<\/p>\n

You measure the temperature of a liquid sample several times under identical conditions. The thermometer displays the same temperature every time, so the results are reliable.<\/div>\n
A doctor uses a symptom questionnaire<\/a> to diagnose a patient with a long-term medical condition. Several different doctors use the same questionnaire with the same patient but give different diagnoses. This indicates that the questionnaire has low reliability as a measure of the condition.<\/div>\n

What is validity?<\/span><\/h3>\n

Validity refers to how accurately a method measures what it is intended to measure. If research has high validity, that means it produces results that correspond to real properties, characteristics, and variations in the physical or social world.<\/p>\n

High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn\u2019t valid.<\/p>\n

If the thermometer shows different temperatures each time, even though you have carefully controlled conditions to ensure the sample\u2019s temperature stays the same, the thermometer is probably malfunctioning, and therefore its measurements are not valid.<\/p>\n

If a symptom questionnaire results in a reliable diagnosis when answered at different times and with different doctors, this indicates that it has high validity as a measurement of the medical condition.<\/div>\n

However, reliability on its own is not enough to ensure validity. Even if a test is reliable, it may not accurately reflect the real situation.<\/p>\n

The thermometer that you used to test the sample gives reliable results. However, the thermometer has not been calibrated properly, so the result is 2 degrees lower than the true value. Therefore, the measurement is not valid.<\/div>\n
A group of participants take a test designed to measure working memory. The results are reliable, but participants\u2019 scores correlate strongly with their level of reading comprehension. This indicates that the method might have low validity: the test may be measuring participants\u2019 reading comprehension instead of their working memory.<\/div>\n

Validity is harder to assess than reliability, but it is even more important. To obtain useful results, the methods you use to collect data<\/a> must be valid: the research must be measuring what it claims to measure. This ensures that your discussion<\/a> of the data and the conclusions<\/a> you draw are also valid.<\/p>\n

How are reliability and validity assessed?<\/h2>\n

Reliability can be estimated by comparing different versions of the same measurement. Validity is harder to assess, but it can be estimated by comparing the results to other relevant data or theory. Methods of estimating reliability and validity are usually split up into different types.<\/p>\n

Types of reliability<\/span><\/h3>\n

Different types of reliability can be estimated through various statistical methods.<\/p>\n\n\n\n\n\n\n\n
Types of reliability<\/a><\/caption>\n
Type of reliability<\/th>\nWhat does it assess?<\/th>\nExample<\/th>\n<\/tr>\n<\/thead>\n
Test-retest reliability<\/a><\/th>\nThe consistency of a measure across time<\/strong>: do you get the same results when you repeat the measurement?<\/td>\nA group of participants complete a questionnaire<\/a> designed to measure personality traits. If they repeat the questionnaire days, weeks or months apart and give the same answers, this indicates high test-retest reliability.<\/td>\n<\/tr>\n
Interrater reliability<\/a><\/th>\nThe consistency of a measure across raters or observers<\/strong>: do you get the same results when different people conduct the same measurement?<\/td>\nBased on an assessment criteria checklist, five examiners submit substantially different results for the same student project. This indicates that the assessment checklist has low inter-rater reliability (for example, because the criteria are too subjective).<\/td>\n<\/tr>\n
Internal consistency<\/a><\/th>\nThe consistency of the measurement itself<\/strong>: do you get the same results from different parts of a test that are designed to measure the same thing?<\/td>\nYou design a questionnaire to measure self-esteem. If you randomly split the results into two halves, there should be a strong correlation<\/a> between the two sets of results. If the two results are very different, this indicates low internal consistency.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

Types of validity<\/span><\/h3>\n

The validity of a measurement can be estimated based on three main types of evidence. Each type can be evaluated through expert judgement or statistical methods.<\/p>\n\n\n\n\n\n\n\n
Types of validity<\/a><\/caption>\n
Type of validity<\/th>\nWhat does it assess?<\/th>\nExample<\/th>\n<\/tr>\n<\/thead>\n
Construct validity<\/a><\/th>\nThe adherence of a measure to existing theory and knowledge<\/strong>\u00a0of the concept being measured.<\/td>\nA self-esteem questionnaire could be assessed by measuring other traits known or assumed to be related to the concept of self-esteem (such as social skills and optimism<\/a>). Strong correlation between the scores for self-esteem and associated traits would indicate high construct validity.<\/td>\n<\/tr>\n
Content validity<\/a><\/th>\nThe extent to which the measurement\u00a0covers all aspects<\/strong> of the concept being measured.<\/td>\nA test that aims to measure a class of students’ level of Spanish contains reading, writing and speaking components, but no listening component.\u00a0 Experts agree that listening comprehension is an essential aspect of language ability, so the test lacks content validity for measuring the overall level of ability in Spanish.<\/td>\n<\/tr>\n
Criterion validity<\/a><\/th>\nThe extent to which the result of a measure corresponds to other valid measures<\/strong> of the same concept.<\/td>\nA survey<\/a> is conducted to measure the political opinions of voters in a region. If the results accurately predict the later outcome of an election in that region, this indicates that the survey has high criterion validity.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n

To assess the validity of a cause-and-effect relationship, you also need to consider internal validity<\/a> (the design of the experiment<\/a>) and external validity<\/a> (the generalizability<\/a> of the results).<\/p>\n

How to ensure validity and reliability in your research<\/span><\/h2>\n

The reliability and validity of your results depends on creating a strong research design<\/a>, choosing appropriate methods and samples, and conducting the research carefully and consistently.<\/p>\n

Ensuring validity<\/span><\/h3>\n

If you use scores or ratings to measure variations in something (such as psychological traits, levels of ability or physical properties), it\u2019s important that your results reflect the real variations as accurately as possible.\u00a0Validity should be considered in the very earliest stages of your research, when you decide how you will collect your data.<\/p>\n