Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
stats_concepts [2017/03/20 11:38] steveclarke [Skewness and Kurtosis] |
stats_concepts [2018/04/10 09:12] (current) 72.38.112.148 |
||
---|---|---|---|
Line 60: | Line 60: | ||
STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically, this is referred to as "biased." Remembering that the P in STDEVP stands for "population" may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result. | STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically, this is referred to as "biased." Remembering that the P in STDEVP stands for "population" may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result. | ||
==== Reliability ==== | ==== Reliability ==== | ||
- | We have included a Reliability.xlsx spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials. | + | Reliability in statistics and psychometrics is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. |
+ | |||
+ | We have included a {{:Reliability.xlsx|Reliability}} spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials. | ||
==== Skewness and Kurtosis ==== | ==== Skewness and Kurtosis ==== | ||
- | We have included a Skewness and Kurtosis.xlsx spreadsheet that shows examples of how we calculate Skewness and Kurtosis. | + | Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. |
+ | |||
+ | Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers. | ||
+ | |||
+ | We have included a {{:Skewness.xlsx|Skewness} and {{:Kurtosis.xlsx|Kurtosis} spreadsheet that shows examples of how we calculate Skewness and Kurtosis. | ||
==== Adverse Impact ==== | ==== Adverse Impact ==== | ||
Steps for calculating Adverse Impact. | Steps for calculating Adverse Impact. | ||
- | 1. At each score, calculate the pass rate percentage for each subgroup. (divide the number of persons in each group (ethnicity subgroups and gender subgroups - that had that score by the total number of people in that subgroup) | + | - At each score, calculate the pass rate percentage for each subgroup. (divide the number of persons in each group (ethnicity subgroups and gender subgroups - that had that score by the total number of people in that subgroup) |
- | 2. Identify the highest percentage | + | - Identify the highest percentage |
- | 3. Divide each of the other percentages by the higher percentage – if it is less than 80% than that group is adversely impacted. | + | - Divide each of the other percentages by the higher percentage – if it is less than 80% than that group is adversely impacted. |
Line 76: | Line 82: | ||
If you were to pretend that those were the only three groups – the 1.1% would be the highest percentage rate. | If you were to pretend that those were the only three groups – the 1.1% would be the highest percentage rate. | ||
- | 1. If the passpoint were set at a score of 69.8, 0 Caucasian candidates out 16 Caucasian passed = 0%; 1/92 Black passed = 1.1%; 1/111 Hispanic passed = 0.9% passed. (pretending those are the only 3 groups for demonstrative purposes) | + | - If the passpoint were set at a score of 69.8, 0 Caucasian candidates out 16 Caucasian passed = 0%; 1/92 Black passed = 1.1%; 1/111 Hispanic passed = 0.9% passed. (pretending those are the only 3 groups for demonstrative purposes) |
- | 2. The highest is 1.1% for Blacks | + | - The highest is 1.1% for Blacks |
- | 3. Calculate the pass rates: | + | - Calculate the pass rates: |
- | a. Caucasian 0/1.1 = 0% - less than 80% so there is Adverse Impact (thus the * next to it) | + | * Caucasian 0/1.1 = 0% - less than 80% so there is Adverse Impact (thus the * next to it) |
- | b. Hispanic 0.9/1.1 = 81.8% - more than 80% so there is no Adverse impact (therefore no * next to it) | + | * Hispanic 0.9/1.1 = 81.8% - more than 80% so there is no Adverse impact (therefore no * next to it) |
A second example: | A second example: | ||
- | 1. For the score of 66.1 – 3/16 Caucasian passed = 18.75%; 6 out 92 Black passed = 6.52%; 5 out of 111 Hispanic passed = 4.50% | + | - For the score of 66.1 – 3/16 Caucasian passed = 18.75%; 6 out 92 Black passed = 6.52%; 5 out of 111 Hispanic passed = 4.50% |
- | 2. Highest percentage is 18.75% for Caucasian | + | - Highest percentage is 18.75% for Caucasian |
- | 3. Calculate the pass rates | + | - Calculate the pass rates |
- | a. Black = 6.52/18.75 = 34.8% - Less than 80%, so there is Adverse Impact | + | * Black = 6.52/18.75 = 34.8% - Less than 80%, so there is Adverse Impact |
- | b. Hispanci = 4.5/18.75 = 24% - less than 80%, so there is Adverse Impact | + | * Hispanic = 4.5/18.75 = 24% - less than 80%, so there is Adverse Impact |
- | These files are calculating this for every single score. | + | {{:adverseimpact.png?600|}} |
==== Alternate Scores ==== | ==== Alternate Scores ==== | ||
Line 107: | Line 112: | ||
We have the following systems loaded with values: | We have the following systems loaded with values: | ||
^System^AlternateBase^AlternateMean^AlternateStDev^ | ^System^AlternateBase^AlternateMean^AlternateStDev^ | ||
- | |rue (Default)|?|-|-| | + | |True (Default)|?|-|-| |
|Custom|?|0 or Calculated|0 or Calculated| | |Custom|?|0 or Calculated|0 or Calculated| | ||
|Z Scores|?|0|1| | |Z Scores|?|0|1| | ||
Line 165: | Line 170: | ||
|d|{{:stat_biserialcorrected_d.png?|}} \\ **ABS** function returns the **absolute value** (i.e. the modulus) of any supplied number. The syntax of the function is: ABS(number). where the number argument is the numeric value that you want the modulus of.| | |d|{{:stat_biserialcorrected_d.png?|}} \\ **ABS** function returns the **absolute value** (i.e. the modulus) of any supplied number. The syntax of the function is: ABS(number). where the number argument is the numeric value that you want the modulus of.| | ||
|Ordinal|{{:stat_biserialcorrected_d.png?|}} \\ **EXP** function in Excel **calculates** for the value of “e” raise to certain power of **integer**. “e” is a constant number which is equal to 2.71828182845904, the **natural logarithm** base. When it comes to the value of e, Excel uses a value of 2.718282.| | |Ordinal|{{:stat_biserialcorrected_d.png?|}} \\ **EXP** function in Excel **calculates** for the value of “e” raise to certain power of **integer**. “e” is a constant number which is equal to 2.71828182845904, the **natural logarithm** base. When it comes to the value of e, Excel uses a value of 2.718282.| | ||
- | ||{{:stat_biserialcorrected.png?|}}| | + | ||{{:stat_biserialcorrected.png?200|}}| |