Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
stats_concepts [2017/03/13 10:39]
steveclarke
stats_concepts [2018/04/10 09:12] (current)
72.38.112.148
Line 1: Line 1:
 ====== Statistics and Results Concepts ====== ====== Statistics and Results Concepts ======
 ==== Stats Result Set ==== ==== Stats Result Set ====
-CE uses a [[exam_calculations|Calculations]] process to generate Stats Result sets.  A Stats Result set is all of the calculated values for an Exam answer key compared against a set or sets of Registrant responses registered to Exam Sessions ​(FIXME link).  The Stats Result set contains Candidate performance against the Exam, Sections, Competencies as well as aggregated results for the Exam, Sections, Items and Competencies. ​ The data in the Stats Result set are used by various reports and exports. ​ The Stats Result set is stored in the database and can be over written by rerunning the Calculation process/+CE uses a [[exam_calculations|Calculations]] process to generate Stats Result sets.  A Stats Result set is all of the calculated values for an Exam answer key compared against a set or sets of Registrant responses registered to [[session_entry|Exam Sessions]].  The Stats Result set contains Candidate performance against the Exam, Sections, Competencies as well as aggregated results for the Exam, Sections, Items and Competencies. ​ The data in the Stats Result set are used by various reports and exports. ​ The Stats Result set is stored in the database and can be over written by rerunning the Calculation process/
  
 ==== Statistics Scope ==== ==== Statistics Scope ====
Line 60: Line 60:
 STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically,​ this is referred to as "​biased."​ Remembering that the P in STDEVP stands for "​population"​ may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result. STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically,​ this is referred to as "​biased."​ Remembering that the P in STDEVP stands for "​population"​ may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result.
 ==== Reliability ==== ==== Reliability ====
-We have included a Reliability.xlsx spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials.+Reliability in statistics and psychometrics is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. 
 + 
 +We have included a {{:Reliability.xlsx|Reliability}} ​spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials.
 ==== Skewness and Kurtosis ==== ==== Skewness and Kurtosis ====
-We have included a Skewness and Kurtosis.xlsx spreadsheet that shows examples of how we calculate Skewness and Kurtosis.+Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution,​ or data set, is symmetric if it looks the same to the left and right of the center point. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. 
 + 
 +Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers. 
 + 
 +We have included a {{:Skewness.xlsx|Skewness} ​and {{:Kurtosis.xlsx|Kurtosis} ​spreadsheet that shows examples of how we calculate Skewness and Kurtosis. 
 + 
 +==== Adverse Impact ==== 
 +Steps for calculating Adverse Impact. 
 + 
 +  - At each score, calculate the pass rate percentage for each subgroup. (divide the number of persons in each group (ethnicity subgroups and gender subgroups - that had that score by the total number of people in that subgroup) 
 +  - Identify the highest percentage 
 +  - Divide each of the other percentages by the higher percentage – if it is less than 80% than that group is adversely impacted.  
 + 
 + 
 +For example using the sample at the bottom of this document – in the following subset – there are 16 Caucasian, 92 Black, 111 Hispanic; (Total number of people in each subgroup acts as the denominator for step 1)  
 + 
 +If you were to pretend that those were the only three groups – the 1.1% would be the highest percentage rate.  
 + 
 +  - If the passpoint were set at a score of 69.8, 0 Caucasian candidates out 16 Caucasian passed = 0%; 1/92 Black passed = 1.1%; 1/111 Hispanic passed = 0.9% passed. ​ (pretending those are the only 3 groups for demonstrative purposes) 
 +  - The highest is 1.1% for Blacks 
 +  - Calculate the pass rates: 
 +    * Caucasian 0/1.1 = 0% - less than 80% so there is Adverse Impact (thus the * next to it) 
 +    * Hispanic 0.9/1.1 = 81.8% - more than 80% so there is no Adverse impact (therefore no * next to it) 
 + 
 +A second example:  
 +  - For the score of 66.1 – 3/16 Caucasian passed = 18.75%; 6 out 92 Black passed = 6.52%; 5 out of 111 Hispanic passed = 4.50% 
 +  - Highest percentage is 18.75% for Caucasian 
 +  - Calculate the pass rates 
 +    * Black = 6.52/18.75 = 34.8% - Less than 80%, so there is Adverse Impact 
 +    * Hispanic = 4.5/18.75 = 24% - less than 80%, so there is Adverse Impact 
 + 
 +{{:​adverseimpact.png?​600|}}
  
 ==== Alternate Scores ==== ==== Alternate Scores ====
Line 79: Line 112:
 We have the following systems loaded with values: We have the following systems loaded with values:
 ^System^AlternateBase^AlternateMean^AlternateStDev^ ^System^AlternateBase^AlternateMean^AlternateStDev^
-|rue (Default)|?​|-|-|+|True (Default)|?​|-|-|
 |Custom|?|0 or Calculated|0 or Calculated| |Custom|?|0 or Calculated|0 or Calculated|
 |Z Scores|?​|0|1| |Z Scores|?​|0|1|
Line 101: Line 134:
 (zScore * AlternateStDev + AlternateMean) * AlternateBase / 100 (zScore * AlternateStDev + AlternateMean) * AlternateBase / 100
  
 +
 +=== Custom TScore Calculation ===
 +CE Can be setup to generate TScores via custom methods. ​ On example of a custom method involves setting Alternate Mean and Alternate Standard Deviation values for all competencies on the Exam.  The Compentecy TScores are then aggregated and an Exam Alternate Mean and Exam Alternate Standard Deviation are applied to create an exam TScore. ​ See [[exam_edit_pass#​Alternate Competency Values|Alternate Competency Values]].
  
 ==== Biserial Calculations ==== ==== Biserial Calculations ====
Line 134: Line 170:
 |d|{{:​stat_biserialcorrected_d.png?​|}} \\ **ABS** function returns the **absolute value** (i.e. the modulus) of any supplied number. The syntax of the function is: ABS(number). where the number argument is the numeric value that you want the modulus of.| |d|{{:​stat_biserialcorrected_d.png?​|}} \\ **ABS** function returns the **absolute value** (i.e. the modulus) of any supplied number. The syntax of the function is: ABS(number). where the number argument is the numeric value that you want the modulus of.|
 |Ordinal|{{:​stat_biserialcorrected_d.png?​|}} \\ **EXP** function in Excel **calculates** for the value of “e” raise to certain power of **integer**. “e” is a constant number which is equal to 2.71828182845904,​ the **natural logarithm** base. When it comes to the value of e, Excel uses a value of 2.718282.| |Ordinal|{{:​stat_biserialcorrected_d.png?​|}} \\ **EXP** function in Excel **calculates** for the value of “e” raise to certain power of **integer**. “e” is a constant number which is equal to 2.71828182845904,​ the **natural logarithm** base. When it comes to the value of e, Excel uses a value of 2.718282.|
-||{{:​stat_biserialcorrected.png?​|}}|+||{{:​stat_biserialcorrected.png?​200|}}|
  
  
Line 140: Line 176:
  
 ==== Term Calculations ==== ==== Term Calculations ====
-Term Calculations are run the through the [[term_entry|Term Entry]] tool.  They are used to aggregate results across multiple separate weighted [[stats_concepts#​statsid|Stat Set]]. ​ Candidate results are then ranked by their overall performance across the weighted Stats Sets.  ​+Term Calculations are run the through the [[term_entry|Term Entry]] tool.  They are used to aggregate results across multiple separate weighted [[stats_concepts#​statsid|Stat ​Results ​Set]]. ​ Candidate results are then ranked by their overall performance across the weighted Stats Sets.  ​
  
 As an example, a Terms Calculation can be run on a 40% weighted midterm Stats Set along with a 60% weighted final stats set.  This will produce a Term Stats set aggregating on the two source Stats sets. As an example, a Terms Calculation can be run on a 40% weighted midterm Stats Set along with a 60% weighted final stats set.  This will produce a Term Stats set aggregating on the two source Stats sets.