Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
stats_concepts [2017/03/06 13:23]
steveclarke [Biserial Calculations]
stats_concepts [2018/04/10 09:12] (current)
72.38.112.148
Line 1: Line 1:
 ====== Statistics and Results Concepts ====== ====== Statistics and Results Concepts ======
 +==== Stats Result Set ====
 +CE uses a [[exam_calculations|Calculations]] process to generate Stats Result sets.  A Stats Result set is all of the calculated values for an Exam answer key compared against a set or sets of Registrant responses registered to [[session_entry|Exam Sessions]]. ​ The Stats Result set contains Candidate performance against the Exam, Sections, Competencies as well as aggregated results for the Exam, Sections, Items and Competencies. ​ The data in the Stats Result set are used by various reports and exports. ​ The Stats Result set is stored in the database and can be over written by rerunning the Calculation process/
 +
 ==== Statistics Scope ==== ==== Statistics Scope ====
 CE at its base level tracks each Candidates performance against each Item.  This is the base scope of statistics CE calculates. ​  CE then aggregates those Candidate-Item statistics to generate higher scoped statistics, like Candidate-Section statistics or Candidate-Exam statistics. ​ CE also aggregates the Candidate statistics to create Exam scoped statistics such as Exam-Competency or Exam-Section statistics. CE at its base level tracks each Candidates performance against each Item.  This is the base scope of statistics CE calculates. ​  CE then aggregates those Candidate-Item statistics to generate higher scoped statistics, like Candidate-Section statistics or Candidate-Exam statistics. ​ CE also aggregates the Candidate statistics to create Exam scoped statistics such as Exam-Competency or Exam-Section statistics.
Line 5: Line 8:
 ==== StatsID ==== ==== StatsID ====
 For each calculation against an exam, a unique set of statistics is created and stored in the database. ​ This unique set of statistics is keyed by the StatsID. For each calculation against an exam, a unique set of statistics is created and stored in the database. ​ This unique set of statistics is keyed by the StatsID.
 +
 ==== Include In Stats ==== ==== Include In Stats ====
 Include in Stats is used to indicate whether a Candidate is auditing the Exam (false) or taking the Exam (true). ​ If the Candidate is auditing the Exam, the Candidate has their statistics calculated, but their statistics are not aggregated up to Exam statistics. Include in Stats is used to indicate whether a Candidate is auditing the Exam (false) or taking the Exam (true). ​ If the Candidate is auditing the Exam, the Candidate has their statistics calculated, but their statistics are not aggregated up to Exam statistics.
Line 56: Line 60:
 STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically,​ this is referred to as "​biased."​ Remembering that the P in STDEVP stands for "​population"​ may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result. STDEVP is used when the group of numbers being evaluated is complete - it's the entire population of values. In this case, the 1 is NOT subtracted and the denominator for dividing the sum of squared deviations is simply N itself, the number of observations (a count of items in the data set). Technically,​ this is referred to as "​biased."​ Remembering that the P in STDEVP stands for "​population"​ may be helpful. Since the data set is not a mere sample, but constituted of ALL the actual values, this standard deviation function can return a more precise result.
 ==== Reliability ==== ==== Reliability ====
-We have included a Reliability.xlsx spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials.+Reliability in statistics and psychometrics is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions. 
 + 
 +We have included a {{:Reliability.xlsx|Reliability}} ​spreadsheet that shows examples of how we calculate various reliability metrics as well as biserials.
 ==== Skewness and Kurtosis ==== ==== Skewness and Kurtosis ====
-We have included a Skewness and Kurtosis.xlsx spreadsheet that shows examples of how we calculate Skewness and Kurtosis.+Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution,​ or data set, is symmetric if it looks the same to the left and right of the center point. Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. 
 + 
 +Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers. 
 + 
 +We have included a {{:Skewness.xlsx|Skewness} ​and {{:Kurtosis.xlsx|Kurtosis} ​spreadsheet that shows examples of how we calculate Skewness and Kurtosis. 
 + 
 +==== Adverse Impact ==== 
 +Steps for calculating Adverse Impact. 
 + 
 +  - At each score, calculate the pass rate percentage for each subgroup. (divide the number of persons in each group (ethnicity subgroups and gender subgroups - that had that score by the total number of people in that subgroup) 
 +  - Identify the highest percentage 
 +  - Divide each of the other percentages by the higher percentage – if it is less than 80% than that group is adversely impacted.  
 + 
 + 
 +For example using the sample at the bottom of this document – in the following subset – there are 16 Caucasian, 92 Black, 111 Hispanic; (Total number of people in each subgroup acts as the denominator for step 1)  
 + 
 +If you were to pretend that those were the only three groups – the 1.1% would be the highest percentage rate.  
 + 
 +  - If the passpoint were set at a score of 69.8, 0 Caucasian candidates out 16 Caucasian passed = 0%; 1/92 Black passed = 1.1%; 1/111 Hispanic passed = 0.9% passed. ​ (pretending those are the only 3 groups for demonstrative purposes) 
 +  - The highest is 1.1% for Blacks 
 +  - Calculate the pass rates: 
 +    * Caucasian 0/1.1 = 0% - less than 80% so there is Adverse Impact (thus the * next to it) 
 +    * Hispanic 0.9/1.1 = 81.8% - more than 80% so there is no Adverse impact (therefore no * next to it) 
 + 
 +A second example:  
 +  - For the score of 66.1 – 3/16 Caucasian passed = 18.75%; 6 out 92 Black passed = 6.52%; 5 out of 111 Hispanic passed = 4.50% 
 +  - Highest percentage is 18.75% for Caucasian 
 +  - Calculate the pass rates 
 +    * Black = 6.52/18.75 = 34.8% - Less than 80%, so there is Adverse Impact 
 +    * Hispanic = 4.5/18.75 = 24% - less than 80%, so there is Adverse Impact 
 + 
 +{{:​adverseimpact.png?​600|}}
  
 ==== Alternate Scores ==== ==== Alternate Scores ====
Line 75: Line 112:
 We have the following systems loaded with values: We have the following systems loaded with values:
 ^System^AlternateBase^AlternateMean^AlternateStDev^ ^System^AlternateBase^AlternateMean^AlternateStDev^
-|rue (Default)|?​|-|-|+|True (Default)|?​|-|-|
 |Custom|?|0 or Calculated|0 or Calculated| |Custom|?|0 or Calculated|0 or Calculated|
 |Z Scores|?​|0|1| |Z Scores|?​|0|1|
Line 97: Line 134:
 (zScore * AlternateStDev + AlternateMean) * AlternateBase / 100 (zScore * AlternateStDev + AlternateMean) * AlternateBase / 100
  
-FIXME add section ​on biserials+ 
 +=== Custom TScore Calculation === 
 +CE Can be setup to generate TScores via custom methods. ​ On example of a custom method involves setting Alternate Mean and Alternate Standard Deviation values for all competencies ​on the Exam.  The Compentecy TScores are then aggregated and an Exam Alternate Mean and Exam Alternate Standard Deviation are applied to create an exam TScore. ​ See [[exam_edit_pass#​Alternate Competency Values|Alternate Competency Values]].
  
 ==== Biserial Calculations ==== ==== Biserial Calculations ====
Line 129: Line 168:
 |sdTest|Standard Deviation of Candidate raw point totals on all Included Items on Exam.| |sdTest|Standard Deviation of Candidate raw point totals on all Included Items on Exam.|
 |NormInv|Returns the inverse of the normal cumulative distribution for the specified mean and standard deviation. \\ NormInv(p, meanTest, sdTest)| |NormInv|Returns the inverse of the normal cumulative distribution for the specified mean and standard deviation. \\ NormInv(p, meanTest, sdTest)|
- +|d|{{:​stat_biserialcorrected_d.png?​|}} \\ **ABS** function returns the **absolute value** (i.e. the modulus) of any supplied number. The syntax of the function is: ABS(number). where the number argument is the numeric value that you want the modulus of.| 
 +|Ordinal|{{:​stat_biserialcorrected_d.png?​|}} \\ **EXP** function in Excel **calculates** for the value of “e” raise to certain power of **integer**. “e” is a constant number which is equal to 2.71828182845904,​ the **natural logarithm** base. When it comes to the value of e, Excel uses a value of 2.718282.| 
 +||{{:​stat_biserialcorrected.png?​200|}}|
  
  
Line 136: Line 176:
  
 ==== Term Calculations ==== ==== Term Calculations ====
-Term Calculations are run the through the [[term_entry|Term Entry]] tool.  They are used to aggregate results across multiple separate weighted [[stats_concepts#​statsid|Stat Set]]. ​ Candidate results are then ranked by their overall performance across the weighted Stats Sets.  ​+Term Calculations are run the through the [[term_entry|Term Entry]] tool.  They are used to aggregate results across multiple separate weighted [[stats_concepts#​statsid|Stat ​Results ​Set]]. ​ Candidate results are then ranked by their overall performance across the weighted Stats Sets.  ​
  
 As an example, a Terms Calculation can be run on a 40% weighted midterm Stats Set along with a 60% weighted final stats set.  This will produce a Term Stats set aggregating on the two source Stats sets. As an example, a Terms Calculation can be run on a 40% weighted midterm Stats Set along with a 60% weighted final stats set.  This will produce a Term Stats set aggregating on the two source Stats sets.