Caveon Webinar Series Lessons Learned at NCSA and ITC July 2014
Caveon Webinar Series - Will the Real Cloned Item Please Stand Up? final
-
Upload
caveon-test-security -
Category
Education
-
view
267 -
download
3
Transcript of Caveon Webinar Series - Will the Real Cloned Item Please Stand Up? final
www.caveon.com 1
May I Have Your Attention, Please:
Would the Real Cloned Item Stand Up?
Tara Williams, Chief Editor Jennifer Miller, Data Forensics Coordinator
10/21/2015
Caveon Webinar Series
www.caveon.com 2
Cloning Trends, Challenges & Solutions
Tara WilliamsCaveon Secure Exam Development and
Support (C-SEDS)
www.caveon.com 3
Agenda for Today
• Cloning 101• Trends in Cloning• Challenges and
Solutions• Using Clones to Deter &
Detect Cheating
www.caveon.com 4
• Security• High price of developing
items• Restricted access to
qualified SMEs and item writers
Common Test Development Woes
www.caveon.com 5
Cloning as Solution Increase security Expand item bankLengthen shelf life of an examControl costsProtect SME/item writer time
www.caveon.com 6
Definition of Cloning
www.caveon.com 7
Trends in Cloning
1. Moving from re-active to pro-activeProgram managers have heard the stats: Items will likely be exposed after 3 weeks of publication (Maynes, COTS, 2014). Preparation is key.
2. Secure item design– Design that deters
Using item design to create items with ‘regenerative’ properties (chameleon items, DOMC items)
– Design that detects(chameleon items)
www.caveon.com 8
Trends in cloning – Pro-active
The story of sourdough…
Baker begins by creating a “starter” of flour and water.
Over a period of days, the starter grows and multiplies as wild yeast grows. The baker continues to feeds it more water and more flour.
Once starter is fully developed, bakers can preserve it, increasing its longevity by removing a portion of it, and then adding more flour and water to it. This is a continuous process. Subtract a portion, add a portion.
www.caveon.com 9
Trends in cloning – Pro-active
Maintaining exam health is similar• Build your bank over a period of time, adding “the right ingredients”(i.e. accuracy review, first edit, final edit, etc.) to get it just right.• Once it is developed, you can preserve its longevity by removing items, and then adding cloned items (Maynes, “Rapid Republication,” COTS, 2014).
www.caveon.com 10
Trends in Cloning – Design that deters & detects
Chameleon
A rectangular ________ with dimensions ___ & ___ meters has an area of:– Garden, 15, 13, 195– Play ground, 15, 15, 225– Swimming pool, 15, 11, 165– Pavilion, 13, 13, 169
• Every item has the same set of answer choices: 195, 225, 165, 169
(Slide taken from Maynes, COTS, 2014)
www.caveon.com 11
Vision Reality
What are practical developmental considerations when implementing cloning?
www.caveon.com 12
Development Challenges (by Level)
Industry
Program
Exam
Item
www.caveon.com 13
Industry-Level Challenge
• Lack of industry-wide standardization of nomenclature
www.caveon.com 14
SolutionNaming, Defining, Determining Use Cases Type Definition Use Cases
Major Cosmetic Surgery (CS)
A general cloning strategy that uses surface-level changes to detect or deter cheating. swapping key words in options altering key words in stem/options altering syntax of stem/options altering paragraphing in stem
Very conducive to non-SME cloners Often combined with other strategies Good for expanding item bank
Chameleon (CH) A cloning strategy that is especially effective at detecting cheating. The cloner attempts to make the clone look so similar to the original that a cheater will guess the original item’s key rather than the new key.
Moderately conducive to non-SME cloners Good for detecting cheating
Flip Key (FK) Flip Stem (FS)
A cloning strategy in which the key is flipped into the stem or the stem is flipped into a key or option.
Moderately conducive to non-SME cloners A good way to deter cheating
New Correct Answer (NC) + 1 New Distractor
A cloning strategy in which the key is replaced with another correct answer. The cloner then replaces at least 1 distractor, as well.
Not conducive to non-SME cloners, as it requires knowledge of how many potential correct answers to the question exist
A good way to deter cheating and expand the item bank
Hybrid Clone (HC) Any combination of the first four types of cloning. Clones often rely on multiple strategies
www.caveon.com 15
Program-level Challenge
• Untrained SMEs / managers / editors• Outdated item development tools
www.caveon.com 16
Solution
• Importance of training all stakeholders
• Having realistic (cost & time) expectations that reflect limitations and capabilities of development tool
www.caveon.com 17
Exam-level challenges
• Items that lack alignment to course material
• Items that lack rationale statements
www.caveon.com 18
Solution
• Make two things standard as part of your development process:-audit of item alignment to course material-inclusion of effective item rationales
www.caveon.com 19
Item-level challenges
• “Inflexible” language (i.e. not conducive to synonym swaps)
• “Inflexible” syntax (i.e. not conducive to paraphrasing and restructuring)
• Item that lacks variables that can easily be swapped in and out
www.caveon.com 20
Solutions
1. Design-driven
2. Process-drivenSMEs are required / encouraged to create chameleon item type, not just MC, MS, etc.
Considerations:• Requiring vs. encouraging• Demands experienced SMEs• Not all content readily lends
itself to this format First Edit
Flagged items cloned by SME or non-SME
Content Reviewer Flags “clone-able” items
Item Writing
www.caveon.com 21
Determining when a clone is a clone
• Aligned to objective• Sufficiently altered to prevent a
candidate from using pre-knowledge to gain an unfair advantage
But what does “sufficiently altered” look like?
www.caveon.com 22
When is a clone a clone?Sample ItemIf one person shines a flashlight (F1) at you from 10m away, and a second person shines an identical flashlight (F2) at you from 20m away, what is the relative amount of light you measure from the two flashlights? *)F1 appears 4 times brighter than F2 1)F1 appears 2 times brighter than F2 2)F1 appears 10 times brighter than F2 3)F1 and F2 appear equally bright 4)F2 appears 4 times brighter than F1
[Source: http://frigg.physastro.mnsu.edu/~eskridge/astr101/sample.html
Clone – Draft 1One person shines a stage light (S1) at you from 10m away, and a second person shines an identical stage light (S2) at you from 20m away, what is the relative amount of light you measure from the two flashlights? *)S1 appears 4 times brighter than S2 1)S1 appears 2 times brighter than S2 2)S1 appears 10 times brighter than S2 3)S1 and S2 appear equally bright 4)S2 appears 4 times brighter than S1
[Source: http://frigg.physastro.mnsu.edu
www.caveon.com 23
When is a clone a clone?Clone– Draft 1One person shines a stage light (S1) at you from 10m away, and a second person shines an identical stage light (S2) at you from 20m away, what is the relative amount of light you measure from the two flashlights? *)S1 appears 4 times brighter than S2 1)S1 appears 2 times brighter than S2 2)S1 appears 10 times brighter than S2 3)S1 and S2 appear equally bright 4)S2 appears 4 times brighter than S1
[Source: http://frigg.physastro.mnsu.edu/~eskridge/astr101/sample.html]
Clone– Draft 2If a lighting engineer shines a stage light (S1) at an actor from 20m away, and a second lighting engineer shines an identical stage light (S2) at the same actor from 10m away, what is the relative amount of light you measure from the two stage lights? *)S1 appears 4 times brighter than S2 1)S1 appears 2 times brighter than S2 2)S1 appears 10 times brighter than S2 3)S1 and S2 appear equally bright 4)S2 appears 4 times brighter than S1
[Source: http://frigg.physastro.mnsu.edu/~eskridge/astr101/sample.html]
www.caveon.com 24
Key Points Summarized
There are exciting possibilities with cloning. Implementing them requires developmental considerations:
– standardizing nomenclature– training personnel– having realistic expectations of dev tool– making course alignment audit and rationales
standard in dev process– understanding “clone-ability”
www.caveon.com 25
Cloning & Deterring Cheating
Jennifer MillerCaveon Data Forensics
www.caveon.com 26
Using Cloned Items to Deter & Detect Cheating
• Objective 1: Devalue Braindump Content by Confusing and Misdirecting Users
• Objective 2: Detect use of braindump content through data forensics
**Assumes cloned items are designed such that they are comparable in difficulty and measurement properties to original items.
www.caveon.com 27
Devalue Braindump Content by Confusing and Misdirecting Braindump Users
• Salient + Trivial facts = Memorable Misdirection
Trivial facts should be modifiable without
changing measurement properties of the items.
www.caveon.com 28
What happens with Clones
• Cloning method is critical• Change stem/do not change answer
– If the new item looks like the old, the braindumpers will use the old answer
– Effect: the item is still compromised• Change stem/change answer
– The item will not likely be compromised– Most effective: change stem slightly so
different choice is correct
www.caveon.com 29
Original:Yosemite Sam Drilling Company is installing a 200 ft, 2-inch diameter groundwater monitoring well at the Solvang Gold Mine. What is the recommended inside diameter (in inches) of the PVC well casing they should use to install the well?
A. 1.94 inches B. 2.47 inchesC. 2.07 inchesD. 2.32 inches
Cloned:Yosemite Sam Drilling Company is installing a 20 ft, 2-inch diameter groundwater monitoring well at the Solvang Gold Mine. What is the recommended inside diameter (in inches) of the PVC well casing they should use to install the well?
A. 1.94 inches B. 2.47 inchesC. 2.07 inchesD. 2.32 inches
Difference has no impact on legitimate test takers.
They don’t know the difference.
Example
www.caveon.com 30
Simulated Braindump Users
• Item parameters from live data, 50 items, cut-score of 33
• Pass rate=94%
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 480
2000
4000
6000
8000
10000
12000
14000
16000
18000
Scores for Braindump Users
Cut-score
www.caveon.com 31
Simulated Braindump Users• 50% random replacement with
indistinguishable cloned items• Scoring with NEW Answer Key• Pass rate = 0% 100% protection
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 480
2000
4000
6000
8000
10000
12000
14000
16000
18000
After Republication with Swapping
Cut-score
www.caveon.com 32
Detect Braindump Usage through Data Forensics
• Invalidate Scores and/or Disciplinary Action
• Determine When to Re-publish Items
www.caveon.com 33
Invalidate Scores/Disciplinary Action
• Old (original) items vs New (cloned) Items
Exam Form 1:100% Original Items
Exam Form 2:50% Original Items50% Cloned Items
STOLEN!!
www.caveon.com 34
Invalidate Scores/Disciplinary Action
• First! Objective 1 has been met! Devalue braindump content. Users will fail 50% of items from confusion and misdirection.
• Second! Detect braindump usage:
“Flawed” Answer KeyScore exams twiceExam Form 2:
50% Original Items50% Cloned Items
Answer Key 1 (“Flawed”):Based on answers to original items
Answer Key 2:Based on correct answers to Form 2 items
Braindump users will answer predictably
since they have access to original content.
www.caveon.com 35
Invalidate Scores/Disciplinary Action
Three Groups:
Those who score well on the “flawed” answer key
Those who score well on
the new answer key
Those who score poorly
on both answer keys
www.caveon.com 36
Invalidate Scores/Disciplinary Action
Prepared Test Takers
Unprepared Test Takers
BRAINDUMP USERS
www.caveon.com 37
Determine When to Re-publish Items
• Differences between mean scores of groups approaches zero – can no longer distinguish between the groups
• Overall pass rate goes back up
**Sophisticated publication and republication strategies can be designed to maximize detections
while also minimizing development costs.
www.caveon.com 38
Thank You!
Tara WilliamsChief Editor, Secure Exam Development &
Follow Caveon on twitter @caveonCheck out our blog www.caveon.com/blogLinkedIn Group “Caveon Test Security”
Jennifer MillerData Forensics [email protected]