NQC Presentation On Validation And Moderation

1WWW.WERC.VU.EDU.AU

Enhancing Comparability of Standards through Validation and Moderation

A study funded by the National Quality Council

Shelley Gillis Berwyn Clayton Andrea Bateman

2

Rationale

• Some key stakeholders have raised concerns with the quality and consistency of assessments being undertaken by RTOs. That is, concerns have been raised about comparability of standards.

3

Aim

• To develop a series of products that would:• Improve the consistency in assessment decisions within

VET; • Increase the level of confidence in industry in assessment

in VET; • Increase awareness of, and consistency in, the application

of reasonable adjustments in making assessment decisions;

• Increase capability in RTOs to demonstrate compliance with AQTF 2007 Essential Standards for Registration, Standard 1.

4

Products

• Guide for Developing Assessment Tools

• Code of Professional Practice for Validation and Moderation

• Implementation Guide: Validation and Moderation

http://www.nqc.tvetaustralia.com.au/nqc_publications

5

Changes to the AQTF User Guide

• Validity• Reliability• Assessment tool• Validation • Moderation

6

• The Guide for Developing Assessment Tools

7

Essential Characteristics of an Assessment Tool

• An assessment tool includes the following components:

• The learning or competency unit(s) to be assessed

• The target group, context and conditions for the assessment

• the tasks to be administered to the candidate

• An outline of the evidence to be gathered from the candidate

• The evidence criteria used to judge the quality of performance (i.e., the assessment decision making rules); as well as the

• The administration, recording and reporting requirements.

8

Ideal Characteristics

• The context• Competency mapping• The information to be provided to the candidate• The evidence to be collected from the candidate• Decision making rules• Range and conditions• Materials/resources required• Assessor intervention• Reasonable adjustments • Validity evidence• Reliability evidence • Recording requirements• Reporting Requirements

9

Competency Mapping

• The components of the Unit(s) of Competency that the tool should cover should be described. This could be as simple as a mapping exercise between the components within a task (eg each structured interview question) and components within a Unit or cluster of Units of Competency. The mapping will help determine the suffiency of the evidence to be collected as well as the content validity

10

Decision Making Rules

• The rules to be used to:• Check evidence quality (i.e., the rules of evidence)• Judge how well the candidate performed according to the standard

expected• Synthesise evidence from multiple sources to make an overall

judgement

11

Reasonable Adjustments

• This section should describe the guidelines for making reasonable adjustments to the way in which evidence of performance is gathered without altering the expected performance standards (as outlined in the decision making rules).

12

Validity Evidence

• Validity is concerned with the extent to which an assessment decision about a candidate, based on the performance by the candidate, is justified. Requires determining conditions that weaken the truthfulness of the decision, exploring alternative explanations for good or poor performance, and feeding them back into the assessment process to reduce errors when making inferences about competence

• Evidence of validity (such as face, construct, predictive, concurrent, consequential and content) should be provided to support the use of the assessment evidence for the defined purpose and target group of the tool.

• .

13

Reliability Evidence

• Reliability is concerned with how much error is included in the evidence.

• If using a performance based task that requires professional judgement of the assessor, evidence of reliability could include providing evidence of:

• The level of agreement between two different assessors who have assessed the same evidence of performance for a particular candidate (i.e., inter-rater reliability).

• The level of agreement of the same assessor who has assessed the same evidence of performance of the candidate, but at a different time (i.e., intra-rater reliability).

• If using objective test items (e.g., multiple choice tests) than other forms of reliability should be considered such as the internal consistency of a test (i.e., internal reliability) as well as the equivalence of two alternative assessment tasks (i.e., parallel forms).

14

Examples

Write SayDoCreate

PortfolioInterviewObservationProduct

15

Quality Checks

• Panel• Pilot• Trial

16

A Code of Professional Practice for Validation and Moderation

17

Assessment Quality Management

• Quality Assurance• Quality Control• Quality Review

18

Assessment Quality Management Quality Assurance (Input approach)

Quality Control (Outcome approach)

Quality Review (Retrospective approach)

Examples include:

Industry competency standards as the benchmarks for assessment

National assessment principles

Minimum qualifications for assessors (i.e.,

TAA40404)

Development of a Professional Code of Practice

Standardisation of reporting formats

Assessment Guidelines and Policy

Documents

Benchmark examples of varying levels of performances

Assessment tool banks

Common assessment tasks

Exemplar assessment tools

Panelling, Piloting and/or Trialling of

assessment tools.

Professional development programs/workshops for assessors

Examples include: Moderation in which adjustments to

assessor judgements are made to overcome differences in the difficulty of the assessment tool and/or severity of the judgement.

Examples Include:

Monitoring and auditing of registered training organisations

Review and validation of

assessment tools, processes and outcomes to identify future improvements.

Follow-up surveys with key

stakeholders (e.g., student destination surveys, employer feedback on how well the assessment outcomes predicted workplace performance).

19

Features Validation Moderation

Assessment Quality Management Type

Quality Review

Quality Control

Primary Purpose Continuous improvement Bring judgements and standards into alignment.

Timing

On-going, yet most powerful post assessment

Prior to the finalisation of Candidate results

Assessment Tools; and Assessment tools; and Focus Candidate Evidence (including

assessor judgements) (desirable only)

Candidate Evidence, including assessor judgements (mandatory)

Assessor Partnerships

Consensus Meetings Consensus Meetings External (validators or panels) External (moderators or panels)

Type of Approaches

Statistical

Recommendations for future improvements

Recommendations for future improvements; and

Outcomes

Adjustments to assessor judgements (if required).

Validation Versus Moderation

20

Focus - Tool

• Has clear, documented evidence of the procedures for collecting, synthesising, judging and recording outcomes (i.e., to help improve the consistency of assessments across assessors [inter-rater reliability]).

• Has evidence of content validity (i.e., whether the assessment task(s) as a whole, represents the full range of knowledge and skills specified within the Unit(s) of competency.

• Reflect work-based contexts, specific enterprise language and job-tasks and meets industry requirements (i.e., face validity).

• Adheres to the literacy and numeracy requirements of the Unit(s) of Competency (construct validity).

• Has been designed to assess a variety of evidence over time and contexts (predictive validity).

• Has been designed to minimise the influence of extraneous factors (i.e., factors that are not related to the unit of competency) on candidate performance (construct validity).

21

Focus - Tool

• Has clear decision making rules to ensure consistency of judgements across assessors (inter-rater reliability) as well as consistency of judgements within an assessor (intra-rater reliability).

• Has a clear instruction on how to synthesise multiple sources of evidence to make an overall judgement of performance (inter-rater reliability).

• Has evidence that the principles of fairness and flexibility have been adhered to.

• Has been designed to produce sufficient, current and authentic evidence.

• Is appropriate in terms of the level of difficulty of the task(s) to be performed in relation to the skills and knowledge specified within the relevant unit(s) of Competency.

• Has outlined appropriate reasonable adjustments that could be made to the gathering of assessment evidence for specific individuals and/or groups.

• Has adhered to the relevant organisation assessment policy.

22

Focus - Judgement

• Check whether the judgement was too harsh or too lenient by reviewing samples of judged candidate evidence against the:

• Requirements set out in the Unit(s) of Competency;• Benchmark samples of candidate evidence at varying levels of achievement (including

borderline cases); and the• Assessment decision making rules specified within the assessment tools.

• Desirable for validation, mandatory for moderation

23

Types of Approaches – Assessor Partnerships

• Validation only

• Informal, self-managed, collegial

• Small group of assessors

• May involve:• Sharing, discussing and/or reviewing one another’s tools and/or judgements

• Benefit• Low costs, personally empowering, non-threatening

• Weakness• Potential to reinforce misconceptions and mistakes

24

Types of Approaches - Consensus

• Typically involves reviewing their own & colleagues assessment tools and judgements as a group

• Can occur within and/or across organisations

• Strength• Professional development, networking, promotes collegiality and sharing

• Weakness• Less quality control than external and statistical approaches as they can also be

influenced by local values and expectations• Requires a culture of sharing

25

Types of Approaches - External

• Types• Site Visit Versus• Central Agency

• Strengths• Offer authoritative interpretations of standards• Improve consistency of standards across locations by identifying local bias and/or misconceptions (if any)• Educative

• Weakness• Expensive• Less control than statistical

26

Types of Approaches - Statistical

• Limited to moderation

• Yet to be pursued at the national level in VET

• Requires some form of common assessment task at the national level

• Adjusts level and spread of RTO based assessments to match the level and spread of the same candidates scores on a common assessment task

• Maintains RTO-based rank ordering but brings the distribution of scores across groups of candidates into alignment

• Strength• Strongest form of quality control

• Weakness• Lacks face validity, may have limited content validity

27

Summary of major distinguishing features

• Validation is concerned with quality review whilst moderation is concerned with quality control;

• The primary purpose of moderation is to help achieve comparability of standards across organisations whilst validation is primarily concerned with continuous improvement of assessment practices and outcomes;

• Whilst validation and moderation can both focus on assessment tools, moderation requires access to judged (or scored) candidate evidence. The latter is only desirable for validation;

• Both consensus and external approaches to validation and moderation are possible. Moderation can also be based upon statistical procedures whilst validation can include less formal arrangements such as assessor partnerships; and

• The outcomes of validation are in terms of recommendations for future improvement to the assessment tools and/or processes; whereas moderation may also include making adjustments to assessor judgements to bring standards into alignment, where determined necessary.

28

Principles

• Transparent• Representative• Confidential• Educative• Equitable• Tolerable

29

Transparent

The purpose, process and implications of validation and/or moderation should be

transparent to all relevant stakeholders.

This principle can be enhanced if:

It is made explicit to assessors the purpose, approach and

potential outcomes.

The approach to be implemented is clearly delineated and

communicated to relevant stakeholders.

The justification for the outcomes recommended (validation)

and/or imposed (moderation) are clearly documented and made

available to assessors.

30

Representative

It is not possible or necessary to validate and/or moderate every possible

assessment tool or piece of candidate evidence within an RTO at one time. A

representative sample should therefore be used to validate and moderate

assessment tools and judgements. A properly selected representative sample

can identify any issues with assessment practices and decisions.


A sampling framework is designed in which risk indicators are

identified that may impact on the assessment process and/or

outcomes, and such indicators are targeted for selection; and

There is an element of random selection.

31

Confidential

Information regarding individuals (i.e., assessors and candidates) and providers

must be treated with sensitivity and discretion. Confidentiality should be observed

in relation to the identity of the assessors (i.e., those who developed the

assessment tools and/or made the judgements) and candidates (i.e., those

whose evidence is submitted in the process). This allows the validation and/or

moderation process to focus on the quality of the assessment tools and the

assessment judgements rather than the individuals involved.


De-identified samples of candidates’ work and assessors’ tools

are used.

The outcomes of the process are given in a private, supportive

environment.

32

Educative

Validation and/or moderation should form an integral rather than separate part of

the assessment process. It should provide constructive feedback, which leads to

continuous improvement across the organisation.


The process is supportive and positive for assessors, validators

and/or moderators.

The process and outcomes provide the basis for individuals as

well as organisations to monitor and reflect on their own practice.

The rationales behind recommendations for alterations and/or

adjustments are made explicit to assessors.

Recommendations for improvement to the assessment tool

and/or decision making process are succinct, constructive and

explicit.

Professional development support is available for assessors.

33

Equitable

Validation and/or moderation must be demonstrably fair, equitably applied and

unbiased.


There are clear and effective policies and mechanisms for the

appeal or review of moderation outcomes by key stakeholders, in

circumstances in which an appeal or review is appropriate.

Confidentiality of evidence can be assured.

The process is sensitive to assessor and candidate diversity and

has no inherent biases.

34

Tolerable

Any assessment includes a margin of error. The way in which evidence is

gathered and interpreted against the standards will vary. The challenge is to limit

the variation to acceptable proportions. Validation and/or moderation enables the

variation to be identified and limited to what is tolerable.


Benchmark samples of borderline cases are used as points of

reference.

Exemplar tools are made available to assessors as well as

validators/moderators.

A risk assessment has been undertaken of the implications of a

false positive judgement (i.e., assessing someone as competent

when in actual fact they are not yet competent) and a false

negative judgement (i.e., assessing someone as not yet

competent when in actual fact the person is competent).

35

Associate Professor Dr Shelley Gillis

Deputy DirectorWork-based Education Research Centre

Victoria University

Email: [email protected]: 0432 756 638

Andrea Bateman

DirectorEducation ConsultantBateman Giles Pty Ltd

Email: [email protected]

Phone: 0418 585 754

Berwyn Clayton

DirectorWork-based Education Research Centre

Victoria University

Email: [email protected]: 0411 138 205

NQC Presentation On Validation And Moderation

Documents

Transcript of NQC Presentation On Validation And Moderation