Evaluation
Direction and guidance for evaluating initiatives across the department to improve education outcomes for all students, and support the effective, efficient, appropriate and transparent use of public resources.
Audience
All department staff.
Version | Date | Description of changes | Approved by |
---|---|---|---|
V02.0.0 | 06/12/2024 | Revised to provide clearer, more detailed instructions on how and by whom evaluations should be conducted, in accordance with the department's and NSW Treasury's requirements. | Executive Director, Policy and Evidence, CESE |
V01.0.0 | 26/07/2024 | Under the 2023 Policy and procedure review program, new policy document with consolidated instructions previously provided in the Evaluation Policy and Evaluation framework. | Executive Director, Policy and Evidence, Centre for Education Statistics and Evaluation (CESE) |
About the policy
The NSW Treasury Policy and Guidelines: Evaluation (PDF 9 MB) require that all NSW Government agencies monitor and evaluate their programs, both ongoing and new, to assess their achievement of intended outcomes and benefits to the people of NSW.
All programs should undergo some form of evaluation, monitoring, and/or periodic review.
This should happen even if the program is not identified in the department’s annual evaluation schedule or assessed as a priority based on the Treasury guidelines.
This policy document applies to evaluations that are initiated by department program owners.
It also includes evaluations required by:
- NSW Treasury
- funding bodies
- external program requirements.
These procedures relate to the Enterprise management policy.
Term | Definition |
---|---|
Administrative data | Information that organisations collect as part of their ongoing management and operations. Examples of administrative data in the department include attendance and suspension records, and student assessment data. |
Compliant evaluation | Some evaluations must meet the department’s and NSW Treasury’s requirements depending on the purpose of the evaluation output (section 2.2). These are referred to as compliant evaluations for the purposes of this policy. |
Evaluation | A rigorous, systematic, transparent and objective process for making judgements about the implementation, impacts and merits or worth of a program, usually in relation to its effectiveness, efficiency and appropriateness. |
Evidence based decision-making | Using high quality, rigorous methods to gather and interpret data, and/or to critically assess existing research as the basis for making decisions. |
Mixed methods | A combination of qualitative and quantitative research and/or evaluation approaches. |
Initiatives | A set of activities managed together over a sustained period that aim to deliver benefits for communities. An initiative is sometimes used interchangeably with policies, projects, programs or strategies. Initiatives may include one or more projects that aim to deliver a specific product or output and achieve a strategic outcome within a specific timeframe and budget. |
Qualitative data | Information that usually refers to perceptions of phenomena such as thoughts, observations, feelings, opinions and/or lived experiences and which are not easily reduced to numbers. Qualitative data helps us answer questions about the 'what', 'how' and 'why' of a phenomenon, rather than questions of 'how many' or 'how much'. |
Quantitative data | Information that can be expressed as numbers. This allows for various forms of analysis, including descriptive statistics (like averages, counts, percentages and differences) to summarise the data, and inferential statistics to draw conclusions about a larger population from a smaller sample. Both types of analyses are instrumental in understanding large datasets, identifying trends over time, and exploring differences across groups. |
Secretary:
- approves evaluations that are conducted or commissioned within the department.
Senior executive staff:
- senior executives of the department (Deputy Secretaries, executive directors and directors) ensure compliance with this policy.
Deputy Secretary, Education and Skills Reform:
- approves the annual evaluation schedule to be submitted to the Cabinet Standing Committee on Expenditure each financial year.
Executive Director, Policy and Evidence, Centre for Education Statistics and Evaluation (CESE):
- implements, monitors and reviews this policy.
Program owners (executive directors):
- ensure that evaluations are designed and conducted according to the requirements of this policy and that evaluations are properly resourced by their business unit.
What needs to be done
In these procedures, ‘compliant evaluations’ mean those that comply with both department and NSW Treasury requirements and can therefore be used to determine the program’s overall effectiveness (its achievement of intended outcomes) or efficiency (value for money).
Performance measurement activities such as monitoring, ‘deliverology’, post-occupancy evaluation, or formative research (such as pilot studies) can be used to complement an evaluation, but, on their own, do not align with the evaluation requirements and recommendations outlined in the Treasury guidelines. This also extends to tracking key performance metrics based on their purpose (why), types of activities to be undertaken (how), and who leads or conducts the evaluation (by whom).
1. Identify priority initiatives for evaluation
Program owners must identify any priority initiatives that will need to be included in the department’s annual evaluation schedule. Initiatives should undergo an evaluation if they either:
- are central to the achievement of department, state or national priorities (for example, election commitment, NSW Plan for Public Education)
- involve large-scale investment (cost more than $10 million over the lifetime of the initiative).
Initiatives that are wholly or partly funded by other government agencies or non-government organisations must be evaluated and included in the department’s annual evaluation schedules if the department is the lead agency.
Evaluations within the department are approved by the Secretary in a process that is administered and overseen by the Centre for Education Statistics and Evaluation (CESE). Compliance with this policy contributes to whether an evaluation is approved.
2. Prepare for an evaluation
2.1. Determine the purpose of an evaluation (why)
Program owners must first decide how they will need to use their evaluation findings. This will determine whether a compliant evaluation is required, or if performance measurement activities are sufficient.
Possible purposes for evaluations include:
- supplementing submissions to the Cabinet Standing Committee on Expenditure as a part of a new policy proposal
- justifying the expansion, reform or discontinuation of an existing initiative
- meeting reporting requirements from external funding bodies (for example, Commonwealth, NSW Treasury)
- supporting decision-making for the initiative.
For compliant evaluations, program owners must fulfill both department (this policy) and NSW Treasury (Policy and Guidelines: Evaluation) requirements. These requirements dictate what sort of data collection or evaluation activities are appropriate depending on the purpose of the planned evaluation.
A high-level monitoring and evaluation plan should be included in business cases to the Cabinet Standing Committee on Expenditure for new policy proposals or recurrent proposals to expand or reform an existing initiative. The plan should be appropriate to the initiative’s size, priority and risk. The NSW Treasury Policy and Guidelines: Submission of Business Cases (PDF 518 KB) recommends that initiatives:
- with a lifetime cost between $10 and $50 million should focus on providing evidence of costs, outcomes and benefits
- over $50 million should look at providing evidence of outcomes and net social benefits, as well as assess value for money.
The NSW Treasury NSW Government Business Case Guidelines (PDF 1 MB) together with the NSW Treasury Policy and Guidelines provide more information on evaluation planning for business case development.
Program owners must conduct or plan for compliant evaluations when they need to achieve one or more of the following:
- fulfill external reporting requirements (for example, NSW Treasury, Commonwealth funding bodies)
- supplement the application for a new policy proposal funding or other state/federal funding for existing programs.
Program owners can conduct other performance measurement activities when they need to achieve one or more of the following:
- provide insights for program development/scaling a program (formative research)
- supplement a compliant evaluation.
If the goal is to support decision-making for the initiative or generate insights for program improvement, then both a compliant evaluation or performance measurement activities may be suitable.
2.2. Determine the scale of the evaluation
Once the purpose and need for a compliant evaluation has been identified, the program owner should determine the scale of their initiative. The scale was adapted from the three-scale model in the Treasury Guidelines, which, as outlined in Table 1, is based on its:
- size – cost of the initiative over a 4-year period
- strategic priority – whether the initiative is an election commitment
- risk – whether the initiative has external reporting requirements or is delivered in partnership with another agency (reputational risk).
When setting up initiatives that involve multiple projects or policies, program owners must first determine the scale and plan the evaluation for the initiative as a whole.
The process for categorising department initiatives by scale is outlined in Table 1 and Figure 1 below.
Table 1 Scales
Scale | Characteristics |
---|---|
Scale A |
Size: Costs more than $250 million over 4 years OR Strategic priority: Is a government commitment (for example, election commitment) OR Risk: Is delivered in partnership with another government agency |
Scale B |
Strategic priority: Is not a government commitment AND Size: Costs $50 million – up to $250 million over 4 years OR Risk: Has external reporting requirements (commonwealth/state) OR Risk: Is a Tier 1 or Tier 2 under Major Projects Policy for Government Businesses Description – TPP18-05; TPG22-12 |
Scale C |
Size: Costs less than $50 million over 4 years AND Strategic priority: Is not a government commitment AND Risk: Is not a Tier 1 or Tier 2 under Major Projects Policy for Government Businesses Description – TPP18-05; TPG 22-12 |
Figure 1. Flowchart to determine the scale of department initiatives
3. Comply with evaluation requirements
Program owners must establish their evaluation complies with department and NSW Treasury requirements at the outset of the initiative.
Program owners and evaluators must consider the following:
- the independence of the evaluation (by whom)
- the types of evaluation to be undertaken (how)
- ensure that evaluations are appropriately resourced.
3.1. Ensure an evaluation is independent (by whom)
Evaluators are responsible for evaluation activities and outputs, including the final content of evaluation reports. Evaluation reports must reflect the evaluators’ findings and conclusions and must not be amended without their agreement.
Treasury provides guidance (refer to Evaluation Workbook V. Evaluation plan: Use the right expertise [PDF 119 KB]) on the level of independence required for evaluations based on the initiative’s scale and characteristics. Initiatives of high cost and high priority (scales A and B):
- must be conducted by evaluators external to the division of the initiative
- have independent governance mechanisms in place.
Table 2 outlines the independence requirements for initiatives as categorised by their scale.
Table 2 Independence requirements
Scale | Who must conduct the evaluation? | Who should be included in the governance? |
---|---|---|
Scale A | The Centre for Education Statistics and Evaluation (CESE), if the initiative is not in the same division as CESE (report to the same Deputy Secretary) and CESE has not played a role in developing or delivering the initiative OR An evaluation unit that is not in the same division as the program team and has not played a role in developing or delivering the initiative OR An external supplier that has not played a role in developing or delivering the initiative |
External stakeholders should be included in the governance group (for example, academic or industry experts). The methodology and deliverables should be reviewed by independent subject matter experts (such as an external supplier or academic). |
Scale B | CESE OR An evaluation unit that is not in the same division (report to the same Deputy Secretary) as the program team and has not played a role in developing or delivering the initiative OR An external supplier that has not played a role in developing or delivering the initiative |
If the evaluation must be conducted by an evaluation unit in the same division (report to the same deputy secretary) as the initiative:
|
Scale C | CESE OR An external supplier that has not played a role in developing or delivering the initiative OR Program owners or internal evaluation teams |
No requirements for Scale C NOTE: Evaluations should be conducted with the right mix of evaluation expertise, independence from program owners, and impartiality. |
3.2. Specify the types of evaluation (how)
Program owners must specify the types of evaluations to be included in their plan. These include:
- process or implementation evaluation
- outcome or impact evaluation
- economic evaluation.
These evaluation types are explained further below, with examples of questions.
Program owners of scales A or B initiatives (refer to Table 1 above), must plan for and undertake both a process and outcome evaluation to ensure compliance.
Economic evaluations are recommended for evaluations of any scale of initiative when the outcomes can be monetised. Process and outcomes evaluations are recommended for Scale C initiatives.
Process or implementation evaluation investigates how programs are delivered, describing the current conditions and identifying issues that may support or hinder success. The evaluation assesses:
- whether activities are being implemented as intended
- which aspects of a program are working well
- what could be improved to inform adjustments to service delivery.
Examples of process evaluation questions include:
- Is the program being implemented as planned?
- How well is the program operating?
- What are the barriers or facilitators to implementing program activities?
- Which program activities are meeting the needs of participants and other key stakeholders?
Outcome or impact evaluation determines whether a program has met its stated objectives. This evaluation type also considers:
- the intended and unintended effects of a program
- if the program works for particular populations and under what circumstances.
Outcome evaluation requires certain conditions to produce robust findings:
- sufficient time to show an effect
- reliable data that enables comparison with a baseline and/or a control group
- measurable outcomes
- and an adequately sized and representative sample.
Examples of outcome evaluation questions include:
- Did the program meet its stated objectives?
- What difference did the program make?
- Who has benefited from the program, how, and under what circumstances?
- Are there any unintended consequences for participants or stakeholders?
Economic evaluation identifies, measures and values the costs and benefits of a program and assigns a value to a program’s inputs and outcomes. Therefore, a quality economic evaluation can only be done when a program is producing reliable outcomes data.
There are various methods for an economic evaluation, however the Treasury guidelines emphasise:
- cost-benefit analysis, which should be used when assessing net social benefits and value for money of a program. It is particularly suitable for large, complex, or risky programs
- cost-effectiveness analysis, which should be used when assessing value for money of a program where it is not feasible to quantify or monetise benefits
Examples of economic evaluation questions include:
- What has been the ratio of costs to benefits for the initiative?
- Is the initiative cost effective relative to alternatives?
3.3. Ensure evaluations are appropriately resourced
Executives that oversee an evaluation should:
- ensure that the evaluation and monitoring activities are appropriately resourced
- consider what is feasible and realistic to achieve within time and budget constraints.
Costs for evaluation should form part of a program’s overall budget. Based on Treasury guidelines, the total cost of evaluation and monitoring and other data collection activities for an initiative should be 1% to 5% of the total program budget over the period that the evaluation is being conducted. The cost of evaluation and monitoring includes:
- the personnel commitment from within the department
- value of contracts with professional services providers
- funding to other teams within the department to support these activities.
For example, a program that costs $100 million over 4 years should not spend more than $5 million on evaluation and monitoring or other collections over the same period.
4. Follow the general principles of evaluation
Program owners and evaluators should follow the principles outlined in Table 3 when planning for an evaluation.
Table 3 General principles of evaluation
Principle | Action and resources |
---|---|
Plan your evaluation early | Ideally, plan evaluations during the program design stage to:
Embed periodic program evaluation, including before, during and after program implementation, into program design. Supporting resources: The NSW Treasury resource, Evaluation Policy and Guidelines supports the implementation of the Treasury requirements and provides advice and resources for planning an evaluation. |
State the purpose clearly |
Evaluation stakeholders should have a clear understanding of an evaluation’s purpose and how findings are intended to be used. This includes knowing why the evaluation matters, to whom the findings will be important and why, and an awareness of the evaluation’s context. |
Ensure evaluation is rigorous and methodologically sound |
Evaluations should be:
Robust evaluation design incorporates the most appropriate methods specific to that evaluation, including quantitative and/or qualitative methods. Section 3.2 of this policy outlines the type of evaluation conducted as part of a compliant evaluation. |
Comply with ethical and data security requirements |
Ethical considerations and data security requirements must be incorporated into the design and conduct of evaluations when applicable. Supporting resources:
|
Ensure effective governance and oversight |
Evaluations should have effective governance structures and processes to ensure oversight of the evaluation design, implementation and reporting. These structures and processes should:
|
Ensure the appropriate mix of evaluation expertise |
Conduct evaluations with the right mix of evaluation expertise, independence from program owners and impartiality. Form evaluation teams based on principles of diversity, inclusion and equity. Identify and actively involve stakeholders in the evaluation process. This will ensure that the definition of outcomes, activities and outputs, as well as what is important to measure in assessing program success, is determined collaboratively. Stakeholders play an important role in contributing to the interpretation of evaluation outputs and in formulating recommendations. Program owners (or the project team) must work closely with the evaluation team (or those stakeholders with evaluation expertise) and inform them of the program area. This will avoid misinterpretation of project aims, objectives and stakeholders. |
Evaluation conduct and findings should be transparent |
The conduct of evaluations should be open to scrutiny. Systematically record and report comprehensive information on all aspects of the evaluation, including methods, analyses and conclusions. Explicitly justify and clearly distinguish factual findings and conclusions from value judgments and recommendations. |
Conduct evaluations in a timely manner |
Evaluations should be timely and strategic to influence decision-making. Providing valid, reliable information requires a balance of technical and time requirements with practical considerations to ensure the evaluation supports evidence-based decision-making. |
Findings should be used for decision-making |
Evaluation findings should be used to:
The department uses evaluations to:
|
Evaluations must be carried out in accordance with the department’s Re-imagining Evaluation: A Culturally Responsive Evaluation Framework.
This framework highlights the importance of centring Aboriginal students, their families, and communities at the heart of evaluation methodology and processes.
Evaluators should recognise, respect and be responsive to the cultural values of participants, communities and settings for which the program intends to create impact.
This includes collaborating with community in evaluation design and honouring the principles of Aboriginal family sovereignty and Aboriginal data sovereignty.
5. Submit scale A or B evaluations for approval
Program owners must submit their proposals for scale A or B evaluations (refer to the three-scale model, section 2.2) to the department’s evaluation approvals process by email to cese.evaluation.corro@det.nsw.edu.au.
CESE will assess applications and provide a briefing to the Secretary’s office for approval. For additional information about the evaluation approvals process, please refer to the CESE intranet resources page (staff only). Scale A or B evaluations not approved by the Secretary will not proceed.
6. Publish the findings
Staff must make outputs from a compliant evaluation publicly available, except where there is an overriding public interest against disclosure, in line with the Government Information (Public Access) Act 2009. Before publication, staff must also account for cross-government agreements, such as the memorandum of understanding with NSW Health, to ensure sensitive information is not released.
Evaluation reports should be released in a range of forums, including the CESE Evaluation evidence bank or the initiative’s website. Send requests for publication in the CESE evaluation repository to info@CESE.gov.au.
Record-keeping requirements
- Program owners, evaluators and delegated staff must keep records in line with the department’s Records management procedures.
Supporting tools, resources and related information
- NSW Government – Evaluation Toolkit
- NSW Treasury Evaluation Policy and Guidelines – Evaluation workbooks
- NSW Evaluation Treasury Policy and Guidelines – Resources
- Re-imagining Evaluation: A Culturally Responsive Evaluation Framework
- Knowledge platform – BetterEvaluation
- NSW legislation – Government Information (Public Access) Act 2009
- NSW Treasury – NSW Government Business Case Guidelines TPP18-06 (PDF 1 MB)
- NSW Treasury – NSW Government Guide to Cost-Benefit Analysis TPG23-08 (PDF 2 MB)
- NSW Treasury – Policy and Guidelines: Evaluation TPG22-22 (PDF 9 MB)
Policy contact
02 7814 0357
info@cese.nsw.gov.au
The Executive Director, Evaluation and Effectiveness, CESE monitors the implementation of this procedure, regularly reviews its contents to ensure relevance and accuracy, and updates it as needed.