Claudio Alberti is a Policy Analyst in the OECD Development Co-operation Directorate. Rachel Lineham is Evaluation Advisor and Methods Innovation Lead at the UK Foreign, Commonwealth & Development Office.
Around the world, development co-operation budgets are under strain. The OECD projects a 9-17% drop in official development assistance (ODA) in 2025, on top of a 9% fall in 2024. By 2027, ODA levels could revert to those of 2020. In this environment, development stakeholders must prove value for money and show results.
At the same time, the scale and complexity of development challenges are increasing. Solid evidence on what works, where, for whom, and why is essential to make limited resources go further. Yet decision makers face a flood of information, making it hard to identify what truly drives impact. Traditional evaluation approaches cannot keep up with the demand for timely, actionable insights.
This is where artificial intelligence (AI) is beginning to change the landscape. By rapidly processing vast and complex data sets, identifying patterns, and synthesising evidence at scale, AI can help evaluators and policy makers understand what to prioritise. For governments with shrinking budgets and rising needs, this presents both strategic opportunities and new responsibilities.
Here’s what policy makers need to know.
Read more: AI for development evaluation: Unlocking better, faster, fairer evidenceHow AI can help evaluate development projects
As development budgets tighten, the demand for high-quality evidence is growing faster than traditional evaluation approaches — such as manual reviews, field-based data collection, and synthesis processes — can deliver. Evaluators must make sense of vast, complex, and often fragmented information streams while responding to rapidly changing contexts.
AI can help bridge the gap between evidence demand and delivery. It offers strong potential throughout the evaluation cycle, from automating systematic reviews to enhancing data analysis. Large Language Models, which can process and analyse complex datasets, identify patterns, and generate insights that might be overlooked by human evaluators, are well suited to these tasks. These tools can accelerate evidence synthesis and help decision makers focus on what matters most, whether assessing project effectiveness, identifying emerging risks, or spotting replicable success factors.
Recent international collaboration has begun to explore how this potential can be responsibly harnessed. The UK Foreign, Commonwealth and Development Office (FCDO) and Global Affairs Canada recently convened experts from across sectors to explore how AI could reshape evaluation and knowledge management. In June 2025, representatives from academia, international organisations, citizens and civil society met to explore ways to leverage AI to produce better and fairer evidence. These meetings, called the Cape Town Consensus, led to joint commitments to advance the use of AI, with a focus on equity and ethics.
Setting AI up for success
Early AI adopters are already demonstrating what this could look like. Finland’s Ministry for Foreign Affairs has launched OpenEval, an AI-assisted platform for accessing evaluative evidence. The UN Sustainable Development Group’s System-Wide Evaluation Office (SWEO) created an AI tool to map and summarise evaluations. To ensure that learnings from these and other programmes are widely shared, the Network on Development Evaluation of the OECD Development Assistance Committee (EvalNet) is documenting use cases from across the development co-operation system. Each case study includes guidance to enable the replication of successful initiatives.
Early test cases, including the above, suggest that three criteria must be met for AI to successfully support evaluation efforts:
1. A foundation of interdisciplinary collaboration. Building effective tools requires considerable upfront investment and collaboration among evaluators, information technology, AI experts, and evidence users. This can help ensure that tools are not only technically sound but are aligned with evaluation needs and ethical standards. Private sector involvement is essential to promote the development of products that meet the quality and ethical standards required in development co-operation, alongside government efforts to provide an enabling policy environment and ensure that innovation aligns with public priorities.
2. Sensitivity to context. AI tools must be sensitive to local realities, including biases and linguistic diversity. AI-powered evidence synthesis platforms, often trained on biomedical databases and English-language publications, tend to overlook research from the Global South and have limited capacity to process non-Western languages. Their reliance on written resources in Western languages also makes it difficult to capture oral evidence, which is vital for integrating indigenous knowledge. Neglected sectors and stakeholders should be better engaged in developing these tools. The early involvement of evaluators can help ensure these nuances are not lost in algorithmic processes. This not only strengthens equity in the evidence ecosystem but fosters an enabling environment for locally-led evaluations.
3. Trust in the tools. Without trust in how AI handles data and generates its findings, uptake will remain limited. Data from a 2024 EvalNet survey, conducted with the UK FCDO and Global Affairs Canada, show that mistrust remains a major barrier to AI adoption, whether due to data safety concerns or doubts about the quality of outputs. Building trust requires shared standards for AI use, ethical guidelines, and clear governance frameworks. Initiatives such as the OECD Principles for Trustworthy AI, the first intergovernmental standards on AI, are central to responsible adoption.
When it comes to development evaluation, automation must complement, not replace, critical thinking, contextual knowledge, and stakeholder engagement. This is not necessarily straightforward: integrating AI into evidence ecosystems requires leadership, sound governance and attention to ethics. Here OECD guidance on Applying Evaluation Criteria Thoughtfully provides a useful reference point, reminding us that technology must serve evaluative principles such as relevance, effectiveness, and inclusion. If used wisely, AI can make it easier for policy makers to find the evidence they need to make informed decisions. At a time of shrinking development budgets and growing development needs, this represents a critical opportunity.

