Evaluation in International Organizations

January 24, 2024

Evaluation has been part of the management policy of international organizations since they were founded. In earlier times, evaluation seemed to grow out of a concern for auditing. The function of external auditors has been to reassure Member States that funds have not been misused.

It is no coincidence that in many organizations, the evaluation function is in the same organizational location as internal audit. This is the case with the United Nations Secretariat and specialized agencies like the IAEA. One of the organizational concerns is to ensure that evaluations are independent. In the World Bank, the Independent Evaluation Group (IEG, formerly the Operations Evaluation Department) is not linked with internal audit and, moreover, does not report to the President of the Bank. To ensure its independence, it reports directly to the Board of Directors and its head cannot be re-appointed. In the United Nations, the Under-Secretary-General for Internal Oversight Services is elected by the General Assembly on nomination by the Secretary-General and has a five-year, non-renewable term that overlaps that of the Secretary-General. Two of the readings describe the organization and functioning of evaluation in the UN and the IAEA.

This is consistent with one of the purposes of evaluation, which is to show that the organization's work produces results, and if it does not, that this is detected and corrected. For the demonstration to be credible, the evaluators have to be credible as well and one means is to make them organizationally independent.

Evaluation in the programming and budgeting cycle

While evaluation has been an integral element of program planning and budgeting in the United Nations system, its integration has been somewhat difficult historically. In the United Nations proper it is part of the Regulations and Rules on Program Planning, Budgeting, Monitoring and Evaluation that have been in place for well over 20 years (and were revised in 2002 and again in 2016). The difficulty with that system was that the evaluations took place too late to have a reasonable effect on programming. (Typically, the next program budget is produced when the current one has just begun, and evaluations have to be made after the period has ended. Thus an evaluation can only affect programming for two biennia later.) It was also difficult to determine what to evaluate and as a result, the evaluations had little effect.

This has changed with the institution of results-based programming and budgeting in organizations of the UN system. This did not mean that the organizations previously were not interested in results. The results-based approach means that the organizations have to indicate what outcomes and impact they expect their programs and projects to have. Its purpose is to move the focus from inputs and outputs to what the outputs are supposed to produce in the way of changes. Moreover, it is intended to remove the incentive for governments to micromanage international organizations by giving managers flexibility in exchange for clear accountability. In this system, evaluation is critical: it validates whether the promised outcomes have actually been achieved. If they have not, managers must be held responsible.

Something has gone terribly wrong

One function of evaluation is to derive lessons from disasters. This is a kind of reactive evaluation. A classic case is the evaluation of what went wrong in Rwanda, where the peacekeeping process failed to prevent the genocide. This was undertaken by an independent commission. As a result of this and other incidents, the Secretary-General commissioned a larger evaluation of peacekeeping operations, under Ambassador Lakhdar Brahimi. The Brahimi report made far-reaching recommendations. Even before these, an evaluation conducted by the OIOS Evaluation Unit had found serious problems in the planning and start-up of peacekeeping operations.

Examples of this use of evaluation abound: the UNHCR commissioned evaluations of a number of its operations, as did the World Food Programme. The IAEA, in the wake of the First Gulf War, did an evaluation of its inspection procedures. In all of these cases, the evaluations were intended to recommend changes that would avoid problems in the future.

Something has gone wonderfully right

Most evaluations in the United Nations system are not reactions to what has gone wrong, but are intended to show what has gone right, learn from it and apply it to improved performance in the future. The motivation, for professionals in the organizations, is to prove that their approach, or issue, is the correct one.

My own involvement in United Nations evaluation began with that concern. After I completed my assignment in Venezuela, I returned to the United States to finish my dissertation and then spent three years teaching at the University of Washington. I rejoined the United Nations in the unit that had backstopped the Venezuela project. The unit, the Regional and Community Development Section of the Social Development Division of the Department of Economic and Social Affairs, mostly managed technical assistance to countries trying to use a community development approach to integrated rural development. It had discovered two things: the approach was very difficult to implement successfully because of the complex variables involved, and there was very little evidence of its success, mostly because the programs did not undertake evaluations.

Recalling the lessons I had learned from the evaluation of the Venezuelan agrarian reform, namely that if good monitoring and evaluation data could be acquired and applied program effectiveness could be increased, I convinced the head of the section and the principal technical advisor for projects that we should try to build an evaluation component into all of the programs we supported with our technical assistance projects.

The intent was for us to help provide the programs with the tools to demonstrate their effectiveness and thus both protect them and help attract additional funding. For us, the outcome would be that the demonstrated success of the programs would show why our projects, and therefore our approach, were desirable for the United Nations.

Here, however, we had to confront the first hurdle that any international evaluation has to face. We could not evaluate our projects if our counterparts did not evaluate their programs. Our effect was indirect. We could provide inputs into a government program in the form of expertise, training and equipment, but the result depended on what the government agency concerned did with them.

Over the three years from 1971 to 1974, I undertook missions to projects in Brazil, Paraguay, Venezuela, Panama, Mexico and Egypt to help rural and community development programs design evaluation systems. I became convinced that, if these were introduced, the programs would be able to detect problems in time to correct them and demonstrate their results. I was also convinced that a successful evaluation system required three things: sound empirical methodology, an ability to produce findings quickly and a relatively low cost.

Eventually, the results of these two years of field work were put together, with the help of colleagues in the Section, in the manual Systematic Monitoring and Evaluation of Integrated Development Programmes, which I am using as a kind of text for this course.

Most international development agencies have had to deal with the fundamental problem: they cannot themselves evaluate the programs they are assisting. Instead, they must induce their national counterparts to do so. This is why most of the main agencies have developed evaluation material and training courses, which we are using in this course.

Types of evaluation in international organizations

In the end, there are a number of types of evaluation done by international organizations. They are distinguished by the extent that they can be done by the organizations themselves.

At the top are evaluations of what the organizations do directly. These determine the direct result of outputs or services produced by an organization. The easiest type (conceptually) is the evaluation of whether the policy analysis done leads to decisions by Member States at the intergovernmental level. These are complicated by the fact that, in the orthodoxy of international organizations, secretariats are supposed to be invisible. States do things, the Secretariats are "just there". In fact, Secretariats play a significant role in structuring the kinds of decisions on which Member States reach agreement.

An example is an evaluation done by the UN's OIOS of the effectiveness of support provided by the Department of Economic and Social Affairs to the Economic and Social Council. The question was, what was the role of the various reports produced (at a certain cost) on substantive matters in the decisions taken by the Council. The evaluation showed that, depending on the way in which the process unfolded, the Secretariat work could make decisions easier. One of the empirical indicators was the proportion of recommendations made in Secretariat reports that eventually were adopted in Council resolutions.

The next level are evaluations of whether the kinds of practices promoted by organizations are adopted by national counterparts. These are less direct results, but are important. An example in the IAEA is the use of certain techniques of reactor safety that the Agency recommends to national nuclear authorities. If the techniques are sound and the Agency promotion methods are effective, these techniques will come into common use.

I have led an independent evaluation of the evaluation function of the International Labour Organization, which shows how that organization uses evaluations in both programs and projeccts. I led a similar evaluation of the Global Programme against Money-Laundering (GPML) of the United Nations Office on Drugs and Crime. It also connected programs and projects to deal with a complex international problem.

And, at the more distant level are the evaluations done by national counterparts of programs in which the international organization has a part. These are the most difficult, both conceptually and operationally. For the national counterpart they must be able to show how well the program is doing, in a national environment that is itself complex. For the international organization it means building in elements that will be able to show the role played in success or failure by the international input to the program.

In all cases, the key to success in evaluations is how they are planned. The first question is about validity. Do the evaluations measure what they purport to measure? The answer to the question is in determining what should be measured, which means determining what is really expected from the program being evaluated.

That is our second theme this week.

Home | Syllabus | Session 2 page | Resources