EVALUATING TRAINING EFFECTIVENESS: AN INTEGRATED PERSPECTIVE IN MALAYSIA
Lim Guan Chong Master of Business Administration (Finance)
International Graduate School of Management Division of Business and Enterprise University of South Australia
Submitted on this 5th of August in the year 2005 for the partial requirements of the degree of Doctor of Business Administration UNIVERSITY OF SOUTH AUSTRALIA
12 JUL 2006 LIBRARY
University of South Australia DOCTOR OF BUSINESS ADMINISTRATION
PORTFOLIO SUBMISSION FORM
Name: Lim Guan Chong
Student Id No: 0111487H
Dear Sir/Madam
To the best of my knowledge, the portfolio contains all of the candidate's own work completed under my supervision, and is worthy of examination.
I have approved for submission the portfolio that is being submitted for examination.
Signed:
14
Dr Travis Kemp/Professor Dr Leo Ann Mean
Date
Supported by:
ge)(2,/2$ Dr Ian Whyte Chair: Doctoral Academic Review Committee International Graduate School of Business
Date
DBA Portfolio Declaration
I hereby declare that this paper submitted in partial fulfillment of the DBA degree is my own work and that all contributions from any other persons or sources are properly and duly cited. I further declare that it does not constitute any previous work whether published or otherwise. In making this declaration I understand and acknowledge any breaches of the declaration constitute academic misconduct which may result in my expulsion from the program and/or exclusion from the award of the degree.
Signature of candidate:
Lim Guan Chong
Date:5th August 2005
11
TABLE OF CONTENTS Portfolio Submission Form Portfolio Declaration Acknowledgements Overview 1
1.1
1.2 1.3 1.3.1
1.3.2 1.3.3 1.3.4 1.3.5 1.3.6 1.3.7 1.3.8
1.3.9 1.3.10 1.4 1.5 1.5.1
1.5.2 1.5.3 1.5.4 1.6 1.7 2
1
Research Paper 1 Methodological Issues 3 In Measuring Training Effectiveness Abstract 4 Introduction 4 Approaches to Training Evaluation 6 Discrepancy Evaluation Model 7 Transaction Model 10 Goal-Free Model 10 Systemic Evaluation 12 Quasi-Legal Approach 13 Art Criticism Model 13 Adversary Model 14 Contemporary Approaches Stufflebeam's 14 Improvement-Oriented Evaluation (CIPP) Model, 1971 Cervero's Continuing Education Evaluation, 1984 15 Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b, 1976, 1979, 1994, 1996a, 1996b, 1998 16 Critical Review 22 Future Research 27 The Transfer Component 27 Evaluating Beyond the 4 Levels 28 Incorporating Competency-based Approach 29 into Training Evaluation Multi-Rater System in Training Evaluation 31 Conclusion 33 References for Paper One 34
Research Paper 2 Evaluating Training Effectiveness: An Empirical Study of Kirkpatrick Model of Evaluation in the Malaysian Training Environment For The Manufacturing Sector
43
2.1
2.2 2.3 2.4 2.5 2.6 2.6.1 2.6.2 2.6.3 2.7 2.8 2.9 2.10 2.11 3
3.1
3.2 3.3 3.4
3.5
3.6 3.7 3.8 3.9
3.10 3.11
3.12 3.13
Abstract Introduction Training Practices in Malaysia Practice of Evaluation in Training Training Evaluation Practices In Malaysia Methodology of Study Questionnaire Construction The Sample and Sampling Questionnaire Responses Findings and Discussion Limitations of Study Conclusion References for Paper Two Appendix A The Questionnaire for Research Paper Two
44 44 45 46 49
Research Paper 3 Multi-rater Feedback For Training And Development: An Integrated Perspective Abstract Introduction The Use of Multi-rater Feedback The Effectiveness Of Multi-rater Feedback For Development The Effectiveness Of Multi-rater Feedback For Appraisal The Variation Of Multi-rater Feedback Information Multi-rater Feedback Practices In Malaysia Integrating Multi-rater Feedback With Developmental Tool Multi-rater Feedback: Process Consultation As A Developmental Tool Micro Perspective Of Conversation Theory In Process Consultation An Integrated Approach for Post Multi-rater Feedback Development Conclusion References for Paper Three
74
iv
53
54 55
56 56 63 64 65 69
75 75 76 79 81 81 83
86
87 89 91
96 97
Acknowledgements I am sincerely grateful to my supervisors, Dr Travis Kemp and Professor Leo Ann Mean, who have been so supportive, by taking their time to look through my papers and gave me tremendously useful feedback and suggestions. First and foremost, thanks to my spouse, Linda Liew Mei Ling who acted as my research assistant and has put in her late nights and thoughtful moral support throughout this endeavor. Special thanks are reserved for my friends who acted as my proof readers who never let me produce less than the best I had to offer. In particular, my sincerest thanks to my respondents, relatives, families and other parties who have supported me along the way and helped me find the time to complete my thesis. Finally, my utmost appreciation to University of South Australia, International Graduate School of Management for their support and enthusiasm to achieving excellence in education.
Lim Guan Chong
Overview The majority of organizations realize that training must be a worthwhile effort; there must be
returns towards labour productivity after training. Evaluation is possibly the least developed aspect of the training cycle. This research portfolio looks at the effectiveness of Kirkpatrick Four-Levels of Evaluation with emphasis on the assessment of the methodology within the training perspective.
Evaluating training is typically linked with measuring change and quantifying the degree of
change which leads to performance. Measuring gains in organization effectiveness that resulted from training interventions is probably the most difficult task in training evaluation.
This research portfolio, as a partial fulfillment of the requirement of the degree of Doctor of Business Administration, develops a series of ideas that expand on traditional approaches to
training evaluation. The research portfolio is divided into three papers.
Paper 1 critically reviews the methodological problems faced when adopting the evaluation model developed by Donald L Kirkpatrick in 1959. A series of industrial research conducted shows little application of this definite approach. The literature provides little understanding about the transfer of the learning component when using Kirkpatrick model to determine
training effectiveness. Most current researchers find that future research on training evaluation lies in the effectiveness of transfer of the skills learned. The objective of this research portfolio through the anatomy of this classical theory is to effectively address the weaknesses by re-focusing the issue of transfer of learning as a major key to unlock the model's practicality and validity.
Paper 2 adopts a survey method to track the history, rationale, objectives, implementation and evaluation of training initiatives in the Malaysian manufacturing sector. It utilizes the survey research to triangulate reliable and convincing findings.
The research looks at the extensiveness of Kirkpatrick model as practised in the Malaysian manufacturing sector. This paper reports the practice of Kirkpatrick's 4 levels of evaluation and the effectiveness of this evaluation model within the Malaysian manufacturing sector.
1
Paper 3 is on the effective use of the multi-rater feedback system in providing multi-source information and creating self awareness based on individual strength and weaknesses. One
underlying rationale to such system is their potential impact on the individual's self awareness which is thought to enhance performance at the development stage.
This paper serves as a conceptual paper, which studies how multi-rater feedback could effectively lead to a successful developmental process through process consultation in the
context of Malaysia training environment. Through the years, training evaluation culture in Malaysia has not been properly developed. A comprehensive approach is necessary for organizations to see the benefits of conducting pre training analysis. This should be followed by an effective development plan so that a comprehensive training approach could be instilled in the Malaysian environment.
The process consultant holds the key to effective development process by using a multi-rater assessment as a pre-training gap analysis. Process consultation provides the opportunity to 'check and balance' the degree of learning and development activities through reflections, problem solving capabilities and application of theory throughout the developmental process. Good conversation was introduced as an intervention tool to complement double loop learning during process consultation.
This portfolio systematically discusses the issue of training evaluation faced by the Malaysian
manufacturing sector. It is recommended that an integrated model approach comprising preliminary and post assessment using multi-rater feedback, followed by a developmental process using process consultation, complemented by good conversation as an intervention tool, may serve as a rational balance between training financial outlays and development outcome.
2
Research Paper I
METHODOLOGICAL ISSUES IN MEASURING THE TRAINING EFFECTIVENESS
Lim Guan Chong Master of Business Administration (Finance) University of Hull
International Graduate School of Management University of South Australia
3
Methodological Issues in Measuring Training Effectiveness Lim Guan Chong International Graduate School of Management University of South Australia
1.1 Abstract This literature review examines the effectiveness and the methodological issues related to
Kirkpatrick's four-level model of evaluation and its application to training. The paper first measures the extent that the Kirkpatrick's evaluation model has been used by organizations to measure learning outcomes, reactions towards development, transfer learning, change of behavior and return of investment after training. Research was conducted to determine the weaknesses of this model faced by most practitioners. An examination of this classical theory was carried out to address the weaknesses of this model by re-focusing the issue of transfer learning as a key to unlock the model's practicality and validity.
1.2 Introduction Training evaluation is regarded as an important human resource development strategy. However, there seems to be widespread agreement that systematic evaluation is the least well
carried out training activity. Chen and Rossi (1992) commented that evaluation knowledge found in the literature has not been fully utilized in program evaluation. This reveals that
training evaluation has not been culturally embedded in most organizations. The first reason could be that companies have no knowledge in conducting training evaluation. Secondly, the available training evaluation models are not sufficient in providing a total approach for effective training evaluation. This is further evidenced by a study on the benefits of training
in Britain, which revealed that 85 percent of British companies make no attempt to assess the benefits gained from undertaking training (HMSO, 1989).
4
Since evaluation started in the area of education, most of the early definitions were in that
area. Tyler (1949) was the first researcher to define evaluation as a process of determining to what extent the educational objectives are actually being realized by the curriculum and instruction. The early researchers emphasized the need to look at attaining objectives as an important process in determining the effectiveness of any programs. This was found in the
study by Steel (1970), who compared effectiveness of the program with its cost. Boyle and Jahns (1970) defined evaluation as the determination of the extent to which the desired objectives have been attained or the amount of movement that has been made in the desired direction. Further study by Provus (1971) conceptualized the need to have a certain standard of performance as an objective-based criterion to judge the success of the program. His model made comparisons between this preset standard and what actually exists. Noe (2000) defined evaluation by referring to training evaluation as the process of analyzing the outcomes needed to determine if training was effective. However, Goldstein and Ford (2002), were of the opinion that evaluation is a systematic collection of descriptive and judgemental information necessary to make effective training decisions which are related to the selection, adoption, value, and modification of various activities.
After many in-depth studies were conducted on training evaluation and the high costeffective expectation from training, the term evaluation has been given a broader perspective in which it no longer focuses on achieving program objectives but mainly covers the methodology element of evaluation (Brinkerhoff, 1988; Goldstein, 1986; Junaidah, 2001; Shadish & Reichardt, 1987; Stufflebeam & Shinkfield, 1985). The basis of goal-based process formed only part of the overall evaluation process, unlike in the past when researchers used one preferred methodological principle to assess the degree to which
training had attained their goal. With the availability of a wider range of philosophical principles and scientific methodologies, many social scientists emphasized scientific rigor in their evaluation models, and this is reflected in their definition of the field (Junaidah, 2001). The evaluation model of these social scientists involves primarily the application of scientific methodologies to study the effectiveness of the programs. These evaluators emphasized the importance of experimental designs (Goldstein & Ford, 2002), quantitative measures (Rossi & Freeman 1993) and qualitative assessment (Wholey, Hatry & Newcomer, 1994). Contemporary social scientists, Cascio (1989), Mathieu and Leonard (1987), Morrow, Jarrett
5
and Rupinski (1997), Tesoro (1998) even adopted utility analysis in evaluating the worthiness and effectiveness of the programs.
In brief, the concept of evaluation consists of two distinct definitions; congruent and contemporary definitions (Junaidah, 2001). The congruent definition is more concerned with meeting the desired objectives. It is a process of collecting information, judging the worth or value of the program and ensuring training objectives are met.
The contemporary definition
of evaluation places emphasis on scientific investigation to facilitate decision-making. Stufflebeam (1971) mentioned that evaluation is the process of delineating, obtaining and providing useful information for judging decision alternatives. This can be seen from the evolution of the early 70s models to the current contemporary evaluation models.
1.3 Approaches to Training Evaluation Evaluation in its modern form has developed from attempts to improve the educational process (Bramley, 1996). Evaluating the effectiveness of people became popular at about the same time as scientific management, and school officials began to see the possibility of
applying these concepts to school improvement (Bramley, 1996). Tyler (1949) model is generally considered an early prominent evaluation model which was planned to evaluate the value of progressive high-school curricula with more conventional ones (Stufflebeam & Shinkfield, 1985).
Tyler (1949) introduced the Basic Principles of Curriculum and Instruction, which is organized around four main concerns: What educational purposes should the organization seek to attain? How to select learning experiences that are likely to be useful in achieving these purposes? How can the selected learning experiences be organized for effective instruction? How can the effectiveness of these learning experiences be evaluated?
Tyler laid the foundation for an objective-based style of evaluation. Objectives were seen as being critical because they were the source for planning, guiding the instruction and
6
preparing the test and measurement procedures. Tyler's objective-based evaluation model concentrates on clearly stated objectives by changing the evaluation from appraisal of
students to appraisal of programs. He defined evaluation as assessing the degree of attainment of the program objectives. Decisions made on any program had to be based on the goal congruence between the objectives and the actual outcomes of the program (Stufflebeam & Shinkfield, 1985).
1.3.1 Discrepancy Evaluation Model The Discrepancy Evaluation Model, developed by Provus (1971) is used in situations where a program is examined through its development stages with the understanding that each stage (which Provus defines as design, installation, process, product and cost-benefit analysis) is measured against a set of performance standards (objectives). The cost-benefit analysis
identifies the potential benefits of the training before it is carried out. The expected behaviours which result from the training are agreed upon between the trainer and the
trainees. The analysis also establishes training objectives, which are defined as changes in work behaviour and increased levels of organizational effectiveness (Bramley & Kitson, 1994). The program developers had certain performance standards in mind regarding how the
program should work and how to identify if it were working. The discrepancies that are observed between the standards and the developed design are communicated back to the relevant parties for review or further corrective action. A discrepancy evaluator's role is to determine the gap between what is and what should be. This model helps the evaluators to make decisions based on the difference between preset standards and what actually exists (Boulmetis & Dutwin, 2000).
Provus's Discrepancy Evaluation Model can be considered an extension to Tyler's earlier objective-based model where a set of performance standards must be derived to serve as the objectives to which the evaluation of the program is based. Furthermore, the model may be also viewed as having properties of both the formative and summative evaluation (Boulmetis & Dutwin, 2000). The design stage comprises the needs analysis and program planning stages; installation and process are parts of the implementation stage where formative
7
evaluation is done; and product and cost-benefit analysis stages comprises a summative evaluation stage.
Formative evaluation focuses on the process criteria to provide further information to understand the training system so that the intended objectives are achieved (Goldstein &
Ford, 2002). Brown, Werner, Johnson and Dunne (1999) note several potential benefits of formative evaluation. The program could be assessed half way through to see whether it is on track, effectively performed, and whether the activities are meeting the needs of the
training. The evaluator determines the extent to which the program is running as planned, measures the program progress in attaining the stated goals, and provides recommendations
for improvement. The evaluation findings in these reports and the monitoring data could be used to end a program in midstream (Goldstein & Ford, 2002). Unlike formative evaluation, summative evaluation is fairly stable and does not allow adjustments during the program cycle. Summative evaluation involves evaluating and determining whether the program has experienced any unplanned effects. It helps organizational decision makers decide whether
to use the program again or improve it in some way. Campbell (1988) discriminates between two types of summative evaluations; the first evaluation simply questions whether a
particular training program produces the expected outcome. The second evaluation compares and investigates the benefits and viability of programmed instruction procedures. By comparing the two evaluations, it was found that programmed instruction produces quicker mastery of the subject, but the eventual level of learning retention is the same with either technique (Campbell, 1988).
Provus Discrepancy Evaluation Model provides information for establishing measures of training success by determining whether the actual content of the training material would develop knowledge, skill and ability (KSA) and eventually lead to a successful job performance. However, there are too many subjective issues that exist, especially on the setting up of the performance criterion. The chosen criterion is based on the relevance of three components: knowledge, skill and ability which are necessary to succeed in the training and eventually on the job. Considering that modem approaches to assessing training programs must be examined with a multitude of measures, including participant reactions, learning, performance, and organizational objectives, it is necessary for training evaluators to view the performance criteria as multidimensional (Goldstein & Ford, 2002). Training can
best be evaluated by examining many independent performance dimensions. However, the 8
relationship between measures of success should be closely scrutinized because the inconsistencies that occur often provide important insights into training procedures (Goldstein & Ford, 2002). Decisions and feedback processes depend on the availability of all
sources of information. There are many different dimensions in which the performance criteria can vary. Issues like relevance and reliability of the criterion are important to consider should one wish to adopt this discrepancy evaluation model. There are several considerations in the evaluation of the performance criteria. These include acceptability to the organization, networks and coalition that can be built between trainees and realistic measures (Goldstein & Ford, 2002).
Responsive approaches used in the goal-free model are better evaluative approaches as there is considerable variation in what the objectives of a program are thought to be. Responsive approaches are a form of action research which involves the stakeholders in the data collection process (Bramley, 1996). The intention is not to attribute causality, but to gain a
sense of the value of program from different perspectives. The term "responsive evaluation" was first used by Stake (1977) to describe a strategy in which the evaluator is less concerned with the objectives of the program than its effect in relation to the concerns of interested parties, namely the stakeholders.
The responsive approach involves protracted negotiations with a wide range of stakeholders in constructing the report. It is thus more likely to reflect their reality and be useful for them. However, the underlying philosophy of responsive evaluation is different from the goal-based approach. Evaluators are seen as subjective partners and the evaluation is based upon a jointcollaborative effort which results in findings being constructed rather than revealed by the investigation. Truth is a matter of consensus among informed parties. Facts have no meaning except within some value framework. Phenomena can only be understood in the context in which they are studied, generalization is not possible.
The suggested method intends to achieve progressive focus by giving more attention to emerging issues rather than seeking the truth. Legge (1984) introduced a model similar to goal free evaluation which evaluates planned organizational change. The evaluation is a joint, collaborative process, which results in something more constructed than revealed by the
investigation. Legge (1984) suggests that instead of attempting evaluation as a thoroughly
9
monitored research, a contingency approach should be adopted. The contingency approach is used to decide which approach is more appropriate or best matches the functional
requirements of the evaluation exercise. Campbell (1988) revealed that internal validity of the scientific approach may not be so crucial. To increase internal validity, the legitimate stakeholders should agree on the evaluation approach. The highlight on internal validity in the scientific approach will frequently imply controlling key aspect of the context and many
organizational variables. This may lead to rather simplified information which clients find difficult to use because it does not reflect their perception of organizational reality. Due to this strong bipolarity between practitioners and academics, not many responsive evaluations have been described in the training literature (Bramley, 1996).
1.3.2 Transaction Model The Transaction Model developed by Stake (1977) affords a concentration of activity among the evaluator, participants and the project staff (Madaus, Scriven & Stufflebeam, 1986). This model combines monitoring with process evaluation through regular feedback sessions
between evaluator and staff. The evaluator uses a variety of observational and interview techniques to obtain information and the findings will be shared with all the relevant parties to improve the overall program. The evaluator participates and provides project activities. Besides trying to obtain objectivity, the evaluators use subjectivity in the transaction model.
This model may have a goal-free or a goal-based orientation. Findings are shared with the staff of all the projects in order to improve both individual and overall projects (Boulmetis & Dutwin, 2000).
1.3.3 Goal-Free Model Unlike early models, the goal-free model developed by Michael Scriven is a model that involved methodological studies and processes (Popham, 1974). The evaluation model examines how the program is performing and how the program could address the needs of the client population. Program goals are not the criteria on which evaluation is based. However, it is a data gathering process which studies actual happenings and evaluates the effectiveness
10
of the program meeting the client's needs. The evaluator has no preconceived notions regarding the outcome of the program (as opposed to the goal-based model). Categories of evaluation naturally emerge from the evaluator's actual observation. Once the data have been collected, the evaluator attempts to draw conclusions about the impact of the program in addressing the needs of the stakeholders.
However, this model has its weakness in terms of its subjective measures. There are some preconceived notions that the evaluator must be an expert in his respective field and some say
no expertise is better (Rossi & Freeman, 1993). Some researchers said that an evaluator who is not familiar with the nuances, ideologies and standards of a particular professional area will presumably not be biased when observing and collecting data on the activities of a
program. They maintain, for example, that a person who is evaluating a program to train dental assistants should not be a person trained in the dental profession. But other researchers allege that a person who is not aware of the nuances, ideologies and standards of the dental profession may miss a good deal of what is important to the evaluation. Both sides agree that an evaluator must attempt to be an unbiased observer and be adept at observation and capable of using multiple data collection methods (Wholey, Hatry 8z Newcomer, 1994). This is a topic of debate among many experts. Scriven suggested using two goal-free evaluators, each working independently to address the preconceived issues and reduce the possible biasness in evaluation (Scriven, 1991).
A study by O'Leary (1972) illustrates the importance of considering other dimensions of the
criteria. She used a program of role-playing and group problem-solving sessions with hardcore unemployed women. At the conclusion of the program, the trainees had developed positive changes in attitude toward themselves. However, it also turned out that these changes did not reflect the lack of positive attitudes toward their tedious and structured jobs. These trainees apparently raised their levels of aspiration and subsequently sought
employment in a working setting consistent with their newly found expectations. It was obvious that the trainees were leaving the job as well as experiencing positive changes in attitude. However, there are many other cases in which the collection of a variety of criteria related to the objectives is the only way to effectively evaluate the training program
(Goldstein & Ford, 2002). This has caused goal-based evaluation lost ground during the last 20 years because of the growing conviction that evaluation is actually a political process and
11
that the various values held in the society are not represented by an evaluative process which implies that a high degree of consensus is possible (Bramley, 1996).
Further studies by Parlette and Hamilton (1977) rejected the classical evaluation system, which focuses on objective reality, assumed to be equally relevant to all stakeholders in
acknowledging the diversity posed by different interest groups. They suggested the
"illuminative evaluation", with description and interpretation rather than with measurement and prediction.
1.3.4 Systemic Evaluation Systemic evaluation analyses the effectiveness of the whole system and enhances the interfaces between the sub-systems in such a way as to increase the effectiveness of the
system. That is what the "system approach" sets out to do (Rossi & Freeman 1993). The most
comprehensive purpose of systemic evaluation is to find out to what extent training has contributed to the business plans of various parts of the organization and consider whether the
projected benefits obtained outweigh the likely cost of training.
The main questions, which this strategy sets out to answer, are (Bramley, 1996): Is the program reaching the target population?
Is it effective? How much does it cost?
Is it cost effective?
These questions are used to derive facts about the evaluation by defining the size of the target
population and working out the proportion that have attended the training and not opinions of whether useful learning has taken place. Effectiveness is difficult to measure as the word may
imply different meanings to different people. However, the model seems to measure quantity rather than the quality of what is being done.
In the system analysis model, the evaluator looks at the program in a systematic manner,
studying the input, throughput and output (Rivlin, 1971).
12
Input are elements that come into the system (i.e. clients, staff, facilities and resources). Throughput consists of things that occur as the program operates, for example, activities, client performance, staff performance, and adequacy of resources such as money, people and space. Output is the result of program-staff effectiveness, adequacy of activities etc. The evaluator mainly examines the program efficiency in light of these categories.
1.3.5 Quasi-Legal Approach Quasi-legal evaluation operates in a court of inquiry manner. Witnesses are called to testify and tender evidence. Great care and attention is taken to hear a wide range of evidence
(opinions, values and beliefs) collected from the program. This approach is basically used to evaluate social programs rather than formally evaluate training or development activities. Quasi-legal evaluation was reported flawed by Porter and McKibbin (1988) in the area of
management education in the USA. The substantial information received from stakeholders was analysed by a small group of professors from a business school. The students were basically satisfied with the qualification which they have obtained and found course worthwhile and useful. However, the researchers criticized that young graduates who attend MBA courses have never worked in an organization and thus do not understand the sort of issues, which should be the basic discussion material of MBA courses. A similar problem
arose with Constable and McCormick's (1987) report on the demand for and supply of management education and training in the UK. The researchers found that judgement by insufficiently impartial judges in the quasi-legal approach may be irrelevant, biased or inconclusive (Bramley, 1996).
1.3.6 Art Criticism Model In the Art Criticism Model developed by Eisner (1997), the evaluator is a qualified expert in
the nuances of the program and becomes the expert judge of the program's operation. The success of this model depends heavily upon the evaluator's judgment. The intended outcome
may come in the form of critical reflection and/or improved standard. This model could be
13
used when a program wishes to conduct a critical review of its operation prior to applying for funding or accreditation.
1.3.7 Adversary Model In Owen's Adversary Model, the evaluator facilitates a jury that hears evidence from
individuals on particular program aspects (Madaus, Scriven & Stufflebeam, 1986). The jury uses multiple criteria to "judge" evidence and make decisions on what have happened. This model can be used when there are different views of what is actually happening in a program such as arguments for and against program components.
1.3.8 Contemporary Approaches - Stufflebeam's Improvement-Oriented Evaluation (CIPP) Model, 1971 Stufflebeam considers the most important purpose of evaluation is not to prove but to improve (Stufflebeam & Shinkfields, 1985). The four basic types of evaluation in this model are context (C), input (I), process (P) and product (P). Context evaluation defines relevant environment and identifies training needs and opportunities of specific problems. Input evaluation provides information to determine usage of resources in the most efficient way to meet program objectives. The results of input
evaluation are often seen as policies, budgets, schedules, proposals and procedures. Process evaluation provides feedback to individuals responsible for implementation. It is accomplished through providing information for preplanned decisions during implementation and describing what actually occurs. This includes reaction sheets, rating scales and content analysis. Ultimately, product evaluation measures and interprets the attainment of program
goals. Contemporary approaches could take place both during and after the program with the aim to improve program evaluation by expanding the scope of evaluation through its four basic types of evaluation (Madaus, Scriven & Stufflebeam, 1986). The CIPP model was conceptualized as a result of attempts to evaluate projects that had been
funded through the Elementary and Secondary Act of 1956 (Stufflebeam, 1983). To conduct CIPP model evaluation, the evaluator needs to design preliminary plans and deal with a wide
14
range of choices pertaining to evaluation. This requires collaboration between clients and evaluators as a primary source for identifying the interest of the various stakeholders.
1.3.9 Cervero's Continuing Education Evaluation, 1984 In Cervero's book titled "Effective continuing education for professionals" he suggested seven categories of evaluation questions organized around seven criteria to determine
whether the programs were worthwhile (Cervero, 1988). The seven criteria are (a) program design and implementation, (b) learner participation, (c) learner satisfaction, (d) learner knowledge skills and attitudes, (e) application of learning after the program, (t) impact of application of learning and (g) program characteristic associated with outcomes. Program design and implementation is concerned with what was planned, what was actually
implemented and the congruence between the two. Factors such as the activities of learners and instructors and the adequacy of the physical environment for facilitating learning are common questions which are asked in this category.
Learner participation has both quantitative and qualitative dimensions. The quantitative dimension deals with evaluative questions that are most commonly asked in any formal
program. The data is not used to infer answers in the other categories. Qualitative data is collected in an anecdotal fashion by unobtrusively observing the proceedings of the educational activities.
Learner satisfaction is concerned with the participants' reaction and is collected according to various dimensions, such as content, educational process, instructor's performance, physical environment and cost.
Learner knowledge, skills and attitudes focus on changes in the learner's cognitive, psychomotor and affective goals. Normally, the evaluator will adopt a pen and paper test to
judge the effectiveness of these categories. Application of learning addresses the degree of skill transfer to the actual work place. The impact of application of learning focuses on the second-order effects, which means the transfer and impact on the public (Cervero, 1988).
15
Program characteristics are associated with the outcome of the program. There are two kinds of evaluative questions: the implementation questions and the outcome questions. Implementation questions are useful for determining what happened before and during the program. Outcome questions are useful for determining what occurred as a result of the program.
The seven categories in this model are not viewed as a hierarchy (Junaidah, 2001). Cervero's ideas have several antecedents in the evaluation literature. His framework was influenced by
Kirkpatrick's (1959) and Tyler's (1949) models. It is considered to be a comprehensive model as it covers all the stages involved in starting from the program design stage to the outcome stage. However, this model evaluation may be viewed as being too tedious to implement due to its complexity. The author is too immersed in getting facts of the entire process and ignores the efficiency of the whole evaluation process. This makes the model more summative than formative in nature.
1.3.10 The Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b, 1976, 1979, 1994, 1996a, 1996b, 1998 One of the most widely used model for classifying the levels of evaluation, used by Barclays
Bank PLC, Reeves in 1996 and others, was developed by Kirkpatrick. His model looks at four levels of evaluation, from the basic reaction of the participants to the training and its impact to the organizational. The intermediary levels examine what people learned from the
training and whether learning has affected their behaviour on the job. Level one (Level 1) concerns itself with the most immediate reaction of participants and is easily measured by
simple questionnaires after the training. Level two (Level 2) is harder to measure and is concerned with measuring what people understood and how they were able to demonstrate
their learning in the work environment. Level two (Level 2) can be measured by pen and paper tests or through job simulations. Level three (Level 3) looks at the changes in people's behaviour towards the job. For example, after a writing skills course, did the individual make fewer grammatical and spelling errors and were their memos easier to understand? Level
four (Level 4) measures the "result" gained from the training. It focuses on the impact of the training on the organization rather than the individual.
16
Kirkpatrick (1959) developed this coherent evaluation model by producing what was thought to be a hierarchy system of evaluations which indicates effectiveness through:Level 1 (Reaction) Level 2 (Learning) Level 3 (Behaviour) Level 4 (Results)
Kirkpatrick's (1994) Training Evaluation Model Reaction
How did the participants react to the training?
Learning
What information and skills were gained?
Behaviour
How have participants transferred knowledge and skills to their jobs?
Results
What effect has training had on the organization and the achievement
of its objectives? (Timely and quality performance appraisals are corporate goal) Kirkpatrick was the first researcher to develop a coherent evaluation strategy by producing what was thought to be a hierarchy of evaluations, which would indicate benefit (Plant & Ryan, 1994).
Level 1: Reaction Evaluation Kirkpatrick proposed the use of a post course evaluation form to quantify the reactions of
trainees. Evaluation at this level is associated with the terms "happiness sheet" or "smile sheet" because reaction information is usually obtained through a participatory questionnaire administered near or at the end of a training program (Smith, 1990). Studies on evaluation mechanisms have shown that such evaluation sheets are not held in high esteem, despite their general use by trainers of many organizations and in institutions
of higher learning (Bramley 1996; Clegg, 1987; Love, 1991; Rae, 1986;). Clegg (1987) found that training evaluation was conducted for 75 percent of training programs done in
17
organizations. A study by Dawson (1993) found that Level 1 evaluation sheets were ubiquitous.
Level 2
Learning Evaluation
The learning level is concerned with measuring the learning principles, facts, techniques and skills presented in a program (Kirkpatrick, 1994). Tyler (2002) found that 32 percent of companies in America have carried out post-training evaluation on Level 2.
Another research conducted by Mathews, Ueno, Kekale, Repka, Pereira and Silva (2001) on 450 companies in UK, Portugal and Finland which focused on training quality and training evaluation showed that 40 percent of UK companies, 31 percent of Finland companies and 51 percent of Portugal companies conduct formal assessment on learning of the principles, facts, skills and attitudes which were specified as training objectives. This level evaluates the knowledge, skills development and attitudinal changes that have
taken place. Examination of both knowledge and attitudinal outcomes is important to increase coverage of training impacts because the pattern of change can vary between the pre-test and post-test (Basadur, Graen & Scandura, 1986; Kraiger, Ford & Salas, 1993). Researchers either assessed change before and after a program (Basadur et al., 1986; Bretz & Thompsett, 1992), or they look merely at the post-training attainment score
(Davis & Mount, 1984; Warr & Bunce, 1995). Measures of learning should be objective, with quantifiable indicators of how new requirements are understood and absorbed. This data is used to confirm that participant learning has occurred as a result of the training initiative (Phillips & Stone, 2002).
Level 3
Behavioural Evaluation
Job performance after training is referred to as behavioural by Kirkpatrick (1959, 1976)
and transfer by Alliger, Tannenbaum, Bennett, Traver and Shotland (1997). Level 3 evaluates the extent to which the "transfer" of knowledge, skills and attitudes has
18
occurred. Tyler (2002) reported that only 9 percent of America industries have carried out post training evaluation at this level. The focal point is on performance at work after
a program. It is essential to record before and after performance but sometimes selfreport are obtained if information are unavailable to an evaluator (Wexley & Baldwin, 1986). It determines the extent of change in behaviour that has taken place and how this behaviour would be transferred to the workplace. It further encourages one to take into account the possible factors in the job environment that could prevent the application of the newly learned knowledge and skills since a positive climate is important for transferring.
Level 4
Results Evaluation
The evaluation of a particular training program becomes more complex as one progress
through every level of Kirkpatrick model. Results can be defined as the final results that occurred because the participants attended the training program. This includes increased production, improved quality, increased sales and productivity, higher profits and return on investment. Level 4 evaluation observes changes in the performance criteria (i.e. key results area) of organizational effectiveness. This level anticipates the gains the organization can expect from a training event. This level of evaluation is made more difficult as organization often demand that the explanation be given in financial terms with measurable quantifiers (Redshaw, 2001).
For the past 30 years since Kirkpatrick's first idea was published in 1959, much debate had been recorded on this model. Despite criticism, Kirkpatrick model is still the most generally accepted by academics (Blanchard & Thacker, 1999; Dionne, 1996; Kirkpatrick, 1996a; 1996b; 1998; Phillips, 1991). However, research conducted in the United States has
suggested that US organizations generally have not adopted all of Kirkpatrick's 4-level evaluation (Geber, 1995; Holton, 1996). This is especially true for the last two, more
difficult, levels of Kirkpatrick's hierarchy (Geber, 1995). In a survey of training in the USA, Geber (1995) reported that for companies with 100 or more employees, only 62 percent
assessed behavioural change. Geber's (1995) results also indicated that only 47 percent of US companies assess the impact of training on organizational outcomes. This poses a good
19
research question about the model's methodology and it forms the basis for epistemological studies around the methodology.
Kirkpatrick's work has received a great deal of attention within the field of training evaluation (Alliger & Janek, 1989; Blanchard & Thacker, 1999; Campion & Campion, 1987; Connolly, 1988; Dionne, 1996; Geber, 1995; Hamblin, 1974; Holton, 1996; Kirkpatrick, 1959; 1960; 1976; 1979; 1994; 1996a; Newstrom, 1978; Phillips, 1991). His concept calls
for four levels of evaluation namely reaction, learning, behaviour and results. His four levels of training effectiveness stimulated a number of supportive and conflicting models of varying
levels of sophistication (Alliger & Janek, 1989; Campion & Campion, 1987). There are models and methods that incorporate financial analyses of training impact (Swanson &
Holton, 1999). However, Warr, Allan and Birdi (1999) conducted a longitudinal study of the first three levels of training evaluation. The study correlated the following: relationships between evaluation levels, individual and organizational predictors of each level and the
differential predictions of attainment vs change score. The study showed that immediate and delayed learning were predicted by the trainee's motivation, confidence and use of learning
strategies. The researchers highlighted that it is preferable to measure training outcomes in terms of change from pre-test to post-test, rather than merely through attainment (post-test) scores (Warr, Allan & Birdi, 1999).
A review of the most popular procedures used by US companies to evaluate their training programs showed that over half (52 percent) use assessments about participants' satisfaction
with the training. 17 percent assessed application of the trained skills to the job and 13 percent evaluated changes in organizational performance following the training. 5 percent tested for skill acquisition immediately after training while 13 percent of American companies carried out no systematic evaluation of their training programs (Mann &
Robertson, 1996). Many of these procedures reflect Kirkpatrick's four levels of reactions, learning, behaviour and results of which will be further discussed.
More than 50 evaluation models available use the framework of Kirkpatrick model (Phillips,
1991). Currently, majority of the employee training is evaluated at Level 1. Evaluation at Level 1 is associated with the terms smile sheet or happiness sheet, because reaction
information is usually obtained through a participatory questionnaire administered near the
end or at the end of a training program (Smith, 1990). The specific indication of the smile 20
sheet or happiness sheet is enjoyment of the training, perceptions of its usefulness and its perceived difficulty (Warr & Bunce, 1995).
Phillips and Stone (2002) enhanced the popularity of the Kirkpatrick model by inserting the fifth level into the existing 4-level model, though he further argued the inadequacy of this
model in capturing the return on investment aspect of the training outcome. Phillips and Stone's (2002) 5-level evaluation model was seen as an extension of Kirkpatrick's 4-level evaluation model as different companies have their own definition of pay offs to measure the
training results. Return on investment compares the training's monetary benefits with the cost of the training, so that the true value of the training to the organization can be assessed. Converting data to monetary values is the first phase in putting training initiatives on the
same level as other investments that organizations make (Phillips, 2002). It cannot be used to cover other variables that may affect the results (i.e. culture, productivity, etc). Kirkpatrick (1994) refuted this idea by claiming that there are many ways to measure training results. This raises the question whether training evaluation be varied only as a measure of financial benefits? Lewis and Thornhill (1994) are of the opinion that there should be 5 levels of evaluation measuring the training effects on the department (i.e. Level 4) and its effects on
the whole organization (i.e. Level 5). Lewis and Thornhill (1994) emphasized the need to look at the value and the organization cultures as the variables to measure training effectiveness.
In recent times others have tried to make the system easier to deal with. Warr et al. (1999) came up with the context, input, reaction and outcome (CIRO) evaluation system with the context part going someway towards front-loading the evaluation and partly towards
mirroring Kirkpatrick model. Dyer (1994) proposed an evaluation system that suits all organizations, irrespective of size or diversity of operation. It is a system that is relatively easy to come to terms with and can be implemented at all the hierarchical stages of an
organization. It fits the individual and it fits the whole organization. The system puts
Kirkpatrick's evaluation system against a mirror. The benefits of using Kirkpatrick's Mirror should be self-evident to anyone involved in management. Application of the paradigm allows the individual to become more business focused, and if adopted universally should provide efficient and effective training throughout any organization (Dyer, 1994).
21
A different model was used in a study by Shireman (1991) on the evaluation of a hospital based health education program. The study adopted the CIPP model in examining the type of evaluation which was being conducted in the hospital. A structured questionnaire was sent to a stratified random sample of 160 hospitals of four different sizes in four mid-western states. The result showed that 48 percent of the respondents reported that product evaluations were usually done and less than 25 percent reported that other types (i.e. context, input, process) of evaluations were done. The product evaluation is outcome-based and quite similar to Kirkpatrick's end process evaluation. Both types of evaluations require appropriate data collection activities.
Kirkpatrick model was used by most researchers as an initial framework of evaluation model generation. This paper addresses the methodological issues surrounding the taxonomy of
Kirkpatrick model as an area for epistemological study. The theoretical an empirical literature of Kirkpatrick model will be critically evaluated and further research opportunities will be outlined.
1.4 Critical Review Phillips (1991) concluded that out of more than 50 evaluation models available, the evaluation framework that most training practitioners used is the Kirkpatrick model. Though the model seemed to be weathered well, it has also limited our thinking on training evaluation and possibly hindered our ability to conduct meaningful training evaluation (Bernthal, 1995). More than ever, training evaluation must demonstrate improved performance and financial
results. But in reality, according to Garavaglia (1993), training evaluation often assessed whether the immediate objectives have been met; specifically, how many items were answered correctly on the post-test. Some based their evaluation only on trainee reaction; the first level of Kirkpatrick model developed in 1959 (Brinkerhoff, 1988). Such information gave organization no basis for making strategic business decisions (Davidove & Schroeder, 1992). Most practitioners are familiar with Kirkpatrick's 4-level evaluation model but many never seemed to get beyond Levels 1 and 2 (Regalbutto, 1992). Numerous organizations have adopted the model presented by Kirkpatrick to suit their own situations; the solution seems to cause the growth of generic models (Dyer, 1994).
22
Kirkpatrick called for a definite approach to the evaluation model. All 4 levels must be measured to ensure effectiveness of the whole evaluation system since each level provides different kinds of evidence.
This view was supported by Hamblin (1974), who suggested that reaction leads to learning and learning leads to change in behaviour, which subsequently leads to changes in the organization. He further stated that each level can be broken at any link and having positive
reaction is necessary to create positive learning. According to Bramley and Kitson (1994), there is not much evidence to support this linkage. Further research carried out by Alliger and Janek (1989) found only 12 articles which attempted to correlate the various levels
advocated by Kirkpatrick. Although there are problems in external validity with such a small data, the tentative conclusion was that there was no relationship between reaction and the
other three levels of evaluation criteria. A correlation study, which was run on these four levels of evaluation showed insignificant results. A literature search based on Kirkpatrick's name, yielded 55 articles but only 8 described evaluation results and none described
correlations between levels (Toplis, 1993). This concluded that good reactions did not predict learning, behaviour or results.
A series of industrial surveys conducted in the last 30 years show little application of all 4 levels of Kirkpatrick model. Surveys conducted since 1970 showed that most industrial trainers rely on student reaction, fewer on test learning and almost none on test application and benefit (Brandenburg, 1982; Plant & Ryan 1994; Raphael & Wagner, 1972). In the last 20 years, a number of writers claimed to have performed a full Kirkpatrick evaluation; however, the linkages described in connecting the training event with the outcome are subjective and tenuous (Salinger & Deming, 1982; Sauter 1980).
A survey conducted by the Bureau of National Affairs and American Society of Training and Development (ASTD) in 1969 using questionnaires indicated that most of the companies conducted Level 1 evaluation and unsystematic approaches to Level 2 evaluation (Raphael &
Wagner, 1972). The survey indicated that problems of evaluation at higher levels were mainly due to a lack of understanding of the approach used. Kirkpatrick model seems to offer a one-size fits all solution to measure training effectiveness. However, there has been little contribution and reliability of this model despite great industrial emphasis in this area. 23
Kirkpatrick model focuses mainly on immediate outcome rather than the process leading to
the results. The following questions were never successfully addressed. In fact the improvement of these processes is the main forces of effectiveness (Murk, Barrett & Atchade, 2000).
How well a person's motivation level affects the learning behaviour The degree of superiors' support after the training The extent to which training interventions was appropriate for meeting needs Longer-term effects of the training, the pay-off in determining a course's overall impact and cost-effectiveness The conduciveness of the training environment
An empirical study by Warr, Allan and Birdi (1999) showed that external processes like increasing confidence and motivation levels of trainees as well as use of certain learning
strategies are important contributing factors towards training effectiveness. A 2-day training course was studied on 23 occasions over a 7-month period in the Institute of Work Psychology, UK. Technicians who attended the training courses which involved operating electronic tools were asked to complete a knowledge test questionnaire on arrival and at the end of the course. A follow up questionnaire was mailed to the trainees one month later.
More than 70 percent of the respondents returned the questionnaire. The questionnaire was designed to capture what the researches defined as third factors (i.e. confidence, perception, motivation, learning strategies, age, etc). The results showed a non-significant correlation
between reactions towards the course and job behaviour. Perceptions of course difficulty were significantly negatively associated with frequency of use of equipment. Correlation between levels two and three evaluation were small. Learning scores and changes in those score - Level 2 were strongly predicted by trainee's specific reactions to the course, but those reactions were not significantly associated with later job behaviour - Level 3 (Warr, Allan & Birdi, 1999).
Alliger et al. (1989) carried out a meta-analysis of studies where reaction measures had been
related to measures of learning (11 studies) and changes in behaviour (9 studies). They found that positive reactions did not predict learning gains better than negative ones (the average
24
correlation between reactions and amount of learning was .02 nor were they any better at predicting changes in behaviour after the program was .07).
Bramley and Kitson (1994) asserted that measuring learning is problematic because designing a reliable measuring instrument is difficult and the necessary skills are often not available. Grove and Ostroff (1990) pointed out that training directors often do not possess the essential
skills to conduct training evaluation. This could be part of the reason why companies are reluctant to evaluate their training effectiveness.
Though Kirkpatrick's traditional assessment methods were widely used on Level 1 and 2
evaluations, the benefits of collecting data at each level are unclear. This uncertainty may result in organization failing to evaluate training completely or selecting forms of evaluation
that may not be reliable. Inadequacy in Kirkpatrick model on each level forces one to look for other possible measures. Therefore, one may argue that to make Kirkpatrick model definite, a more detailed assessment method must be conducted at each level to ensure practicality, validity and applicability (Mann & Robertson, 1996).
Mann and Robertson (1996) undertook to investigate the utility of various methods used in
evaluating training programs. Twenty-nine subjects were selected from a three-day training seminar for the European National Run in Geneva, Switzerland. The seminar was a computer training event (on e-mail and the Internet) for youth workers, and trainees were asked to complete training evaluation forms before and after the training program and by post one
month later. Sixteen people returned this final questionnaire. Each questionnaire contained three sets of questions designed to measure knowledge, attitudes and self-efficacy. The results showed doubt over the value of the data received from reaction and learning levels.
Recommendations were made based on the following findings:-
Measuring learning (Level 2) as a method of evaluating training effectiveness is
important. The study showed that not all of what is learned immediately after training is retained one month later. This denotes that the practitioner should be aware of the short-term training effectiveness.
25
To ensure a more realistic evaluation at Level 2, one must be prudent of the pre and
post course evaluation method proposed by Kirkpatrick. The time frame for learning to take place was never specified. An appropriate measuring model is necessary to determine the extent of learning has taken place. In another words, Kirkpatrick model lacks longitudinal considerations.
Measuring changes in learning through data collection as prescribed by Kirkpatrick (absolute term) gained no value in predicting how well a person can perform the skills attained from the training after a one-month period.
A positive attitude does not show any relevance on how well a person can perform a
trained task after a month. Reaction evaluation that shows positive attitude attained have no direct linkage to performance.
However, individual self-efficacy did not decrease over time. Empirical studies shown that self-efficacy correlates with actual performance (Kraiger, Ford & Salas
1993). One might look at the possibility of measuring self- efficacy instead of reaction evaluation. In another words, self-efficacy offers more tangible results as compared to reaction evaluation.
The reasons for Kirkpatrick failure in Level 3 and Level 4 evaluation was due to lack of a defined framework and specific tools that are appropriate for measuring transfer of learning since its first introduction 40 years ago. It is necessary, at the most basic level, to have a body of case studies from which the generalizations can be drawn and thus hypotheses formed. However, this body of information has not been published (Bramley & Kitson, 1994).
The issue here is whether or not the knowledge taught during training is being transferred or
demonstrated by the trainees on the job. The transfer component of training evaluation was examined by Olsen (1998) in a study conducted in 1996. Transfer is evidence of whether what has been learned is actually being used on the job for which it was intended.
The survey asked questions regarding how Kirkpatrick's 4-level evaluation were performed, what percentage of payroll was spent on training, how much training was actually transferred
26
to the job and what specific items would enhance the level of transfer. A content analysis was carried out on the 138 survey comments received on how the respondents made estimates of the percentage of transfer value they reported. Follow up interviews were also undertaken to provide additional clarification on responses and record impressions and opinions about
the data collection. The results showed that the percentage of transfer depended on the types of training. Technical training showed the best rate of transfer, soft skills (interpersonal) do not transfer as readily and are not easily observed. Transfer is not so readily apparent in the effective work areas (Olsen, 1998).
Bramley (1996) offered an explanation why evaluation is not being carried out at the
behaviour and result levels. Traditionally most trainers use individual and educational models of training process. The process has its limitation as emphasis is on encouraging individuals to learn something rather than to find uses (if any) for the learning.
1.5 Future Research Bramley and Kitson (1994) argued that the problems of evaluation at Levels 3 and 4 were not
well understood because not enough evaluation of this kind has been carried out. This is due to the fact that effective measurement methods for Levels 3 and 4 are not available and the amount of work in setting up the criteria for measuring these two levels is time consuming. It is apparent that the incompleteness of Kirkpatrick model lies in its Levels 3 and 4 of evaluation.
1.5.1 The Transfer Component The transfer component is a potential area for future research. Transfer of training can be defined as 'the application of knowledge, skills and attitudes learned from training on the job and subsequent maintenance of them over a certain period of time (Baldwin & Ford, 1988;
Xiao, 1996). This process does not appear to have received much attention since most organizations were apparently looking primarily at Levels 1 and 2 evaluations. Early studies lacked theoretical framework to guide these investigations (Baldwin & Ford, 1988).
27
A survey conducted by Cheng and Ho (1998) revealed that there were inconsistent findings
on the variables that promised positive training transfer. The main intention of further research is to develop common variables that are critical to different training and transfer situations, including the establishment of common scales or instruments that can be used in different research settings.
The current approach which uses variables such as individual ability, motivation and environmental favourability has shown a profound effect on training transfer research (Noe &
Schmidt, 1986). However, this approach raises the question of application. This is because individual differences (e.g. self efficacy and locus of control) are expected to extent considerable influence on transfer outcome (Cheng & Ho, 1998).
A longitudinal study would be a better way of measuring the effectiveness of transfer
learning. It is argued that trainees who show similar levels of transfer performance after a short period of training, may differ substantially in the long run (Kraiger & Ford, 1993). Therefore, another major aspect of transfer research is to examine the level of newly acquired knowledge, skills or behaviour retained in the transfer settings after a longer period of time. For example, research should record the changes in terms of levels of skill proficiency as a function of time after training.
1.5.2 Evaluating Beyond the 4 Levels In considering the above studies, an effective evaluation should measure beyond the aspect of
reaction, learning, behaviour and results. Lewis and Thornhill (1994) suggested that an effective training evaluation needs to be integrated and matched to the culture of the
organization. This integrated culturally related approach is advocated because it would be able to minimize the risk of not meeting the objectives of carrying out training at the input stages as well as evaluating reactions and impact at the outcome stage. This brings more strategic approaches in identifying and prioritizing training needs, in relation to organizational objectives.
28
To justify the training evaluation results, we may consider Brinkerhoff s (1987) criticism on
Kirkpatrick model, which only concentrates on the outcome of training. This is further supported by Bernthal (1995) who found necessary to look for a broader linkage between
training and the organization context. Bernthal (1995) introduced the training-impact tree method in measuring organization context. This is done by listing the barriers of training and the factors that facilitate training next to their associated values and practices which are aligned with the organization objectives.
Although Kirkpatrick model focuses on the attainment of tangible outcomes, it is important to note that the question of measuring intangible outcomes that are related to training
effectiveness must not be ignored. Kirkpatrick (1994) revisited his 4-level evaluation model and states that as long as the evidence collected is beyond a reasonable doubt, one should be satisfied with the evidence. Perhaps an experienced training practitioner may want to explore the possibility of interacting the absolute 4-level evaluation model with other process models. As a result of this, the gap that exists in short and long term measures of training evaluation
may be minimized. Future research may be built upon deriving the integrated model that would complement both absolute and process evaluation on training effectiveness.
1.5.3 Incorporating Competence-based Approach into Training Evaluation The aim of future research is to develop a comprehensive training evaluation by
incorporating the absolute Kirkpatrick model with the competence-based process. The competence-based assessment system could be used in collecting sufficient evidence to determine whether individuals are performing competently in their jobs.
Strebler, Robinson and Heron (1997) classified two different meanings of the term competency namely expressed as behaviours that an individual needs to perform a job and as minimum standards of performance. The term competency has been used to refer to the
meaning expressed as behaviours and performance standards. Competence-based assessment is helpful to provide a behaviourist framework for learning in training evaluation. A behaviourist approach to learning provides simpler tasks for the trainer and clarity of outcome for the learner (Hoffmann, 1999). Another definition of competencies is the quality
29
of outcome which may be used to evaluate gains in productivity or efficiency in the workplace as a result of training (Strebler et al., 1997).
Further research by Sternberg and Kolligian (1990) defined competency as the underlying
attributes of a person such as their knowledge, skills or abilities. The use of this definition created a focus on the required inputs of individual in order for them to produce competent
performances. This is aligned with the traditional training evaluation approach of measuring knowledge, skills and abilities of a person after training. Rowe (1995) suggested that competence-based assessment which looks at evaluating the whole process of learning should consist of:-
Objective:
The trainer should exhibit clear learning objectives and methods for obtaining those objectives.
Evidence:
Evidence must be provided to indicate competent performance.
Observation:
An assessor looks out for competent performance.
Peers'
Comments are obtained from work colleagues, peers.
Comments:
and customers.
The key point is that a competence-based model supplements knowledge-based achievements. Programs will be designed by permitting competence-based models to build
on knowledge-based achievement. In this way knowledge supports work, learning supports skill and theory supports practice (Rowe, 1995).
The competence-based method would be able to assess whether knowledge and skills learned are being effectively applied in the workplace and whether the trainee can now be described as competent after completion of a training program.
This integrated model could also be used prior to designing a training program in order to establish development needs and to determine training program content.
30
1.5.4 Multi-Rater Feedback System in Training Evaluation There does not appear to be a distinct individual who founded or invented this process and
according to Moses, Hollenbeck and Sorcher (1993), the term multi-rater feedback is misleading as it suggests a newly discovered concept, whereas they argue that perceptions of people have been available as long as there have been people to observe them.
Nowack (1993) presents a useful summary of some of the reasons for the increased use of multi-rater feedback in organizations:
The need for a cost-effective alternative to assessment centers; The increasing availability of assessment software capable of summarizing data from multiple sources into customized feedback reports; The need for continuous measurement of improvement efforts; The need for job-related feedback for employees affected by career plateauing; and The need to maximize employee potential in the face of technological change, competitive challenges and increased workforce diversity.
From the organizational perspective, multi-rater feedback can be used solely for
developmental purposes. Romano (1994) and Atwater et al. (1993) found that the most common use is in the area of training and development. The overall net effect of training and development should enhance organizational performance.
From the individual perspective, the feedback is invaluable because it comes from numerous sources, providing multiple perspectives and opinions. Each opinion and perspective may provide relevant yet different feedback (Atwater et. al, 1993; Hazucha et. al, 1993; Tornow, 1993). This form of feedback can increase the reliability, fairness and acceptance of the data by the person being rated (London, Wohlers & Gallagher, 1990). This occurs because the feedback is received from multiple sources and not just from one ratee.
One of the advantages of using multi-rater feedback is that it provides the opportunity for individuals who are being assessed to compare their self perceptions against the perceptions of others regarding their behaviour (Rosti & Shipper, 1998).
31
The difference in perspective between the rater and the ratee is not treated as an error but is a
source of information which can enhance personal learning. Ratees can learn from the discrepancy between self rating and the rating of others.
The use of multi-rater feedback provides a natural method for both enhancing learning of the
participants and improving the evaluation process. Feedback is seen as a critical element in affecting change (Bennis, Benne & Chin, 1969). Multi-rater feedback could be used to serve
as an unfreezing process in Lewin's (1948) model of change. This would enhance the ratee's learning by creating doubts on the ratee's current performance standard and provides an
opportunity for prospective development. Most training evaluation models emphasize the absolute outcome of training. However, multi-rater feedback involves the change process where the resultant behaviour involved reinforcement of past performance and also provides
an opening for future learning. Thus, collecting multi-rater feedback before and after training will enhance learning and provide at least part of the data needed to evaluate training.
Moses et al. (1993) provides the following criticism of multi-rater feedback:
It relies on generalized traits as there is a limited or non-existent frame of reference for making rater/observer judgments.
It is based on an individual's memory, which can often be incomplete descriptions of past performance.
The observer may be unable to interpret behaviours It relies on the instrument designers' scoring system, factor analysis or data collection methods to interpret the information for the participant.
The main argument of Moses et al. (1993) is that multi-rater feedback is based on other people's observations and that such observations are often incomplete descriptions of past
performance because the observer does not know what to look for. The unresolved issue is what behaviours to study. Multi-rater feedback has been taken to identify the behaviour of effective management. There is lack of sufficient definitional detail to study managerial proficiency or the effectiveness of training (Morrison & McCall, 1978; Schriesheim & Kerr, 1977). Yulk (1994) argued that further refinement of these constructs is needed by identifying
32
specific skills which make up each construct. Hence, development of construct and its validity is important prior to training.
Multi-rater feedback has been found to be widely used in managerial and leadership development programs (Cacioppe R., 1998; Cacioppe & Albrecht, 2000; Garavan, Morley &
Flynn, 1997; McCauley & Moxley, 1996; Thach, E.C., 2002). However, its usage in other fields needs further research and exploration. This is further supported by Rosti and Shipper (1998) in their study on the impact of training in a management development program based on multi-rater feedback.
1.6 Conclusion It is widely acknowledged that the Kirkpatrick evaluation model has been providing the most basic thoughts on training evaluation throughout this decade. However, there seems to be incomplete application of Kirkpatrick's 4-level evaluation model being carried out by the
industries. No significant success has been identified from the use of 4-level evaluation model by the majority of organizations that have conducted training evaluations.
Based on this literature review, it may be concluded that Kirkpatrick model has not reached a stage of clarity for in-depth training evaluation to be carried out. His model would provide training managers with the idea of what is training evaluation on a systematic approach however the aspect of training measurement method was not well explored or detailed.
While training has been conceptualized as a continually evolving process, the existing literature appears to have failed to provide adequate strategies for organizations wanting to evaluate the immediate, as well as the long-term, effectiveness and value of their training efforts.
At face value, the literature shows that the full Kirkpatrick evaluation strategy is being widely applied; however, more detailed analysis found that none were able to demonstrate Level 4 evaluation and of those who claimed evaluation at Levels 2 or 3, none were able to demonstrate a systematic approach to the problem.
33
Arguably the dilemma in adopting the Kirkpatrick's taxonomy as a comprehensive and integrated approach to evaluation lies in both the qualitative and quantitative attempts that may or may not provide good phenomenological studies. Further analysis of the method shows considerable confusion as to what is, or is not, a valid indicator for evaluation. Clearly, there has been little change in terms of level of confidence towards the reliability of training evaluation, notwithstanding greater emphasis on this key organizational development process.
The weaknesses of Kirkpatrick model have brought opportunity for future research in incorporating competencies and multi-rater feedback approach into the long-term evaluation of training.
These weaknesses have also opened up opportunities for further research in the transfer learning especially in the studies of its longitudinal and application effect.
1.7 References for Paper One Alliger, G.M. & Janek, E.A. 1989, 'Kirkpatrick's levels of training criteria: thirty years later', Personnel Psychology, vol. 42, pp. 331-342. Alliger, G.M., Tannenbaum, S.I., Bennett, W., Traver, H. & Shotland, A. 1997, 'A metaanalysis of the relations among training criteria', Personnel Psychology, vol. 50, pp. 341-358. Atwater, L., Roush, P. & Fishthal, A. 1993, The Impact of Upward Feedback on Self and Follower Ratings of Leaders, Centre for Creative Leadership, New York. Baldwin, T.T. & Ford, J.K. 1988, 'Transfer of training: a review and directions for future research', Personnel Psychology, vol. 41, pp. 63-105. Basadur, M., Graen, G.B. & Scandura, T.A. 1986, 'Training effects on attitudes toward divergent thinking among manufacturing engineers', Journal of Applied Psychology, vol. 71, pp. 612-617. Bennis, W.G., Benne, K.D. & Chin, R.1969, The Planning of Change, 2nd edn, Holt, Rinehart & Winston, New York.
Bernthal, P.R. 1995, 'Education that goes the distance', Training and Development, vol. 49, no. 9, pp. 41.
34
Blanchard, P.N. & Thacker, J.W. 1999, Effective Training, Systems, Strategies and Practices, Prentice Hall Publisher, New Jersey. Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives and evidence from Canada', International Journal of Training and Development, vol. 4, no.4, pp. 295-303. Boulmetis, J. & Dutwin, P. 2000, The ABCs of Evaluation: Timeless Techniques for Program and Project Managers, Jossey-Bass Publisher, San Francisco.
Boyle, P.G. & Jahns, I. 1970, 'Program development and evaluation' in Handbook of adult education, eds Smith, R.M., Aker, G.F. & Kidd, J.E., Macmillan Company, New York, pp. 70.
Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal of European Industrial Training, vol. 18, no.1, pp. 10-14. Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and New York.
Brandenburg, D. 1982, 'Training evaluation: what is the current status?' Training and Development Journal, pp. 14-19. Bretz, R.D. & Thompsett, R.E. 1992, 'Comparing traditional and integrative learning methods in organizational training programs', Journal of Applied Psychology, vol. 77, pp. 941-951. Brinkerhoff, R. 0. 1987, Achieving results from training, Jossey-Bass Publisher, San Francisco.
Brinkerhoff, R.O. 1988, 'An integral evaluation model for human resource development', Training and Development Journal, vol. 42, no. 2, pp. 66-68. Brown, K.G., Werner, M.N., Johnson, L.A. & Dunne, J.T. 1999, Formative evaluation in Industrial/Organization Psychology: further attempts to broaden training evaluation, presented at a symposium on training evaluation: advances and new directions for research and practice, Society of Industrial and Organizational Psychology, Atlanta. Cacioppe, R. 1998, 'An integrated model and approach for the design of effective leadership development programs', Leadership and Organization Development Journal, vol. 19, no. 1, pp. 44-53. Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model to develop leadership and management skills', Leadership and Organization Development Journal, vol. 21, no. 8, pp. 390-404. Campbell, J.P. 1988, Training Design for Performance Improvement, in Productivity in Organizations, eds Campbell, J.P. & Campbell, R.J., Jossey-Bass Publisher, San Francisco.
35
Cascio, W.F. 1989, Using utility analysis to assess training outcomes, in Training and Development in Organizations, ed. I.L. Goldstein, Jossey-Bass, San Francisco. Cervero, R.M. 1988, Effective Continuing Education for Professionals, Jossey-Bass Publisher, San Francisco. Campion, M.A. & Campion, J.E. 1987, 'Evaluation of an interview skills training program in a natural field setting', Personnel Psychology, vol. 40, no. 4, pp. 675-91. Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program and Policy Evaluations, Greenwood Press, Westport, CT. Cheng, E. & Ho, D. 1998, 'The effects of some attitudinal and organizational factors on transfer outcome', Journal of Managerial Psychology, vol. 13, no. 5/6, pp. 309-317.
Clegg, W.H. 1987, 'Management training evaluation: an update', Training and Development Journal, vol. 41, no. 2, pp. 65-71. Connolly, M.S. 1988, 'Integrating evaluation, design and implementation', Training and Development Journal, vol. 42, no. 2, pp.20-23. Constable, J. & McCormick, R. 1987, The Making of British Managers, BIM, CBI, London.
Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training' Training and Development Journal, vol. 46, no. 8, pp. 70-71. Davis, B.L. & Mount, M.K. 1984, 'Effectiveness of performance appraisal training using computer assisted instruction and behaviour modeling', Personnel Psychology, vol. 37, pp. 439-452. Dawson, R.P. 1993, Model of evaluations of equal opportunities training in local government with special reference to women, unpublished PhD thesis, South Bank University, London. Dionne, P. 1996, 'The evaluation of training activities: a complex issue involving different stakes', Human Resource Development Quarterly, vol. 7, pp. 279-86.
Dyer, S. 1994, `Kirkpatrick's mirror', Journal of European Industrial Training, vol. 18, no. 5, pp 31-32. Eisner, E.W. 1997, The Enlightened Eye: Qualitative Inquiry and the Enhancement of Educational Practice, 2nd edn., Merrill, New York. Garavaglia, L.P. 1993, 'How to ensure transfer of training', Training & Development Journal, vol. 47, no. 10, pp. 63-68. Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employee development', Journal of Management Development, vol. 16, no.2, pp. 134-147.
36
Geber, B. 1995, 'Does your training make a difference? Prove it!', Training and Development Journal, vol. 3, pp. 27-34. Goldstein, L.I. 1986, Training in Organizations: Needs Assessment, Development and Education, Cole Publishing Company, California. Goldstein, L.I. & Ford, J.K. 2002, Training in Organizations: Needs Assessment, Development and Evaluation, Thomson Learning, Wadsworth, Canada. Grove, E.A. & Ostroff, C. 1990, Program evaluation, in Developing Human Resources, eds Wexley, K. & Hinnicks, J., BNA Books, Washington D.C. Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill Publisher, New York. Hazucha, J.F., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback on management skills development', Human Resource Management, vol. 32, pp. 325351.
HMSO 1989, Training in Britain: A Study of Funding, Activity and Attitudes, Her Majesty's Stationery Office, London.
Hoffmann, T. 1999, 'The meanings of competency', Journal of European Industrial Training, vol. 23, no. 6, pp. 275-285. Holton, E.F. III 1996, 'The flawed four-level evaluation model', Human Resource Development Quarterly, vol. 7, pp. 5-21.
Junaidah, H. 2001, 'Training evaluation: clients' roles', Journal of European Industrial Training, vol. 25, no. 7, pp. 374-379. Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction', Journal of American Society for Training and Developing, vol. 13, pp. 3-9. Kirkpatrick, D.L. 1959b, 'Techniques for evaluating training programs: part 2 - learning', Journal of American Society for Training and Developing, vol. 13, no. 12, pp. 21-26. Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour', Journal of American Society for Training and Developing, vol. 14, no. 1, pp. 13-18. Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results', Journal of American Society for Training and Developing, vol. 14, no. 2, pp. 28-32. Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: A guide to human resource development, 2nd edn, Craig, R.L.O., McGraw-Hill Publisher, New York.
Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training and Development Journal, vol. 33, pp. 78-92.
37
Kirkpatrick, D.L. 1994, Evaluating Training Programs: The Four Levels, Berrett-Koehler Publishers, San Francisco. Kirkpatrick, D.L. 1996a, 'Great ideas revisited', Training and Development Journal, vol. January, pp. 54-59. Kirkpatrick, D.L. 1996b, 'Invited reaction: reaction to Holton article', Human Resource Development Quarterly, vol. 7, pp. 23-24.
Kirkpatrick, D.L. 1998, Evaluating Training Programs: The Four Levels, BerrettKoehler Publishers, San Francisco.
Kraiger, K., Ford, J.K. & Salas, E. 1993, 'Application of cognitive, skill-based and affective theories of learning outcomes to new methods of training evaluations', Journal of Applied Psychology, vol. 78, no. 2, pp. 311-328.
Legge, K. 1984, Evaluating Planned Organizational Change, Academic Press, London. Lewin, K. 1948, Resolving social conflicts, Harper & Bros Publishers, New York, NY.
Lewis, P. & Thornhill, A. 1994, 'The evaluation of training an organizational culture approach', Journal of European Industrial Training, vol. 18, no. 8, pp. 25-32.
London, M., Wholers, A.J. & Gallagher, P. 1990, '360-degree feedback surveys: a source of feedback to guide management development', Journal of Management Development, vol. 9, pp. 17-31. Love, A.J. 1991, Internal Evaluation: Building Organizations From Within, Sage Publication, California, CA. Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L. 1986, Evaluation Models: Viewpoints on Educational and Human Services Evaluation, Kluwer-Nijhoff Publishing, Boston.
Mann, S. & Robertson, I. T. 1996, 'What should training evaluation evaluate?' Journal of European Industrial Training, vol. 20, no. 9, pp. 14-20. Mathieu, J.E. & Leonard, R.L. Jr. 1987, 'Applying utility concepts to a training program in supervisory skills: a time-based approach', Academy of Management Journal, vol. 30, pp. 316-335. Mathews, B.P., Ueno, A., Kekale, T., Repka, M., Pereira, Z.L. & Silva, G. 2001, 'Quality training: needs and evaluation-findings from a European survey, Total Quality Management, vol. 12, no. 4, pp. 483-490. McCauley, C.D. & Moxley, R.S. Jr. 1996, Developmental 360: How Feedback Can Make Managers More Effective, Jossey-Bass Publisher, San Francisco. Morrison, A.M. & McCall, J.D. 1978, Feedback to Managers: A Comprehensive Review of Twenty-four Instruments, Centre for Creative Leadership, Greensboro, NC.
38
Morrow, C.C., Jarrett, M.Q. & Rupinski, M.T. 1997, 'An investigation of the effect and economic utility of corporate-wide training', Personnel Psychology, vol. 50, pp. 91119.
Moses, J., Hollenbeck, G.P. & Sorcher, M. 1993, 'Other people's expectations', Human Resource Management, vol. 32, Summer Fall. Murk, P., Barrett, A. & Atchade, P. 2000, 'Diagnostic techniques for training and education: strategies for marketing and economic development', Journal of Workplace Learning, vol. 12, no. 7, pp. 296-306.
Noe, R.A. & Schmitt, N. 1986, 'The influence of trainee attitudes on training effectiveness: test of a model', Personnel Psychology, vol. 39, pp. 497-523. Noe, R.A. 2000, Employee Training and Development, McGraw-Hill Publisher, New York.
Nowack, K. 1993, '360-degree feedback: the whole story', Training and Development Journal, vol. 47, no. 1, pp. 69-73.
Newstrom, J.W. 1978, 'The problem of incomplete evaluation of training', Training and Development Journal, vol. 32, no. 11, pp. 22-24. O'Leary, V.E. 1972, 'The Hawthorne effect in reverse: effects of training and practice on individual and group performance', Journal of Applied Psychology, vol. 56, pp. 491494. Olsen, J. H. Jr. 1998, 'The evaluation and enhancement of training transfer', International Journal of Training and Development, vol. 2, no. 1, pp. 61-75. Parlette, M. & Hamilton, D. 1977, 'Evaluation as a new approach to the study of innovative programmes', in Beyond the Numbers Game, eds Hamilton, D. et al., Macmillan, London. Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, Gulf Publishing Company, Houston, TX. Phillips, J.J. 2002, Return on Investment in Training and Performance Improvement Programs, 2nd edn, Butterworth-Heinemann, Woburn, MA. Phillips, J.J. & Stone, R.D. 2002, How to Measure Training Results, A Practical Guide to Tracking the Six Key Indicators, McGraw-Hill Publisher, New York.
Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European Industrial Training, vol. 18, no. 5, pp. 27-30. Popham, W. J. 1974, Evaluation in Education: Current Applications, Berkeley, McCutchan, California. Porter, L., & McKibbin, L. 1988, Future of Management Education and Development Drift Or Thrust Into the 21' Century?, McGraw-Hill Publisher, New York.
39
Provus, M. 1971, Discrepancy Evaluation, Berkeley, McCutchan, California. Rae, L. 1986, How to Measure Training Effectiveness, Gower Publications, Aldershot, London.
Raphael, M. & Wagner, E. 1972, 'Training surveys surveyed', Training and Development Journal, vol. 26, pp. 10-14. Redshaw, B. 2001, 'Evaluating organizational effectiveness', Measuring Business Excellence, vol. 5, no. 1, pp. 16-18. Regalbutto, G.A. 1992, 'Targeting the bottom line', Training and Development Journal, vol. 46, no. 4, pp. 29-32. Rivlin, A.M. 1971, Systematic Thinking for Social Action, Brookings Institution, Washington.
Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3. Rossi, P.H. & Freeman, H.E. 1993, Evaluation.. A Systematic Approach, 5th edn, Sage Publication, California.
Rosti, R.T. Jr. & Shipper, F. 1998, 'A study of the impact of training in a management development program based on 360 feedback', Journal of Managerial Psychology, vol. 13, no.1/2, pp. 77-89. Rowe, C. 1995, 'Incorporating competence into the long term evaluation of training and development', Industrial Commercial Training, vol. 27, no.2, pp. 3-9.
Salinger, R. & Deming, R. 1982, 'Practical strategies for evaluating education', Training and Development Journal, vol. 4, pp. 20-29.
Sauter, J. 1980, 'Purchasing public sector executive development', Training and Development Journal, vol. 34, no. 4, pp. 92-98. Schriesheim, C.A. & Kerr, S. 1977, 'Theories and measurement of leadership: a critical appraisal of present and future directions', in Leadership: The Cutting Edge, eds Hunt, J.G. & Larson L.L., Southern Illinois University Press, Carbondale, IL. Scriven, M. 1991, Evaluation Thesaurus, Sage Publication, Newbury Park, California. Shadish, W. R. & Epstein, R. 1987, 'Patterns of program evaluation practice among members of the evaluation research society and evaluation network', Evaluation Review, vol. 11, no. 5, pp. 555-590. Shadish, W.R. & Reichardt, C.S. 1987, 'Evaluation studies', Evaluation Review, vol. 12, pp. 13-30.
40
Shireman, J.A.R. 1991, Utilization of program evaluation for decision making regarding hospital based patient/client focused health education programs, doctoral dissertation, University of Iowa, dissertation abstracts international, 52/12A, AA C9212928.
Smith, A.J. 1990, 'Evaluation of management training subjectivity and the individual', Journal of European Individual Training, vol. 14, no. 1, pp. 12-15. Stake, R. 1977, 'Responsive evaluation', in Beyond the Number Game, eds Hamilton, D., Jenkins, D., King, C., MacDonald, B. & Parlett, H.M., Macmillan, London.
Steel, S. 1970, 'Program evaluation: a broader definition', Journal of Extension, vol. 13, pp. 13-20.
Sternberg, R. & Kolligian, J. Jr. 1990, Competence Considered, Yale University Press, New Heaven, CT. Strebler, M., Robinson, D. & Heron, P. 1997, 'Getting the best out of your competencies', Institute of Employment Studies, University of Sussex, Brighton. Stufflebeam, D.L. 1971, Education Evaluation: Decision Making, by the PDK national study committee on education, Itasca, III: F.E. Peacock Publisher Inc, Boston.
Stufflebeam, D.L. 1983, 'The CIPP model for program evaluation', in Evaluation Models, eds Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L., Kluwer-Nijhoff Publishing, Boston, pp. 117-141. Stufflebeam, D.L. & Shrinkfield, J.A. 1985, Systematic evaluation, Kluwer Nijhoff Publishing, Boston. Swanson, R.A. & Holton, E.F. 1999, Results: How to Assess Performance, Learning And Perceptions in Organizations, Berrett-Koehler Publishers, San Francisco.
Tesoro, F. 1998, 'Implementing an ROI measurement process at Dell Computer', Performance Improvement Quarterly, vol. 11, pp. 103-114. Thach, E.C. 2002, 'The impact of executive coaching and 360-feedback on leadership effectiveness', Leadership and Organization Development Journal, vol. 23, no. 4, pp. 205-214.
Toplis, J. 1993, 'Training evaluation reflections on the first steps', European Work Organization Psychology, vol. 2, no. 2, pp. 146-152. Tornow, W.W. 1993, 'Perceptions or reality, is multiple-perspective measurement a means or an end?', Human Resource Management, vol. 32. no. 2 & 3, pp. 209-408.
Tyler, R.W. 1949, Basic Principle of Curriculum and Instruction, University of Chicago Press, Chicago. Tyler, R.W. 2002, 'Evaluating evaluations', Human Resource Magazine, vol. June, pp. 8593.
41
Warr, P. & Bunce, K. 1995, 'Employee age and voluntary development activity', International Journal of Training and Development, vol. 2, pp. 190-204. Warr, P., Allan, C. & Birdi, K. 1999, 'Predicting three levels of training outcome', Journal of Occupational and Organizational Psychology, vol. 72, pp. 351-375. Wexley, K.N. & Baldwin, T.T. 1986, 'Post-training strategies for facilitating positive transfer: an empirical exploration', Personnel Psychology, vol. 29, pp. 503-520.
Wholey, J.S., Hatry, H.P. & Newcomer, K.E. 1994, Handbook of Practical Program Evaluation, Jossey-Bass Publisher, San Francisco. Xiao, J. 1996, 'The relationship between organizational factors and the transfer of training in the electronics industry in Shenzhen, China', Human Resource Development Quarterly, vol. 7, no. 1, pp. 55-73. Yulk, G.A. 1994, Leadership in Organizations, 2nd edn, Englewood Cliffs, Prentice Hall Publisher, New Jersey.
42
Research Paper 2
EVALUATING TRAINING EFFECTIVENESS: AN EMPIRICAL STUDY OF KIRKPATRICK MODEL OF EVALUATION IN THE MALAYSIAN TRAINING ENVIRONMENT FOR THE MANUFACTURING SECTOR
Lim Guan Chong Master of Business Administration (Finance) University of Hull
International Graduate School of Management University of South Australia
43
Evaluating Training Effectiveness: An Empirical Study of Kirkpatrick Model Of Evaluation in the Malaysian Training Environment for the Manufacturing Sector Lim Guan Chong International Graduate School of Management University of South Australia
2.1 Abstract This research adopted an empirical approach to track the history, rationale, objectives and the implementation of training evaluation initiatives in Malaysia's manufacturing sector. Since the establishment of the Human Resource Development Fund, training activities in Malaysia have increased. The majority of Malaysian organizations that conduct training are doubtful about how training activities could add value to the organization performance and justify their training investment. This research provides an understanding of training evaluation culture within the Malaysian manufacturing sector and the effectiveness of this Kirkpatrick's 4-level evaluation model as applied to the Malaysian manufacturing sector.
2.2 Introduction The Malaysian government is committed towards education, training and human resource development. The government recognizes the importance of human resource development in its quest for achieving a fully developed nation status. This commitment has translated into the establishment and growth of the training practice in the country.
Being the sole provider of training previously, the government has adopted the policy of involving private enterprises in all aspects of training. Training needs have become crucial
and vital to the development of capital-intensive and value added industries. Apart from 44
involving enterprise to make training more market-driven, there is a need for enterprise to
share the burden of training. In the Seventh Malaysia Plan, the private sector was expected to play a more active role in upgrading the qualification and skill of its workers (Junaidah, 2001).
2.3 Training Practices in Malaysia Training activities within Malaysian companies are behind countries like Singapore, Japan
and Korea. Training activities in Malaysia are mainly conducted by large multinational companies. The International Labour Organization's study in 1997 showed that Malaysia is in the 12th position in terms of providing in-company training (Junaidah, 2001).
The Malaysian government passed a new Act of Parliament entitled Human Resources Development Act in 1992, to encourage and stimulate the private sector to introduce training
and development for its employees (HRDC, 1992). The objective of this Act is to set aside accumulated funds to promote training activities within the organization. Under this Act, companies with more than 50 employees will have to contribute 1 percent of their total staff's monthly salary to the Ministry of Human Resources through the Human Resources Development Council (HRDC). The fund is known as the Human Resources Development
Fund (HRDF), was launched in January 1993. The government set up the HRDC to manage this fund by identifying the systematic training needs and approving relevant training programs required by organizations. The levy is partially refunded under special schemes known as Training Aid Scheme and Approved Training Program (ATP) Scheme to the
respective organizations once the training program is completed. The policy lays down the parameters for a Human Resource oriented development strategy that is designed to mobilize national effort to increase technological capabilities and competitiveness as well as create
highly skilled, productive, disciplined and efficient workforce. This strategy would aid Malaysia's transition into an industrialized economy. Private sector companies are also expected to enhance their training activities by utilizing the HRDF and participating in skill
development programs run by the state governments (MEPU, 1996). Since the establishment of the HRDC, how has the Malaysian manufacturing sector gained from the training
conducted? With information on how training benefit organizations, it would help the
45
Malaysian government to chart the progress and expected time frame needed for Malaysia to transform into an industrialized economy.
The need to develop a highly trained workforce is evident from the increase of more than 200 management consulting and training institutions, professional associations and management
schools operating in Malaysia (Arthur Anderson & Co, 1991). The number of employees who return to formal education and training has increased consistently since 1972 (Ahmad, 1998). The government set up the National Institute of Public Administration Malaysia (INTAN) which is responsible for training government employees in administration and management (Junaidah, 2001).
There are some real difficulties in assessing the full extent of skill development for government training in Malaysia even after conducting evaluation (Mirza & Juhary, 1995).
Firstly, much of skill development takes place in the private sectors. Most skills even those involving advanced manual skills are acquired on the job. Secondly, skill development during employment tends to be demand-driven (Pillai, 1994). Workers gain experience on the job and upgrade their skills when they are exposed to a higher skill level. A study by Pillai and Othman (1994) showed that the budget for training and education in Malaysia has increased by 40 percent. Company emphasis has been on improving the quality of training to help develop competent labour force that improves the competitiveness of the industrial
sector in Malaysian. This new demand will force employers to further develop employee competencies. Saiyadain (1995) found that as many as 82.6 percent of organizations sponsored their managers for training, and on average these organizations spent 4.65 percent
of the managerial payroll on training managers. This shows that the number of knowledge workers and new knowledge-based opportunities is expected to increase dramatically in the next few years.
2.4 The Practice of Evaluation in Training Although the methodology of evaluating training effectiveness may look fair, it could make it
difficult to express rational criticism. A survey by Wagel (1977) found that 75 percent of companies have no formal method for evaluating training effectiveness. In a subsequent
46
survey by Easterby-Smith (1985), the result showed that out of 15 organizations with 320 300,000 employees, only one conducted some form of evaluation on a regular basis which was a post-course questionnaire. According to Rowe (1992), although every training manual
gives lip service to evaluation, it is notoriously difficult to carry out effectively. The extensive survey by Plant and Ryan (1994) served to further underline the lack of widespread sophistication in evaluation. They point to budget cutting and economies pressures as being
possible explanations. A recent study by Blanchard, Thacker and Way (2000) on 202 organizations in Canada reported that more than half of the organizations are not comprehensively evaluating their training.
According to Carnevale and Schulz (1990), the American Society for Training and Development (ASTD) research indicated that the most popular reasons for evaluation are to gather information to help decision makers improve the training process and facilitate
participants' job performance. This explains why the outcome-based Kirkpatrick model is so popularly used. Evaluation also helps measure the degree of improvement in application and assesses how well the learner achieves the established goals (Attkinsson, Sorenson, Hargreaves & Hororwitz, 1978).
For the past 30 years the Kirkpatrick model had been considered the most prominent training evaluation model (Bernthal, 1995). Phillips (1991) concluded that, out of more than 50 evaluation models available, the evaluation framework that most training practitioners use is
the Kirkpatrick model. It is easy to find firms that practice training evaluation. However, most firms only conduct post course evaluation using Kirkpatrick's Level 1 evaluation.
Another important purpose for training evaluation is to meet the accountability requirements of funding groups or clients (Rossi & Freeman, 1993). The demand for accountability has
been the major impetus for program evaluation since 1980s. Fiscal constraints have increased the competition of companies' activities for available dollars and raised the question of value for money from their activities (Ruthman & Mowbray, 1983).
Training evaluation is more than a set of empirical methods governed solely by the standards
of social science. Judgments on the quality of program evaluation must also be based on criteria that are meaningful both to immediate users and the larger system in which the program is embedded (Corday & Lipsey, 1986). 47
Phillips (1991) stated that when it comes to training evaluation, there still appears to be more talk than action. In many organizations, training evaluation is either ignored or approached in an unsystematic manner. Previous literature (Davidove & Schroeder, 1992; Shelton & Alliger, 1993; Smith, 1990) demonstrated that training evaluation is unsystematic and based on simple means. Gutek (1988) stated that there was little or no demand on the part of the
organization to seriously evaluate a training program. Most organizations evaluate their training programs by emphasizing one or more levels of Kirkpatrick model (Chen & Rossi, 1992). The researchers, however, commented that evaluation knowledge found in the literature is not being fully utilized in evaluation practices.
Admittedly it is difficult to completely ascertain a training program's effectiveness. What works at a particular time at a particular training location with a group of participants may not necessarily work as well when transferred to another time, setting and group (Junaidah, 2001).
Bramley and Kitson (1994) asserted that measuring learning is problematic because it is
difficult to design a reliable measuring instrument. There are also few people who possess the necessary skills to evaluate training however these skills are often not available. Grove and Ostroff (1990) mentioned that training directors often do not possess the necessary skills
to conduct training evaluation. However, Bramley (1996) mentioned that the lack of training evaluation skills could be due to the methodological weakness embedded within the Kirkpatrick model of evaluation.
In addition to the unavailability of a reliable measuring instrument, Barron (1996) commented that why management does not demand evaluation because the management believes that training will be reflected in an employee's work performance. The research by Smith and Piper (1990) supported this view and showed that trainers openly said, "We do just what we are asked to do
deliver training. We do not do what we are not asked to do
improve human performance in the workplace". Smith and Piper (1990) also mentioned this as one of the reasons for providing training but not evaluation. The research found that their clients did not request for an evaluation. This could be the reason why training providers do not evaluate their products.
48
A research by the ASTD in 1990 showed that most companies now conduct some form of
evaluation of their training programs. Practitioners tend to use different methodology and approaches. In examining evaluation methods in business-education partnerships, Erickson (1991) found that there is little standardization in the methodology. Shadish and Epstein (1987) conducted a study to look at program evaluations among members of the Evaluation
Research Society and Evaluation Network. They found that practitioners had different methodologies as well as different assumptions about evaluation. In their study, three patterns of practices emerged from the evaluation practices which they labeled the academic pattern, decision-driven pattern and the outcome pattern.
Heneman and Schurab (1986) stated that the evaluation of training programs is considered
different compared to the theory and models in the literature. Many authors commented that once participants leave the training setting, program providers seldom attempt to determine the effect of their program. Indeed, the word evaluation raises all sorts of emotional defense reactions. Such response indicates a low level of commitment among training professionals toward evaluation. Most of the time, the practices are informal, unsystematic and based on
one popular model. However in the study by Junaidah (2001) on Malaysian training evaluation practices, it was found that evaluation was moderately formal, comprehensive and systematic but could be further improved. Nevertheless, it is uncertain whether this so-called comprehensive approach to training evaluation is within the taxonomy of the Kirkpatrick framework. Currently, there is little literature on the evaluation system within the Malaysian context.
2.5 Training Evaluation Practices in Malaysia Validation of training effectiveness and benefits of training and development programs have
gained importance in public and private sectors in Malaysia. The Malaysian government places great emphasis on program evaluation and appointed two federal agencies to be responsible for evaluation. They are the National Institute of Evaluation and the Evaluation
Unit at the Prime Minister's Department. This unit is responsible for evaluating special
governmental projects and programs (Maimunah, 1990). Another evaluating body is the Publication and Consultancy Bureau which carries out evaluation for government training.
49
There are three types of evaluation process currently being practiced in the agency. The formal training evaluation uses standard evaluation questionnaires and oral evaluation in the form of informal discussions, while the informal evaluation conducted during training (Junaidah, 2001).
The reasons why Malaysian organizations do not evaluate training may lie in the inability to develop relevant measuring tools or the difficulty in determining which performance outcomes are attributed to training.
The rise in the awareness of training evaluation during the Malaysian economic downturn in 1997 has increased the pressure for organizations to justify the investment cost placed on
training (Junaidah, 2001). Organizations realized that training must be a worthwhile effort and this raises the need for measuring training effectiveness. Evaluating training
effectiveness does not seem to be the culture of most organizations in Malaysia. Thousands of training programs have been conducted in Malaysia since the rise of HRDF, (Mirza &
Juhary, 1995). However, effectiveness in terms of productivity, skills improvement, increase in performance standards and return on investment is still unknown. Training should be evaluated to learn the weaknesses of the training program. The selection criteria for evaluation should be able to find out the improvement in the participants' work performance.
The need for greater quality management during the economic downturn forced Malaysian companies to upgrade their current version of International Standard Organization (ISO) to ISO 9001:2000 which emphasized on documenting the training evaluation process. Companies that pursued this latest version of ISO are required to justify their training efforts and money spent by linking skill development with the quality philosophy of the company. As organizations pursue the latest version of ISO, evaluating training ranks high among top management as a means of justifying training investment (Junaidah, 2001). The opportunity cost of foregoing training commitment has become extremely high. More than ever, training
evaluation must demonstrate improved performance and financial results. As the investment spent on training is costly, it is understandable why top managers wish to see value for money and demand justification for training cost. Training providers need to show clients that they are getting good returns on their investment in training. This demand for accountability had been the major impetus for training in the past few years (Junaidah, 2001).
50
Most organizations in Malaysia have sufficient training facilities. Most managers are sponsored to attend training programs on production, general management and human
resources management for an average of 2 days (Mirza & Juhary, 1995). On average organizations spend 4.65 percent of the managerial payroll on training (Saiyadain, 1995). The measurement of training effectiveness varies from organization to organization. A few
organizations have developed systematic plans to follow up on training. The top management's attitude towards training has been identified as a critical factor in effective
operationalization of training (Mirza & Juhary, 1995). In organizations where the top and middle management have been perceived to be supportive, training seems to have contributed
to the overall growth. But how far the evaluation process has been conducted to prove the growth is still questionable. In order to improve the overall effectiveness of training, all organizations should undertake training evaluation effectively. As mentioned by Brinkerhoff (1988), training needs to adopt evaluations and measuring systems that can improve the
feedback mechanism in order to build their response capacity. A system of pre course evaluation followed by post course evaluation may help in setting relevant expectations for improvement.
A serious gap in the Malaysian training context is the insufficient information on the number,
nature and content of training facilities in the country. The skill-level at which the output would fit into the labour market is not known while the syllabus, duration and quality of
training vary from one agency to another. This is due to the lack of collaboration and consultation between industry and training institution. The quality of training is not up to the mark. Trainees have theoretical knowledge but little practical experience (Pillai, 1994).
There has been limited study on training evaluation practices in Malaysia. A training evaluation research by Shamsuddin (1995) was on the contextual factors associated with evaluation practices of selected adult and continuing education providers in Malaysia. According to him even though the management directed an evaluation to be conducted, it was
only for a narrow purpose. It was used to demonstrate program success by showing how good was the training and how many people received the training which is merely Level 1
evaluation. The wider purpose of program evaluation such as measuring the acquired learning (Level 2), program impact (Level 3) and cost effectiveness (Level 4) was not the
management priority. According to Shamsuddin (1995), the clients were not aggressive stakeholders who cared and demanded accountability from the training providers. Their 51
behaviour and characteristics did not push the training provider to examine the real effect of the programs in terms of learning gain and program effectiveness.
Besides Shamsuddin's (1995) study, four other studies conducted locally included the
element of evaluation practice. The first study by Hamid, Mohd, Muhamad and Ismail (1987) asked 235 organizations if management education in Malaysia significantly provides
candidates with a set of skills. Organizations found that 67.6 percent of management programs offered by local universities and colleges are too theoretical. Out of 121 respondents, 60.3 percent indicated that training is important while the rest felt the contrary. This study focused on reaction evaluation (Level 1) to study the participants' satisfaction level towards the overall programs.
Another study conducted by Asma (1994) examined the
design of training practices of four training providers in Malaysia and found that the evaluation practiced by the trainers do not conform to any theory and most of the evaluations used were ad hoc and informal.
Mirza and Juhary (1995) conducted a study on local and multinational organizations and found that in the majority of these organizations even if managers who return from training may write a report, no formal systematic mechanism exists to assess how well they are
utilizing their training in the organizations. The research further found that participants were only encouraged to apply learning at work but do not take the effort to find out what caused
the change. The result indicates that the behaviour towards measuring training effectiveness is not popularly practiced. Organizations feel that if learning does not take place, it would show in the next appraisal report. Participants who have learned something should have
applied it and therefore not necessary to track changes in performance.
Mirza and Juhary's (1995) study also revealed that most organizations in Malaysia evaluate training effectiveness on a superficial level. Some encourage their managers to try out new ideas while others do not show the same kind of support. Unfortunately for most companies, measuring training effectiveness may not be practiced organization wide. This is because
measuring training effectiveness has never been a policy in most organizations. Lack of support by most department heads is deterring most organizations from carrying out post-
training evaluation. Most organizations felt that if they had a more supportive top management they could have established systems for measuring training effectiveness.
52
The most recent study was by Junaidah (2001) on training evaluation practices by training
institutions in Malaysia. The study showed moderately formal training evaluation practices by Malaysian training practitioners. However, the researcher was uncertain whether these training practitioners applied the taxonomy of Kirkpatrick model in training evaluation practices.
Generally, training evaluation practices in Malaysia are either not done or if done, do not
follow any theory suggested in the literature. There is a paucity of detailed evidence of direct causal links between investment in training and the resultant return in the form of increased
performance. Brandenburg (1982) suggested that part of the reason training practitioners tended not to conduct evaluation or if they did, they relied heavily on soft information
evaluation methods and did not disseminate the results widely. Pauzi (1985) felt that part of the problem lies in the attitude of the top management who do not show full commitment to the evaluation process.
A further study is needed to study current training evaluation practices in Malaysia and to understand updates of this practice. It is important to understand training effectiveness in
Malaysia as it is worthwhile to analyze the training evaluation process which has undergone
in the country. This study would contribute to the existing body of knowledge as current information on training evaluation is inadequate. Since a large number of professional associations, private consultants and management schools in universities are organizing training programs in Malaysia, the results of the study would indicate areas where training evaluation could be practiced for different training programs.
2.6 Methodology of Study Most recent surveys of training and evaluation practices in Malaysia were conducted by Hamid et al. (1987), Asma (1994), Mirza and Juhary (1995), Shamsuddin (1995) and Junaidah (2001). The dearth of published materials on training and development activities of managers in Malaysia has prompted this study.
53
This explorative study was conducted to understand the evaluation culture and the
extensiveness of training evaluation practices in Malaysia. The lack of baseline information prevented the evaluation of transfer learning. This prompted the use of empirical approach in
this study. The study evaluates the perceptual effects on both management and nonmanagement levels of training programs in the manufacturing sector. This survey asked the level of training evaluation performed, the percentage of payroll spent on training, the
impediments to training and the percentage of training transferred to the job. Follow up interviews were also undertaken to provide additional clarification and interpretation on responses and enabled impressions and opinions about the data to be recorded accurately.
2.6.1 Questionnaire Construction A comprehensive survey of the literature was done to find out the degree of training evaluation being conducted by training practitioners in Malaysia. The survey questions asked the degree that training evaluation practices were conducted in Malaysia based on Kirkpatrick's 4-level of evaluation (Kirkpatrick, 1959a, 1959b, 1960a, 1960b, 1976, 1979). Examples of questions are:Reaction
How did the participants react to the training?
Learning
What information and skills were gained?
Behavior
How have participants transferred knowledge and skills to their jobs?
Results
What effect has training had on the organization and achievement of its objectives?
The instrument was designed primarily based on the published work of Blanchard, Thacker
and Way (2000) with modification based on the Malaysian training environment. The modifications from Blanchard et al. questionnaire include rephrasing and simplifying
question structure to suit local linguistic understanding. Words which were ambiguous or misunderstood were replaced. These modifications were applied in order to encourage a more
accurate response. Care was taken to ensure that simple and clear questions were used to
54
seek information on significant areas of training evaluation activity in Malaysia. The questionnaire can be found in Table 4.
The questionnaire is made of 34 questions. There are 8 questions in Level 1, 5 in Level 2, 13
in Level 3 and 8 in Level 4. Level 3 was constructed with the most questions as it asked about practices for measuring transfer learning. Practitioners could use a variety of assessment to measure transfer learning hence the survey questions require detailed practices undertaken by practitioners.
The questions in the questionnaires were randomly sorted to avoid biasness caused by the
order of the questions. The survey questions used a 5-point Liken scale to permit good scale discrimination.
A panel of experts which consisted of training professionals from the Malaysia Institute of
Management was used to evaluate the items in the questionnaire. Extensive pilot testing was undertaken by the training professionals to ensure that the questions were easily understood.
The internal consistency was determined using the Cronbach alpha method. The Cronbach alpha coefficient is 0.8458.
2.6.2 The Sample and Sampling To improve the effectiveness and efficiency in terms of time and resources, a purposeful
sampling technique was employed. The sample was manufacturing based companies found in the HRDC Directory. The HRDC Directory listed approximately 5000 organizations but only 40 percent from the listing are manufacturing based companies. The questionnaires were sent to 2000 manufacturing based companies with more than 50 employees. The questionnaires were posted between December 2003 and January 2004. The questionnaires were addressed to the Personnel and Human Resources Managers of the organizations. A self-addressed stamped envelope was enclosed to maintain anonymity on the return of the completed questionnaires through the postal service.
55
2.6.3 Questionnaire Response The questionnaires were posted to 2000 of manufacturing organizations in Malaysia found in
the HRDC Directory. The appeal highlighted the focus of the study, i.e. training evaluation activities that relate to the benefits of training.
Of the 2000 questionnaires posted 94 were returned with a note that the organizations were closed down or had moved to a new address. This reduces the original samples of 2000 to 1906. Reminder notes were sent out three weeks after the first posting in order to encourage
greater response rate. However there were only 109 completed questionnaires returned. The overall lack of organizational response can be attributed to a variety of causes: low interest, lack of time to respond, current restructuring of the organization, unavailable contact person, and outdated addresses.
2.7 Findings and Discussion Data was analysed using SPSS for XP Windows (Version 13). Statistical significance was accepted at the 0.05 level of confidence. A total of 5.5 percent of the questionnaires were returned. Part 1 of the questionnaire gathered information on the background of the companies. It was found that out of the 109 companies, 46 percent are multinational companies while 54 percent are Malaysian companies. Part 2 of the questionnaire gathered information on the organization's commitment to training. The results are shown in Table 1.
56
Table 1. Commitment to Training Commitment to Training
Statistics (n =109)
Does your organization conduct training programs for employee development Does your organization conduct training needs analysis before conducting any training programs
What type of training is conducted by your organization Management e.g. Leadership, supervisory, managing change, communication, human relations and interpersonal skills
Organization Specific e.g. training programs related to policies, values, cultures, goals and objectives of the whole organization
Yes No
= 100 percent = 0 percent
Yes = 41.3 percent No = 58.7 percent Multinational = 39 Malaysian companies = 6
45.9 percent
18.9 percent
Technical e.g. quality, productivity, product training, IT training, accounting system and job related training
64.3 percent
Personal Improvement e.g. motivation, time management, self development, managing self, presentation skills and business communication skills
24.2 percent
Others
0 percent
A total of 41.3 percent of organizations agreed that a training needs analysis was conducted
prior to conducting any training program. The rest of the organizations conduct training to meet the needs of the organization such as low productivity or a morale problem, reaction to
a crisis and frequently not coordinated with other functions of the organization. The lack of baseline information prevents evaluation and no meaningful comparison of the participant's performance before and after training can occur.
The results indicate that 64.3 percent of organizations organized technical training. A large number of organizations felt the need to upgrade the technical competence of their employees in the areas of quality, productivity, product training, IT training, accounting system and job
related training. Of all the organizations interviewed, 65 percent reported that they have
57
extended their range of products during the last two years and 88 percent had made changes to machinery and equipment.
Management training was ranked the second at 45.9 percent followed by personal
development at 24.2 percent. One fifth of the organizations are also concerned with management training. Many feel that skills such as leadership, supervision, managing change, communication, human relations and interpersonal skills are needed for management
development. Although organization specific training is an emerging area, only about 18.9 percent of the organizations feel the need to impart training in this field.
Table 2 shows the level of evaluation conducted on management and non-management training by the organization.
Table 2. Training Evaluation Practices in Organization Training Evaluation Practices in Organization
Statistics (n =109)
Level 1
reaction evaluation
35 percent
Level 2
learning evaluation
25 percent
Level 3
behavioural evaluation
Level 4
results evaluation
16.5 percent 11 percent
No training evaluation practices
12.5 percent
The results indicate that out of 109 companies, 35 percent of organizations conducted Level 1 evaluation by measuring the participant's reactions towards the training program while 25 percent of the organizations conducted Level 2 evaluation by measuring the participant's degree of learning as the result of the training initiatives.
Only 16.5 percent of organizations
conducted Level 3 evaluation by measuring the changes in the participant's behaviour
towards the job after each training program. However, 11 percent of organizations quantified the results of training and calculated its return on investment in training which is
classified as level 4 evaluation. The remaining 12.5 percent of organizations have never conducted training evaluation after each training program. The results indicate that more
58
than half of the organizations do not evaluate their training at the behavioural or the results
levels. The reason for this is that sometimes training function is seen as an isolated and peripheral function, which is not truly integrated into the job setting (Olsen, 1998).
The means and standard deviations of the four levels of training evaluation for all 109 companies are shown in Table 3.
Table 3: Means and Standard Deviations of the Four Levels of Training Evaluation
Level
Mean + SD
Level 1
3.63 ±0.62
Level 2
3.41 + 0.62
Level 3
3.26 + 0.63
Level 4
2.99 + 0.68
Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree
The majority of the organizations agree that they conduct Level 1 evaluation after each
training program. The average for Level 1 evaluation is 3.63 which suggest that the majority of organizations conduct Level 1 evaluation. The average score for Level 2 evaluation is 3.41 which indicate that some companies conduct Level 2 evaluation selectively and the
majority is done on technical training. The average for Level 3 evaluation is 3.26. The result indicates that the degree of measuring behavioral changes in the job after training is not that
popular among these manufacturing organizations. This could be due to the unavailability of specific tools to measure the subjective changes in behavior. The average score for level 4 evaluation is 2.99 indicating that the majority of these manufacturing organizations do not
conduct result evaluation. The result was further confirmed by an interview which mentioned that the benefits of training are not easily measured in quantitative terms and most benefits cannot be measured immediately.
The means and standard deviations of the 34 questions in the instrument for all 109 companies are shown in Table 4.
59
Table 4: Means and Standard Deviations of 34 Questions in the Instrument Mean Score
L evel 1
-
React ion Evaluation
SD
Departmental heads conducted collective opinions from participants with regards to the training program conducted 2. Evaluate perceptions of participants on key benefits and value arising from training
4.12 3.03
0.689 0.934
3. Conduct training environmental audit to track participants
4.03
0.724
satisfaction after training 4. Focus on perception of trainees towards the training program.
4.38
0.862
5. Measure trainers competency and credibility after each 2.74 training program 6. Most training programs conduct post course reaction 3.89 evaluation after training. 7. Always make an effort to ask participants whether they enjoy 4.20
0.908
1.
0.715 0.815
attending the training programs
8. Measure the accuracy of the training program in addressing 2.67
1.021
the exact requirement of the job
Level 2 -
1.
Learning Evaluation
Allow participants to write down what they have learned
3.69
0.641
Conduct pen and paper test for measuring the amount of 4.28
0.703
which might be useful for their work 2.
knowledge gained from a training program 3.
Administer a test before and after training with regards to the knowledge gained from a training program
3.41
0.912
4.
Identify the principles, facts and techniques learned by
2.98
1.090
participants
Level 3 -
Behavioral Evaluation
5.
Participants were asked if there were any barriers preventing them from using what they have learned
2.69
0.932
1.
0.909
2.
Develop performance-based tests as part of the training 2.89 evaluation Assess the level of transfer of learning to the job 3.04
3.
Measure the success rate of participants performing each item 3.23
0.089
0.994
learned 4.
Define an action plan for participants and
evaluate the 3.43
0.745
Identify specific skill improvement as a result of a training 3.93
1.079
implementation success rate 5.
program
positive changes effectiveness after training
6.
Measure
and
3.77
0.931
7.
Measure the behavior changes resulting from the training
3.51
1.099
Organize the trainer's follow up session to track the 3.28 participant's behavioral change after training 9. Use observation techniques to monitor changes of behavior 2.62 and attitudes resulting from the training program 10. Conduct work performance evaluation in the workplace after 2.71 training
1.141
in
personnel
efficiency
program 8.
1.062
0.703
11. Observing and documenting the practice of knowledge and skills learned by the trainee into the workplace.
3.32
0.773
12. Assess the increase in knowledge and skills as well as attitude change of trainees
2.84
0.842
13. Conduct a preview session with your trainee to specify the
3.79
0.952
expected objectives to achieve from the training
60
Level 4 -
1.
Results
Evaluation
Measure the level of productivity before and after a training
program 2. Link effectiveness of training to financial benefit
2.56
0.721
2.91
0.668
3.10
0.711
3.
Conduct cost-benefit analysis on training programs conducted
4.
Measuring the worthiness of attending training in cost and time away from work
terms of 3.35
0.823
5.
Measure the tangible cost in terms of reduced cost and 2.82
0.913
improved quality after training 6. Calculate the cost of training and its impact towards 2.71 organization improvements 7. Compare the cost of training program with benefits obtained 3.24 from it 8. Finding evidence of direct links between training investment 3.18 and returns from training
0.793
0.894 0.615
Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree
The results indicate that Level 1 evaluation (reaction) seems to be the most significant
training evaluation practice. A high mean score of 4.38 indicates that the majority of Malaysian manufacturing companies focus on the perception of trainees towards the training
program. Managers do play an active role in conducting Level 1 evaluation by collecting
opinions from participants with regards to the training program conducted. Measuring the accuracy of a training program in addressing the exact requirement of the job is the least practiced and is indicated by a low mean score of 2.67.
The practice of pre and post pen and paper test after a training program is most popularly practiced by these manufacturing companies and is shown in the mean score of 4.28. The lowest mean score for Level 2 evaluation was 2.69 indicates that organizations seldom ask participants if there were any barriers which prevented them from using what they have learned.
Level 3 evaluation is modestly practiced by manufacturing companies in Malaysia. The highest mean score of 3.93 indicates that the majority of these manufacturing companies
identified specific skill improvement as a result of a training program. The use of observation techniques to monitor changes of attitude and behaviour as a result of the training program shows the lowest mean score of 2.62.
The apparent lack of practice in Level 4 evaluation (result) is probably due to the effort and
potential complexities involved which entails much more work. This is reflected in the
61
survey result which indicates low interest in conducting cost-benefit analysis of training by
these organizations. Measuring the worthiness of attending training in terms of cost and time away from work showed a mean score of 3.35. This is regarded as one of the most popular practice of Level 4 evaluation by these organizations. Calculating the costs of training and its impact towards organization improvements showed the lowest mean score of 2.71. Independent t-tests were used to test for significant difference in the four levels of training evaluation conducted by multinational and Malaysian companies. It was found that there were significant differences between training evaluation at Level 1, Level 2, Level 3 and Level 4 between multinational companies (N=50) and Malaysian companies (N=59) at p < 0.05. See Table 5.
Table 5: Summary oft-tests of the four levels of training for multinational companies and Malaysian companies
Company
Level 1 (Mean + SD)
Level 2 (Mean + SD)
Level 3 (Mean + SD)
Multinational
3.78 + 0.52
3.69 + 0.56
3.63 + 0.48
3.50 + 0.53
Malaysian
3.49 + 0.67
3.17 + 0.57
2.94 + 0.56
2.56 + 0.46
2.635 *
4.758 *
6.794 *
9.838 *
t-value
Level 4 (Mean + SD)
*p <0.05 The results indicate that the majority of multinational companies operating in Malaysia have a clearer objective of what ought to be done and have enshrined this in mission statements on
training. These multinational companies provide training and development for all employees in all areas of operations with expensive investment and serious attempts to produce a competent and quality workforce. The results show that multinational companies judge
training effectiveness as their immediate reaction to training evaluation. These multinational companies applied formal and systematic procedures and processes to assess training effectiveness as compared to Malaysian companies.
The results show that the majority of Malaysian companies did not conduct Level 3 and
Level 4 evaluations. Most Malaysian companies seem to lack the formal mechanism to
62
assess training effectiveness. The results of the t-tests were further confirmed by interviews which suggested relatively mild commitment of top management to training and some resistance by middle management to the function of training in Malaysian companies. Training seems to be a low priority area and training evaluation is conducted on an ad hoc basis. Part of this could be because identifying individual performance improvement after training is regarded as a tedious and lengthy process.
Only six Malaysian companies have conducted training needs analysis prior to conducting
training. The result was further confirmed by interview which mentioned at times managers send employees to training programs just to fill the quota. These employees are not the intended participants of the training program and would return without learning much. Since they do not have much commitment for learning after training, it does not permit Level
2, Level 3 or Level 4 evaluation to take place. This trend is shown in less Malaysian companies practising Level 2, Level 3 and Level 4 evaluation as compared to multinational companies.
2.8 Limitations of Study The number of respondents was relatively low. Even though the majority of manufacturing companies that have more than 50 employees registered with the Human Resource Development Council, the actual number of organizations that actively participate in training and development is rather low.
Out of the 2000 manufacturing based organizations listed in the EIRDC Directory, less than
40 percent of them conducted at least one training program in a year (HRDC, 2003). Details
of companies that do not participate in training were not disclosed by HRDC. The reason is because HRDC does not want training providers to seek for organizations with high unused
funds. The survey was decided to send to all 2000 manufacturing based organizations as the details of the organizations that do not conduct training program were not known. Hence, the majority of these organizations that do not conduct training could not answer the questionnaire.
63
The success of the study depends on the willingness of respondents to cooperate. Some may not see the value in participation while others may view the topic as sensitive or irrelevant. Despite reminder notes were sent out three weeks after the first posting to encourage greater
response rate. A comparison between respondents and non-respondents would have been helpful. Unfortunately, data were not available for making such comparisons in this study.
2.9 Conclusion Kirkpatrick model has been considered one of the most prominent models of evaluation
practised in Malaysia. The application of the 4-level of evaluation in Malaysia is not well adopted. This study reveals that training evaluation carried out by most organizations in
Malaysia is mainly to judge trainees' reactions. A culture of fill in one of this before you go typically pervades in training evaluation. Most organizations lack the formal and systematic mechanisms to assess training effectiveness.
Many companies remain blissfully unaware of
how much they spend on training, whether it is effective or not. Indeed, even the use of expensive external trainers does not appear to trigger detailed evaluations.
The majority of Malaysian organizations show little or no interest in conducting training evaluation and have even less interest in the results of evaluation as method of evaluating
effectiveness. Some find evaluation difficult as it is almost impossible to determine which participant efforts are attributable to training and which are not.
Although Kirkpatrick model of evaluation serves as an outcome of training, most
practitioners do not know what evaluation criteria to look for. The confusion of the actual outcome possibly hindered the ability to conduct Level 3 and Level 4 evaluations meaningfully.
Hence this research gap shows the opportunity to examine specific outcome required from
training and the transfer component of training in detail. This study will determine what strategies might be most helpful in maximizing the transfer learning and constructing an appropriate model for evaluation.
64
2,10 References for Paper Two Ahmad, R.H. 1998, 'Educational development and reformation in Malaysia: past, present and future' Journal of Educational Administration, vol. 36, no. 5, pp. 462-475. Attkinson, C. C., Sorenson, J.E., Hargreaves, W.A. & Hororwitz, M.J. 1978, Evaluation of Human Service Programs, Academic Press, London. Arthur Anderson & Co. 1991, Professional Services in Malaysia, Arthur Anderson & Co., Kuala Lumpur, Malaysia. Asma, A. 1994, Training design development: The practice of four development agencies in Malaysia, Unpublished Ph. D. dissertation, University Pertanian Malaysia, Serdang.
Barron, T. 1996, 'A new wave in training funding', Training and Development Journal, vol. 50, no. 5, pp. 28-32. Bernthal, P. R. 1995, 'Evaluation that goes the distance', Training and Development', vol. 49, no. 9, pp. 41-49. Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives and evidence from Canada', International Journal of Training and Development, vol. 4, no.4, pp. 295-303. Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and New York.
Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal of European Industrial Training, vol. 18, no.1, pp. 10-14. Brandenburg, D.C. 1982, 'Training evaluation: what's the current status', Training and Development Journal, vol. 36, pp. 28-29. Brinkerhoff, O.R. 1988, 'An integrated evaluation model of HRD', Training and Development Journal, vol. 42, no. 2, pp. 66-8. Carnevale, A. P. & Schulz, E.R. 1990, 'Return on investment: according to training', Training and Development Journal, vol. 44, no. 7, pp. 1-32. Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program and Policy Evaluations, Greenwood Press, Westport, CT. Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training', Training and Development Journal, vol. 46, no. 8, pp. 70-71. Erickson, M.R.C. 1991, Business-education partnerships: a study of evaluation methods, Doctorial dissertation, the George Washington University, dissertation abstracts international, vol. 52/07A, AAC9133008.
65
Easterby-Smith, M. 1985, 'Training course evaluation from an end to a means', Personnel Management, vol. April, pp. 25-27. Gutek, S.P. 1988,'Training program evaluation: an investigation of perceptions and practice in non-manufacturing business organizations', doctoral dissertation, Western Michigan University, Kalamazoo, MI, dissertation abstracts international, vol. 49/05a, AA C8811388.
Groove, E.A. & Ostroff, C. 1990, 'Program evaluation', in Developing Human Resource, eds Wexley, K. & Himicks, J., BNA Books, Washington D.C. Heneman, H.G. & Schurab, D.P. 1986, Human Resource Management, Irwin, Illinois. Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill, New York. Hamid, A.A., Mohd, S., Muhamad, A.H. & Ismail, Z. 1987, Management Education in Malaysia, in Developing managers in Asia, eds Tan Jing Hee & You Poh Seng, Addison-Wesley, Singapore. Human Resource Development Council 1992, Human Resource Development Act 1992, Ministry of Human Resource, Kuala Lumpur, Malaysia.
Human Resource Development Council 2003, Ministry of Human Resource, Kuala Lumpur, Malaysia.
Junaidah, H. 2001, 'Training evaluation: clients' role', Journal of European Industrial Training, vol. 25, no. 7, pp. 374-379. Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction', Journal of American Society for Training and Developing, vol. 13, pp. 3-9, Kirkpatrick, D.L. 1959b, 'Techniques for evaluating training programs: part 2 - learning', Journal of American Society for Training and Developing, vol. 13, no. 12, pp. 21-26.
Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour', Journal of American Society for Training and Developing, vol. 14, no. 1, pp. 13-18. Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results', Journal of American Society for Training and Developing, vol. 14, no. 2, pp. 28-32. Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: A guide to human resource development, 2nd edn, Craig, R.L.O., McGraw-Hill Publisher, New York.
Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training and Development Journal, vol. 33, pp. 78-92. Malaysia Economic Planning Unit 1996, Seventh Malaysia Plan: 1996-2000, Government Printer, Kuala Lumpur, Malaysia.
66
Maimunah, I. 1990, Extension: Implication to Community Development, 2nd ed, Dewan Bahasa and Pustaka, Kuala Lumpur, Malaysia
Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia, Malaysian Institute of Management, Malaysia.
Olsen, J.H. 1998, 'The evaluation and enhancement of training transfer', International Journal of Training and Development, vol. 2, no. 1, pp 61-75. Pauzi, M. 1985, 'Training nuisance, 12th ARTDO International Conference', Petaling Jaya, Malaysia, 22-27 July. Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, Gulf Publishing Company, Houston, TX.
Pillai, P. 1994, Industrial Training in Malaysia: Challenge and Response, ISIS Publication, Setiakawan Printers Sdn Bhd, Malaysia. Pillai, P. & Othman, R. 1994, 'Learning to work, working to learn', Institute of Strategic and Institutional Studies, Kuala Lumpur.
Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European Industrial Training, vol. 18, no. 5, pp. 27-30. Rowe, C. 1992, 'How useful was it? The problem of evaluating in-house training programs', Industrial and Commercial Training, vol. 24, no. 7, pp. 14-18. Rossi, P.H. & Freeman, H.E. 1993, Evaluation: A Systematic Approach, 5th edn, Sage Publications, California.
Ruthman, L. & Mowbray, G. 1983, Understanding Program Evaluation, Sage Publication, London. Saiyadain, M.S. 1995, 'Perceptions of sponsoring managers, training organizations, and top management attitude toward training', Malaysian Management Review, vol. 30, no. 4, pp. 69-74. Shadish, W.R. & Epstein, R. 1987 'Patterns of program evaluation practice among members of the evaluation research society and evaluation network', Evaluation Review, vol. 11, no. 5, pp. 555-590. Shamsuddin, A. 1995, 'Contextual factors associated with evaluation practices of selected adult and continuing education providers in Malaysia', unpublished PhD dissertation; University of Georgia, Athens, G.A.
Shelton, S. & Alliger, G. 1993, 'Who's afraid of level of evaluation?', Training and Development Journal, vol. 47, no. 6, pp. 43-46. Smith, A. 1990, 'Evaluation of management training subjectivity and the individual', Journal of European Industrial Training, vol. 14, no. 1, pp. 12-15.
67
Smith, A.J. & Piper, J.A. 1990, 'The tailor-made training maze: a practitioner's guide to evaluation', Journal of European Industrial Training, vol. 14, no. 8, pp. 2-24. Wagel, H.W. 1977, 'Evaluating management development and training programmes', Personnel Management, vol. 54, no. 4.
68
2.11 Appendix A The Questionnaire for Research Paper Two
This survey is about ....
grainituj gAlat.t.k. 9...N.4 4. aZatatt44.
Enormous resources of time, money, and energy are invested in every imaginable kind of training and development program. Little effort is invested in discovering the how well those process work, how they might be improved or, indeed, if they work at all. It is important for
organization that uses training and development activities to seek practical ways of evaluating those activities.
With greater emphasis by the Ministry of Human Resources since the enactment of Human Resources Development Act, 1992, there is a need to improve the effectiveness of training activities in Malaysia in order to achieve greater productivity among the workforce. However, effective evaluation requires the examination of training outcomes at several levels of evaluation. This research study is designed to study to what extent the Malaysian manufacturing sectors have carried out training evaluation and how these organizations have
benefited from the training event.
The information you provide will help us better
understand the quality and effectiveness of training evaluation system that has so far being carried out within the Malaysian context. Because you are the one who can give us a correct picture of how you experience conducting training evaluation, I wish to invite you to participate in this research study. The results will be presented in an aggregate and untraceable manner.
If you have any enquiry about this research or the questionnaire, feel free to contact me, Lim Guan Chong, at No. 54, Jalan SS2167, 47300 Petaling Jaya, Selangor Darul Ehsan, or my cell phone 019-4781553, or my e-mail
[email protected]. You can also contact my supervisors, to verify this survey and my doctoral candidateship: Dr. Travis Kemp (e-mail:
[email protected]) or Professor Dr. Leo Ann Mean (e-mail:
[email protected]).
Part 1: Tell us about your Organization Name of organization:
Type of company:
Multinational
Malaysian companies Nature of business:
Manufacturing
Service
Others, please specify
69
Part 2: Commitment to Training Do your organization conduct training program (in house training program, public program and on-the-job training) for employees development? Yes No
Do your organization conduct training needs analysis before conducting any training programs? Yes No
What types of training programs conducted by your organization? Management (e.g. leadership, supervisory, managing change, human relation and interpersonal skills, communication) Organizational specific (e.g: training programs related to whole organization policies, values, culture, goals and objectives)
Technical (e.g. quality, productivity, product training, IT training, accounting system and job related training) Personal Improvement (e.g. motivation, time management, self development, managing self, presentation skills and business communication skills) Others. Please specify
Part 3: Training Evaluation Practices Instructions: Please indicate your agreement and disagreement that truly represents the practice in your organization on a scale of 5 (strongly agree), 4 (agree), 3 (neutral), 2 (disagree) to 1 (strongly disagree), to express your view.
Training Evaluation Practices
Strongly
Agree
Neutral
Disagree
Agree 1
2
Strongly Disagree
Most training programs conduct post course reaction evaluation after training
5
4
3
2
1
Always make an effort to ask participants whether they enjoy attending the training programs
5
4
3
2
1
70
Training Evaluation Practices
Strongly
Agree
Neutral
Disagree
Agree 3
Departmental heads conducted collective opinions from participants with regards to the training program
Strongly Disagree
5
4
3
2
1
conducted. 4
Participants were asked if there were any barriers preventing them from using what they have learned
5
4
3
2
1
5
Allow participants to write down what they have learned which might be useful for their work
5
4
3
2
1
6
Define an action plan for participants and evaluate the implementation success rate
5
4
3
2
1
7
Conduct pen and paper test for measuring the amount of knowledge gained from a training program
5
4
3
2
1
8
Administer a test before and after training with regards to the knowledge gained from a training program.
5
4
3
2
1
9
Develop performance-based tests as part of the training evaluation
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
12 Measure the behaviour changes resulting from the training program
5
4
3
2
1
13 Conduct a preview session with your trainee to specify the expected objectives to achieve from the training
5
4
3
2
1
14 Organize the trainer's follow up session to track the participant's behavioural change after training
5
4
3
2
1
5
4
3
2
1
10 Identify specific skill improvement
as a result of a training program
11
15
Measure positive changes in personnel personnel efficiency and effectiveness after training
Measuring the worthiness of attending training in terms of cost and time away from work
71
Training Evaluation Practices
Strongly
Agree
Neutral
Disagree
Agree
Strongly Disagree
16
Use observation techniques to monitor changes of behaviour and attitudes resulting from the training program.
5
4
3
2
1
17
Measure the level of productivity before and after a training program
5
4
3
2
1
18
Link effectiveness of training to financial benefit
5
4
3
2
1
19 Conduct cost-benefit analysis on training programs conducted
5
4
3
2
1
20
Evaluate perceptions of participants on key benefits and value arising from training
5
4
3
2
1
21
Identify the principles, facts and techniques learned by participants
5
4
3
2
1
22 Measure the tangible cost in terms of reduced costs and improved quality after training
5
4
3
2
1
23 Measure the accuracy of the training program in addressing the exact requirement of the job
5
4
3
2
1
24 Measure the success rate of
5
4
3
2
1
25 Conduct training environmental audit to track participants satisfaction after training
5
4
3
2
1
26 Measure productivity improvement after each training
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
5
4
3
2
1
participants performing each item learned
27
Calculate the cost of training and its impact towards organization improvement
28 Conduct work performance evaluation in the workplace after training
29 Measure focus on perception of trainees towards the training program
72
Training Evaluation Practices
Strongly
Agree
Neutral
Disagree
Agree
30 Assess the increase in knowledge and skills as well as attitude change
Strongly Disagree
5
4
3
2
1
of trainees
31
Compare the cost of training program with benefits obtained from it
5
4
3
2
1
32
Observing and documenting the practice of knowledge and skills learned by the trainee into the workplace.
5
4
3
2
1
5
4
3
2
1
34 Measure trainers competency and credibility after each training program
5
4
3
2
1
Finding evidence of direct links between training investment and returns from training.
5
4
3
2
1
33 Assess the level of transfer of learning to the job
35
Thank you for your participation!
73
Research Paper 3
MULTI-RATER FEEDBACK FOR TRAINING AND DEVELOPMENT: AN INTEGRATED PERSPECTIVE
Lim Guan Chong Master of Business Administration (Finance) University of Hull
International Graduate School of Management University of South Australia 74
Multi-Rater Feedback For Training And Development: An Integrated Perspective Lim Guan Chong International Graduate School of Management University of South Australia
3.1 Abstract
This paper looks at the difference between success and failure of multi-rater feedback in enhancing employee self awareness and encouraging them to engage in development
programs. Multi-rater feedback is basically used as an unfreezing process in which employees are motivated to rethink their behaviour and its impact on others. Multi-rater feedback provides employees with good data from multiple perspectives, encouraging
openness to listen and accept self weaknesses for development. A comprehensive integral model encompassing process consultation and good conversation is used to facilitate
effective development after multi-rater feedback. Process consultation provides a description of prescriptive approach to help the employees recognize and accept responsibility for the
difference in perception. The flexibility of process consultation should be enhanced by integrating good conversation to promote ideal communication and interaction between the process consultant and employees which will eventually build trust for open learning and
development. Post multi-rater feedback is introduced to assess the degree of performance improvement which resulted from the development program.
3.2 Introduction Multi-rater or 360-degree feedback has gained wide acceptance and usage to support development of leadership and management skills (Cacioppe & Albrecht, 2000). The multirater feedback process provides a comprehensive feedback collected from people around
ratees, in the workplace. One underlying rationale to such system is its potential impact on
75
the target ratee's self awareness: increasing self awareness is thought to enhance development
(Ashford, 1993; Mount, Judge, Scullen, Sytsma & Hezlett 1998). According to Van Veslor et al. (1993), the number of multi-rater feedback instruments has increased significantly in the past 15 years. It is estimated that American companies spent $152 million in 1992 on this form of feedback for development (Hoffman, 1995). Multi-rater feedback was first introduced to the UK in the early 1990s, and has spread quickly across public and private
sector organizations (Fletcher & Baldry, 2000). The spread is based on the perceived benefits of fairer and greater accuracy in representing a performance, which creates development and
learning potential that can consequently motivate changes in behaviour. In review of 20 organizations responding to the delivery of multi-rater feedback, London and Smither (1995) found that 40 percent of the respondents always linked multi-rater feedback to specific developmental activity. According to Moses, Hollenbeck and Sorcer (1993), there does not
appear to be a distinct individual who founded or invented the multi-rater feedback process. They argued that the term multi-rater feedback has been mistaken to be a newly discovered concept, as perceptions of people have been available as long as there were people around to observe them.
3.3 The Use of Multi-rater Feedback The implementation of multi-rater feedback varies among organizations. The widespread adoption of multi-rater feedback and other multi-source feedback is based on the perceived benefits of fairer and greater accuracy in representing performance because it offers a more rounded assessment of the individual, not just the top-down perspective of conventional
appraisal. It is an empowering mechanism, which allows subordinates to exert some influence over the way they are managed. The same is true for peers, who can think back and improve a colleague's role perform as a team member.
Moreover, the multi-rater feedback
system provides a natural method for both enhancing learning and improving performance. As the complexity of job function increases in the workplace, it is crucial for employees to receive feedback from a variety of constituencies and not only the traditional superiorsubordinate appraisal approach. This feedback facilitates self awareness by enabling participants to compare their own perceptions of their skills and personal style with the
perceptions of important observers in their work environment. Multi-rater systems are
76
assumed to improve performance by increasing self awareness through diversified information from multi-rater feedback (Borman, 1997). Ratees who receive feedback or appraisal on their performance from a variety of sources will be responsible to improve current job performance through continuing to add value to the organization needs and be prepared for the future.
Continuing development has been the key priority for most organizations in keeping the
workforce updated with on-going technological changes. Measuring and improving worker performance has become increasingly important for organizations to stay competitive. According to Nowack (1993), the increased use of multi-rater feedback in organizations was mainly due to the increasing need for continuous measurement of improvement efforts; the need for job-related feedback for employees affected by career plateauing; and the need to maximize employee potential in the face of technological changes, competitive challenges and increased workforce diversity.
Multi-rater feedback has also been seen to increase reliability, fairness and acceptance of the data by the person being rated (London, Wohlers & Gallagher, 1990). This is because feedback is received from multiple sources and not just from one. A study conducted in an American company showed that only 3.9 percent of staff felt that feedback should come solely from the superior, while 94.8 percent felt that feedback should come from both superior and co-workers (Cacioppe & Albrecht, 2000). This result indicates that there is an on-going trend in American companies to have performance feedback from multiple sources despite the fact that there may always be variation in terms of perceptual differences between
the self and others on the feedback results. Although multi-rater feedback system provides the ratee with greater information as a base for development, Tornow and London (1998) suggested that multiple feedback sources require balancing, as multiple sources may potentially offer conflicting viewpoints to the ratee. However, the differences in perspective between the rater and the ratee should not be treated as an assessment error. This is further supported by Ashford (1993) who found that multi-rater feedback is important to the ratee because the information could further stimulate the ratee's cognitive reactions that would
likely give impact to subsequent behavioral changes. A multiple feedback system is a source of information, which can enhance personal learning by providing the opportunity to ratees who are being assessed to compare their self-perceptions against the perceptions of others
77
regarding their behaviour. Multi-rater feedback is simply a set of performance-related information which is essential for learning and development.
Bennis, Benne and Chin (1969) is of the view that multi-rater feedback is a critical element in affecting change in performance evaluation. According to Zemke and Zemke (1995), adults undertake learning experiences when they see a need to acquire a new or different skill or
knowledge. Multi-rater feedback provides the opportunity for open communication between rater and ratee to discuss on ratee's past behaviour and weaknesses, encouraging openness to hearing and accepting feedback. Such feedback and open communication is instrumental for an unfreezing process which ratee is motivated to rethink back his or her weaknesses and strengths (Shipper & John, 1992). McCauley and Moxley (1996) also viewed multi-rater
feedback as an instrument in an unfreezing process, in which ratees will have a chance to rethink their previous and current behaviour based on the discrepancies of the results between self and others and how their weaknesses would create impact on others. Hence, conducting multi-rater feedback before and after training provides an avenue for the training provider to evaluate performance changes. This shows that the feedback received can be used both as reinforcement of past learning and also an opportunity for future learning (Rosti & Shipper, 1998).
Evidence from various settings has demonstrated an association between self awareness and
performance outcome (Fletcher, 1997). It has been found that high self awareness is related with high performance ratings in various aspects (Atwater, Ostroff, Yammarino & Fleenor, 1998; Bass & Yammarino, 1991; Furnham & Stringfield, 1994). Nasby (1989) is of the opinion that ratees with high self awareness are more able to integrate various feedbacks into self-perception in order to reach a higher performance outcome. Ashford (1984) found that people with low self awareness are more likely to ignore or discount feedback about them
and will have a negative attitude towards work. This indicates that a highly self-aware ratee is likely to exert self-positivism to accept feedback and show self-motivation for improvement (Fletcher & Bailey, 2003).
However, according to Greguras, Ford and Brutus (2003), further research needs to be conducted to investigate the effectiveness of multi-rater feedback systems in increasing one's
self awareness. This will lead to eventual improvement in one's performance as the ratee may react differently to different source of information they received. Information obtained 78
from different sources would have different affects on the ratee's self awareness. If the
feedback proves to be true, how would the ratee react to this perceptual reality. Conway and Huffcutt (1997) commented that although different raters present different information, not
much research has explored how the ratee attends to, integrates, and uses the information
from the various raters. Although multi-rater feedback system increases one's self awareness and leads to further improvement, a further specific mechanism needs to be included (Hazucha, Hezlett & Schneider, 1993; Reilly, Smither & Jasilopoulos, 1996; Walker &
Smither, 1999). The specific mechanism refers to identification of appropriate personality traits, skills or competency needed by the ratees; establishing an appropriate feedback rating
approach and selection of relevant raters. This proposed mechanism must be embedded within the feedback system prior to the implementation. This will increase ratee readiness in accepting the multi-rater feedback and lead to individual self awareness for further
improvement. Many studies have been conducted to discover specific interventions of the multi-rater feedback mechanism.
3.4 The Effectiveness of Multi-rater Feedback for Development Fletcher and Bailey (2003) found that multi-rater feedback provides the opportunity for the
ratee and raters to agree on the level of competence that is needed. Church (1997) also supports this view and suggested that multi-rater feedback provides both the rater and ratee
with the opportunity to agree on the development needs of a required performance standard, competency and skills necessary for the ratee.
Both the rater and ratee would have an
opportunity to clarify their respective expectations in order to develop a psychological
contract or agreement. The ratee would be more focus on what is needed by people working around them and raters would have a better understanding of the ratee's strengths and weaknesses. In certain multi-rater feedback systems this is known as a gap analysis process and other multi-rater literature refer to it as congruence-d (Warr & Bourne, 1999). Edward
(1993, 1994) stated that d is the score difference score between the ratee and other raters. However, Fletcher and Bailey (2003) commented that telling a ratee d score is of no use unless the rater can provide specific and meaningful information to reduce the gap between
ratee's and raters' scores. Congruence-d is obtained by subtracting the average score of the other raters from the self-rating for each feedback questionnaire item, and dividing that value
79
with the standard deviation of raters and ratee's scores (Warr & Bourne, 1999). The level of self awareness is signified by the d score. If the d score is equal to 0, this signifies complete
agreement between the self and the others rating on all items. Disagreement between ratee's and raters' ratings generally showed low correlations between-source ratings. This showed that different rater sources actually provide different information (Conway and Huffcutt, 1997; Harris and Schaubroeck, 1988). Ashford (1993) and Brutus, London and Martineau 1999) conducted studies on relative impact of different rater sources showed raters have
different implications on development of the ratee. The studies discovered that subordinate ratings had the largest impact on goal selection, followed by peers and superior. This shows that the selection of information from different rater source is important for ratees to decide which rater source is most qualified and which feedback is important for further improvement
(Kluger & Denisi, 2000). Mount (1984) supports the validity of subordinates rating and indicated that the majority of ratees show approval for subordinate ratings for developmental purposes (Bernardin, Dahmus & Redmon, 1993; Facteau et al., 1997, 1998; London et al., 1990; McEvoy, 1990).
Further study by Greguras, Ford and Brutus (2003) on 213 managers using a policy capturing design that allowed factors (i.e. lead others, general administrative performance, building
working relationship and overall performance) to be manipulated. The study showed that superior ratings would be weighted more heavily than peer or subordinate ratings for the
ability to lead others, general administrative performance, building working relationship and
overall performance. Ratees will attend more to peer ratings than subordinate ratings for general administration of roles and responsibilities because peers are more likely to
understand the ratee's duties, which are similar to their own. However, a study by Atwater, Roush and Fischthel (1995) showed that the ratee attend more to subordinate ratings as compared to peer ratings for the ability to lead others as subordinates have first-hand
experience with the ratee's leadership behaviour. Selection of feedback information is tied closely to the ratee's perception. Needless to say, ratee development success is closely related to the ratee's perception of the source of information. User should consider whether multirater is best used for development or only provides different dimensions of reference for the ratees.
80
3.5 The Effectiveness of Multi-rater Feedback for Appraisal There are debates on the use of multi-rater feedback for appraisal and development (Bracken, Dalton, Jaka, McCauley & Pollman, 1997). According to London et al. (1990) and Antonioni (1994), respondents will answer questions differently if it is for appraisal purposes. A study by London and Smither (1995) showed that 40 percent of people who provided multi-rater feedback ratings said they would have altered those ratings if the company planned to use them for evaluation or appraisal. According to McEvoy and Buller (1987), ratees view the
process as most useful when uses for development as apposed to appraisal. London and Beatty (1993) found evidence to support this. They reported that 34 percent of the respondents in their study would rate their superior differently if the feedback were shared
with their superior. Hence, there is still an element of fear for individuals to appraise their superiors honestly. Further studies should be carried out to determine the capability of multi-rater feedback that is used in performance appraisal. Few researchers agree that multirater is useful solely for developmental purposes as it is also widely used in managerial and leadership development programs (Cacioppe, 1998; Cacioppe and Albrecht, 2000; Garavan,
Morley & Flynn, 1997; McCauley & Moxley, 1996; Thach, 2002). O'Reilly (1994) suggested that when multi-rater feedback is used for development purposes, scores from
raters do not vary much. However, this was not the case for formal performance appraisals.
3.6 The Variation of Multi-rater Feedback Information A study by Kluger and DeNisi (1996) on the effectiveness of multi-rater feedback interventions showed that only one-third actually yielded positive improvements in
performance. There is an urgent need to take a closer look at the effectiveness of multi-rater feedback in performance development. Feedback is invaluable to ratee as it comes from multiple sources, and provides multiple perspectives. Each opinion or perspective may provide relevant yet different feedback for the ratee to focus upon (Atwater & Yammarion, 1993; Hazucha et al., 1993; Tornow, 1993). Ghorpade (2000) commented that having more information does not necessarily mean a higher accuracy rate and information provided by
just a superior does not mean it is not impartial. If the source does not have an opportunity to observe the ratee's behaviour, or does not recognize the requirements of a particular
81
performance dimension, feedback from the source may be inaccurate for the ratee's
development. Therefore the quality of ratings from different sources for a particular dimension should be assessed (Kluger & DeNisi, 1996). London and Smither (1995) stated that ratings provided by different raters are likely to be inconsistent because it may create much confusion and disagreement on the results and may not increase future development. According to Moses et al. (1993) multi-rater feedback relies solely on the instrument scoring
system or data collection methods to interpret the information for ratees. Moses et al. (1993) argued that multi-rater feedback is based on people's observations and the observer may not
know what behaviour to look for. If the primary purpose of multi-rater feedback is to identify developmental opportunities, then a set of competent performance behaviours has to be identified and communicated to all raters prior to the process. This would enable the rater
to understand the required habits, behaviors or styles so that a proper and fair judgment
towards the ratee's performance is ensured. The rater's feedback is important and may have an impact on the ratee's subsequent developmental priorities.
The rater's feedback such as perception bias, cultural issues and gender should also be given
special attention (Cacioppe & Albrecht, 2000). An example of perception bias is a man will show better leadership than a woman. A study of three organizations, with a total of over 20,000 employees, showed that there was a positive correlation between performance and age
until the age of 45 (Cacioppe & Albrecht, 2000). The study indicates that raters are likely to stereotype younger ratees as performing better than older ratees, or older ratees may have
better experience compared to younger ratees. However, by looking at the cultural dimension, Leslie, Gryskiewicz and Dalton (1998) argued that multi-rater feedback might not
necessarily be well accepted by cultures in certain countries. Some cultures do not subscribe to the same notion that feedback is valuable and can guide manager development. For instance, cultures such as the French may place more value on lineage or social class than
developing managers. Different cultures may find it a shock to be asked personal information regarding their superiors. American managers find difficult to get those that report directly to them to give negative feedback (Wilson et al., 1996). Another example is Asian value of face-saving where a request for information needed in a multi-rater feedback may come across as offensive (Wilson et al., 1996). An organization that wishes to conduct multi-rater feedback needs to take a closer look at the age, culture and genders of the raters or ratee as these may affect the effectiveness of multi-rater feedback process. 82
Honey and Mumford (1982) reflected that in the event of self assessment, most managers are
poor reflectors. They prefer to charge on with new ideas rather than look backwards and reflect on how things might have gone better. However, according to Waldman, Atwater and Antonian (1998) individuals who rated themselves higher are likely to have higher selfesteem and self-concept. Disagreement over the result could be a threat to the ratee's self esteem and weaken their motivation for further development. Special caution need to be taken in designing multi-rater feedback in order to minimize the potential of ratee being pessimistic and to ensure that ratee's self-image is converted to productive behavioural
change (Wood, Allen, Pillenger & Kahn, 1999). The feedback process should be designed as a tool to ensure effective interpretation of information received from multi-rater feedback to stimulate individual and organization improvement in attaining strategic business objectives
(Heisler, 1996). Information from multi-rater feedback is mainly used for developing people but increasingly, it is being used for strategic planning in training and development (Romano, 1994; Atwater et al., 1993). A research conducted with 48,000 participants indicated that multi-rater feedback could successfully contribute to the effectiveness of training and development (Cacioppe & Albrecht, 2000).
3.7 Multi-rater Feedback Practices in Malaysia In the Malaysian training environment, multi-rater feedback could be used as one of the assessment models for training and development. Al imo-Metcalf (1998) commented that multi-rater feedback should only be used in the context of assessment for development. Payne (1998) supported the view that multi-rater feedback could be a potentially powerful
and even dangerous tool. Therefore it should be confined to the developmental arena and used by people who know what they are doing.
Training evaluation and assessment practices in Malaysia are still considered at an
elementary stage. A study conducted by Zakaria and Rodzhan (1993) on 94 manufacturing and service organizations in Malaysia found that only 44 percent of respondent organizations conducted formal training. Of those who conducted formal training, 23 per cent did not conduct any training needs assessment. The main reason was lack of expertise to perform
83
assessment. Among these respondents, the main source of information for training needs assessment was the problems faced by their organizations. This evidence shows the lack of attention given to transference of skills in training evaluation and feedback. Therefore, it is wise to establish and instill the right approach to training and development as jobs today are increasingly complex, and the traditional method of having a superior rate a subordinate performance is inadequate in giving quality information to improve performance and skills. The training culture in Malaysia has been indirectly influenced by multinational companies operating in Malaysia. This is supported by a survey by Wan Aziz (1994) showed that the majority of multinational companies operating in Malaysia brought in training culture. A survey by Zakaria and Rodzhan (1993) on 108 manufacturing companies, suggested that about 67 percent of the multinational companies interviewed conducted general and specific
training programs for all levels of staff. These multinational companies in Malaysia need to conduct training because they require highly-skilled manpower who are able to operate new
and sophisticated machinery or research product improvement. A study by Wan Aziz (1994) on 120 companies showed that 55.6 percent of Malaysian-owned companies conduct training. This shows that Malaysian-owned companies are emulating the training culture of multinational companies in order to cope with a challenging environment. A research by Junaidah (1999) showed that Malaysian companies feel discouraged when undertaking training, as they are not able to mark the progress of development after training. The main reason may to lie in their inability to see the tangible benefits of training (Saiyadain &
Juhary, 1995). The majority of Malaysian companies conduct training needs on a general basis. Zakaria and Rodzhan (1993) found that only 16 per cent of Malaysian companies indicated that their training needs assessment was based on the strategic plan of the
organization. This indicates a lack of strategic orientation in the way training was conducted in Malaysian companies. Components of training and development in an organization need to cohere with one another in supporting organization strategy.
During the pre-training stage, a needs assessment is crucial in identifying relevant skills needed by the individual to contribute to the strategic objectives of the company. Multi-rater
feedback would compliment the needs analysis by providing ratee with multi-source
feedback for further development. The ratee will be given an opportunity to understand their strengths and weaknesses from a different source and focus on reducing weaknesses and maintaining strengths. In Malaysia training needs assessment are not conducted by measuring 84
individual skill deficiency but through general perceptions of a few top executives in the whole department or organization. Organizations see training as an organizational need
rather than an individual need. Mirza and Juhary (1995) found that training organizations in Malaysia offered training programs that were relevant to the needs of the organizations and were too theoretical, one-shot with no follow up and not interactive. Organizations have neither the professional competence nor the resources to identify training needs and mount relevant training programs. Mirza and Juhary (1995) indicated that the stated flaws of training could be attributed to the partial training culture brought by multinational companies
in Malaysia. They commented that assessment is difficult; it is almost impossible to determine which employee weakness can be addressed by training. The culture of conducting training evaluation among Malaysian companies was simply not popular or encouraging.
According to June and Rozhan (2000), given no proper pre and post training evaluation, the organization will be constrained in its ability to link training with strategic objectives. It would be difficult for the training and development to have a meaningful impact on organizational effectiveness. Their study also provided the argument that multi-rater assessment is not practiced by Malaysian-owned companies for development. If Malaysian
companies wish to conduct complete training and proper assessment, it is wise to use multirater feedback on the training needs assessment so that organizations would be more focused in the development process and able to measure its effectiveness. Shipper and John (1992), found that multi-source information may be a mechanism for open communication among diverse groups to establish proper psychological contract and clarify expectations. This is supported by Luthans and Farner's (2002) study using the Kirkpatrick
(1994) training evaluation framework integrated with multi-rater feedback on 409 expatriate workers from 49 multinational companies on whether transfer learning on the job was well
received. The mentioned training evaluation framework may be applied to local managers who worked in multinational companies in Malaysia who are not clear of the cultures brought in by expatriates and the expectation of their foreign counterparts within the company. Therefore, multi-rater feedback, which has been described as needs analysis process, will
clarify ratee's expectations with the people working around them (Fletcher & Bailey, 2003).
85
Instilling multi-rater feedback as part of the pre-training needs analysis in Malaysian can bring practical benefits for the organization by focusing on a particular behaviour or key competency that is necessary for employee development. Employees will also have the chance to audit self-perception against others through this self awareness mechanism, which will result in higher work performance (Atwater et al., 1998; Bass & Yammarino, 1991;
Fumham & Stringhfield, 1994). The information received from the multi-rater feedback would impact on the targeted individual's self awareness and lead to the achievement of agreed developmental needs (Fletcher & Bailey, 2003). Indeed, research has confirmed that the use of multi-rater feedback is one of the best methods to promote ratees' self awareness of their strengths and skill deficiencies (Hagberg, 1996; Rosti & Shipper, 1998; Shipper &
Dillard, 2000). Multi-rater feedback has been defined as an information gathering process from relevant observers and is linked to specific business needs or objectives. Therefore, a multi-rater feedback refers to the practice of providing an employee with perceptions of his or her performance competencies from numerous sources (Cacioppe & Albrecht, 2000). By reviewing different perceptions of their performance competencies, ratees can confirm their strengths as well as identify their blind spots, habits, behaviours and styles, which may have
an adverse impact on others and their developmental priorities. This process helps a ratee to focus on and develop performance competencies through a well-structured development process. Waldman et al. (1998) were concerned about the lack of research examining the effectiveness of multi-rater feedback on the performance developmental cycle.
3.8 Integrating Multi-rater Feedback with Development Tool Organizations need to look at development as a continuous process by incorporating
development model in the multi-rater feedback system. If the purpose of having multi-rater feedback is not clear and not integrated with the developmental systems, it will come across
like a trend. This is shown by Judge and Cowell (1997) using executive coaching as a development process after conducting multi-rater assessment. The study showed that the combination of multi-rater feedback coupled with individual coaching as a developmental
process increased leadership development effectiveness by 60 percent. This was based on the direct report and peer post-survey feedback. Another study by Heisler (1996a) used a comprehensive combined model which integrated multi-rater feedback with the leadership
86
and management skills development process. This approach was applied to a sample of 304 superiors and more than 1000 subordinates. The result showed that the ratees felt an increase of ownership towards their personal and professional development. Ratees reported improved communication and interaction with their superiors, peers and subordinates
(Heisler, 1996a). Effective communication and interaction between raters and ratee will reduce possible multi-rater feedback drawbacks.
The developmental process of multi-rater feedback involves a great deal of cognitive complexity and acknowledgement of the validity and legitimacy of the feedback. It also requires balancing multiple or conflicting perspectives and balancing a sense of self with the
larger context and role requirements. There should be some mechanism to address the discrepancy between the ratee's and rater's feedback in order to make it into a coherent developmental tool. The identified discrepancies can be used to assist ratees in developing
their personal action plan for development. Research is needed to clarify and validate the most effective concept design to develop ratee after multi-rater feedback.
3.9 Multi-rater Feedback: Process Consultation as a Development Tool Process consultation is an ideal support tool for development (Schein, 1997). The process consultation session should be conducted after multi-rater feedback so that it turns out to be a
very positive experience, regardless of discrepancies in the results. Process consultation is an ongoing development system approach that has skilled third party (process consultant) work
with ratees and helping them learn about their competency gap from the multi-rater feedback
process. The process consultant should emphasize on the ratee's strengths and improve on what the ratee does best, not what he does worst. In spite of this, if different world-view arises between the process consultant and the ratee (client), the process consultant may use non-directive techniques in order to help the client recognize and accept responsibility for the deficiency (Hall, Otaza & Hollenbeck, 1999; Judge & Cowell 1997; Thach & Heinselman, 1999).
Process consultation is based on the idea that ownership of the issues of concern remain with the ratee, who has actively participated in defining the key issues resulting from multi-rater
87
feedback and formulating a solution that is culturally appropriate (Schein, 1987). The role of the consultant revolves around facilitation and engaging in a helpful relationship with the
ratee, rather than simply being a provider of expertise. The process consultant's role is more nondirective and questioning as he or she gets the groups to solve their own problems
(French & Bell, 1999). This approach increases the likelihood of confronting the most pressing issues and helps the ratee benefits from problem-solving skills needed for ongoing organizational change.
Schein (1987) commented that process consultation is not one single thing the process consultant does but are paramount goals the process consultant helps the ratees (client)
achieve, change and resolve key issues of concern through different interventions. Although information on the stages of change (Lovelady, 1989) and the focus of intervention (Fagenson & Burke, 1990) are important, it reveals too little about the specific activities that process consultants engage in, and the skills they need to accomplish them successfully. Schein (1987) concluded that process consultants make interventions in the following order: agenda setting, feedback of observations or other data, counseling and coaching, and
structural suggestions if any. During the process consultation, ratee (client) who wishes to change their traditional practices and behaviors need to be given the opportunity to reflect on
a wide range of meaningful feedback. Without reflection, it is just lip service to change ratee's behaviour or performance. Process consultants will also be given the opportunity to reflect their feelings, thoughts and perceptions on ratee's development. Through the reflection process, process consultants will be able to evaluate the degree of reaction and learning of the ratee. This is supported by Kolb's (1984) learning theory which states that an individual will learn effectively if he or she is able to reflect on the feedback received. It is important to take a closer look at a process consultant's actual intervention role which involves intertwining events, issues, thoughts, emotions and human interactions. Schein (1987) and Weisbord (1988) showed appreciation for the complex role and behavioural
repertoire required by the process consultant. Most research does not distinguish between the different settings and contexts for consultancy practice (Chapman, 1998). The question arises on whether the process consultant engages in different activities in their work within the
organization. If so, what particular skills are required for them to successfully develop a ratee (client). This question is of interest to many people including the process consultants
themselves. Chapman (1998) asserted that a successful facilitation process requires building 88
emotional ties between the process consultant and client through good communication and
interpersonal skills. According to Kirkpatrick (1959), adults must be motivated to learn. Hence through effective communication and interpersonal interactions, development of psychological contract and emotional ties between both parties will motivate them to participate in the development plan (Wolfe & Kolb, 1984).
3.10 Micro Perspective of Conversation Theory in Process Consultation Pask's (1975) work in developing a human learning system through conversation theory may be used to enhance facilitation between the process consultant and client. Conversation theory is a framework for intervention analysis called the conversations model developed by
Ledington in 1989. The elements of the framework are individuals, groups or organizations
that formed. The framework is used to manage an intervention and the intervener is free to construct a social group or community. Conversation is a means of knowledge acquisition and is a process in gaining self-understanding and mutual understanding. It is also a way to achieve predetermined objectives by using specific strategies (Navarro, 2001). The specific strategies could be used in a fair manner by trying to genuinely convince the other participants in a good conversation session.
The conversation model would be able to guide process consultants on how to go through various intervention strategies so that the client is stimulated to tell his or her story with minimal disruption of either the process or content. This can be done if every learning conversation is followed by reflections by both the process consultant and client. Pask's (1976a) conversation theory mentioned that reflection would bring about a desired emergent behaviour which shows what the participants have learned and achieved, how the participants have interacted interpersonally, and what the participants need to learn in the future.
Pask's (1976a) conversation theory contemplated the phenomenon of human learning as the result of an emergent process of conversation such as linguistic interaction based on
conscious, conceptual resonance between several P-individuals. These P-individuals can be distinct points of view within a biological individual, different biological individuals or even specific groups of them. P-individual is an effective participant in a conversation, which
89
connect with many P-individuals. He suggested the existence of a close relationship between these two aspects
P-individuals as perspectives and P-individuals as participants.
According to Pask's (1976a) conversation theory, process of communication can also be considered as a P-individual: a strict conversation is a prototypical P-individual. There are three different conceptual aspects coexisting in the idea of a P-individual: the concept of a cognitive perspective, the concept of a participant (in a conversation) and that of a whole
conversation. The conversation is a P-individual, and so are the participants who converse with each other (Pask, 1961).
Hence, good conversation is an important concept derives from Pask's conversation theory that forms the basis for effective process consultation by encouraging process consultant and his client to reproduce new behaviour through mutual information transfer and network of
concepts. Pask (1975) adopted a few alphabetical equations to explain his conversation process by phrasing A (the process consultant) is conscious with B (the client) and committing themselves to some dependency or relationship T (the agreed course of action). The commitment of A and B to T is sought because this supposedly leads to desired
outcomes. In an analogous manner, the performance of a client in a conversation will potentially involve the whole personality and not just the epistemic resource. Pask (1975) did not consider information and conversation as a pre-selection of interaction but as a consequence of the emergence of new realities when a given system interacts with other
systems. This emergence is due to the synchronization effect of the two systems.
The creation of the learning context requires self awareness as well as a social context for intentional interaction (Black & Mendenhall, 1990). The learning context can be facilitated by developing good conversation between the process consultant and his or her client. Good conversation creates a form of conversation between the process consultant and client where norms of discourse are developed consensually, values and assumptions can be surfaced and
tested, and all voices can be heard (Schuurman & Veermans, 2001). Through good conversation the process consultant can enhance transfer learning and proceed with the development process for his or her client.
Good conversation creates a cycle of effective transaction between the process consultant and
client who come into conversation and learn from each other. Research on stereotyping has found an association between the level of self-acceptance a client feels and the tendency to 90
stereotype or accept others (Adorno, Frenkel-Brunswik, Levinson & Stanford, 1950; Rubin, 1967). Therefore, we expect that as clients increasingly accept themselves, they are more able to let go of their prejudices and stereotypes of the process consultant. When clients are fully free to speak, and feel they are genuinely being heard, the affirmation they experience
enhances self-acceptance. This enables them to listen more completely, allows for the synergistic cycle of being heard and experience increased self-acceptance (Rogers, 1970). Through good conversation, the possibility of stereotyping will be minimized to allow the process consultant plays his/her role more effectively.
3.11 An Integrated Approach for Post Multi-rater Feedback Development Post multi-rater feedback development start with a contact client or known as the ratee with
whom the process consultant meets concerning his or her performance deficiencies. Whether or not that client admits to owing the performance deficiencies that is to be worked on in the event of development, the process consultant would not want to be prematurely
perceived as an expert. The process consultant would want the client to feel helped after a
few meetings. The client should feel that every conversation is helpful especially during early interactions.
According to Schein (1997), the process consultant and client have something to learn from each other during the development process. He came up with eight general principles to
improve the flexibility of process consultation. They are: always be helpful, always deal with reality, access your ignorance, everything you do is an intervention, it is the client who owns
the problem, go with the flow, be prepared for surprises and learn from them, share the
problem. These eight general principles govern the process consultant's roles and relationship with the client. Chapman (1998) said that a process consultant should adopt flexible consulting roles. Some clients may need a mentor and adviser on general management matters as much as they require a facilitator and project manager. He further commented that good process consultants help to identify the real issues and challenges facing the organization as well as discuss a tailor-made process for constructive change.
91
Although Schein's eight general principles were used to enhance flexibility of the process consultant's role, it does not mention how an effective dialogue session could be established between the process consultant and his/her client. Vygotsky (1978) mentioned that psychological contract could be established through effective dialogue and it will help the process consultant and client reach higher levels of understanding. The establishment of psychological contract would be an opportunity for the process consultant and client to learn important new things about a situation when they explore it together. According to Schein (1997), the client owns the problem and has to live with the
consequences of the problem and the solution. Therefore the consultant must not withdraw any problems away from the client because the client is the best person to understand and appreciate what would be the next best steps. Involvement of client depends on their willingness to openly discuss issues they are facing and the trust they give the consultant. Sometimes the client hides the real problem because he or she is testing the consultant to determine whether the relationship is characterized by sufficient trust to reveal what may be
very intimate and personal information. Trust building therefore requires greater 'good conversation' between both parties to explore their commitment and intention. Learning about good conversations and adjusting our responses to different individuals, groups and issues appropriately, can have a dramatic impact on outcomes for individuals,
teams and whole organizations. Any significant human learning is not just cognitive information-processing but also moral and aesthetic co-construction of parts of our life-world
(Boyd, 2001). The process consultant and client may have conflicting views and feelings if both parties hold strongly to their beliefs or worldview. The resolution to this conflicting belief or worldview demands that new realities be generated through synchronisation of
perceptual differences (Navarro, 2001). Reflections provide an avenue for both parties to understand each other's views and learn from each other's differences. The process consultant may use reflections to help his or her client to focus on one behavioral change they
would like to make as a result of their experience: "What do you want to work on, and are you willing to make a commitment to change?" This gives their reflection an action component which is often beneficial.
Learning occurs in two forms: single-loop and double-loop (Argyris, 1994). Single-loop learning asks a one-dimensional question to elicit a one-dimensional answer. Double-loop 92
learning takes an additional step, or more often than not, several additional steps. It turns the
question back on the questioner. It asks what the media calls follow-up. A double loop process might also ask why the current setting was chosen in the first place. Because doubleloop learning depends on questioning one's own assumptions and behaviour, this apparently benevolent strategy is actually anti-learning (Argyris, 1994). Admittedly, being considerate
and positive can contribute to the solution of single-loop problems for example cutting costs. But it will never help people figure out why they lived with problems for years, why they covered up, why they were so good at pointing to the responsibility of others and so slow to focus on their own. The notion of good conversation expands the phenomenon of ideal
speech to include ideal listening and promote interaction. For ideal listening to occur, the individuals must feel secure and accepting enough of themselves to be open to new possibilities. Enhanced self-acceptance can contribute to the possibility of valuing the
diversity of others. Thus, the responsibility of the process consultant includes nurturing clients' self-acceptance and inspiring a sense of personal power among people around them. This framework promotes a deep commitment to empathetic interaction between process consultant and client to construct a shared reality as a common setting for development pathway (Navarro, 2001).
In good conversation theory, learning is approached from an inside-out perspective based on personal experience (Hunt, 1987). Under the process consultation practice, individual
personal experience is required to be reflected on. Reflection pinpoints and dramatizes what individuals have learned and achieved, how individuals have interacted interpersonally, and
what individuals need to learn in the future. Through valuing each person's individual experience, the uniqueness of every person is assumed and considered a resource. With the more typical outside-in approach to learning, the dissimilarity of each person is considered a
problem to be solved (Hunt, 1987). The point of departure for learning in 'good conversation' is not only the presumption that each individual is different, but that diversity is an inherent resource. In this consensual and self-reflective process, as more and more diverse reflections become fully heard within the group, the values and perspectives of each member influence others, and the process of mutual socialization evolves.
The process consultant must become the reflective practitioner cum learner besides helping the individual benefit from the double loop processes (Argyris & Schon, 1978). They must diagnose the issues and take action to improve the practice by involving themselves directly 93
and fully, preparing to investigate such experiences from as many different perspectives as
possible and patterning their observations into meanings through reflection. Documentation of the agreed proposed course of action was not mentioned by Schein (1997). In the absence of documented agreement between the process consultant and client, commitment in fulfilling
the course of action is unlikely to happen. Schuurman and Veermans (2001) derived two classes of consequences from conversation theory: the weak consequence and the strong consequence. The weak consequence stresses observation, the strong consequence stresses
control. The weak consequence concentrates on record keeping of the experimental subject, closed conversation and topic of exchanges. The strong consequence notes the records of agreements derived from the outcome of negotiations between two parties. Both these consequences will bring about total commitment between the process consultant and the client. The documentation process that records the reflections made between two parties during the learning process will become an obligated implication to be fulfilled. Besides this, the record keeping will also provide both parties with a flow of progression towards their development goals.
Pask (1961, 1965, 1975a, 1975b, 1976a, 1976b) introduced both the object-language and meta-language to explain the required exchanges during the learning process. He stressed the need for researchers to distinguish object-language and meta-language with any of the learning interfaces (Schuurman & Veermans, 2001). The object-language comprises a system of expression (i.e. sentences during conversation) belonging to the object of study. These sentences should be internal expressions of the object, that is to reflect properties of the object and these expressions should conform to well define rules (in this case, the developmental pathway undertaken by the process consultant and the client). Within the meta-language, a new object-language can be proposed. If the new object-language fits the purpose (i.e. learning objective) better than the original object-language then it can be
replaced. However, this is only possible if the process consultant and/or the client knows what to replace. Keeping apart object-language and meta-language allows revisions to be
tracked. This is a crucial prerequisite for systematic inquiry (De Zeeuw, 1995). Record keeping process holds a very important key in making this a success (Schuurman &
Veermans, 2001). Pask (1965) considered interaction between object-language and metalanguage pivotal in learning and human performance in general. Pask observed objectlanguage and meta-language interactions so as to study how conversations are punctuated by
agreements (including agreements to disagreements). According to Pask, researcher needs to 94
keep proper record of the interaction between object-language and meta-language and to
mark all agreements. The agreements serve as controlled conversation of true hard data. He argued that psychological experiments start with basic meta-language interactions: the experimenter and experimental subject have to agree on their respective roles (Schuuman & Veermans, 2001). The meta-language interactions that Pask strongly advocated should serve as the whole basis of process consultation sessions where emergent behaviour for learning is likely to happen.
However, there are a few drawbacks to the conversation theory where the theory itself actually eliminates some basic traits of the human mind, human interaction and ignores other
aspects of human reality (Navarro, 2001). The other factors which prevent a straightforward application of conversation theory to the study of social realities are strong dependence on Pask's theory and not the study of real social life, specifically of human interactions. To address the weakness of conversation theory in a real social environment and natural conversation situation, the strength of the theory depends on its ability to bring into sharp
focus on specific aspect of the world. Massaro and Cowan (1993) suggested that in building a community of good conversation, people are required to put themselves in the shoes of others and to empathise if they are to arrive at consensually developed norms. Through empathy for others, they can begin to understand, bring life, feelings and even accommodate
for the consequences of each other's norms distinct from themselves. It has to be an attempt to truly see the world as the other sees it, understand the real life situation of the other and adopt other's perspectives and values (Massaro & Cowan, 1993). One assumption of good conversation is its essential dynamic quality and process, resisting a tendency to control for predictability. This form of conversation implicitly and explicitly sets the conditions for valuing individuals or organization through the integration of affective and cognitive modes
of experience and learning (Argyris, 1994). In this initial phase of the process, the process
consultant's role is particularly important. As part of the norm creation process, the process consultant needs to be continually modeling a respectful and inclusive approach throughout to foster a safe, receptive space for the conversation to unfold (Argyris, 1994).
95
3.12 Conclusion Although research on multi-rater feedback assessment indicates that different rater sources provide different information, multi-rater feedback technique is still useful at the preliminary stage to provide information or create self awareness on individual strength, weaknesses or
blind spots. One underlying rationale to such systems is their potential impact on the target individual's self awareness which increasing self awareness is thought to enhance performance. This paper provides a concept on how multi-rater feedback can lead to a successful developmental process through process consultation in Malaysia. Through the years, training evaluation culture in Malaysia has not been properly practiced, hence it is recommended that a proper approach be used to enable organizations to see the benefits of holding pre training needs analysis and effective development approach so that a comprehensive training and effective development approach could be instilled in the
Malaysian environment. Hence, the process consultant holds the key to effective development process using multi-rater assessment as a pre-training gap analysis.
Process consultation provides the opportunity to check and balance the degree of learning and development activities through reflection, problem solving capabilities and application of theories throughout the developmental process. The flexibility of process consultation should be enhanced by integrating conversation theory using good conversation and documentation
of pre-agreed commitment of action known as reflection. This will promote ideal communication and interaction between the process consultation and client which will
eventually build trust for open learning and development. Good conversation is an important intervention tool that has potential for applying effective human communication, decision making, and policy making in the development process through single loop and double-loop learning.
Multi-rater feedback approach also gathers information from various sources, in order to evaluate the level of transfer learning of an individual at the end of the development stage of
process consultation. It is recommended that an integrated and comprehensive model comprising preliminary multi-rater feedback assessment, followed by developmental process using process consultation and good conversation in an effort to facilitate transfer learning to the organization.
96
3.13 References for Paper Three Adorno, T.W., Frenkel-Brunswik, E., Levinson, D.J. & Stanford, R.N. 1950, The authoritarian personality, Harper and Brother, New York.
Alimo-Metcalf, B. 1998, 'Editorial 360-degree assessment and feedback', Professional Forum, vol. 6, no. 1, pp. 16-18. Antonioni, D. 1994, 'Designing an effective 360-degree appraisal feedback system', Personnel Psychology, vol. 47, pp. 349-356. Argyris, C. & Schon, D. 1978, Organization Learning: A Theory in Action Perspective, Addison-Wesley, Reading, MA.
Argyris, C. 1994,'Good conversation that blocks learning', Managerial Excellence, Harvard Business Review, vol. 15, pp. 303-317. Ashford, S.J. 1984, 'Self-assessments in organizations: a literature review and integrative model', Research in Organizational Behaviour, vol. 11, pp. 133-174.
Ashford, S.J. 1993, 'The feedback environment an exploratory study of cue use', Journal of Organizational Behaviour, vol. 14, pp. 201-224. Atwater, L.E. & Yammarino, F.J. 1993, 'Personal attributes as predictors of superiors' and subordinates' perceptions of military academy leadership', Human Relations, vol. 46, pp. 645-668. Atwater, L.E., Ostroff, C.M., Yammarino, F.I. & Fleenor, I.W. 1998, 'Self-other agreement: does it really matter?' Personnel Psychology, vol. 51, no. 3, pp. 577-598.
Atwater, L.E., Roush, P. & Fischthal, A. 1995, 'The influence of upward feedback on selfand follower ratings of leadership', Personnel Psychology, vol. 48, pp. 35-49. Bass, B.M. & Yammarino, F.I. 1991, 'Congruence of self and others' leadership ratings of naval offices for understanding successful performance', Applied Psychology: An International Review, vol. 40, no. 4, pp. 437-454. Bennis, W.G., Benne, K.D. & Chin, R. 1969, The Planning of Change, 2nd edn, Holt, Rinehart and Winston, New York, NY.
Bernardin, H.J., Dahmus, S.A. & Redmon, G. 1993, 'Attitudes of first line supervisors towards subordinate appraisals', Human Resource Management, vol. 32, pp. 315-324. Black, J.S. & Mendenhall, M. 1990, 'Cross-cultural training effectiveness: a review and a theoretical framework for future research', Academy of Management Review, vol. 15, no. pp. 113-136.
97
Borman, W.C. 1997, '360-degree ratings: an analysis of assumptions and research agenda for evaluating their validity', Human Resource Management Review, vol. 7, pp. 315324.
Boyd, G. 2001, 'Reflections on the conversation theory of Gordan Pask', Kybernetes, vol. 30, no. 5/6, pp. 560-570. Bracken, D.W., Dalton, M.A., Jako, R., McCauley, C.D. & Pollman, V.A. 1997, Should 360-degree Feedback Be Used Only for Developmental Purposes? Greensboro, NC: Center for Creative Leadership.
Brutus, S., London, M. & Martineau, J. 1999, 'The impact of 360-degree feedback on planning for career development', Journal of Management Development, vol. 18, pp. 676-693. Cacioppe, R. 1998, 'An integrated model and approach for the design of effective leadership development programs', Leadership and Organization Development Journal, vol. 9, no. 1, pp. 44-53.
Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model to develop leadership and management skills', Leadership and Organization Development Journal, vol. 21, no. 8, pp. 390-404. Cacioppe, R. & Albrecht, S. 2000, 'Differing perceptions of managers: behaviours using the holon leadership-management model', in Parry, K. edn. forth coming. Chapman, J. 1998, `Do process consultants need different skills when working with nonprofits?', Leadership and Organization Development Journal, vol. 19, no. 4, pp. 211-215.
Church, A.H. 1997, 'Managerial self awareness in high performing individuals in organizations', Journal of Applied Psychology, vol. 82, pp. 281-292. Conway, J.M. & Huffcutt, A.I. 1997, 'Psychometric properties of multi-source performance
ratings: a meta-analysis of subordinate, supervisor, peer and self-ratings', Human Performance, vol. 10, pp. 331-360. Dezeeuw, G. 1995, 'Values, science and the quest for demarcation', System Research, pp. 15-24.
Edwards, J.R. 1993, 'Problems with the use of profile similarity indices in the study of congruence in organizational research', Personnel Psychology, vol. 46, pp. 641-65. Edwards, J.R. 1994, 'The study of congruence in organizational behaviour research: critique and proposed alternative, Organizational Behaviour And Human Decision Processes, vol. 58, pp. 51-100.
98
Facteau, J.D., Facteau, C.I., McGonigle, T.P. & Fredholm, R.I. 1997, Characteristics of feedback and managers' reactions in multi-source appraisal systems, paper presented
at the 12th annual conference of the Society of Industrial and Organizational Psychology, St. Louis, MO.
Fagenson, E. & Burke, W. 1990, 'Organization development practitioners' activities and interventions in organizations during the 1980s', Journal of Applied Behavioural Science, vol. 26, no. 3, pp. 285-297. Fletcher, C. 1997, 'Self awareness: a neglected attribute in selection and assessment?', International Journal Of Selection And Assessment, vol. 5, no. 3, pp. 183-187.
Fletcher, C. & Baldry, C. 2000, 'A study of individual differences and self awareness in the context of multi-source feedback', Journal Of Occupational And Organizational Psychology, vol. 73, pp. 303-319.
Fletcher, C. & Bailey, C. 2003, 'Assessing self awareness: some issues and methods', Journal of Managerial Psychology, vol. 18, no. 5, pp. 395-404. French & Bell 1999, Organization Development: Behavioural Science Interventions for Organization Improvement, 6th edn, Prentice-Hall Publisher, New Jersey.
Furnham, A. & Stringfield, P. 1994, 'Correlates of self and subordinate ratings of managerial
practices as a correlate of supervisor evaluation', Journal of Occupational and Organizational Psychology, vol. 67, no. 1, pp. 57-67.
Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employee development', Journal of Management Development, vol. 16, no.2, pp. 134-147. Ghorpade, J. 2000, 'Managing five paradoxes of 360-degree feedback', Academy of Management Executive, vol. 14, pp. 140-50. Greguras, G.J., Ford, J.M. & Brutus, S. 2003, 'Manager's attention to multi-source feedback', Journal of Management Development, vol. 22, no. 4, pp. 345-361 Hagberg, R. 1996, 'Identify and help executives in trouble', Human Resource Magazine, vol. 41, no. 8, pp. 88-92. Hall, D., Otazo, K. & Hollenbeck, G. 1999, 'Behind closed doors: what really happens in executive coaching', Organizational Dynamics, vol. 27, no. 3, pp. 39-58.
Harris, M.M. & Schaubroeck, J. 1988, 'A meta-analysis of self-supervisor, self-peer and peer-supervisor ratings', Personnel Psychology, vol. 41, pp. 43-62. Hazucha, J. Fr., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback on management skills development', Human Resource Management, vol. 32, pp. 325351.
Heisler, W.J. 1996a, '360-degree feedback: an integrated perspective', Career Development International, vol. 1, no. 3, pp. 20-23.
99
Hoffman, R. 1995, 'Ten reasons you should be using 360-degree feedback', Human Resource Management Magazine, vol. 40, no. 4, pp. 82-86. Honey, P. & Mumford, A. 1982, Manual of Learning Styles, Honey Publication, Maidenhead. Hunt, D.E. 1987, Beginning With Ourselves, Brookliine Books, Cambridge, MA.
Judge, W. & Cowell, J. 1997, 'The brave new world of executive coaching', Business Horizons, vol. 40, no. 4, pp. 71. Junaidah, H. 1999, Training Management: A Malaysian Perspective, Prentice-Hall Publisher, Pearson Education, Malaysia.
June, M.L. P. & Rodzhan, 0. 2000, 'Management training and development practices of
Malaysian organizations', Journal of the Malaysian Institute of Management,
Malaysian Management Review, vol. 35, no. 2, pp. 77-85.
Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction', Journal of American Society for Training and Developing, vol. 13, pp. 3-9, Kirkpatrick, D.L. 1959b, 'Techniques for evaluating training programs: part 2 - learning', Journal of American Society for Training and Developing, vol. 13, no. 12, pp. 21-26.
Kirkpatrick, D.L. 1994, Evaluating Training Programs The Four Levels, Berrett-Koehler Publishers, San Francisco. Kluger, A.N. & Denisi, A.D. 1996, 'The effects of feedback interventions on performance: historical review, a meta-analysis and a preliminary feedback intervention theory', Psychological Bulletin, vol. 119, pp. 254-284. Kluger, A.N. & Denisi, A.D. 2000, 'Feedback effectiveness: can 360-degree appraisals be improved?', Academy of Management Executive, vol. 14, pp. 129-139. Kolb, D. 1984, Experimental Learning, Prentice-Hall Publisher, New Jersey. Leslie, J., Gryskiewicz, N. & Dalton, M. 1998, 'Understanding cultural influences on the 360-degree feedback process', in Maximizing the Value of 360-degree Feedback: A Process for Successful Individual and Organization Development, eds Tornow, W. & London, M., Jossey-Bass, San Francisco, pp. 196-216.
London, M. & Beatty, R.W. 1993, '360-degree-feedback as a competitive advantage', Human Resource Management, vol. 2-3, pp. 353-372. London, M. & Smither, J.W. 1995, 'Can multi-source feedback change perceptions of goal accomplishment, self evaluations and performance related outcomes? Theory-based applications and directions for research', Personnel Psychology, vol. 48, pp. 803-839.
100
London, M., Wholers, A.J., & Gallagher, P. 1990, '360-degree feedback surveys: a source of feedback to guide management development', Journal of Management Development, vol. 9, pp. 17-31.
Lovelady, L. 1989, 'The process of organization development: a reformulated model of the change process, Part 1', Management Decision, vol. 27, no. 4, pp. 143-154. Luthans, K.W. & Farner, S. 2002, 'Expatriate development: the use of 360-degree feedback', Journal of Management Development, vol. 21, no. 10, pp. 780-793. Massaro, D.W. & Cowan, N. 1993, 'Information processing models: microscopes of the mind', Annual Review and Psychology, vol. 44, pp. 383-425. McCauley, C.D. & Moxley, R.S. Jr. 1996, Development 360: how Feedback Can Make Managers More Effective, Jossey-Bass Publisher, San Francisco.
McEvoy, G.M. 1990, 'Public sector managers' reactions to appraisals by subordinates', Public Personnel Management, vol. 19, pp. 201-212. McEvoy, G. M. & Buller, P.F. 1987, 'User acceptance of peer appraisals in an industrial setting', Personnel Psychology, vol. 40, pp. 785-797. Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia, Malaysian Institute of Management, Malaysia.
Mount, M. K. 1984, 'Psychometric properties of subordinate ratings of managerial performance', Personnel Psychology, vol. 37, pp. 687-701. Mount, M.K., Judge, T.A., Scullen, S.E., Sytsma, M.R. & Hezlett, S.A. 1998, 'Trait, rater, and level effects in 360-degree performance ratings', Personnel Psychology, vol. 51, pp. 557-576. Moses, J., Hollenbeck, G. P. & Sorcer, M. 1993, 'Other people's expectations', Human Resource Management, vol. 32, Summer Fall.
Nasby, W. 1989, 'Private self-consciousness, self awareness and the reliability of selfreports', Journal of Personality and Social Psychology, vol. 56, no. 6, pp. 950-957. Navarro, P. 2001, The Limits of Social Conversation, Kybernetes, MCB University Press, vol. 30, no. 5/6, pp. 771-788.
Nowack, K. 1993, '360-degree feedback: the whole story', Training and Development Journal, vol. 47, no. 1, pp. 69-73.
O'Reilly, B. 1994, '360-degree feedback can change your life', Fortune Magazine, vol. 130, no. 8, pp. 93-97. Pask, G. 1961, An Approach to Cybernetics, Hutchinson, London.
Pask, G. 1965, Inleiding tot de Cybernetica, Het Spectrum, Utrecht.
101
Pask, G. 1975a, The Cybernetics Of Human Learning And Performance', Hutchinson, London.
Pask, G. 1975b, Conversation, Cognition and Learning: A Cybernetic Theory and Methodology, Elsevier, Amsterdam. Pask, G. 1976a, Conversation Theory: Applications In Education And Epistemology, Elsevier, Amsterdam and New York. Pask, G. 1976b, Revisions in the foundations of cybernetics and general systems theory as a result of research in education, epistemology and innovation (mostly in man-machine systems), proceedings of the 8th International Congress on Cybernetics, Namur, vol. 6, no. 11, September, pp. 83-109.
Payne, T. 1988, 'Editorial 360-degree assessment and feedback', International Journal of Selection and Assessment, vol. 6, no. 1. Reilly, R.R., Smither, J.W. & Vasilopoulos, NJ. 1996, 'A longitudinal study of upward feedback', Personnel Psychology, vol. 49, pp. 599-612. Rogers, C.R. 1970, Encounter Groups, Harper and Row, New York.
Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3. Rosti, R.T., Jr & Shipper, F. 1998, 'A study of the impact of training in a management
development program based on 360-degree feedback', Journal of Managerial
Psychology, vol. 13, pp. 77-89.
Rubin, I. 1967, 'The reduction of prejudice through laboratory training', Journal of Applied Behavioural Science, vol. 3, no. 1. Saiyadain, M.S. & Juhary, A. 1995, 'Managerial training and development in Malaysia', Journal of the Malaysia Institute of Management, Management Review, vol. 5, pp. 23-36. Schein, E.H. 1987, Process Consultation, vol. 2, Addison-Wesley, MA.
Schein, E.H. 1997, 'The concept of 'client' from a process consultation perspective', Journal of Organizational Change Management, vol. 10, no. 3, pp. 202-216. Schuurman, J.G. & Veermans, K. 2001, 'Conversation and research', Kybernetes, vol. 30, no. 7/8, pp. 881-890. Shipper, F. & Dillard, J.E. Jr. 2000, 'A study of impending derailment and recovery of middle managers across career stages', Human Resource Management, vol. 39, no. 4, pp. 331-345.
102
Shipper, F. & John, J. 1992, 'Employees' feedback: its use for management development and
the results in a government organization', in Fargher, J.S. edn., Proceedings of Symposium on Productivity and Quality Improvement with a Focus on Government, Industrial Engineering and Management Press, Washington, DC. Thach, E.C. 2002, 'The impact of executive coaching and 360-degree feedback on leadership effectiveness', Leadership and Organization Development Journal, vol. 23, no. 4, pp. 205-214.
Thach, I. & Heinselman, T. 1999, 'Executive coaching defined', Training and Development Journal, vol. 53, pp. 34-39. Tornow, W.W. 1993, 'Perceptions or reality: is multi-perspective measurement a means or an end?' Human Resource Management, vol. 32, no. 2 and 3, pp. 221-230. Tornow, W.W. & London, M. 1998, Maximizing the Value of 360-Degree Feedback: A Process for Successful Individual and Organizational Development, Jossey-Bass Publisher, San Francisco.
Van Veslor, E., Taylor, S. & Leslie, J.B. 1993, 'An examination of the relationship among self-perception accuracy, self awareness, gender and leaders' effectiveness', Human Resource Management, vol. 32, summer fall, no. 2/3, pp. 249-263. Vygotsky, L.S. 1978, Mind in Society, Harvard University Press, Cambridge.
Waldman, D.A., Atwater, L.E. & Antonian, D. 1998, 'Has 360-degree feedback gone amok?', Academy of Management Executive, vol. 12, no. 2, pp. 86-94. Walker, A.G., & Smither, J. W. 1999, 'A five-year study of upward feedback: what managers do with their results matters', Personnel Psychology, vol. 52, pp. 393-423. Wan Aziz, W.A. 1994, 'Transnational corporations and human resource development', Personnel Review, vol. 23, no. 5, pp. 50-69.
Warr, P. & Bourne, A. 1999, 'Factors influencing tow types of congruence and similarity as
related to interpersonal evaluation in manager-subordinate dyads', Academy of Management Journal, vol. 23, pp. 320-30.
Weisbord, M. 1988, 'Towards a new practice theory of OD: notes on sharpshooting and moviemaking', in Research in Organizational Change and Development, eds Pasmore, W. & Woodman, R., JAI Press, Greenwich, CT, vol. 2, pp. 59-96. Wilson, M.S., Hoppe, M.H., & Sayles, R.S. 1996, Managing Across Cultures: A Learning Framework, Centre for Creative Leadership, Greensboro, NC.
Wolfe, D.M. & Kolb, D.A. 1984, 'Career development, personal development and experiential learning', in Organization Psychology: Readings on Human Behaviours in Organizations, 4th edn, Prentice-Hall, NJ.
103
Wood, R., Allen, T., Pillenger, T. & Kahn, N. 1999, '360-degree feedback: theory, research and practice', in Human Resource Strategies: An Applied Approach, eds Travaglione, T. & Marshall, V., McGraw-Hill, Sydney.
Zakaria, I. & Rodzhan, 0. 1993, Human resource development practice in the manufacturing
sector in Malaysia: an empirical assessment, Paper Presented at the Seminar on Human Resource Management, Faculty of Business Management, University Kebangsaan Malaysia.
Zemke, R. & Zemke, S. 1995, 'Adult learning: what do we know for sure?', Training and Development Journal, vol. 32, pp. 31-37.
104