2013; 35: 1027–1041
The Multiple Mini-Interview (MMI) for student selection in health professions training – A systematic review ALLAN PAU1, KAMALAN JEEVARATNAM2, YU SUI CHEN1, ABDOUL AZIZ FALL1, CHARMAINE KHOO1 & VISHNA DEVI NADARAJAH1 1
International Medical University, Malaysia, 2Royal College of Surgeons in Ireland, Perdana University, Malaysia
Background: The Multiple Mini-Interview (MMI) has been used increasingly for selection of students to health professions programmes. Objectives: This paper reports on the evidence base for the feasibility, acceptability, reliability and validity of the MMI. Data sources: CINAHL and MEDLINE Study eligibility criteria: All studies testing the MMI on applicants to health professions training. Study appraisal and synthesis methods: Each paper was appraised by two reviewers. Narrative summary findings on feasibility, acceptability, reliability and validity are presented. Results: Of the 64 citations identified, 30 were selected for review. The modal MMI consisted of 10 stations, each lasting eight minutes and assessed by one interviewer. The MMI was feasible, i.e. did not require more examiners, did not cost more, and interviews were completed over a short period of time. It was acceptable, i.e. fair, transparent, free from gender, cultural and socioeconomic bias, and did not favour applicants with previous coaching. Its reliability was reported to be moderate to high, with Cronbach’s alpha ¼ 0.69–0.98 and G ¼ 0.55–0.72. MMI scores did not correlate to traditional admission tools scores, were not associated with pre-entry academic qualifications, were the best predictor for OSCE performance and statistically predictive of subsequent performance at medical council examinations. Conclusions: The MMI is reliable, acceptable and feasible. The evidence base for its validity against future medical council exams is growing with reports from longitudinal investigations. However, further research is needed for its acceptability in different cultural context and validity against future clinical behaviours.
Introduction Admissions to health professions training programmes are high stake decisions. The panel or board interview is commonly used to aid this decision (Edwards et al. 1990), although the evidence suggests its limited ability to predict academic or clinical performance in health care disciplines (Goho & Blackman 2006). Furthermore, interviewer and interviewee characteristics may represent a complex mix of factors that could have a major impact on the interview process hence reducing the validity and reliability of student selection. For example, Dixon et al. (2002), in their review on the panel interview commented that structure and scoring anchors impact on its reliability and validity. Wilkinson et al. (2008), in their study argued that panel interviews have little predictive value and added that the ‘‘threat’’ of an interview may even dissuade some potential applicants and concluded that GPA (grade point average from student pre entry qualification) has the best predictive value to academic performance. Structuring the interview has been reported to enhance its acceptability and reliability (Patrick et al. 2001). The Multiple Mini-Interview (MMI) is a highly structured student selection method designed to resemble the Objective Structured Clinical
Practice points . The Multiple Mini-Interview (MMI) is a highly structured student selection method designed to resemble the Objective Structured Clinical Examination (OSCE). . It usually consists of 10 stations, each lasting eight minutes and assessed by one interviewer. . It is practically feasible, i.e. it does not require more examiners than the panel interview, does not cost more, and the interviews could be completed over a short period of time. . It has been reported to be acceptable, i.e. fair, transparent, free from gender, cultural and socio-economic bias, and did not favour applicants with previous coaching. . Performance at the MMI does not correlate to traditional admission tools scores and is not associated with preentry academic qualifications. It is the best predictor for OSCE performance and is statistically predictive of subsequent performance at medical council examinations.
20 13
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Abstract
Correspondence: Allan Pau, 126, Jalan Jalil Perkasa 19, 57000 Kuala Lumpur, Malaysia; email:
[email protected] ISSN 0142–159X print/ISSN 1466–187X online/13/121027–15 ß 2013 Informa UK Ltd. DOI: 10.3109/0142159X.2013.829912
1027
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
A. Pau et al.
Examination (OSCE) (Eva et al. 2004c). In the MMI, applicants rotate every few minutes around a series of stations designed to assess the non-cognitive attributes that are currently assessed in the panel interview, although cognitive skills may also be assessed. Scenarios or questions designed are flexible and can be moulded to assess the attributes that best meet the programs and schools requirement. The stations are designed in such a way that they do not require or assess specific learned knowledge, but rather evaluate a candidate’s ability to logically work through a problem and express one’s ideas clearly (Eva et al. 2004c). Each task station involves reading a scenario, and discussing one’s viewpoint with one or more interviewer, who may be faculty or lay, roleplaying with an actor or completing a specific task. Candidates may be scored for critical think ability, communications skills, and attitudes towards certain ethical or social dilemmas. The MMI, therefore, allows a wide sampling of candidates’ competencies in order to gain a more accurate picture of their overall ability. Increasingly the MMI has been adopted as the preferred student selection method in the health professions, including medicine, dentistry, pharmacy and veterinary science, with a corresponding increase in research on its acceptability, feasibility, reliability and validity. The purpose of this review is to (1)
identify the common features of the MMI with regard to the number of stations, number of assessors per station, and the time of each station, assess the feasibility of implementing the MMI across different contexts with regard to time and resources needed, assess acceptability of the MMI for different study populations as well as interviewers, and assess the reliability and validity of the MMI.
(2)
(3) (4)
Methods As far as we could gather a review protocol did not exist. We followed the PICO structure to formulate the following search question: In populations of applicants to health professions training, is the MMI a feasible, acceptable, reliable and valid tool for selection of applicants? We searched the Medline and EBSCOHost databases using the following search strategy: (MMI) OR Multiple mini interview) OR Multiple mini interviews) AND ((reliability) OR validity) OR feasibility) OR acceptability) OR acceptance) OR predictive validity) OR construct validity. In selecting the studies, we included all studies that tested the MMI on applicants to health professions training, such as medicine, dentistry, pharmacy and veterinarian science. Studies using postgraduate applicants as their samples were also included. Our outcomes of interest pertained to some measure of feasibility, which included cost, time and human resources; acceptability, which included the interviewees and interviewers’ perspectives; reliability; and validity. Only published papers in the English language were included. 1028
The citations generated were reviewed by two authors independently and those that reported on the MMI were identified. Those that reported on other educational topics were excluded. Case studies were excluded, but descriptive surveys and longitudinal studies were included. Both qualitative and quantitative studies were also included. Data were extracted onto a pre-designed table (Appendix A1), and included year of publication, country of study, sample characteristics and size, MMI features, and findings on feasibility, acceptability, reliability and validity. The extracted data were discussed among members of the research team, and appropriate findings were drawn.
Results Of the 64 citations identified and reviewed by the research team, 27 were rejected because they were not related to education or the MMI. The abstracts of the remaining 37 titles were screened by the research team, of which seven were excluded; five were not on the MMI, one was not on education, and one was a case study (Figure 1). Of the 30 papers selected for review, 24 were cross-sectional studies, three were cross-sectional with qualitative design, and three were longitudinal studies.
Characteristics of studies reviewed Of the 30 studies reviewed, 20 were carried out on medical school applicants alone, of which ten were in Canada, five in Australia, three in the UK and two in the USA. The remaining studies were conducted on residency applicants (4/30), veterinary applicants (2/30), pharmacy students (1/30), medical and dental applicants (1/30), health science applicants (1/ 30), and medical and residency applicants (1/30). In total 19 of the 30 studies were carried out in Canada with six in Australia, three in the UK and two in the USA.
Features of the MMI The number of stations used in the studies reviewed ranged from 4 to 12, with 10 studies using a 10-station MMI, 6 using 12, 5 using 8, and the remaining 9 using 4, 7, 9 or 11 stations. Fourteen of the studies used one assessor per station while 4 used 2 assessors, and the remaining 12 did not report the number of assessors per station. Most studies used faculty as assessors, while some used a combination of faculty and community practitioners (Hecker & Violato 2011) and others included students (Brownell et al. 2007). The range of time at each station was 5 to 15 min with a mode of 8 min. Eleven studies reported using 8-min stations, five using 7-min, three using 10-min, one using 5-min and one 15-min stations. Two studies tested the effect of different lengths of time at stations; one comparing eight and six minutes (Cameron & Mackeigan 2012) and the other eight and five minutes (Dodson et al. 2009). Seven did not report the time at each station. The average MMI has 10 stations, each lasting eight minutes and is rated by one assessor.
A review of the Multiple Mini-Interview
Records identified through database searching = (n = 64) ↓
Records after duplicates removed (n = 64) ↓
Records screened (n = 64)
27 records excluded: 24 not on education, 3 not on MMI
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
↓
Full-text articles assessed for eligibility (n = 37)
7 full-text articles excluded: 1 not on education, 5 not on MMI, 1 was a case study
↓
Studies included in qualitative synthesis (n = 30)
Figure 1.
Flow chart outlining selection of papers for review.
Feasibility Three studies reported on the feasibility of the MMI. One reported that it did not require more examiners when compared to the panel interview, did not cost more, and the interviews could be completed over a short period of time (Brownell et al. 2007; Finlayson & Townson 2011). Another study reported that it provided a positive experience for interviewers as well as applicants (Eva et al. 2004c).
Acceptability Of the 30 studies reviewed, 14 reported on the acceptability of the MMI. Some authors reported that the MMI was acceptable to interviewees and interviewers because it was perceived as fair (Razack et al. 2009), transparent (Uijtdehaage et al. 2011) and providing opportunities for the interviewees to regain composure if they had problems with a previous station (Kumar et al 2009). Positive experience for both applicants and examiners has also been reported (Eva et al. 2004c). Acceptability was also determined as free from gender and cultural bias (Brownell et al. 2007), and socio-economic disadvantage (Uijtdehaage et al. 2011) or benefit of previous coaching (Griffin et al. 2008). Griffin et al. (2008) reported that previous coaching, as disclosed by applicants, had no effect on UMAT or MMI scores. Applicants who had previous MMI experience improved their subsequent performance in the same stations but not in new stations. Preference for station length differed between interviewers and interviewees, with the former judging six mins to be ‘‘just right’’ and eight mins to be ‘‘a bit long’’, and the latter preferring longer time (Cameron & Mackeigan 2012). One
study reported that graduate candidates outperformed schoolleavers (Dowell et al. 2012) while another reported no difference between graduate and school-leaver applicants (O’Brien et al. 2012). Acceptability of the MMI was compared to that of the panel or standard interview by O’Brien et al. (O’Brien et al. 2011) for graduate and school-leaver applicants to 4-year and 5-year medical training programmes. The 5-year candidates, generally school-leaver applicants, reportedly felt that the MMI gave a more accurate picture of their abilities and that the panel interview was more difficult. In contrast, the 4-year candidates felt the MMI was more difficult.
Reliability Eighteen studies reported on the reliability of the MMI. Intrastation reliability was reported to reach 0.98 by Lemay et al. (2007). The inter-item reliability (i.e. the internal consistency of the three scores assigned within any one station) and the interrater reliability within stations have also been reported to be very high by Dore et al. (2010). However, Finlayson & Townson (2011) conducted a 4-station MMI, each at 15 min, and reported inter-rater reliability ranging from 0.50 to 0.69 for three stations, and 0.10 for one station. Generally the reported reliability ranged from moderate (Roberts et al. 2008) to acceptable (Dore et al. 2010) to high (Lemay et al. 2007), with Cronbach’s alpha ranging from 0.69 to 0.98. However, Finlayson & Townson (2011) reported inter-station reliability ranging from 0.45 to 0.47. Other researchers have also reported low inter-station correlations, (Lemay et al. 2007).
1029
A. Pau et al.
Using generalisability analysis, Hecker & Violato (2011) reported a G coefficient of 0.79 for seven stations with two assessors. A Decision study indicated that G ¼ 0.81 can be achieved from ten stations with one assessor. Similarly, in Dore et al.’s (2010) study, G ¼ 0.55 to 0.72 for seven stations, is increased to G ¼ 0.64 to 0.79 with 10 stations in a D-study. Reliability between faculty and community veterinarians has also been reported (Hecker & Violato 2011), although community assessors’ ratings were reportedly less consistent than faculty’s (Eva et al. 2004b).
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Validity Content validity. The validity of the MMI was discussed in 17 of the 30 studies. One key observation was that the MMI scores did not correlate with traditional admission tools scores such as the personal interview (r ¼ 0.185), undergraduate grades (r ¼ 0.317), simulated tutorial (r ¼ 0.227) and autobiographical sketch (r ¼ 0.170) (Eva et al. 2004c). Other studies reported that the MMI did not correlate with pre-entry academic qualifications, such as the GPA scores (Hecker et al. 2009), pre-pharmacy average (PPA) (r ¼ 0.025) or Pharmacy College Admission Test (PCAT) (r ¼ 0.042) (Cameron & Mackeigan 2012), GAMSAT ( ¼ 0.04) and UK Clinical Aptitude Test (MCAT) ( ¼ 0.00) (O’Brien et al. 2011). However, positive association with certain cognitive skills, such as the GAMSAT scores for ‘‘Reasoning in Humanities and Social Sciences’’ (r ¼ 0.26) and ‘‘Written Communication’’ (0.26) (Roberts et al. 2008), and cognitive reasoning skills (Roberts et al. 2009) have been reported as well as correlation with autobiographical submission focusing on ethical decision making (r ¼ 0.65) (Dore et al. 2006). The MMI was not reported to be associated with emotional intelligence (Yen et al. 2011). Predictive validity. For medical students, MMI performance at admission was the best predictor for subsequent OSCE as well as clerkship performance (Eva et al. 2004a). Validity against future non-cognitive assessment was investigated by Eva et al. (2009), who reported that MMI performance at admission was statistically significantly predictive of performance at future examinations, such as the percentage of stations passed in the MCCQE (Medical Council of Canada Qualifying Examination) Part II. However, a cross-sectional study investigating the association between MMI performance of medical residency applicants and their MCCEE (Medical Council of Canada Evaluating Examination) and MCCQE I scores reported low, non-significant correlations, and also non-significant correlation with MCCQE II scores (Hofmeister et al. 2009). In a more recent study, Eva et al. (2012) reported that better MMI performance at entry to medical school was predictive of higher MCCQE scores.
Discussion This review assessed the evidence base for the acceptability, feasibility, reliability and validity of the MMI in the selection of health professions students. The key findings were that the MMI was (i) practically feasible in terms of efficient utilisation 1030
of time, costs and human resources when compared to the panel interview; (ii) generally acceptable to both interviewees and interviewers; (iii) generally reliable with acceptable Cronbach’s apha and G-coefficient values; and (iv) predictive of future performance in certain aspects of medical council examinations. The feasibility of implementing the MMI depends on the financial and human resources available in addition to infrastructure requirements and other miscellaneous expenses. Expertise is also necessary in developing the stations and conducting the interviews. Therefore the initial preparatory costs to develop the MMI are likely to be high (Rosenfeld et al. 2008). The costs are also dependent on the number of stations in a circuit and the length of each station, and whether faculty, students or lay persons are used as assessors. Consider an example of a 10-station circuit of eight minutes per station and one assessor per station. During one circuit of 80 min, ten applicants can be interviewed. Compare this to a traditional interview of 40 min per interview with a panel of two assessors. During 80 min, ten applicants can be interviewed by five panels. The MMI assessors may employ non-faculty such as students and lay persons. Potentially the difference in the running costs can be minimal, and may even be lower when using the MMI. The feasibility of running an MMI is likely to be context dependent and should be tested in a pilot before implementation. The acceptability of the MMI has been investigated from the perspectives of the interviewees as well as the assessors. Most studies have reported that the MMI is acceptable to both in terms its fairness and does not disadvantage on cultural, gender and socio-economic grounds. By contrast, studies of panel or board interviewers have suggested that they are often biased in relation to an applicant’s sex, race, appearance, similarity to the interviewer, and contrast to other interviewees (Edwards et al. 1990). This bias is reduced by making the interview as structured as possible (Patrick et al. 2001). The MMI, which has been likened to the OSCE for student selection (Eva et al. 2004c), is designed to be highly structured at each station. In addition the multiple stations design means that an interviewee disadvantaged at one station may not experience similar disadvantages at the other stations. This may explain its acceptability to interviewees. The use of qualitative research methods has yielded helpful insights into the acceptability of the MMI. For example, Kumar et al. (2009) identified that the scenario-based nature of the MMI made it harder for rehearsal and coaching of responses, and indeed it has been reported that performance at the MMI is not associated with self-reported previous coaching (Griffin et al. 2008), and therefore, does not disadvantage applicants with no access to coaching. Most universities that have adopted the MMI have been graduate entry programmes in Canada. It had been envisaged that graduate applicants would prefer the MMI and perform better. However, a study comparing the MMI and panel interview (O’Brien et al. 2011) reported that pre-university applicants actually preferred the MMI compared to graduate applicants. This suggests the potential acceptability of the MMI to medical schools that generally recruit pre-university candidates. However, this depends on the design and content
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
A review of the Multiple Mini-Interview
of the MMI stations. The acceptability of the MMI is also likely to be context specific, especially in terms of the attributes and values that are being tested. Again, piloting of the MMI to gather views and perceptions of interviewees and assessors should be carried out prior to implementation. An impetus for the development of the MMI has been that the traditional panel interview lacks reliability (Kreiter et al. 2004). Interpretation of reliability in the studies reviewed was not always clearly explained. For example, it is expected that intra-station and inter-rater reliability would be high and interstation reliability low (Lemay et al. 2007; Dore et al. 2010) since different stations may test different attributes. Most studies reported that reliability was moderate (Roberts et al. 2008; Yen et al. 2011), presumably between stations, or high (Hecker et al. 2009), but generally acceptable or satisfactory (Eva et al. 2004c; Dowell et al. 2012). However, reliability would appear to be associated with number of stations or interviewers (Hecker & Violato 2011), and the content of each station (Lemay et al. 2007). Within an MMI, multiple stations may be designed to assess the same attributes resulting in high reliability, or very different attributes resulting in low reliability. Future research on reliability should perform factor analysis to identify which stations assess the same attributes (Lemay et al. 2007), and reliability should be reported for groups of stations assessing the same attributes, or between groups of stations assessing different attributes. A number of studies carried out generalisability analysis and decision studies to quantify the amount of error caused by each factor such as number of items per station, number of raters per station, number of stations (Hecker & Violato 2011), and length of time per station (Dodson et al. 2009). This allows computation of overall reliability of the MMI considering all sources of variance (Dore et al. 2010), and would be a more appropriate measure of reliability. The reliability of the MMI has generally been reported to be acceptable. This has been recognised by the Ottawa 2010 Conference in a consensus statement on assessment for selection for the health care professions (Prideaux et al. 2011). Interviewer subjectivity is the largest source of measurement error, suggesting that interviewer training could be helpful (Roberts et al. 2008). Most studies reported that MMI performance was not associated with pre-entry academic qualifications such as GPA, MCAT and GAMSAT scores. This suggests that the MMI is capable of testing non-cognitive attributes, such as professionalism (Hofmeister et al. 2009), legal, ethical and organisational skills. (Eva et al. 2009), motivation, interest in medicine, decision making skills, ability to debate a complex issue (O’Brien et al. 2011), empathy, moral and ethical reasoning, motivation and preparedness to study medicine, teamwork and leadership, honesty and integrity (Till et al. 2013), and advocacy, ambiguity, collegiality and collaboration, cultural sensitivity, responsibility and reliability (Lemay et al. 2007). The evidence for its predictive validity is somewhat lacking. Our review has identified three longitudinal reports that supported the validity of the MMI in predicting the future clinical performance as well as performance in medical council examinations (Eva et al. 2004a; Eva et al. 2009, 2012). The most
recent report by Eva et al. (2012) provided strong evidence for the predictive validity of the MMI. When the MCCQE scores were compared between candidates who had been accepted and rejected to medical training based on their MMI performance, those accepted by higher MMI scores were reported to achieve higher mean MCCQE Part I and 2 scores. Further evidence for its predictive validity will no doubt emerge as the schools that have implemented the MMI begin to report follow-up of their students. The findings of this review should be considered in the context of its limitations. Our main limitation was that our search was limited to PubMed and we did not attempt to retrieve the grey literature. It is possible that the grey literature may highlight publication bias as the literature reviewed tended to point clearly to the feasibility, acceptability and reliability of the MMI. The literature reviewed has been published out of Canada, America, the United Kingdom and Australia. A more extensive search may yield evidence from more diverse contexts. However, the MMI can be considered to be still in its infancy and it is unlikely to have been implemented in many parts of the world.
Conclusion The MMI is acceptable and feasible, although this is likely to be context specific. The evidence available has been generated from high income countries and there is no literature on its implementation elsewhere. Its reliability is acceptable. However, the evidence for its validity against assessment of non-cognitive attributes is limited. Further longitudinal research is needed to contribute to the evidence base for the validity of the MMI.
Declaration of interest: The authors report no declarations of interest
Notes on contributors ALLAN PAU, BDS, MSc, PhD, FDSRCSEd, is Professor in Dental Public Health and the MMI project lead at IMU; carried out the literature search, reviewed the selected papers, performed the analysis and drafted the manuscript. KAMALAN JEEVARATNAM, DAHP, DVM, MMedSc, PhD, is Senior Lecturer in Physiology, with a special interest in cardiology research and medical education; contributed to reviewing the selected papers and helped to draft the manuscript. YU SUI CHEN, BSc, MSc, PhD, is Associate Professor in Physiology and Associate Dean of Students Services, with a special interest in small group learning; contributed to reviewing the selected papers. ABDOUL AZIZ FALL, BSc, MSc, PhD, is Lecturer in Psychology; contributed to reviewing the selected papers. CHARMAINE KHOO, BA, is Head of Admissions; contributed to reviewing the selected papers. VISHNA DEVI NADARAJAH, BSc (Hons), PhD, is Professor in Biochemistry and Associate Dean for Teaching and Learning, with a special interest in health professions education research; helped to draft the manuscript.
1031
A. Pau et al.
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
References Brownell K, Lockyer J, Collin T, Lemay J. 2007. Introduction of the multiple mini interview into the admissions process at the university of Calgary: Acceptability and feasibility. Med Teach 29(4):394–396. Cameron AJ, Mackeigan LD. 2012. Development and pilot testing of a multiple mini-interview for admission to a pharmacy degree program. Am J Pharm Educ 76(1):10. Dixon M, Wang S, Calvin J, Dineen B, Tomlinsen E. 2002. The panel interview: A review of empirical research and guidelines for practice. Pub Personnel Manag 31(3):397–428. Dodson M, Crotty B, Prideaux D, Carne R, Ward A, de Leeuw E. 2009. The multiple mini-interview: How long is long enough? Med Educ 43(2):168–174. Dore KL, Kreuger S, Ladhani M, Rolfson D, Kurtz D, Kulasegaram K, Cullimore AJ, Norman GR, Eva KW, Bates S, et al. 2010. The reliability and acceptability of the multiple mini-interview as a selection instrument for postgraduate admissions. Acad Med: J Assoc Am Med Colleges 85(10 Suppl):S60–S63. Dore KL, Hanson M, Reiter HI, Blanchard M, Deeth K, Eva KW. 2006. Medical school admissions: Enhancing the reliability and validity of an autobiographical screening tool. Acad Med: J Assoc Am Med Colleges 81(10):S70–S73. Dowell J, Lynch B, Till H, Kumwenda B, Husbands A. 2012. The multiple mini-interview in the U.K. context: 3 years of experience at Dundee. Med Teach 34(4):297–304. Edwards JC, Johnson EK, Molidor JB. 1990. The interview in the admission process. Acad Med 65(3):167–177. Eva KW, Reiter HI, Rosenfeld J, Norman GR. 2004a. The ability of the multiple mini-interview to predict pre-clerkship performance in medical school. Acad Med: J Assoc Am Med Colleges 79(10 Suppl):S40–42. Eva KW, Reiter HI, Rosenfeld J, Norman GR. 2004b. The relationship between interviewers’ characteristics and ratings assigned during a multiple mini-interview. Acad Med: J Assoc Am Med Colleges 79(6):602–609. Eva KW, Reiter HI, Rosenfeld J, Trinh K, Wood TJ, Norman GR. 2012. Association between a medical school admission process using the multiple mini-interview and national licensing examination scores. JAMA: J Am Med Assoc 308(21):2233–2240. Eva KW, Reiter HI, Trinh K, Wasi P, Rosenfeld J, Norman GR. 2009. Predictive validity of the multiple mini-interview for selecting medical trainees. Med Educ 43(8):767–775. Eva KW, Rosenfeld J, Reiter HI, Norman GR. 2004c. An admissions OSCE: The multiple mini-interview. Med Educ 38(3):314–326. Finlayson HC, Townson AF. 2011. Resident selection for a physical medicine and rehabilitation program: Feasibility and reliability of the multiple mini-interview. Am J Phys Med Rehabil/Assoc Acad Physiat 90(4):330–335. Goho J, Blackman A. 2006. The effectiveness of academic admission interviews: An exploratory meta-analysis. Med Teach 28(4):335–340. Griffin B, Harding DW, Wilson IG, Yeomans ND. 2008. Does practice make perfect? The effect of coaching and retesting on selection tests used for admission to an Australian medical school. Med J Aus 189(5):270–273. Griffin B, Wilson I. 2012. Associations between the big five personality factors and multiple mini-interviews. Adv Health Sci Educ 17(3):377–388. Hecker K, Donnon T, Fuentealba C, Hall D, Illanes O, Morck DW, Muelling C. 2009. Assessment of applicants to the veterinary curriculum using a multiple mini-interview method. J Vet Med Educ 36(2):166–173. Hecker K, Violato C. 2011. A generalizability analysis of a veterinary school multiple mini interview: Effect of number of interviewers, type of interviewers, and number of stations. Teach Learn Med 23(4):331–336. Hofmeister M, Lockyer J, Crutcher R. 2009. The multiple mini-interview for selection of international medical graduates into family medicine residency education. Med Educ 43(6):573–579.
1032
Hofmeister M, Lockyer J, Crutcher R. 2008. The acceptability of the multiple mini interview for resident selection. Fam Med 40(10):734–740. Jerant A, Griffin E, Rainwater J, Henderson M, Sousa F, Bertakis KD, Fenton JJ, Franks P. 2012. Does applicant personality influence multiple miniinterview performance and medical school acceptance offers? Acad Med: J Assoc Am Med Colleges 87(9):1250–1259. Kreiter CD, Yin P, Solow C, Brennan RL. 2004. Investigating the reliability of the medical school admissions interview. Adv Health Sci Educ: Theory Pract 9(2):147–159. Kulasegaram K, Reiter HI, Wiesner W, Hackett RD, Norman GR. 2010. Nonassociation between neo-5 personality tests and multiple mini-interview. Adv Health Sci Educ: Theory Pract 15(3):415–423. Kumar K, Roberts C, Rothnie I, du Fresne C, Walton M. 2009. Experiences of the multiple mini-interview: A qualitative analysis. Med Educ 43(4):360–367. Lemay JF, Lockyer JM, Collin VT, Brownell AK. 2007. Assessment of noncognitive traits through the admissions multiple mini-interview. Med Educ 41(6):573–579. Moreau K, Reiter H, Eva KW. 2006. Comparison of aboriginal and nonaboriginal applicants for admissions on the multiple mini-interview using aboriginal and non-aboriginal interviewers. Teach Learn Med 18(1):58–61. O’Brien A, Young C, Lomax A. 2012. A novel student-selected component in medical admissions. Med Educ 46(5):510–511. O’Brien A, Harvey J, Shannon M, Lewis K, Valencia O. 2011. A comparison of multiple mini-interviews and structured interviews in a UK setting. Med Teach 33(5):397–402. Patrick LE, Altmaier EM, Kuperman S, Ugolini K. 2001. A structured interview for medical school admission, phase 1: Initial procedures and results. Acad Med 76(1):66–71. Prideaux D, Roberts C, Eva K, Centeno A, Mccrorie P, Mcmanus C, Patterson F, Powis D, Tekian A, Wilkinson D. 2011. Assessment for selection for the health care professions and specialty training: Consensus statement and recommendations from the Ottawa 2010 conference. Med Teach 33(3):215–223. Razack S, Faremo S, Drolet F, Snell L, Wiseman J, Pickering J. 2009. Multiple mini-interviews versus traditional interviews: Stakeholder acceptability comparison. Med Educ 43(10):993–1000. Roberts C, Rothnie I, Zoanetti N, Crossley J. 2010. Should candidate scores be adjusted for interviewer stringency or leniency in the multiple miniinterview? Med Educ 44(7):690–698. Roberts C, Walton M, Rothnie I, Crossley J, Lyon P, Kumar K, Tiller D. 2008. Factors affecting the utility of the multiple mini-interview in selecting candidates for graduate-entry medical school. Med Educ 42(4):396–404. Roberts C, Zoanetti N, Rothnie I. 2009. Validating a multiple mini-interview question bank assessing entry-level reasoning skills in candidates for graduate-entry medicine and dentistry programmes. Med Educ 43(4):350–359. Rosenfeld JM, Reiter HI, Trinh K, Eva KW. 2008. A cost efficiency comparison between the multiple mini-interview and traditional admissions interviews. Adv Health Sci Educ: Theory Pract 13(1):43–58. Till H, Myford C, Dowell J. 2013. Improving student selection using multiple mini-interviews with multifaceted Rasch modeling. Acad Med: J Assoc Am Med Colleges 88(2):216–223. Uijtdehaage S, Doyle L, Parker N. 2011. Enhancing the reliability of the multiple mini-interview for selecting prospective health care leaders. Acad Med: J Assoc Am Med Colleges 86(8):1032–1039. Wilkinson D, Zhang J, Byrne GJ, Luke H, Ozolins IZ, Parker MH, Peterson RF. 2008. Medical school selection criteria and the prediction of academic performance. Med J Aus 188(6):349–354. Yen W, Hovey R, Hodwitz K, Zhang S. 2011. An exploration of the relationship between emotional intelligence (EI) and the multiple mini-interview (MMI). Adv Health Sci Educ: Theory Pract 16(1):59–67.
1033
Canada
Canada
Canada
Canada
Eva et al. (2004a,c)
Eva et al. (2004b,c)
Dore et al. (2006)
Country
Eva et al. (2004c)
Title of paper
9 (3 with faculty, 3 with community, 3 with both)
12
To assess the consist- Cross-sectional Medical applicants 54 ency of ratings by faculty members to community members
(1) To investigate asso- Cross-sectional Medical applicants 30 ciation between autobiographical submission (ABS) and MMI scores
NS
2
1
10
(1) To assess the rela- Longitudinal tionship between pre-clerkship performance and the MMI and traditional admission tools
Medical applicants 45 enrolled students (of 117 applicants)
1
Sample size
NS
8
8
8
Assessors Time/staper tion station (min)
10
Study design Study population
No. of stations
(1) To develop a mul- Cross-sectional Medical applicants 117 tiple sample approach to the personal interview using an innovative admission protocol. (2) To report the results from two studies using this protocol.
Aims and hypothesis No gender differences. Positive experience for applicants and examiners.
Acceptability
Appendix A1. Review of papers reporting on the MMI’s feasibility, acceptability, reliability and validity.
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Feasibility
Number of stations more important than number of interviewers or station for reliability. Community members’ rating less consistent compared to faculty
MMI scores do not correlate with traditional admission tools score.
The MMI is a reliable tool (Cronbach’s ¼ 6.5)
(continued)
The new ABS scoring method correlated better with the MMI relative to the traditional method
1. MMI was the best predictor for OSCE performance 2. GPA was the most consistent predictor of performance on MCQ examination of medical knowledge
Validity
Reliability
1034
Country
Study design Study population
Sample size
Cross-sectional Medical applicants 281 (1) To study the acceptability of MMI by applicants and interviewers. (2) To study feasibility
(ABS scores obtained from raters evaluating one ABS question across multiple candidates before scoring the next ABS question (horizontal assessment), compared to evaluating all questions for one candidate (Vertical assessment). ABS scores also obtained off-site and on-site. Canada To determine whether Cross-sectional Medical applicants 5 self-declared the MMI is a barrier aboriginal applito aboriginal cants and 7 applicants general-pool applicants.
Brownell et al. Canada (2007)
Moreau et al. (2006)
Title of paper
Aims and hypothesis
10
11
No. of stations
81 total (27 faculty, 27 students, 27
11–6 aboriginal raters and 5 nonaboriginal raters.
8
Assessors Time/staper tion station (min)
Appendix A1. Continued.
community)
Acceptability
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
NS
Feasibility
Validity
Does not require more High number of examiners, does not applicants willcost more, and ing to participate interviews comin MMI again. pleted over a short Free of gender period of time. and cultural bias. Sufficient time to present ideas.
Neither aboriginalspecific rater training nor aboriginal rater assignment is required to ensure a level playing field for the assessment of applicants’ personal qualities
Reliability
1035
Canada
Canada
Australia
Australia
Lemay et al. (2007)
Hofmeister et al. (2008)
Griffin et al. (2008)
Roberts et al. (2008)
To establish whether interviewers’ decisions are reliable and valid Cross-sectional Medical applicants
485
12
(1) To investigate the Cross-sectional Residency appli- 71 cants (IMGs) with acceptability of the qualitative MMI to family medicine interviewers and International Medical Graduates (IMGs) applicants (1) To assess effect of Cross-sectional Medical applicants 287 (17 with previcoaching on UMAT ous MMI and MMI scores. (2) experience) To assess effect of previous MMI experience on MMI score
8
9
10
(1) To assess the reli- Cross-sectional Medical applicants 281 ability and validity of the MMI for noncognitive attributes. (2) To assess if MMI stations distinguish between different non-cognitive attributes. (3)To assess if the MMI discriminates between those accepted and those placed on the waitlist. (4) To assess association between applicant sociodemographic characteristics and admission/waitlist status.
155 total
NS
1
1 per station
7
NS
8
8 min
Candidate performance on one question does not correlate strongly with another, demonstrating the importance of context specificity.
Coaching no effect on UMAT or MMI scores
High levels of satisfaction. 8 min acceptable. Free from gender and cultural bias.
Socio-demographic differences were not associated with status on acceptance or waiting lists.
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
(continued)
Moderately reliable. Small positive correlInterviewer subation with GAMSAT jectivity is largest section scores for source of meas’Reasoning in urement error. Humanities and Social Sciences’ and ’Written Communication’.
Previous MMI experience improved subsequent performance in same stations but not in new stations
Cronbach’s alpha Factor analysis sugfor each station gested each station ranged from assessed different 0.97–0.98. Low attributes. correlations Significant differbetween ences in MMI scores stations. between those accepted and those on the waitlist. (based on weightings of 40% for the file review, 48% for the MMI, and 12% for the invigilated essay.)
1036 100
Australia (although (1) To explore partici- Cross-sectional Medical applicants 442 applicants and 70 Interviewers with qualitapants’ satisfaction interviews were completed tive focus with the MMI. (2) To also carried out questionnaire. groups explore perceptions in Canada) 37 interviewers of MMI strengths into 6 focus and weaknesses. groups.
Cross-sectional Medical applicants
Kumar et al. (2009)
(1) To study the perception of both applicants and evaluators about MMI
Canada
122
Sample size
Razack et al. (2009)
(1) To assess reliability, Cross-sectional Veterinary applicants validity and acceptability of MMI
Study design Study population
Canada
Country
Hecker et al. (2009)
Title of paper
Aims and hypothesis
1 to 2
1
8
2
7
8
8
Assessors Time/staper tion station (min)
10
7 (2 tested cognitive knowledge and psychomotor skills
No. of stations
Appendix A1. Continued.
Feasibility
Validity
Reliability was high The MMI score not associated with (Cronbach’s GPA. Factor ana 4 0.8). lyses suggested MMI stations valid in measuring ethical and moral values. GPA score loaded to academic ability score. Age loaded to inter-personal ability.
Reliability
MMI format enhanced the building of rapport. Gave chance to redeem a ‘bad first impression with one person’ and to regain their composure. Scenario-based nature made it harder for rehearsal and coaching of responses. MMI format ’very stressful’ because ‘‘have to change thought processes and scenarios every 7 min’’, but also viewed as a useful skill in the workplace. Lack of opportunity to discuss specific personal qualities
MMI is enjoyable. Allows competitiveness. Fair. Provides opportunities to show strength
Acceptable to both interviewers and interviewees
Acceptability
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
1037
Canada
Canada
Australia
Australia
Canada
Eva et al. (2009)
Hofmeister et al. (2009)
Roberts et al. (2009, 2010)
Dodson et al.(2009)
Kulasegaram et al. (2010)
175
686
(1) To assess associ- Cross-sectional Medical applicants 152 (28.3%) of 538 ation between perpeople comsonality measures pleted the Neoand MMI perform5 personality ance. (2) To detertest mine components of the big five that are related to non-cognitive skills as assessed by the MM.
To determine whether Cross-sectional Medical applicants MMI stations can be shortened
(1) To assess the extent Cross-sectional Medical and dental to which MMI conapplicants tributes to entrylevel cognitive reasoning skills and professionalism
71
Medical applicants 29 residents who 9 participated in and resident the MMI as residoctors dents (PG sample) and 34 graduates who participated in the MMI as medical school applicants (UG sample)
(1) To assess validity Cross-sectional Residency applicants against compulsory examinations, including the AIMG OSCE, the MCCEE, and MCCQE I and MCCQE II scores
(1) To test the stability of Longitudinal performance of MMI. (2) To correlate MMI scores and Part II of MCCQE. (3) To correlate measures of cognitive and non-cognitive competencies.
12
10
8
12
for PG sample, 12 for UG sample
1
81 total
1 per station (207 interviewers in total)
1 per station (33 total)
1
8
5 stations scored at 8 min, 5 scored at 5 and 8 min
7
8
10
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Reducing the duration of stations from 8 to 5 min conserves resources with minimal effect on applicant ranking and test reliability
time of application to medical school: GPA and MMI scores were INVERSELY correlated. MMI predictive of percentage of stations passed in the MCCQE Part II in PG sample. MMI only significant predictor of percentage of stations passed in the MCCQE Part II UG sample.
(continued)
No statistically significant correlation was found between personality factors and the MMI.
MMI is able to assess cognitive reasoning skills, as well as non-cognitive attributes
Low, non-significant correlations with OSCE overall MCCEE and MCCQE I scores. A higher non-significant correlation with MCCQE II scores
At Reliability of any one station is consistently low. Reliability of the average performance score across 12 stations have median reliability of 0.73.
1038
UK
& Canada
Finlayson Townson (2011)
O’Brien et al. (2011, 2012)
Canada
Yen et al. (2011)
Country
Canada
Dore et al. (2010)
Title of paper
16
(1) To evaluate reliabil- Cross-sectional Medical applicants 21 applicants from 4-year MBBS, ity, feasibility and 26 from 5-year acceptability. (2) To MBBS (out of assess its concur350 interviewed rent validity comby panel) pared to the standard interviews (SIs). (3) To assess relationship between performance and socio-demographics. (4) To assess whether aptitude tests (GAMSAT and UKCAT) predicted performance in either the SIs or MMIs.
(1) To determine feasi- Cross-sectional Residency bility of MMI. (2) To applicants determine inter-rater reliability
196
Cross-sectional Heath Science (1) To compare MMI applicants scores with Bar-on EQ-i emotional intelligence. (2) To investigate whether EQ-I can serve as proxy measure to MMI
Sample size 484
Study design Study population
(1) To assess reliability Cross-sectional Residency applicants of the MMI. (2) To assess acceptability by candidates and interviewers
Aims and hypothesis
2
NS
8
NS
1 to 2
5
15
NS
NS
Assessors Time/staper tion station (min)
4
8
7
No. of stations
Appendix A1. Continued.
No difference in MMI scores between the 5and 4-year streams, males and females, or different age groups. The 5year candidates felt that the MMI gave a more accurate picture of their abilities and that the SIs was more difficult. The 4-year candidates felt the MMI was more difficult.
Acceptability reported by high percentage of candidates and interviewers
Acceptability
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Process feasible, time and cost efficient.
Feasibility
Reliability
MMI not correlated to EQ-i at the total or subscale level. EQ-i cannot discriminate between accepted and not-accepted students.
Validity
Cronbach’s was No significant difference between the SIs and 0.69 for the 4MMI scores in the year stream, MBBS 4 candidates and 0.73 for the whereas the MBBS 5-year stream. 5 candidates performed better on the MMI.
Inter-rater reliability from 0.50 to 0.69 for three stations, 0.10 for one. Interstation reliability from 0.45 to 0.47.
MMI moderately reliable (0.75), presumably between stations.
Overall reliability 0.55 to 0.72. IDstudy estimated increase to 0.64–0.79 with 10 stations.
1039
(1) To assess the effect Cross-sectional Veterinary of the number of applicants MMI stations, and the type and number of interviewers on reliability
Canada
2012
Canada
Hecker & Violato (2011)
Dowell et al. (2012)
Cameron & Mackeigan (2012)
(1) To assess the feasi- Cross-sectional Pharmacy 1st year 30 (8 students, 7 bility and acceptwith students faculty, 15 ability of the MMI. (2) qualitative practitioners) To determine optimal station duration. (3) To assess the discriminant validity of the MMI. (4) To assess the reliability of the overall MMI score.
(1) To describe proCross-sectional Medical applicants 452 from 2009 gression from a cohort and 477 traditional interview from 2010 to a full-scale MMI. cohort (2) To report on its psychometric properties. (3) To report views of applicants and assessors.
103
(1) To assess reliability Cross-sectional Medical applicants 154 (76 in 2009 of the MMI. (2) To cohort, 78 in assess acceptability 2010 cohort) by candidates and interviewers
USA
Uijtdehaage et al. (2011)
10
10
7
12
30 total
1
2 per station (1 faculty, 1 community vet)
28 total
7
10
Graduates/mature candidates outperformed UK school-leavers and overseas candidates. Acceptable to all parties.
No difference based on sex or socio-economic disadvantage. Transparent, uniform and fair.
8 min at 5 Most interviewers stations judged 6 min and 6 min ‘‘just right’’ and at 5 8 min ‘‘a bit stations long.’’ Applicants had opposite opinion.
NS
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
(continued)
Station scores by Station scenarios had student interface validity for canviewers higher didates and interthan those of viewers. faculty or practiCorrelations with tioners. Overall pre-pharmacy averscore reliability age (PPA) and was high. Pharmacy College Admission Test (PCAT) composite were negligible.
Reliability was satisfactory and consistent. Students were less lenient, and made more use of the full range of the rating scales and were just as reliable as staff.
Preliminary reliability in 2009 was lower than reported in previous studies. Improved in 2010 after an easy station was replaced with a more challenging one and a new scoring rubric Reliability adequate with 7 stations, 2 raters (G ¼ 0.79). Similar reliability estimated for 10 stations, 1 rater (G ¼ 0.81). Community vets and faculty demonstrated adequate level of agreement
1040
USA
Country
Eva et al. (2012)
Canada
Griffin & Wilson Australia (2012)
Jerant et al. (2012)
Title of paper
Study design Study population
Sample size
To investigate whether Longitudinal the MMI would yield good prediction of national licensing examination scores. Medical applicants 751 Interviewees matched to MCCQE scores,
444 (1) To assess the asso- Cross-sectional Medical applicants ciation between personality factors and MMI performance. (2) To assess the associations of personality factors and MMI performance with medical school acceptance offers. Cross-sectional Medical applicants 868 over 3 years
Aims and hypothesis
12
10
No. of stations
1
1
8
10
Assessors Time/staper tion station (min)
Appendix A1. Continued.
Acceptability
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Feasibility
Reliability
Candidates accepted by higher MMI scores achieved higher mean MCCQE part I scores compared to rejected candidates. Higher MMI scores also associated with higher total mean MCCQE part II scores.
Associated with extraversion, conscientiousness and agreeableness. Correlated with interpersonal understanding. Unrelated to logical reasoning ability, non-verbal reasoning, or past academic performance
Higher extraversion correlated with better MMI performance. Those offered acceptance more likely to have higher MMI scores, higher extraversion and agreeableness scores.
Validity
1041
Till et al. (2013)
UK
To explore (1) whether Cross-sectional Medical applicants 452 from 2009 the MMI can reliably Dundee cohort differentiate candidates at variable levels of non-cognitive ability, (2)whether three different groups of examiners exhibited any systematic differences in their rating patterns, and (3) how correcting for those differences would affect candidates’ scores and selection. 10
1
7
Med Teach Downloaded from informahealthcare.com by International Medical Univ on 01/23/14 For personal use only.
Examiners, candi- MMI able to separate dates, and stacandidates into four tions accounted levels of non-cognifor about 32% of tive ability. the total variance, whereas the non-cognitive ability of the candidates accounted for 16%, about 4% reflected in differences in station difficulty.