Valid and Reliable Survey Instruments to Measure Burnout, Well-Being, and Other Work-Related Dimensions

A key organizational strategy to improving clinician well-being is to measure it, develop and implement interventions, and then re-measure it. A variety of dimensions of clinician well-being can be measured including burnout, engagement, and professional satisfaction. Below is a summary of established tools to measure work-related dimensions of well-being. Each tool has advantages and disadvantages and some are more appropriate for specific populations or settings. This information is being provided by the Research, Data, and Metrics Working Group of the National Academy of Medicine Action Collaborative on Clinician Well-Being and Resilience.

Scroll below for an overview of each validated instrument to assess work-related dimensions of well-being.

Valid and Reliable Survey Instruments to Measure Burnout

Purpose

To measure burnout in individuals who work with people (human services and medical professionals).

Format/Data Source

Maslach Burnout Inventory – Human Services Survey for Medical Personnel (MBI-HSS MP) is a 22-item survey that covers 3 areas: Emotional Exhaustion (EE), Depersonalization (DP), and low sense of Personal Accomplishment (PA).  Each subscale includes multiple questions with frequency rating choices of Never, A few times a year or less, Once a month or less, A few times a month, Once a week, A few times a week, or Every day.

Date

Measure released in 1981.

Data Analysis

It is preferred to examine relationships with subscale scores as continuous variables and outcomes.  Investigators often dichotomize results into burnout – non-burnout but there is no accepted standard definition.1  A common approach considers individuals as presenting at least one symptom of burnout if they have high scores on either the EE (total score of 27 or higher) or DP (total score of 10 or higher) subscales.  Evidence indicates that high scores on these subscales can distinguish clinical burnout from the non-burned out2  because this approach identifies individuals whose degree of burnout places them at increased risk of potentially serious personal and professional consequences.3-8  An alternative approach considers individuals to have burnout if they have a high EE score plus either a high DP score or a low PA score (PA score less than 33).1  

Development and Testing

The instrument was developed following exploratory research with interview and questionnaire data, testing in a variety of health and service occupations, and factor and confirmatory data analysis.  Reliability coefficients, test-retest reliability, convergent validity, and discriminant validity among human services professionals are summarized in the manual.10

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

Substantial data 11,12 supports associations between burnout as measured using the MBI and health care related outcomes (e.g., medical error,5,7,8  malpractice,13 suboptimal patient care practices,14 physician turnover and early retirement,15,16 and lower medical knowledge 17), suboptimal professionalism,4,18 and personal outcomes (e.g., alcohol abuse19-21 , suicidal ideation,3,6  and motor vehicle incidents 22)  From a  health system characteristics perspective, associations have been found between burnout and practice setting, work hours, clerical burden, and specialty.11,23-26

Country of Origin

United States of America

Past or Validated Applications

  • Participant age: adults
  • Population: human services/helping professionals (e.g., teachers, social workers, police officers), including physicians, residents/fellows, medical students, and nurses
    •  National benchmark data available:
      • US physicians: Yes 24
      • US residents/fellows: Yes 27
      • US medical students: Yes 27
      • General population: Yes 24
  • Setting: workers in human service/helping professions

Cost

Individual Report – $15; Group Report – $200. Instrument is proprietary. Permission can be obtained through www.mindgarden.com.

Notes

  • Multiple language translations are available
  • The MBI-General Survey (MBI-GS) is a 16-item assessment applicable to more general, non-social jobs as well.10
  • The MBI-Human Services Survey (MBI-HSS) is a 22-item assessment, applicable to human services jobs, e.g. clergy, police, therapists, social workers, medical, etc.10

References

  1. Dyrbye LN, West CP, Shanafelt TD. Defining burnout as a dichotomous variable. Journal of General Internal Medicine 2009;24:440.
  2. Schaufeli WB, Hoogduin K, et al. On the clinical validity of the Maslach Burnout Inventory and the burnout measure. Psychol Health 2001;16:565-582.
  3. Dyrbye LN, Thomas MR, Massie FS, et al. Burnout and suicidal ideation among US medical students. Ann Intern Med 2008;149:334-341.
  4. Dyrbye LN, Massie FS, Jr., Eacker A, et al. Relationship between burnout and professional conduct and attitudes among US medical students. JAMA 2010;304:1173-1180.
  5. Shanafelt TD, Balch CM, Bechamps G, et al. Burnout and medical errors among American surgeons. Annals of Surgery 2010;251:995-1000.
  6. Shanafelt TD, Balch CM, Dyrbye LN, et al. Suicidal ideation among American surgeons. Arch Surg 2011;146:54-62.
  7. West C, Huschka M, Novotny P, et al. Association of perceived medical errors with resident distress and empathy: A prospective longitudinal study. JAMA 2006;296:1071-1078.
  8. West CP, Tan AD, Habermann TM, Sloan JA, Shanafelt TD. Association of resident fatigue and distress with perceived medical errors. JAMA 2009;302:1294-1300.
  9. Leiter MP & Maslach C. (2016). Latent burnout profiles: A new approach to understanding the burnout experience. Burnout Research, 3, 89-100.
  10. Maslach C, Jackson SE, Leiter MP (2018) Maslach Burnout Inventory: Manual 4th Menlo Park, CA: Mind Garden, Inc.
  11. Dyrbye LN, Shanafelt TD, Sinsky CA, Cipriano PF, et al. Burnout among health care professionals: A call to explore and address this underrecognized threat to safe, high-quality care. NAM Perspectives Discussion Paper, National Academy of Medicine, Washington DC. https://nam.edu/burnout-among-health-care-professionals-a-call-to-explore-and-address-this-underrecognized-threat-to-safe-high-quality-care
  12. Wallace JE, Lemaire JB, Ghali WA. Physician wellness: a missing quality indicator. Lancet 2009;374:1714-1721.
  13. Balch CM, Oreskovich MR, Dyrbye LN, et al. Personal consequences of malpractice lawsuits on American surgeons. J Am Coll Surg 2011;213:657-667.
  14. Shanafelt TD, Bradley K, Wipf J, Back A. Burnout and self-reported patient care in an Internal Medicine residency program. Ann Intern Med 2002;136:358-367.
  15. Shanafelt TD, Mungo M, Schmitgen J, et al. Longitudinal Study Evaluating the Association Between Physician Burnout and Changes in Professional Work Effort. Mayo Clinic Proceedings 2016;91:422-431.
  16. Shanafelt TD, Dyrbye LN, West CP, Sinsky C. Potential Impact of Burnout on the US Physician Workforce. Mayo Clin Proc 2016;91:1667-
  17. West C, Shanafelt TD, Kolars J. Quality of life, burnout, educational debt, and medical knowledge among internal medicine residents. JAMA 2011;306:952-960.
  18. Dyrbye LN, West CP, Satele D, Boone S, Sloan J, Shanafelt TD. A national study of medical students’ attitudes toward self-prescribing and responsibility to report impaired colleagues. Acad Med 2015;90:485-493.
  19. Jackson ER, Shanafelt TD, Hasan O, Satele D, Dyrbye LN. Burnout and alcohol abuse/dependence among S. medical students. Acad Med 2016;91:1251-1256.
  20. Oreskovich MR, Kaups KL, Balch CM, et al. The prevalence of alcohol use disorders among American surgeons. Arch Surg 2012;147:168-174.
  21. Oreskovich MR, Shanafelt TD, Dyrbye LN, et al. The prevalence of substance use disorders in American physicians. Am J Addictions 2015;1:30-38.
  22. West CP, Tan AD, Shanafelt TD. Association of resident fatigue and distress with occupational blood and body fluid exposures and motor vehicle incidents. Mayo Clinic Proc 2012;87:1138-1144.
  23. Shanafelt TD, Dyrbye LN, Sinsky C, et al. Relationship between clerical burden and characteristics of the electronic environment with physician burnout and professional satisfaction. Mayo Clin Proc 2016;91:836-848.
  24. Shanafelt TD, Hasan O, Dyrbye LN, et al. Changes in burnout and satisfaction with work-life balance in physicians and the general US working population between 2011 and 2014. Mayo Clin Proc 2015;90:1600-1613.
  25. Shanafelt TD, Balch CM, Bechamps GJ, et al. Burnout and career satisfaction among American surgeons. Annals of Surgery 2009;250:463-471.
  26. Shanafelt TD, Gorringe G, Menaker R, et al. Impact of organizational leadership on physician burnout and satisfaction. Mayo Clinic Proceedings 2015;90:432-440.
  27. Dyrbye LN, West CP, Satele D, et al. Burnout among U.S. medical students, residents, and early career physicians relative to the general U.S. population. Acad Med 2014;89:443-451

Purpose

To measure burnout in any occupational group.

Format/Data Source

Oldenburg Burnout Inventory is a 16-item survey with positively and negatively framed items that covers 2 areas: exhaustion (physical, cognitive, and affective aspects) and disengagement from work (negative attitudes toward work objects, work content, or work in general).1  There are multiple questions for each of these subscales and responses are in the form of a 4 point Likert scale from strongly agree (1) to strongly disagree (4).

Date

Measure released in 2002.

Measure Item Mapping

  • Exhaustion: 2, 4, 5, 8, 10, 12,14,16
  • Disengagement: 1, 3, 6, 7, 9, 11, 13,15 

Data Analysis

Each burnout dimension is treated separately as a continuous variable. 

Development and Testing

Developed in response to the MBI not having negatively worded items, and based on job demands-resources model where job demands are primarily related to exhaustion and job resources are primarily related to disengagement.2,3  Two factor structure has been confirmed in a sample of Dutch workers,3 Dutch physicians 1 and US workers 4 whereas a four factor model (exhaustion, energy, disengagement, and engagement) was supported in study of Chinese nurses.5   There is some evidence of convergent validity of OLBI with a shortened (16-item) version of the MBI-GS in a sample of 2431 US workers 4 and in a sample of Chinese nurses though convergent validity data suggests positively worded items should be dropped.5  In a study of 232 Greek employees bivariate correlation between OLBI-exhaustion and MBI-GS-emotional exhaustion was 0.6, and the bivariate correlation between OBLI-disengagement and MBI-GS depersonalization was 0.6.3  In a study of 528 South African employees working in construction, bivariate correlation between OLBI-exhaustion and MBI-GS-emotional exhaustion was 0.6, and the bivariate correlation between OBLI-disengagement and MBI-GS depersonalization was 0.37.6

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

Existing data is limited as a majority of studies have included small samples of physicians and other health care providers, and have mostly been conducted outside of the United States.  Studies in Swedish nurses and other Swedish public health professionals suggest that OLBI scores predict intent of turnover and lower self-reported mastery of occupational skills.7-9  Correlations have also been reported between OLBI scores and self-rated health (n=342 Swedish medical students 10 and n=290 medical residents 11).  In a longitudinal sample of 186 Swedish medical students, end of medical school OLBI-exhaustion and worries about their future endurance/competence predicted 6-10 month postgraduate OLBI-exhaustion.12

Country of Origin

Germany

 Past or Validated Applications

  • Patient age: adults
  • Population: any occupational group
    • National benchmark data not available for US physicians, medical students, or general population.
  • Setting: any 

Cost

$0. Instrument publicly available in appendix of article. 6

Notes

  • Multiple language translations are available

References

  1. The Oldenburg Burnout Inventory: A good alternative to measure burnout and engagement. 2007. (Accessed August 14, 2017, at https://www.researchgate.net/publication/46704152_The_Oldenburg_Burnout_Inventory_A_good_alternative_to_measure_burnout_and_engagement.)
  2. Demerouti E, Bakker AB, Nachreiner F, Schaufeli WB. A model of burnout and life satisfaction amongst nurses. Journal of Advanced Nursing 2000;32:454-64.
  3. Demerouti E, Bakker AB, Nachreiner F, Schaufeli WB. The job demands-resources model of burnout. Journal of Applied Psychology 2001;86:499-512.
  4. Halbesleben JRB, Demerouti E. The construct validity of an alternative measure of burnout: Investigating the English translation of the Oldenburg Burnout Inventory. Work & Stress 2005;19:208-20.
  5. Qiao H, Schaufeli W. The convergent validity of four burnout measures in a chinese sample: A confirmatory factor-analytic approach. Applied Psychology 2011;60:87-111.
  6. Demerouti E, Mostert K, Bakker AB. Burnout and work engagement: a thorough investigation of the independency of both constructs. Journal of Occupational Health Psychology 2010;15:209-22.
  7. Rudman A, Gustavsson P, Hultell D. A prospective study of nurses’ intentions to leave the profession during their first five years of practice in Sweden. International Journal of Nursing Studies 2014;51:612-24.
  8. Rudman A, Gustavsson JP. Burnout during nursing education predicts lower occupational preparedness and future clinical performance: a longitudinal study. International Journal of Nursing Studies 2012;49:988-1001.
  9. Peterson U, Bergstrom G, Demerouti E, Gustavsson P, Asberg M, Nygren A. Burnout levels and self-rated health prospectively predict future long-term sickness absence: a study among female health professionals. Journal of Occupational & Environmental Medicine 2011;53:788-93.
  10. Dahlin M, Joneborg N, Runeson B. Performance-based self-esteem and burnout in a cross-sectional study of medical students. Med Teach 2007;29:43-8.
  11. Anagnostopoulos F, Demerouti E, Sykioti P, Niakas D, Zis P. Factors associated with mental health status of medical residents: a model-guided study. Journal of Clinical Psychology in Medical Settings 2015;22:90-109.
  12. Dahlin M, Fjell J, Runeson B. Factors at medical school and work related to exhaustion among physicians in their first postgraduate year. Nord J Psychiatry 2010;64:402-8. 

Purpose

To measure burnout in any occupational group.

Format/Data Source

Single-item. Stem and response items vary in publications.  The following item was utilized in Dolan et al. 5: “Overall, based on your definition of burnout, how would you rate your level of burnout?”  Responses, options are (1) “I enjoy my work, I have no symptoms of burnout,” (2) “Occasionally I am under stress and I don’t always have as much energy as I once did, but I don’t feel burned out,” (3) “I am definitely burning out and have one or more symptoms of burnout, such as physical and emotional exhaustion,” (4) “The symptoms of burnout that I am experiencing won’t go away.  I think about frustration at work a lot,” and (5) “I feel completely burned out and often wonder if I can go on. I am at a point where I may need some changes or may need to seek some sort of help.”

Date

Measure released in 1981.

Measure Item Mapping

N/A

Data Analysis

Often dichotomized as no symptoms of burnout (score of 2 or less) vs. 1 or more symptoms (score of 3 or more). These cut-off scores were not established based on validity evidence.

Development and Testing

In a sample of 5400, VA employees correlation between responses to the single-item with single-item for MBI-EE (item 8:“I feel burned out from my work”) score was r = 0.79.  Compared to single MBI-EE item, the single-item had a sensitivity of 83.2%, specificity of 87.4%, and AUC was 0.93.5  In a separate sample of 307 physicians single-item correlated modestly with MBI-EE score r = 0.64.6  In a third study, single-item responses in sample of 308 rural physicians and advance practice providers correlated with full MBI EE and DP domain scores (Spearman’s r =.72 and .41, p<.0001).  In multivariable models, single item predicted high EE (but neither low EE nor low/high DP) as measured by the MBI.  In this sample, the original MBI 2 items (item 8: “I feel burned out from my work” and item 10: “I have become more callous toward people since I took this job”) correlated better with their respective parent subscale (Spearman’s r = .89 and .81, p <.0001).  The summary from that study was that the single item predicts high levels of EE but not low EE or DP, and that it is not effective at capturing individuals who have evidence of burnout in the depersonalization or personal accomplishment domains.7

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

In a study of 422 primary care physicians, single item burnout characterization was associated with lower satisfaction, greater time pressure, poor work control, and intent to leave the medical practice on univariate analysis.8  No relationship was found between burnout and quality of care, as measured by chart review of 1419 patients.  In a related study involving 426 primary care physicians structural equation modeling found significant and small to modest path coefficients between stress, satisfaction, and single item burnout and between single item burnout and self-reported medical error and suboptimal patient care practices.9

Country of Origin

United States of America

Past or Validated Applications

  • Patient age: adults
  • Population: Physicians
    • National benchmark data not available for US physicians, medical students, or general population. Some data in VA primary care, including 1769 providers5
  • Setting: any health care setting

Cost

$0. Publicly available.5

References

  1. Veninga RL, Spradley JP. The Work/Stress Connection:, How to Cope With Job Burnout. Boston: Little Brown; 1981.
  2. Freedborn D. Satisfaction, commitment, and psychological well-being among HMO physicians. West J Med 2001;174:13-8.
  3. Schmoldt RA, Freeborn DK, HD K. Physician Burnout: recommendations for HMO managers. HMO Pract 1994;8.
  4. Williams ES, Manwell LB, Konrad TR, Linzer M. The relationship of organizational culture, stress, satisfaction, and burnout with physician-reported error and suboptimal patient care: results from the MEMO study. Health Care Management Review 2007;32:203-12.
  5. Dolan ED, Mohr D, Lempa M, et al. Using a single item to measure burnout in primary care staff: a psychometric evaluation. Journal of General Internal Medicine 2015;30:582-7.
  6. Rohland BM, Kruse GR, Rohrer JE. Validation of a single-item measure of burnout against the Maslach Burnout Inventory among Physicians. Stress and Health 2004;20:75-9.
  7. Waddimba AC, Scribani M, Nieves MA, Krupa N, May JJ. Validation of single-item screening measures for provider burnout in a rural health care network. Eval Health Prof 2015;39:215-25.
  8. Rabatin J, Williams E, Baier Manwell L, Schwartz MD, Brown RL, Linzer M. Predictors and Outcomes of Burnout in Primary Care Physicians. Journal of Primary Care & Community Health 2016;7:41-3.

Purpose

To measure burnout in any occupational group.

Format/Data Source

Copenhagen Burnout Inventory is a 19-item survey with positively and negatively framed items that covers 3 areas: personal (degree of physical and psychological fatigue and exhaustion), work (degree of physical and psychological fatigue and exhaustion related to work), and client-related (or a similar term such as patient, student, etc.) burnout.  There are multiple questions for each of these subscales and responses are in the form of either always, often, sometimes, seldom, and never/almost never or to a very high degree, to a high degree, somewhat, to a low degree, and to a very low degree.

Date

Measure released in 2005.

Measure Item Mapping

  • Overall physical and psychological fatigue: 6 items
  • Physical and psychological fatigue related to work: 7 items
  • Client-related burnout: 6 items

(Questions are to be mixed with questions on other topics to avoid stereotyped response patterns)

Data Analysis

Each dimension is separately treated as a continuous variable.  The response options are recoded into scores of 100, 75, 50, 25, and 0.  Next, items within the subscale are averaged, with one item reverse scored.  Higher scores indicate a higher degree of burnout.  Possible score ranges for all scales is 0-100.  In one study investigators chose a score of 50 or higher to indicate burnout as a dichotomous variable.1  In a separate study investigators chose scores of 25 or lower, 25 to 50, and higher than 50 to categorize low, intermediate, and high burnout.2  These cut-off scores were not established based on validity evidence.

Development and Testing

Developed with a framework that characterizes the core of burnout as fatigue and exhaustion, which are attributed to specific domains in a person’s life (personal, work-related, and client-related).  In a sample of 1914 individuals from seven different workplaces CBI scales had high internal reliability, scores correlated with SF-36 scales, and scores predicted future sickness absence, intention to quit, and sleep problems.3

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

Existing data is limited as a majority of studies have included small samples of physicians and other health care providers, and have mostly been conducted abroad.  In terms of potential health care related outcomes, CBI scores have been associated with lower perceptions of quality of care (psychosocial care, diagnosis/therapy, quality assurance, diagnostic and therapeutic errors in a study of 1311 German surgeons),4 nurse turnover  intention (in a study of 159 ICU nurses in Iran),5 self-reported sick absences (prospective study of 824 Danish workers in human service sectors),6 and sickness days, sleep problems, use of pain killers, and intention to quit work (prospective study of 1914 Danish employees in human sector).7 In terms of personal outcomes, CBI scores predicted the WHO-Five Well-Being Index score among 317 Canadian residents,8 and antidepressant treatment, especially among men (prospective study of 2936 Danish employees). From a health system characteristics perspective, associations have been found between CBI score and job strain, over-commitment, and low social support (Taiwanese health care professionals)9 and between practice setting and recent reorganization at work (598 Norwegian midwives).1

Country of Origin

Denmark

 Past or Validated Applications

  • Patient age: adults
  • Population: any occupational group
    • National benchmark data not available for US physicians, medical students, or general population.
  • Setting: any

Cost

$0. Publicly available in Table S1 1 and https://nfa.dk/da/Vaerktoejer/Sporgeskemaer/Sporgeskema-til-maaling-af-udbraendthed/Copenhagen-Burnout-Inventory-CBI 

Notes

  • Multiple language translations are available

References

  1. Henriksen L, Lukasse M. Burnout among Norwegian midwives and the contribution of personal and work-related factors: A cross-sectional study. Sexual & reproductive healthcare: official journal of the Swedish Association of Midwives 2016;9:42-7.

  2. Madsen IEH, Lange T, Borritz M, Rugulies R. Burnout as a risk factor for antidepressant treatment – a repeated measures time-to-event analysis of 2936 Danish human service workers. J Psychiatr Res 2015;65:47-52.

  3. Kristensen TS, Borritz M, Villadsen E, Christensen KB. The Copenhagen Burnout Inventory: A new tool for the assessment of burnout. Work Stress 2005;19:192-207.

  4. Klein J, Grosse Frie K, Blum K, von dem Knesebeck O. Burnout and perceived quality of care among German clinicians in surgery. International Journal for Quality in Health Care 2010;22:525-30.

  5. Shoorideh FA, Ashktorab T, Yaghmaei F, Alavi Majd H. Relationship between ICU nurses’ moral distress with burnout and anticipated turnover. Nursing Ethics 2015;22:64-76.

  6. Borritz M, Rugulies R, Christensen KB, Villadsen E, Kristensen TS. Burnout as a predictor of self-reported sickness absence among human service workers: prospective findings from three year follow up of the PUMA study. Occupational & Environmental Medicine 2006;63:98-106.

  7. Kristense TS, Borritz M, Villadsen E, Christensen KB. The Copenhagen Burnout Inventory: A new tool for the assessment of burnout. Work Stress 2005;19:192-207.

  8. Kassam A, Horton J, Shoimer I, Patten S. Predictors of Well-Being in Resident Physicians: A Descriptive and Psychometric Study. Journal of Graduate Medical Education 2015;7:70-4.

  9. Chou L-P, Li C-Y, Hu SC. Job stress and burnout in hospital employees: comparisons of different medical professions in a regional hospital in Taiwan. BMJ Open 2014;4:e004185.

     

Valid and Reliable Survey Instruments to Measure Composite Well-Being

Purpose

To measure burnout and professional fulfillment in physicians.

Format/Data Source

The Stanford Professional Fulfillment Index (PFI) is a 16-item survey that covers burnout (work exhaustion and interpersonal disengagement) and professional fulfillment. Response options are on a five-point Likert scale (“not at all true” to “completely true” for professional fulfillment items and “not at all” to “extremely” for work exhaustion and interpersonal disengagement items.)

Date

Measure published in 2018.

Measure Item Mapping

  • Professional fulfillment: items 1-6
  • Work exhaustion: 7-10
  • Interpersonal disengagement items: 11-16

Data Analysis

Items are scored 0 to 4. Each dimension is treated as a continuous variable. Scale scores are calculated by averaging the item scores of all the items within the corresponding scale. Scale scores can then be multiplied by 25 to create a scale range from 0 to 100. Higher score on the professional fulfillment scale is more favorable. In contrast, higher scores on the work exhaustion or interpersonal disengagement scales are less favorable. Dichotomous burnout categories are determined from the average item score (range 0 to 4) of all 10 burnout items (work exhaustion and interpersonal disengagement), using a cut-point of 1.33. Dichotomous professional flfillment is recommended at an average item score cut-point of >3.0.

Development and Testing

The PFI was developed for use in physicians.1  Development involved input from members of a physician wellness committee (n>30) and two national physician wellness experts. The efficacy of the PFI has been evaluated in a sample of 185 residents and 65 practicing physicians. Principal components analysis of data from this sample justified the three PFI subscales of professional fulfillment, work exhaustion, and interpersonal disengagement. In a subsample of 100 responders who had stable sleep-related impairment scores over a 2-3 week period, test-retest reliability estimates were 0.82 for professional fulfillment (α = 0.91), 0.80 for work exhaustion (α = 0.86), 0.71 for interpersonal disengagement (α = 0.92), and 0.80 for overall burnout (α = 0.92). The correlation between the PFI work exhaustion subscale score and Maslach Burnout Inventory emotional exhaustion subscale score was 0.72. The correlation between PFI interpersonal disengaement score and Maslach Burnout Inventory depersonalization subscale score was 0.59. The correlation between the PFI professional fulfillment score and Maslach Burnout Inventory personal accomplishment subscale score was 0.46. Compared to the Maslach Burnout Inventory, the PFI burnout scale sensitivity and specificity in identifying those with burnout was 72% and 84%, respectively, and AUC was 0.85. PFI scales also correlated in the expected directions with Patient-Reported Outcomes Measurement Information System (PROMIS) sleep-related impairment, depression, and anxiety scores, and with World Health Organization Quality of Life-BREF scores, PFI scales demonstrated sufficient sensitivity to detect expected effects of a two-point (range 8-40) change in PROMIS sleep-related impairment.1

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

In the study of 250 resident and practicing physicians PFI work exhaustion and interpersonal disengagement had small (r=.15 and .33, respectively) but statistically significant correlations with scores on a 4-item medical error scale (internal consistency reliability estimate α =.62). Mean medical error scale scores were higher among those physicians with burnout (as classified using the PFI) in comparison to those without burnout. The Cohen’s d effect size difference in self-reported medical errors for high versus low burnout classified using the PFI was 0.55.1

Country of Origin

USA

 Past or Validated Applications

  • Patient age: adults
  • Population: physicians
    • Benchmark data are avialable for practicing U.S. physicians and residents from the authors.
  • Setting: any health care setting 

Cost

Publicly available in article. No cost for non-profit organizations using the PFI for research or program evaluation. Cost for commercial use or use by for-profit organizations depends on application; contact the Stanford Risk Authority at wellness.surveyteam@TheRiskAuthority.com. 

References

1. Trockel M, Bohman B, Lesure E, et al. A Brief Instrument to Assess Both Burnout and Professional Fulfillment in Physicians: Reliability and Validity, Including Correlation with Self-Reported Medical Errors, in a Sample of Resident and Practicing Physicians. Acad Psychiatry. 2018;42(1):11-24.

Purpose

To identify distress in a variety of dimensions (burnout, fatigue, low mental/physical quality of life, depression, anxiety/stress).1-5

Format/Data Source

7 or 9-item instrument with yes/no response categories.

Date

Measure released in 2010.

Measure Item Mapping

N/A

Data Analysis

A total score is calculated by adding the number of ‘yes’ responses.  In a sample of physicians, medical students, and US workers, every one point increase in score resulted in a step-wise increased probability of distress and risk for adverse personal and professional consequence.  For the 7-item version, score range is 0 to 7, and threshold score to identify individuals in distress is 4 or higher for medical students, 5 or higher for residents, 4 or higher for practicing physicians, and 2 or higher for other US workers.  In the expanded 9-item version, the original 7-items are scored in a traditional manner, with responses to meaning in work and satisfaction with work-life balance items resulting in 1 point being added or subtracted,1 resulting in a score range of -2 to 9.

Development and Testing

The 7-item Well-Being Index (WBI) was originally designed to be used in medical students.4,5  Development involved input from experts, correlation analysis from previously administered assessments, and a multi-step validation process.  After initial development in a sample of 2230 medical students, the efficacy of the WBI was confirmed in a separate sample of 2682 medical students.  At a threshold score of 4 or higher, the WBI’s specificity for identifying medical students with severe distress ranged from 88-91% with sensitivity of 59-93%.4 The WBI was validated in a national sample of 7560 US residents in 2012.3  At a threshold score of 5 or higher the index’s specificity for identifying residents with low mental QOL, high fatigue, or recent suicidal ideation was 84%.  The score also stratified residents’ self-reported medical errors.  The WBI was also validated in a national sample of 6994 US physicians. At a threshold score of 4 or higher, the index’s specificity for identifying physicians with low mental QOL, high fatigue, or recent suicidal ideation was 86%.2  The score also stratified career satisfaction, reported intent to leave the current practice, and self-reported medical errors.  In 2014,  the 7-item WBI was tested in a sample of 5392 US workers and 6880 US physicians, and the 9-item WBI was developed and tested.1  The 9-item was created in an effort to identify individuals who were thriving, and included items exploring satisfaction with work life integration and meaning in work, both of which may mitigate the relationship between job-related stress and psychological distress.1  The 9-item WBI predicted low and high QOL, high fatigue, recent thoughts of suicidal ideation, and burnout in both samples.  The area under the curve of the 7-item and the 9-item for identifying burnout was 0.84 and 0.85 in the physician sample, respectively.

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

National studies have found associations between WBI scores and health care related outcomes (e.g., medical error, physician turnover) and personal outcomes (e.g., fatigue, recent suicidal ideation).1-5

Country of Origin

USA

 Past or Validated Applications

  • Patient age: adults
  • Population: any occupational group
    • National benchmark data available for US physicians, residents, medical students, and general population, with national benchmarks soon available for US nurses and advance practice providers.
  • Setting: any  

Cost

The WBI is free for research use and for use in quality improvement efforts by nonprofit organizations. An interactive version of the index that provides personalized feedback to individuals and links to national resources is also free for individual use. The organizational version of the interactive WBI that provides individualized feedback, links to local and national resources, and organization level reports is also available but requires a fee for use. Access to the tool and information regarding cost and permission to use the tool is available at https://www.mededwebs.com/well-being-index

References

  1. Dyrbye LN, Satele D, Shanafelt T. Ability of a 9-Item Well-Being Index to Identify Distress and Stratify Quality of Life in US Workers. J Occup Environ Med 2016;58:810-7.
  2. Dyrbye LN, Satele D, Sloan J, Shanafelt TD. Utility of a brief screening tool to identify physicians in distress. J Gen Intern Med 2013;28:421-7.
  3. Dyrbye LN, Satele D, Sloan J, Shanafelt TD. Ability of the Physician Well-Being Index to identify residents in distress. J Grad Med Educ 2014;6:78-84.
  4. Dyrbye LN, Schwartz A, Downing SM, Szydlo DW, Sloan JA, Shanafelt TD. Efficacy of a brief screening tool to identify medical students in distress. Acad Med 2011;86:907-14.
  5. Dyrbye LN, Szydlo DW, Downing SM, Sloan JA, Shanafelt TD. Development and preliminary psychometric properties of a well-being index for medical students. BMC Medical Education 2010;10:8.

Valid and Reliable Survey Instruments to Measure Depression and Suicide Risk

Purpose

To measure major depression and suicidal ideation.

Format/Data Source

The Patient Health Questionnaire-9 (PHQ-9) is the self-report component of the PRIME-MD (Primary Care Evaluation of Mental Disorders) inventory1. For each of the 9 DSM-5 (Diagnostic and Statistical Manual of Mental Disorders [Fifth Edition]) depressive symptoms, participants indicate whether, during the previous 2 weeks, the symptom has bothered them “not at all,” for “several days,” for “more than half the days,” or “nearly every day.” Suicidal ideation is screened for with item 9 of the Patient Health Questionnaire–9 (PHQ-9) (i.e., “Thoughts that you would be better off dead, or hurting yourself in some way” over the past 2 weeks). Positive response to this item increases the cumulative risk for a suicide attempt and suicide completion over the next year by 10- and 100-fold, respectively2.

Date

Measure released in 1999.

Measure Item Mapping

One item each for:

  1. Interest
  2. Mood
  3. Sleep
  4. Energy
  5. Appetite
  6. Self-worth
  7. Concentration
  8. Psychomotor slowing or activation
  9. Suicidal ideation

Data Analysis

The PHQ-9 is most often used as a continuous measure, with scores for individual items summed to produce a composite depressive symptom score between 0-27. Cut points of 5, 10, 15 and 20 representing mild, moderate, moderately severe and severe levels of depressive symptoms. The PHQ-9 can also be used as a diagnostic algorithm to make a probable diagnosis of major depressive disorder (MDD)3.

Development and Testing

PHQ-9 scores ≥10 have a sensitivity and specificity of 88% for major depressive disorder3,4. The PHQ-9 performs similarly across sex3,5, age6and racial/ethnic groups7-9. Importantly for longitudinal assessments, the PHQ-9 shows high sensitivity to change over time5,10. Compared to other available depression measures, the PHQ-9 is relatively short and demonstrates good validity, sensitivity and specificity in both clinical and non-clinical populations4,11. Further, the PHQ-9 is the primary depression instrument utilized by large health care providers such as the U.S. Department of Veterans Affairs and the National Health Services, and is the instrument that web users are taken to after a Google search for “clinical depression.”12,13 (https://www.blog.google/products/search/learning-more-about-clinical-depression-phq-9-questionnaire/). The widespread use of the PHQ-9 ensures a range of normative data for comparison.

Links to Outcomes or Health System Characteristics Related to Health Care Professionals

In physicians, PHQ-9 scores have been associated with medical errors, work hours and productivity10,14,15

Country of Origin

United States of America

 Past or Validated Applications

  • Patient age: Adolescents, adults, and older adults
  • Population: any occupational group
    • From meta-analyses, comparison data are available for the general population, medical students (N=10,386),16 and resident physicians (N=3,756)17
  • Setting: any 

Cost

$0. Available at: http://www.phqscreeners.com/sites/g/files/g10016261/f/201412/PHQ-9_English.pdf

Notes

  • Multiple language translations are available.

Alternate Depression Measure

The abbreviated 2-item PHQ-2 instrument has been developed for situations where administration of the full PHQ-9 is not feasible. The PHQ-2 is composed of the first two items of the PHQ-9 (assessing low mood and loss of interest) and subjects receive a score between 0 and 3 on each item18. With a composite score range between 0-6, scores of ≤2 or ≤3 have been considered a positive screen for depression depending on the study. A positive PHQ-2 screen for depression correlates well with positive screens on the PHQ-9 and other longer depression instruments19. Further, the PHQ-2 has generally shown moderate to good sensitivity to detect clinical depression. However, the specificity of PHQ-2 has been variable across studies and low in many studies20,21. Thus, the PHQ-2 is most accurately viewed as a screening tool for depression rather than a diagnostic instrument22.

References

  1. Spitzer RL, Kroenke K, Williams JB. Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study. Primary Care Evaluation of Mental Disorders. Patient Health Questionnaire. JAMA. Nov 10 1999;282(18):1737-1744.
  2. Simon GE, Rutter CM, Peterson D, et al. Does response on the PHQ-9 Depression Questionnaire predict subsequent suicide attempt or suicide death? Psychiatr Serv. Dec 1 2013;64(12):1195-1202.
  3. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. Sep 2001;16(9):606-613.
  4. Lowe B, Spitzer RL, Grafe K, et al. Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians’ diagnoses. J Affect Disord. Feb 2004;78(2):131-140.
  5. Lowe B, Unutzer J, Callahan CM, Perkins AJ, Kroenke K. Monitoring depression treatment outcomes with the patient health questionnaire-9. Medical care. Dec 2004;42(12):1194-1201.
  6. Klapow J, Kroenke K, Horton T, Schmidt S, Spitzer R, Williams JB. Psychological disorders and distress in older primary care patients: a comparison of older and younger samples. Psychosom Med. Jul-Aug 2002;64(4):635-643.
  7. Huang FY, Chung H, Kroenke K, Delucchi KL, Spitzer RL. Using the Patient Health Questionnaire-9 to measure depression among racially and ethnically diverse primary care patients. J Gen Intern Med. Jun 2006;21(6):547-552.
  8. Huang FY, Chung H, Kroenke K, Spitzer RL. Racial and ethnic differences in the relationship between depression severity and functional status. Psychiatr Serv. Apr 2006;57(4):498-503.
  9. Barthel D, Barkmann C, Ehrhardt S, Schoppen S, Bindt C, International CDSSG. Screening for depression in pregnant women from Cote dIvoireand Ghana: Psychometric properties of the Patient Health Questionnaire-9. J Affect Disord. Nov 15 2015;187:232-240.
  10. Sen S, Kranzler HR, Krystal JH, et al. A prospective cohort study investigating factors associated with depression during medical internship. Arch Gen Psychiatry. Jun 2010;67(6):557-565.
  11. Williams JW, Jr., Pignone M, Ramirez G, Perez Stellato C. Identifying depression in primary care: a literature synthesis of case-finding instruments. General hospital psychiatry. Jul-Aug 2002;24(4):225-237.
  12. Lewis H, Adamson J, Atherton K, et al. CollAborative care and active surveillance for Screen-Positive EldeRs with subthreshold depression (CASPER): a multicentred randomised controlled trial of clinical effectiveness and cost-effectiveness. Health Technol Assess. Feb 2017;21(8):1-196.
  13. Scherrer JF, Salas J, Schneider FD, et al. Characteristics of new depression diagnoses in patients with and without prior chronic opioid use. J Affect Disord. Mar 1 2017;210:125-129.
  14. Kalmbach DA, Arnedt JT, Song PX, Guille C, Sen S. Sleep Disturbance and Short Sleep as Risk Factors for Depression and Perceived Medical Errors in First-Year Residents. Sleep. 2017;40(3).
  15. Rosen T, Zivin K, Eisenberg D, Guille C, Sen S. The Cost of Depression-Related Presenteeism in Resident Physicians. Acad Psychiatry. Dec 182017.
  16. Rotenstein BR, M.; Torre, M.; Segal, J.; Peluso, M.; Guille, C.; Sen, S.; Mata, D. Prevalence of Depression, Depressive Symptoms and Suicidal Ideation among Medical Students. JAMA. 2016; in press.
  17. Mata DA, Ramos MA, Bansal N, et al. Prevalence of Depression and Depressive Symptoms Among Resident Physicians: A Systematic Review and Meta-analysis. Jama. 2015;314(22):2373-2383.
  18. Kroenke K, Spitzer RL, Williams JB. The Patient Health Questionnaire-2: validity of a two-item depression screener. Medical care. Nov 2003;41(11):1284-1292.
  19. Yu X, Stewart SM, Wong PT, Lam TH. Screening for depression with the Patient Health Questionnaire-2 (PHQ-2) among the general population in Hong Kong. J Affect Disord. Nov 2011; 134(1-3): 444-447.
  20. Arroll B, Goodyear-Smith F, Crengle S, et al. Validation of PHQ-2 and PHQ-9 to screen for major depression in the primary care population. Annals of family medicine. Jul-Aug 2010;8(4):348-353.
  21. Manea L, Gilbody S, Hewitt C, et al. Identifying depression with the PHQ-2: A diagnostic meta-analysis. J Affect Disord. Oct 2016; 203:382-395.
  22. Wilson R, Agius M. Is there good evidence that the two Questions in PHQ-2 are useful questions to use in order to screen for depression? Psychiatria Danubina. Sep 2017; 29(Suppl 3):232-235.

We want to hear from you!

Frequently Asked Questions

Are there clear advantages for using one instrument versus another? For example, does use of MBI better enable comparisons with results from previous studies of U.S. physicians or in other specialties?

The Maslach Burnout Inventory is the gold standard for research purposes. Use of the full MBI allows for scores to be compared to results from previous studies of U.S. physicians (Mayo Clinic Proc, December 2015;90(12):1600-1613). The full MBI is 22 items long and therefore may not be practical in all settings. The use of 2 single items from the Maslach Burnout Inventory is the second best option: item 8 (“I feel burned out from my work”,) and item 10 (“I have become more callous toward people since I took this job”) correlate strongly with the emotional exhaustion and depersonalization subscale scores and concurrent validity has also been demonstrated (J Gen Intern Med 2012;27:1445-52. J Gen Inter Med 2009;24:1318-21.)

Are there some advantages to combining survey questions on burnout with those to assess some aspects of well-being, such as meaning of work?

Organizations interested in measuring physician well-being could consider a variety of dimensions including burnout, stress, fatigue, satisfaction, and quality of life. Various instruments are available to measure these domains. Instruments with national benchmark data and shown to correlate with patient satisfaction, safety, quality measures, productivity, turnover, and other outcomes of interest are preferred. A review of self-reported measures for assessing well-being has been published in BMJ (Linton M, Dieppe P, Medina-Lara A. Review of 99 self-report measures for assessing well-being in adults: exploring dimensions of well-being and developments over time. BMJ Open 2016;6:e010641. doi: 10.1136/bmjopen-2015-010641.)

Commonly used instruments, number of items, whether or not there are national benchmarks for US physicians and scores have been shown to correlate with relevant outcomes can also be found in the Table on page 6 of the article “Executive Leadership and Physician Well-Being: Nine Organizational Strategies to Promote Engagement and Reduce Burnout” By Drs. Shanafelt and Noseworthy (Mayo Clin Proc. 2017 Jan;92(1):129-146. doi: 10.1016/j.mayocp.2016.10.004).

Conducting a survey to measure burnout or other dimensions of distress is the first step to managing the problem. Including questions on such surveys that explore key drivers of burnout, such as meaning in work, workload, work efficiency, social support at work, control/flexibility, work-life balance, and organizational culture and values, can provide a starting point for conversation and action.

Are there best practices for combining different instruments?

Surveys are an important research methodology. The most important piece is to use instruments with acceptable reliability and validity that have national benchmark data to help with interpreting results, and are preferably associated with outcomes of interests. Using the entire instrument with exactly the same instructions and response categories is critical. If there are plans to repeat the survey over time the instruments should follow one another in the same order. It is acceptable to include more than one instrument in a survey. Just be sure to keep the instructions that precede the items and the response categories the same as the original instrument.

Survey-related books:

  • Neumann, W. L. (2003). Social Research Methods: Qualitative and Quantitative Approaches. Boston, MA: Allyn & Bacon.
  • Salant, P. & Dillman, D. A. (1994). How to Conduct Your Own Survey. New York, NY: John Wiley & Sons.
  • Cohen, L. Manion, L. and Morrison, L. Research Methods in Education (Fifth Edition), London, Routledge
  • Falmer, 2000. Dillman, D. A. Mail and Internet Surveys, The Tailored Design Method. 2nd Edition. Wiley, 2000.
  • Fink, A. and Kosecoff, J. How to Conduct Surveys: A Step-by-Step Guide. Sage, 1985.
Is there an optimal number or upper limit on the number of questions to maximize response rates?

There are many factors that influence response rate, including interest of your sample population in the topic and survey length. Short, simple questionnaires typically have better response rates than long, complex surveys. There are a number of factors beyond how many questions there are on a survey that influence how long it takes to complete a survey. Surveys that take less than 10 minutes for an individual to complete tend to have better response rates. Prior to administering a survey to a large group of individuals it is advised to pilot test the instrument with a smaller cohor. Doing so can help identify error and ensure the success of the project.

For more details see:

  • “Increasing response rates to postal questionnaires: systematic review” by Edwards P, Roberts I, Clark M, DiGuiseppi C, Pratap S, Wentz R, and Kwan I published in BMJ.com 2002;324:1183.
  • “Physician response to surveys. A review of the literature”. Kellerman SE, Herold J. Am J Prev Med, 20(1), 61-67, 2001.
  • “How to Conduct Surveys. A Step by Step Guide.” Fink A. 4th Edition, Sage Publications.
Are there unique considerations for inclusion of depression-related questions in surveys?

When assessing sensitive topics such as depression, suicide, and substance abuse there is potential tension between the desire to directly help respondents exhibiting signs of pathology and maintaining respondent confidentiality. Especially, with the unfortunate continued stigma around mental health among clinicians, ensuring confidentiality is critical to respondents and to collecting accurate results. One compromise solution is to take all participants to a new screen at the end of the survey that provides a) general information about depression b) encouragement to seek help if experiencing depressive symptoms and STB c) and information on resources for mental health services by state, including a suicide hotline.


Join Our Community

Sign up for NAM email updates