The Gerontologist
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Tilden, V. P.
Right arrow Articles by Hickman, S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tilden, V. P.
Right arrow Articles by Hickman, S.
The Gerontologist 42:71-80 (2002)
© 2002 The Gerontological Society of America

Measurement of Quality of Care and Quality of Life at the End of Life

Virginia P. Tilden, DNSc, RN, FAANa,b, Susan Tolle, MD, FACPa, Linda Drach, MPHa,b and Susan Hickman, PhDa,b

a Center for Ethics in Health Care, Oregon Health & Science University, Portland, OR
b School of Nursing, Oregon Health & Science University, Portland, OR

Correspondence: Virginia P. Tilden, DNSc, RN, FAAN, Oregon Health & Science University, 3181 S.W. Sam Jackson Park Road, SN-ORD, Portland, OR 97201-3098. E-mail: tildenv{at}ohsu.edu.


    Abstract
 TOP
 Abstract
 Purpose of Measurement at...
 National Focus on Quality...
 Conclusions
 References
 
Purpose: Consumers and providers demand better indicators for quality of care and quality of life at the end of life. This article presents recommendations for advancing the science of measurement at end of life. Design and Methods: The authors reviewed the extant literature and applied the Institute of Medicine's conceptual framework for national health care quality to end-of-life care and research. Results: Ten recommendations were developed, charting a course for research that will improve the quality of care delivered and, consequently, the quality of life experienced at life's end. Implications: Measurement bridges the conceptual and operational levels of scientific research, clinical care, and quality improvement. Although a large amount of psychometric groundwork has been laid in the field of end-of-life research, the next wave of studies will ideally take measurement at end of life to a higher level of rigor and precision.

Key Words: Quality improvement • Recommendations • Measurement • End of life

Although substantial progress has been made in the past decade on the psychometric measurement of variables central to end-of-life care, demand is intense for better indicators of quality of care and quality of life at the end of life. Consumers and providers alike continue to indict the American way of death as fragmented, expensive, and insensitive to patient and family preferences, despite substantial nationwide improvements in care for dying patients. The need for valid and reliable measures is heightened by the demand to demonstrate improvements objectively, as evidenced by the pressure of numerous quality-indicators projects.

Complex challenges complicate psychometric measurement at the end of life whether for research or quality improvement. Challenges include difficulties in defining end-of-life time periods to delineate the denominator for statistical analyses; controlling for extraneous influences or other interactions on the variability of constructs; minimizing subject burden while maximizing robustness of a scale; and using proxies as respondents for a patient population that is largely incapacitated at the final stage. Challenges of measurement are especially intense with elderly persons, where frailty, dementias, and diminished vision and hearing, among other factors, affect validity, reliability, and utility of measures. These issues and recommendations for advancing the science of measurement at the end of life shown in Table 1 are the focus of this article.


View this table:
[in this window]
[in a new window]
 
Table 1. Recommendations for Advancing the Science of Measurement at the End of Life

 

    Purpose of Measurement at the End of Life
 TOP
 Abstract
 Purpose of Measurement at...
 National Focus on Quality...
 Conclusions
 References
 
Psychometric measurements of the dying experience and of quality of care are needed to produce descriptive or predictive information for three main purposes: (a) empirical research for evidence that eventually guides clinical practice or informs policy; (b) clinical assessment of an individual patient's status or trajectory; and (c) quality-of-care outcomes for quality improvement and accountability. Although there is much consensus about these purposes, the present state of development and refinement of measures to achieve these goals are fairly rudimentary. Many measures of discrete variables related to the terminal stage of various diseases are available in the literature. Although useful, they do not comprehensively index either the dying experience or the full scope of clinical care. More comprehensive approaches have been taken by several research teams, notably the investigators of the "Study to Understand Prognoses and Preferences for Outcomes and Risks of Treatment" (SUPPORT Principal Investigator [SUPPORT], 1995) and the contributors to the Toolkit of Instruments to Measure End-of-life Care project (Teno 2000Citation).

Support
The SUPPORT study led the way in the late 1980s with then state-of-the-art measurement approaches that included data from medical records and in-depth serial interviews with patients and/or surrogates and attending physicians. Outcome measures of SUPPORT were changes in five indicators before and after an intervention: (a) timing of the written Do Not Resuscitate order; (b) patient/surrogate-physician agreement on cardiopulmonary resuscitation preferences; (c) days in the intensive care unit (ICU), coma, and mechanical ventilation before death; (d) pain; and (e) hospital resource use. More than 100 articles and abstracts have been published about the study's methods, findings, and implications for policy change. No other single study on end-of-life care has had such far-reaching impact, and measurement insights from the SUPPORT study have helped inform the field's subsequent measurement approaches.

Toolkit
Major groundwork in measurement has been forged by the contributors and sponsors of the Toolkit project, which began with a series of multidisciplinary conferences spanning 1996–2000. White articles addressed a psychometric environmental scan and recommendations were formulated (Teno 2000Citation). Numerous publications of this large body of work document its longitudinal development and impressive scope. A major goal of the Toolkit project has been to assess and collect measures that are clinically meaningful, administratively manageable, and psychometrically sound (Teno, Byock, and Field 1999Citation). The emphasis of the Toolkit is on measures related to the domains central to end-of-life care, including quality of life, pain, functional status, sense of emotional well-being, sense of spiritual fulfillment, satisfaction with quality of care, and other self-report and/or subjective aspects of dying. The Toolkit website (www.chcr.brown.edu/pcoc/toolkit.htm) provides the most comprehensive central repository of available measures that focus on an array of end-of-life variables and is invaluable for investigators embarking on research or quality improvement projects.


    National Focus on Quality Indicators
 TOP
 Abstract
 Purpose of Measurement at...
 National Focus on Quality...
 Conclusions
 References
 
Although measurement has long played an essential role in health care for empirical research and clinical assessment, its emphasis in the national quality-indicators initiative is more recent. Several groups have cited a need for end-of-life, quality-indicators projects (Foundation for Accountability 1998Citation; Lynn 1997Citation; Medicare Payment Advisory Commission 1999Citation; Morrison, Siu, Leipzig, Cassel, and Meier 2000Citation; Schoeni 2000Citation). Lynn, Schall, Milne, Nolan, and Kabcenell 2000Citation have reported two innovative quality improvement collaboratives sponsored by the Institute for Healthcare Improvement and the Center to Improve Care of the Dying. To measure quality improvements, these multiteam, multisite projects generally used frequency measures of rates and proportions, which is a good but basic start on the work of developing measures for quality improvement.

The Institute of Medicine's (IOM) report, Envisioning the National Health Care Quality Report (Hurtado, Swift, and Corrigan 2001Citation), calls for a systematic approach to quality assessment and improvement for America's health care system—a system it calls deficient in its ability to translate clinical knowledge into practice and to consistently deliver care that is safe, timely, efficient, and in tune with consumer perspective. The IOM calls for a national focus on measurement of quality indicators as an approach to "crossing the quality chasm" (Institute of Medicine, Committee on Quality of Health Care in America 2001Citation). The IOM Committee approached its quality-indicators task with a multistep process (as adapted in Table 2 ).


View this table:
[in this window]
[in a new window]
 
Table 2. Process of the IOM in Envisioning a National Health Care Quality Report

 
IOM first selected a conceptual framework as an organizing schema. The conceptual framework contained both desirable components of health care quality (including safety, effectiveness, timeliness, and patient-centeredness) and a consumer perspective of needs over the life cycle (including staying healthy, getting better, living with illness or disability, and coping with the end of life). Second, the Committee specified criteria for selecting measures of the variables in the framework, with concern for the importance of what is being measured, the scientific soundness of the measure, and the feasibility of using the measure. Third, the Committee defined criteria for potential sources of data for a national health care quality data set, such as availability, credibility, reliability, and timeliness of a data source, among others. The Committee suggested that existing data sources, such as the Medical Expenditure Panel Survey or the Consumer Assessment of Health Plans Survey, should be complemented by the development of new tools to achieve a comprehensive health information infrastructure. Furthermore, the Committee cautioned that quality reports should concentrate on process and outcomes of care as opposed to structure of care, because structure and quality of care have consistently proven to be only weakly associated. Although the IOM report applies to a full range of health care domains, it can be applied to a single area, such as end-of-life care. Table 3 shows the application of the IOM's process to the domain of end-of-life care.


View this table:
[in this window]
[in a new window]
 
Table 3. Application of the IOM Process to End-of-Life Care

 
Step 1: Develop Conceptual Frameworks for the End of Life
Conceptual frameworks portray variables and predict relationships between and among variables. The greater the specificity of variables and relationships, the more testable the framework becomes, and therefore, the more useful for directing interventions. A substantial amount of good work has identified the most important variables in quality care at end of life (e.g., Byock 1999Citation; Curtis et al. 2001Citation; Emanuel and Emanuel 1998Citation; Field and Cassel 1997Citation; Fowler, Coppola, and Teno 1999Citation; Higginson 1998Citation; Lynn 1997Citation; Patrick, Engelberg, and Curtis 2001Citation; Singer, Martin, and Kelner 1999Citation; Steinhauser, Christakis, et al. 2000Citation; Steinhauser, Clipp, et al. 2000Citation; Stewart, Teno, Patrick, and Lynn 1999Citation; Teno 1999Citation). Overall, health status and patient/family satisfaction are umbrella components. Within these, there is consensus about the most important variables: good management of both physical and psychological symptoms; timely, negotiated decision making and avoidance of inappropriate prolongation of dying; achieving a sense of control and completion; communication with loved ones; and spiritual and existential transcendence.

However, much less research has been done on the interrelationships (e.g., path analyses of predictive, hierarchical, and causal relationships) among and between variables at the end of life. One proposed (but untested) model of predictive relationships (Emanuel and Emanuel 1998Citation) specifies an initial set of variables as fixed characteristics of the patient (clinical status, sociodemographic characteristics). Fixed characteristics influence the second set of variables: Modifiable dimensions of the patient's experience. Here, the variables of concern match the domains that other researchers have identified (physical symptoms, psychological and cognitive symptoms, social relationships and support, economic demands and caregiving needs, hopes and expectations, and spirituality and existential beliefs). Modifiable dimensions of the patient's experience have bidirectional relationships with the third set of variables: care-system interventions [family and friend interventions, microsocial interventions (e.g., support groups), macrosocial interventions (e.g., Medicare), medical-provider interventions, and health care institution interventions (e.g., hospice)]. Care-system interventions both influence and are influenced by the patient and family, and these interactions influence the outcome, which is the overall dying process—experienced by the patient and family as a good or a bad death. Strengths of this model of relationships are its broad scope and inclusiveness, but these are also weaknesses, in that they are largely untestable. Overall, present conceptual frameworks are well developed at the level of domain specification, but generally are untested for predictive, causal, and dependent relationships, leading to the following recommendation.

Recommendation 1
Studies should be done that test conceptual models using sophisticated statistical analyses so that pathways among predictive, causal, and dependent relationships of key variables in end-of-life care can be determined.

Several critiques of present conceptual frameworks can be offered. First, whether the domains grouped as "modifiable dimensions" are really modifiable is debatable, and distinguishing between modifiable and nonmodifiable dimensions is a challenge in quality-indicators projects. Expecting improvement in outcomes not affected by interventions confounds quality-indicator and accountability projects. Dying is fraught with unavoidable circumstances and medical events, such as the loss of a family caregiver or a painful bowel obstruction, which inevitably erode the desired ideal scenario (Morrison et al. 2000Citation; Patrick, Engelberg, and Curtis 2001Citation). On the other hand, the fact that so much regional variability is noted in many variables suggests a good amount of modifiability (J. Lynn, personal communication, August 21, 2001).

Second, conceptual frameworks to date have not addressed the issue of "symptom clusters," that is, the degree to which symptoms are inextricably interactive, where any single symptom is largely codependent on changes in other symptoms, such as the codependence of pain and fatigue (Miaskowski and Lee 1999Citation) or fatigue and depression (Meek et al. 2000Citation). Measurement of symptoms to date has tended to index each symptom separately and generally has missed these important dependent relationships. Thus, more precision in conceptual frameworks regarding complex symptom interactions will help advance future measures that unpack symptom clusters. These two issues of present conceptual frameworks lead to the following recommendation.

Recommendation 2
Studies should be done that expand and refine conceptual frameworks to (a) clarify which domains are modifiable; and (b) address symptom clusters, specify covariate relationships between symptoms, and develop or modify measures that can most effectively index symptoms within clusters.

Step 2: Measure Selection
Psychometric attributes of measures are well understood and therefore only briefly mentioned here as an obvious starting point for any discussion of measure selection. Reliability, validity, feasibility/utility, and precision constitute the core psychometric performance indicators. As measures are developed and tested, evidence about these aspects of measure performance is reported. Accumulating evidence, in turn, influences the degree to which measures are favorably judged and, therefore, used by other researchers, both in populations for which they were intended as well as adapted for similar or even dissimilar populations. Thus, measures undergo a sort of Darwinian development, with improvements in the good measures occurring over time while the weaker measures are passed over. Understandably, investigators are much more likely to publish reports about measures when psychometric evidence is favorable than when it is not, which should make other investigators cautious about selecting a tool that has had few reports over time regarding its psychometric performance.

One aspect of measurement efforts for the end of life is the degree to which measurement precision is needed. In a strict sense, precision is similar to "sensitivity/specificity" as used in the biomedical sciences, in which, in the context of disease detection, sensitivity is the likelihood that a patient with a given disease will have a positive test result (Cromwell, Weibell, and Pfeiffer 1980Citation). In end-of-life research and clinical care, many variables require a high level of measurement precision; for example, pain, because accurate detection is essential for good management. Other variables can tolerate a much grosser estimate. For example, in the SUPPORT study, economic burden on the family was estimated from five straightforward questions posed to families. There was no attempt to validate family assets, and doing so would likely have been a waste of measurement effort, because the policy implications of financial burden to families were clear from data gathered by these five questions.

As described previously, the Toolkit is a comprehensive repository of measures related to key variables at the end of life and an excellent starting point for investigators searching for measures. An annotated bibliography of the Toolkit lists measures available in 11 domains. Some domains have received a great deal of measurement attention, such as pain and quality of life. Obviously, those two domains have been areas of high measurement interest much more broadly than at the end of life. For example, there are hundreds of measures of quality of life in the biomedical and psychosocial literature, and many have been adapted for end-of-life populations. Adaptation is necessary because quality-of-life measures for general populations tend to be overly long and therefore burdensome to terminally ill subjects, to contain numerous items not appropriate for an end-of-life population, and to index values that typically are not held by persons facing imminent death. Thus, a large number of measures of discrete end-of-life variables already exist. Many are very useful, although a varying amount of psychometric work has been done on them, which leads to the following recommendation:

Recommendation 3
Studies should be done that contribute psychometric evaluation to existing end-of-life measures, as well as studies that propose to develop new measures, along with rigorous psychometric evaluation of new measures.

Adapting existing tools.
Should the field's primary investment be in developing new measures rather than continuing to use already developed tools? Our research team would take a stand that, when possible, it is time-efficient and cost-effective to build on those existing tools that already benefit from some psychometric work, even when an intended application diverges from the original purpose. For our own series of studies on family member reports of pain and symptom management, we were unable to find a measure of patient pain that had been designed specifically for proxy respondents after patient death and had evidence of psychometric strength. We specifically wanted information about the patient's last week of life in community settings (home or nursing home) and therefore selected family proxies as most informed and reliable. Rather than start from scratch with a new measure that would lack information about psychometric properties, we chose to adapt (with permission) a well-established tool for symptom distress, the Global Distress Index scale of the Memorial Symptom Assessment Scale (MSAS-GDI) by Portenoy and colleagues 1994Citation. The MSAS-GDI has been used extensively in cancer populations, and many investigators have reported evidence of its strong psychometric properties.

We modified the MSAS-GDI in two ways. First, because the MSAS-GDI was developed to measure symptom distress for cancer patients, we examined the items for appropriateness in noncancer populations. Overall, we found the items suitable for other diagnoses with the exception of the lack of a dyspnea item, which we added. Adding an item, of course, changes the score range and also can influence other psychometric properties, such as internal consistency reliability, a coefficient that is sensitive to number of items. Thus, one caution when adding or deleting items to an existing scale is that doing so alters scoring and item analysis, making it essential to report psychometric analysis of the changed tool when reporting one's own study and potentially complicating later comparison of scores across studies. Second, the MSAS-GDI was designed as a patient-administered clinical tool and therefore required us to make minor wording changes for retrospective reports by family proxies and for telephone administration. As a pilot test for our modified form of the tool, we administered it to 103 family caregivers who met the criterion of being involved in the decedent's care and decision making in the final month of life. Family respondents found the items easy to comprehend, and a majority had no difficulty reporting whether a symptom had been present or not, thus supporting the approach of using family proxies for this information. Item and scale analysis showed reasonably good psychometric properties of the family version of the MSAS-GDI. Good distribution of scores across anchors relieved our concerns about potential restriction of range. The scale demonstrated good internal consistency reliability ({alpha} = .82). The average item-total correlation was r = .49, and the average interitem correlation was r = .30, suggesting that items were moderately correlated with the overall total scale and with each other. We concluded that the Family MSAS-GDI could become a useful alternative to the original scale when the desired source of data is family proxies (Hickman, Tilden, and Tolle 2001Citation). Thus, adapting an existing tool can be a more efficient approach to measurement than undertaking the arduous process of developing a new measure, leading to the following recommendation.

Recommendation 4
Studies should be done that appropriately adapt existing measures to extend the utility of already available measures.

New comprehensive measures.
Several teams of investigators have made important measurement progress by developing and testing multivariate scales that comprehensively index the end-of-life experience and/or quality of care. For example, Byock and Merriman 1998Citation designed their Missoula-VITAS Quality of Life Index for prospective patient administration to index symptoms, function, interpersonal issues, well-being, and transcendence (spirituality and existential meaning)—all components of quality of life at the end of life. Teno, Clarridge, Casey, Edgman-Levitan, and Fowler 2001Citation report on another noteworthy effort of comprehensive assessment, a retrospective quality-indicators tool for use in telephone interviews with bereaved family members. This tool taps family perceptions of problems that occurred related to decision making, coordination, control and respect, physical comfort, surrogates' emotional comfort, and self-efficacy. The tool avoids items geared toward satisfaction with care because, typically, data from satisfaction items show restriction of range and acquiescence bias, likely due in part to low expectations of doing better (Teno 1999Citation). Instead, this research team used a strategy that avoids the pitfalls of satisfaction items and results in problem scores, which measure the opportunity for clinicians to improve. Problem scores bypass limitations in reliability and validity typical of evaluation items and yield data useful to quality-indicators appraisal. Thus, promising new comprehensive measures for both prospective and retrospective assessment of quality care and subjects' experiences have been reported, and more are needed.

Recommendation 5
Studies should develop multidimensional, comprehensive measures that capture an adequate range of end-of-life domains and control for threats to validity (e.g., response sets, acquiescence bias).

Cultural appropriateness of measures.
Researchers have noted cultural variations in end-of-life experiences and treatment preferences (e.g., Krakauer et al. 1998Citation). Lack of cultural equivalence in the meaning and experience of dying introduces another serious challenge to end-of-life measurement (Corless, Nicholas, and Nokes 2001Citation). Using research instruments with ethnically diverse samples often stretches the measures beyond the samples in which they were initially tested. Cross-cultural researchers urge investigators to consider that variables can have different overall meaning in different cultures. For example, the term "pain" can carry different nuances and shades, ranging from strict physical pain to an emotion-ladened state that might better be termed "suffering" (Corless et al. 2001Citation). Furthermore, translation of instruments introduces other potential problems. Literal translation of an instrument is likely to fall short of cultural equivalence, because the latter relies as much on tone and approach as on verbal equivalence (Skevington, Bradshaw, and Saxena 1999Citation). Also, appropriateness of format can be an issue. Distinction among anchors on a Likert-type scale—such as among excellent, good, fair, and poor—may be difficult for those not as familiar with making judgments or thinking in a hierarchical fashion. Tools already developed for multinational research (e.g., Skevington et al. 1999Citation) may be useful to end-of-life investigators and should be examined. Thus, future development of culturally sensitive, and often culture-specific, measures of end-of-life variables is warranted and leads to the following recommendation.

Recommendation 6
Studies are needed that will develop, or adapt and test, culture-specific measures of variables of end-of-life care and that take into account the unique variations of ethnic and cultural groups.

Step 3A: Determine Settings and Timing of Data Collection
Settings of Care
Differences among settings of care potentially confound end-of-life data. Obviously, acuity levels differ across settings, data collection is more difficult in some settings, and rates of refusals vary [e.g., intensive care unit (ICU) families are more likely to decline than hospice families], which will limit the representativeness of samples to setting rather than to an overall end-of-life population. Related to this are the vastly different profiles of cause of death, with various causes of death being associated with different settings.

Lunney, Lynn, and Hogan 2002Citation used Medicare data on elderly decedents to cluster causes of death into four trajectories of dying: sudden death (e.g., hemorrhage); progressive terminal conditions (e.g., cancer), organ system failures (e.g., chronic obstructive lung disease), and age-related frailty. Large variations in these four trajectories complicate measurement in several ways. The sudden death group offers neither time nor access for measurement. The organ system failure group has long periods of relatively good functioning interspersed with intermittent acute exacerbations, which may respond well to aggressive rescue efforts and lead to subsequent periods of relatively high functioning and good quality of life. Thus, patients in this group generally are not defined as dying. Furthermore, the huge variability in their symptom severity and overall quality of life means that any single measurement effort would be little more than an unstable snapshot. Measurement in the frailty group is severely impeded by common problems of frail elderly persons, such as delirium, dementia, and sensory impairments, which make it difficult to collect reliable data. It is not always clear when frailty patients are in terminal decline until after death because of unpredictable courses of decline. Only progressive terminal conditions, such as metastatic cancers, lend themselves to relatively clear definition, relatively stable variables for measurement, and relatively easy access to subjects. Consequently, end-of-life researchers have tended to overstudy hospice patients, 80% of whom have cancer, and understudy other groups, particularly those in the frailty and the organ system failure groups.

Timing of Data Collection
Teno and Coppola 1999Citation, Lunney and colleagues 2002Citation, and others (e.g., Donaldson and Field 1998Citation) have described the serious problem of defining the time period called the end of life, that is, defining who qualifies for inclusion in end-of-life studies. Some manage the problem of sample definition by sampling at marker events, such as transfer to hospice or making a decision to withdraw treatment (Donaldson and Field 1998Citation).

A timing issue for clinical assessment is that fluctuations in end-of-life symptoms often lead to misinterpretation of scores as stable rather than as unreliable snapshots (Meek et al. 2000Citation). End-of-life symptoms require "state" measures (vs. trait measures); state measures must be robust enough to differentiate true differences in subjects over time from random fluctuations (Deyo and Centor 1986Citation). Also, missing data on symptom measures are unlikely to be random (an underlying assumption of measurement theory; Nunnally 1978Citation), but instead are tied to exacerbations of symptoms and thus are systematic rather than random errors.

Findings are limited regarding the best time to interview family members in retrospective studies. Studies that examined the effects of different time intervals between patient death and interview of next of kin have found that even long time periods, such as several years, have little negative effect on family data (Cartwright, Hockey, and Anderson 1973Citation; Jacobs and Burke 1991Citation; Poe, McLaughlin, Powell-Griner, Parsons, and Robinson 1991Citation). Regarding willingness of families to participate, Brock, Holmes, Foley, and Holmes 1992Citation found response rates to be identical for interviews conducted at 2 and 3 months after patient death.

Recommendation 7
Studies should be done that address measurement issues related to (a) settings of care, (b) differences in the dying trajectories, (c) fluctuations in symptom exacerbations, and (d) validity of proxy data at different time points.

Step 3B: Determine Data Sources
Data sources include potentially useful national data sets, as well as primary data from sources directly involved in the final months, including patients, families, and clinicians. Challenges in measurement accompany all of these sources.

National Data Sets
National data sets contain indicators potentially useful for quality assessment of the end of life. However, imposing new questions on established data sets requires consideration about the purpose of the data set and its relative strengths and limitations (Iezzoni 1997Citation). Several appealing aspects of using existing national data sets are that they relieve the burden of additional data collection from a vulnerable patient-family population and lend themselves to research or quality questions of a national scope. For example, the Minimum Data Set (MDS), the National Mortality Followback Survey (NMFS), and Medicare Hospice Claims Data contain relevant indicators for end-of-life care (Teno 2001Citation). However, data collection generally suffers from a narrow perspective, such as that of admitting staff, and items have not been tailored to the critical quality questions in end-of-life care. Past versions of the periodic NMFS (of 1% of adult U.S. deaths in a given year) have collected data regarding access to care, health care utilization, and functional status in the last year of life, but have not tapped the domains most central to the experience of dying, such as symptom management, patient/family satisfaction, negotiated decision making, or continuity of care. For future iterations of the NMFS, Teno 2001Citation has called for the addition of questions addressing quality of end-of-life care, which would yield valuable data to end-of-life researchers, but apparently no plans for these iterations have been made.

Pain management may be one area for which national data sets are particularly useful. Fries, Simon, and Morris 2001Citation used the MDS pain assessment scale to examine pain prevalence among almost 35,000 Michigan nursing home residents and found it to be highly predictive of the Visual Analogue Scale pain scores, considered the gold standard for a summary assessment of pain. Moreover, they maintained that the MDS pain scale was easier to administer than the Visual Analogue Scale and was a simple way to summarize the reported presence and intensity of pain using routinely collected nursing home MDS data (Fries et al. 2001Citation). However, there are other method problems with the MDS, because it was developed with a focus on restoring function for reimbursement purposes and not on the special needs of the dying. The admitting staff of each long-term care facility makes measurements of pain. Teno, Weitzen, Wetle, and Mor 2001Citation reported substantial differences in the identification of residents' pain frequency and severity by individual nursing homes (between 8% and 49% of residents reporting daily pain). This variation could reflect differences in case mix, inconsistent pain management, or inadequate pain assessment. Thus, nursing homes can appear to be better at managing pain, when in reality they are failing to assess and accurately diagnose pain.

Medicare claims data can give information on charges, reimbursement, hospitalizations, ICD-9 code, and enrollment in hospice for those older than age 65. In The Dartmouth Atlas of Health Care (Wennberg and Cooper 1999Citation), an annual publication, Medicare data are used to assess several indices of quality of care in the last 6 months of life, including rates of in-hospital deaths, ICU admissions, physician contact, and reimbursements for inpatient care. The Dartmouth Atlas provides data about hospital referral regions across the country, permitting useful comparisons by state. However, because the Medicare claims data lack information on disease severity and patient preferences, it yields only a narrow slice of information relevant to quality indicators for end-of-life care.

Another opportunity presented by national data sets comes from the U.S. Drug Enforcement Administration data from ARCOS/DADS (Automation of Reports and Consolidated Orders Systems/Diversion Analysis and Detection System). This source gives data on each opioid distributed by pharmacies per 100,000 persons, and thus allows state-to-state comparisons on pain management. The major weakness of this data set is that opioids are not segregated as to purpose; that is, medications for postoperative pain, chronic pain, and end-of-life pain are grouped together. Thus, trends in pain management for the dying are easily obscured by trends in other, larger pain populations.

Recommendation 8
Studies should be done that assess validity of available national data sets as sources of data about end-of-life care, test innovative approaches to their use, and control for limitations and potential biases of existing data sets.

Dying patients as respondents.
Although the patient is the best source of data about his or her end-of-life experience (Cohen and Mount 1992Citation; Stewart et al. 1999Citation), there are obvious barriers to prospective research with dying patients. Interviewing patients close to death may be unfeasible or unfairly burdensome (Wenger et al. 1994Citation). The stress of data collection can exacerbate the symptoms being measured, thus eroding the reliability of the scores obtained. Ethical concerns can prevent researchers from approaching patients during the last days, and patients, if asked, may be unwilling or simply unable to participate because of dementia, delirium, or coma (Hardy, Edmonds, Turner, Rees, and A'Hern 1999Citation; SUPPORT, 1995). Even when patients can be interviewed, the unpredictability of death means that patients in any given study may be interviewed 1 month, 1 week, or 1 day before death, thus introducing time-until-death as an uncontrolled variable. For example, patients with congestive heart failure, even 2 days before death, have a 50% chance of living 2 more months (Fox et al. 1999Citation). Because the time-until-death endpoint can only be known in retrospect, families and clinicians most familiar with the patient's experience are often considered the most appropriate sources of data.

Surrogates as proxy respondents.
Proxies have been used widely in research involving older adults (Neumann, Araki, and Gutterman 2000Citation) and those experiencing deteriorating health (Magaziner, Simonsick, Kashner, and Hebel 1988Citation), as well as in retrospective studies of end of life (SUPPORT, 1995; Teno et al. 2001Citation; Tolle, Tilden, Rosenfeld, and Hickman 2000Citation). The use of proxy respondents is a practical and cost-effective way to capture a wide spectrum of data on the end-of-life experience, but their use raises concerns about reliability and validity. Overall, findings have been mixed about the reliability of proxy data. Clinicians tend to underestimate pain (Rankin and Snider 1984Citation; Teske, Daut, and Cleeland 1983Citation) and quality of life (Sprangers and Aaronson 1992Citation). Findings are inconsistent about proxy estimates of patients' emotional problems. Nurses were noted to overestimate patients' levels of anxiety and depression (Holmes and Eburn 1989Citation; Jennings and Muhlenkamp 1981Citation), whereas physician ratings of patients' affective states sometimes overestimate and sometimes underestimate emotional symptoms (Brody 1980Citation; Grandi et al. 1990Citation).

Family ratings are most concordant with the patient when measuring an area of experience that is concrete and observable, such as functioning, physical health, and cognitive status, and less concordant in assessments of psychological and/or emotional well-being (Neumann et al. 2000Citation; Sprangers and Aaronson 1992Citation). Studies that examine both proxy and patient data collected in the same timeframe show similarly mixed results. Agreement between responses of family caregivers and patients is good for description of symptoms, adequacy of support, and preferred place of death (Spiller and Alexander 1993Citation), as well as for assistance with activities of daily living, awareness of diagnosis, and evaluations of care (Field, Douglas, Jagger, and Dand 1995Citation). However, family caregivers rate patients' emotional state as worse than do patients (Spiller and Alexander 1993Citation) and agreement on pain is poor, either underestimated or overestimated (Field et al. 1995Citation).

Some family members tend to be more reliable proxies than others. In general, spouses, children, and other close family members tend to be capable proxies. However, proxy reports can be influenced by caregiver stress, burden, or frail health, which can negatively influence proxy ratings of a patient's functional performance and psychosocial health (Long, Sudha, and Mutran 1998Citation; Neumann et al. 2000Citation; Rothman, Hedrick, Bulcroft, Hickam, and Rubenstein 1991Citation; Zanetti, Geroldi, Frisoni, Bianchetti, and Trabucchi 1999Citation). Frail health and cognitive impairments can complicate efforts to collect valid data from elderly proxies. Visual and hearing deficits in elderly proxy respondents are common (Desai, Pratt, Lentzner, and Robinson 2001Citation), presenting special measurement challenges to researchers, who must adapt tools and methods to compensate and facilitate full participation. Finally, because caregivers often change as death approaches, the person who knows the most about the patient's last week of life may not be the same person who provided care in the preceding months (Fowler et al. 1999Citation).

There are various explanations for the divergence in ratings between patients and proxies. First, patients and proxies do not have the same perceptions, having experienced the same events from different vantage points. In the case of retrospective studies, proxy reports on the end-of-life experience include the death itself, which is information the patient, prospectively, does not have. Second, difference in timing of interviews and difference in interpretation of questions by different respondents can affect concordance. Third, patients and proxies have different motivations and interests that affect their responses; for example, patient answers may be influenced by denial or concerns about self-presentation (Sprangers and Aaronson 1992Citation), whereas retrospective proxy responses are likely colored by bereavement. Despite limitations, however, the literature does support the use of proxy respondents at the end of life, with good reliability for many variables. It is also important to acknowledge that family perceptions have value beyond their proxy function, and the family experience is itself a variable of interest. This leads to the following recommendation.

Recommendation 9
Studies should be done that test the validity of data from proxies. Because questions about the reliability and validity of proxy data may be a function of earlier and more rudimentary measures, the quality of proxy data should be judged with newer and more sensitive measures.

Locating family proxies.
Finding family proxies can be a challenge, especially in large samples that are not linked to records of specific health care organizations. Although an agency-based approach to sampling family respondents is logical to answer questions of quality improvement for system performance, in research, random sampling offers larger, more representative samples without the bias of convenience sampling. In our research using systematic random sampling of decedents, next-of-kin is identified through death certificates. We have called state health departments in every state and discovered that a majority of states' death certificates (78%) include the next-of-kin's mailing address. However, death certificates in Oregon and 10 other states list only the next-of-kin's name and relationship to the deceased. In our case, this has necessitated developing an array of case-finding strategies to locate an appropriate family respondent (Tilden, Drach, Tolle, Rosenfeld, and Hickman 2002Citation). Our case-finding strategies have become remarkably effective over a series of three large studies: "Barriers to Improving Care of the Dying" (Tolle, Tilden, Hickman, Rosenfeld, and Halvor 1999Citation; Tolle, Rosenfeld, Tilden, and Park 1999Citation; Tolle, Tilden, Rosenfeld, et al. 2000Citation), "Trends in Pain Management" (Tolle, Tilden, Hickman, and Rosenfeld 2000Citation), and "Family Perceptions of Community-Based Dying" (Tilden, Tolle, and Drach 2000Citation). Our most recent R01 study of statewide community-based dying had a 72% rate of locating eligible family respondents from only the next-of-kin's name. Thus, it is possible to achieve relatively representative random samples from the general population.

Recommendation 10
Studies should propose innovative sampling plans that will achieve large and representative samples of family proxies.


    Conclusions
 TOP
 Abstract
 Purpose of Measurement at...
 National Focus on Quality...
 Conclusions
 References
 
Measurement bridges the conceptual and operational levels of scientific research, clinical care, and quality improvement. A large amount of psychometric groundwork has been laid in the field of end of life research, and outstanding investigators and sophisticated investigator teams are poised to take measurement to the next level of rigor and precision. Recommendations in this article are offered as suggestions for that work.

The National Institutes of Health has exerted major leadership in the country toward improving end-of-life care through research initiatives. Under a multiinstitute collaborative agreement, several research funding programs have already been released to the scientific community: "Management of Symptoms at the End of Life" (National Institutes of Health [NIH], 1997), "Research on Care at the End of Life" (NIH 1999Citation), and "Quality of Life for Individuals at the End of Life" (NIH 2000Citation). These, combined with research initiatives by private foundations, have encouraged and supported a wave of empirical research on quality of life at the end of life and on improving end-of- life care.

We applaud this support and encourage more within a full range of support mechanisms, from pilot/feasibility mechanisms through comprehensive research centers. Just as with the IOM's attention to measurement for crossing the quality chasm in health care, quality of care, and quality of life at the end of life will be enhanced by the continued improvement of measures.

The Forum

Book Reviews

Practice Concepts

Received for publication September 29, 2001. Accepted for publication April 23, 2002.


    References
 TOP
 Abstract
 Purpose of Measurement at...
 National Focus on Quality...
 Conclusions
 References
 




This article has been cited by other articles:


Home page
AM J HOSP PALLIAT CAREHome page
M. Miyashita, T. Morita, S. Tsuneto, K. Sato, and Y. Shima
The Japan HOspice and Palliative Care Evaluation Study (J-HOPE Study): Study Design and Characteristics of Participating Institutions
American Journal of Hospice and Palliative Medicine, June 1, 2008; 25(3): 223 - 232.
[Abstract] [PDF]


Home page
AM J HOSP PALLIAT CAREHome page
M. Miyashita, A. Nakamura, T. Morita, and S. Bito
Identification of Quality Indicators of End-of-Life Cancer Care From Medical Chart Review Using a Modified Delphi Method in Japan
American Journal of Hospice and Palliative Medicine, March 1, 2008; 25(1): 33 - 38.
[Abstract] [PDF]


Home page
FocusHome page
J. M. Lyness
End-of-Life Care: Issues Relevant to the Geriatric Psychiatrist
Focus, January 1, 2007; 5(4): 459 - 471.
[Abstract] [Full Text] [PDF]


Home page
FocusHome page
J. M. Lyness
End-of-Life Care: Issues Relevant to the Geriatric Psychiatrist
Focus, April 1, 2005; 3(2): 341 - 353.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar