Home
HOME ARCHIVE SEARCH TABLE OF CONTENTS

This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Services
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
PubMed
Right arrow PubMed Citation
The Gerontologist 44:689-692 (2004)
© 2004 The Gerontological Society of America

Interrater Reliability of the Outcomes and Assessment Information Set: Results From the Field

Elizabeth A. Madigan, PhD, RN1, and Richard H. Fortinsky, PhD2

Correspondence: Address correspondence to Elizabeth A. Madigan, PhD, RN, Associate Dean for International Health, Associate Professor, Frances Payne Bolton School of Nursing, Case Western Reserve University, 10900 Euclid Avenue, Cleveland, OH 44106-4904. E-mail: Madigan{at}case.edu


    Abstract
 TOP
 Abstract
 Methods
 Results
 Conclusions
 References
 
The Outcomes and Assessment Information Set (OASIS) is now used extensively for regulatory, reimbursement, research, and clinical purposes in home health care. However, little is known about the interrater reliability of OASIS items based on assessments from home-health-agency clinicians. Therefore, we evaluated OASIS item interrater reliability among 88 patients from 21 agencies. Of 25 items studied, all except 2 had weighted kappa values of {kappa} ≥ 0.60. We conclude that OASIS item interrater reliability was highly adequate in this study, but we recommend routine interrater-reliability evaluation by agencies to maximize the quality of OASIS data nationally.


The Outcomes and Assessment Information Set (OASIS) was mandated for use in home-health-care agencies by the Centers for Medicare and Medicaid Services in 1999 for two primary purposes. The first purpose was to establish a uniform database for quality-management activities within the home-health-care industry, primarily at the agency level (Shaughnessy et al., 2002), known as outcome-based quality improvement and outcome-based quality management (OBQI/OBQM). The second purpose was to use a uniform set of clinical, functional, and service-related variables to develop a case-mix-adjustment algorithm, the Home Health Resource Group (HHRG) for the Medicare home-health-care prospective payment system (PPS) that was implemented in October 2000. Thus, OASIS data are critically important for home-health-care-agency compliance with federal regulation and reimbursement initiatives (Madigan, 2002; Madigan, Tullai-McGuinness, & Fortinsky, 2002, 2003).

In addition, because OASIS data are collected on all home-health-care patients insured by the Medicare and Medicaid programs, they represent a potentially valuable resource for investigators interested in conducting home-health-care research (Fortinsky, Garcia, Sheehan, Madigan, & Tullai-McGuinness, 2003). However, to our knowledge, there is little published information regarding the reliability of OASIS data. Therefore, our purpose in this study was to determine the interrater reliability of OASIS data as collected by home-health-care clinicians in the field, focusing on items used for the home health care PPS and for quality management.

Reported Reliability of OASIS Data
Given the multiple uses of OASIS data, a critical question is whether two clinicians will assign the same response to an OASIS item for the same patient. Questionnaire or clinical assessment items are commonly assumed to have adequate interrater reliability if the value of the reliability coefficient (weighted kappa for OASIS items with more than two possible response categories) is ≥ 0.60 (Landis & Koch, 1977). Crisler, Shaughnessy, and colleagues (Crisler et al., 2002; Shaughnessy et al., 2002) reported interrater reliability results from 66 patients from five home-health agencies, in which two nurses specifically hired and trained for the reliability exercise (not agency employees) performed OASIS assessments within 24 hr of each other. Of the 38 OASIS items used to calculate health-status outcomes in OBQI reports, all but 2 items had coefficients greater than 0.60, and 25 items had coefficients exceeding 0.70.

In the only other known published study on OASIS data reliability, Madigan and Fortinsky (2000) studied the intrarater reliability of selected items. The same home-health-care clinician completed OASIS items as part of a routine home visit and then completed an OASIS form for the same patient within 48 hr, using recall of the patient's condition and non-OASIS clinical documentation as guides. This modified approach based on 22 admission and 15 discharge assessments yielded kappa scores ranging from 0.41 to 1.0.

In summary, few studies have published results of the reliability of OASIS data, despite its increasingly common use for home-health-care policy and research purposes. No known studies have reported interrater reliability of OASIS data when two home-health-care clinicians employed by agencies are used as the OASIS data collectors. This study is intended to fill this important knowledge gap.


    Methods
 TOP
 Abstract
 Methods
 Results
 Conclusions
 References
 
We conducted this interrater reliability study as part of a larger prospective cohort study designed to examine determinants of and relationships between Medicare home-health-care patient outcomes and resource-use patterns. Further details about the larger study have been published elsewhere (Fortinsky et al., 2003).

To obtain interrater reliability data, we instructed each participating agency to send two health care professionals (raters) into the home at the same time for up to five home visits made to any patients enrolled in the larger study. During these selected visits, the first rater completed an OASIS form according to the protocol for agency purposes. The second rater independently and concurrently completed an OASIS form upon observing the patient and the first rater. We instructed raters not to change their answers on the basis of any discussions held after they completed their OASIS forms.

Sample
The final interrater reliability sample includes 88 patients from 21 participating agencies, each of whom had one interrater reliability assessment. Patients in the interrater reliability sample had a mean age of 77.7 years (SD = 8.24); 66% were female and all were White. These demographic characteristics were nearly identical to the larger study sample of 1,284 patients in terms of age and gender, although 93% of the larger study sample was White. Of the 88 interrater reliability assessments, 50 were completed at admission, 32 at discharge, and 6 at resumption of care following a hospitalization. To preserve maximum sample size, we pooled all 88 assessments for analyses in this study.

Analysis
We used weighted kappas for the interrater reliability cases. Weighted kappas are preferred because they take into account not only chance agreement, as with the original kappa (Cohen, 1968), but also the extent of disagreement (Agresti, 1990). In other words, in cases in which the two raters differ in their classification of the participant, ratings that are separated by only 1 point receive higher agreement scores than those separated by 2 or more points. We also report the percentage of agreement for comparability with other reports. Finally, we are only reporting weighted kappas for items in which we had more than 30 cases for the interrater reliability. There were items with fewer than 30 cases, because the item was associated with a particular condition, such as a wound, or with data collected only at specific times (transfer or discharge).

There are a number of OASIS items that have multiple responses for the item. For example, for Behaviors Demonstrated, each behavior is noted as yes or no. In the analysis, we computed the average kappa and percentage of agreement for the six items that make up the Behaviors Demonstrated OASIS item. OASIS items selected for this analysis include those used to determine HHRG classification for the PPS and those items used in the OBQI report described herein. Some items appear in both the OBQI report and the HHRG report. In these cases, the items are included in both tables to facilitate understanding (see Table 1).


View this table:
[in this window]
[in a new window]
 
Table 1. Interrater Reliability for OASIS Items Used in OBQI.

 

    Results
 TOP
 Abstract
 Methods
 Results
 Conclusions
 References
 
Results of the interrater reliability analyses are presented in Tables 1 and 2. Using the parameters established by Landis and Koch (1977), we defined substantial interrater reliability as kappa scores of {kappa} ≥.60. Table 1 shows that all OASIS items used in OBQI and HRRG reports achieved substantial interrater reliability. There were a number of items used in OBQI reports or for HHRG scoring with insufficient sample sizes because they are completed only for subgroups of patients (e.g., patients transferred from home or those who have urinary incontinence or wounds); therefore, we did not compute kappa scores for these items.


View this table:
[in this window]
[in a new window]
 
Table 2. Interrater Reliability of Selected Items Used in HHRG Scoring.

 

    Conclusions
 TOP
 Abstract
 Methods
 Results
 Conclusions
 References
 
In this study we found that the OASIS items used in OBQI reports and HHRG scoring had substantial interrater reliability when rated independently by two clinicians employed by Medicare-certified home-health agencies. There were no items with values less than.60, and most items had values higher than.70. These findings suggest that the reliability of these OASIS items is sufficient for use in research and for regulatory and reimbursement purposes. Further examination of interrater reliability for the less commonly used OASIS items is needed.

Our findings regarding high levels of interrater reliability among most OASIS items studied are consistent with those reported elsewhere (Crisler et al., 2002; Shaughnessy et al., 2002), even though our evaluation methods differed. Our study included two raters, both employed by participating home-health agencies, observing the same OASIS assessment session in the home at the same time. In contrast, the comparative studies (Crisler et al., 2002; Shaughnessy et al., 2002) involved two raters, both research nurses not employed by participating home-health agencies, conducting OASIS assessments at different times, usually on different days. We acknowledge that both evaluation approaches have strengths and limitations. Although our approach captures interrater reliability among actual home-health staff conducting OASIS assessments as part of an actual home visit, we recognize that the approach we used might lead to more favorable results as a result of the simultaneous presence of both staff members in the patients' homes. For example, simultaneous ratings minimize the potential sources of instability in reported and recorded data, such as changes in patient status, the manner in which the OASIS-driven questions are asked by raters, and the manner in which patients reply to clinical and functional questions from raters.

In addition, our findings are similar to interrater reliability findings on the Minimum Data Set, which is another federally mandated data set used with older adults residing in nursing homes. Research by Hawes and colleagues (1995) evaluated interrater reliability for 123 nursing home residents by using two raters who independently evaluated patients on two separate occasions within a 7-day time frame. Similar to our findings, the reliability coefficients between the raters were adequate or better than adequate, with the functional status domain items showing the highest levels of interrater reliability.

In conclusion, the value of this study is that it tested the interrater reliability of OASIS items critically important to current home-health-care practice and policy by using actual home-health-care staff as raters. To our knowledge, results from this type of OASIS item-reliability evaluation have not been previously reported. Although we believe that evidence is clearly building in support of the reliability of OASIS data, we also contend that, in order to sustain this level of reliability, Medicare-certified home-health agencies throughout the United States should routinely evaluate the reliability of their clinical staff in recording OASIS data.


    Footnotes
 
Funding for this study was provided by the National Institute of Nursing Research under Grant R01 NR05081 (R. H. Fortinsky, Principal Investigator). Back

1 Frances Payne Bolton School of Nursing, Case Western Reserve University, Cleveland, OH. Back

2 Center on Aging, University of Connecticut Health Center, Farmington. Back

Decision Editor: Linda S. Noelker, PhD

Received for publication August 7, 2003. Accepted for publication February 16, 2004.


    References
 TOP
 Abstract
 Methods
 Results
 Conclusions
 References
 




This article has been cited by other articles:


Home page
Home Health Care Management PracticeHome page
J. M. Rogers and D. K. Schott
Front Loading Visits: A Best Practice Measure to Decrease Rehospitalization in Heart Failure Patients
Home Health Care Management Practice, February 1, 2008; 20(2): 147 - 153.
[Abstract] [PDF]


Home page
JCOHome page
S. M. Koroukian, P. Murray, and E. Madigan
Comorbidity, Disability, and Geriatric Syndromes in Elderly Cancer Patients Receiving Home Health Care
J. Clin. Oncol., May 20, 2006; 24(15): 2304 - 2310.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
S. L. Douglas, B. J. Daly, C. G. Kelley, E. O'Toole, and H. Montenegro
Impact of a Disease Management Program Upon Caregivers of Chronically Critically Ill Patients
Chest, December 1, 2005; 128(6): 3925 - 3936.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Services
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
PubMed
Right arrow PubMed Citation


HOME ARCHIVE SEARCH TABLE OF CONTENTS