| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|
| ||||||||||||||||||||||||||||||||
Correspondence: Address correspondence to Sara J. Czaja, Dept. of Psychiatry and Behavioral Sciences, University of Miami School of Medicine, 1695 N.W. 9th Avenue, Miami, FL 33136. E-mail: jsharit{at}miami.edu
| Abstract |
|---|
|
|
|---|
Key Words: Ecological validity Real-world tasks Task analysis
| Introduction |
|---|
|
|
|---|
The goal of this article is to demonstrate how techniques such as task analysis and simulation can be used to develop representative tasks and enhance the ecological validity of research protocols. We begin with a brief discussion of methodological issues currently being raised within the cognitive aging literature. Next, we define characteristics of ecologically valid research and demonstrate the application of our approach using examples from research projects designed to examine aging and the performance of real-world computer-based work tasks. Finally, we discuss how findings from applied research studies can be translated into solutions for real-world problems and at the same time advance theory regarding aging and functional performance.
| Background |
|---|
|
|
|---|
In fact, within the domain of cognitive aging there has been a recent emphasis on understanding everyday cognition or the performance of older adults on problems encountered in everyday life. It has also been argued that understanding everyday cognition requires an ecological research approach and that traditional psychometric measures of cognition may have little relevance to the competence or functioning of older adults on tasks such as medication management, job activities, or learning to use novel technology (e.g., Denny, 1989; Salthouse, 1994; Willis & Schaie, 1986).
Findings regarding age and work performance provide a cogent illustration of the dilemma regarding the usefulness of psychometric measures. Most studies have shown that component processes of cognition, when assessed by traditional psychometric measures, decline with age (Park, 1992). Furthermore, there appears to be a relationship between these measures of ability and job performance (e.g., Schmidt & Hunter, 1992). Thus, on the basis of these findings one would expect to find age-related declines in work performance. In fact, there is little evidence that older workers are less productive than younger workers (Czaja, 2001). One explanation is that the ability measures are not tapping the abilities needed for particular jobs. Another explanation is that the measures fail to capture the complexity of work situations. For many types of work tasks, older people are able to use their expertise and contextual support to compensate for age-related declines in abilities. Use of compensatory strategies is not possible when performing psychometric tests that are relatively novel and place a heavy emphasis on test taking and academic skills. In addition, most work environments do not require performance at maximum levels, as is often the case with laboratory tasks. Consequently, there may be a discrepancy between findings regarding performance capabilities in laboratory versus work settings.
To overcome these issues, investigators (e.g., Allaire & Marsiske, 1999; Diehl, Willis, & Schaie, 1995; Salthouse, Hambrick, Lukas, & Dell, 1996) have attempted to use a more ecological research approach by examining performance on tasks designed to simulate situations encountered in everyday life. Diehl and associates used a measure of daily problem solving, the Observed Tasks of Daily Living (OTDL), to examine problem-solving competence in three domains: medication adherence, food preparation, and use of the telephone. They found that the performance of older adults on the OTDL was correlated with measures of basic abilities and that the OTDL showed convergent validity with a paper-and-pencil test of everyday problem solving.
In a similar vein, Salthouse and colleagues (1996) used a synthetic work task, SYNWORK1 (Elsmore, 1994), to gain insight into potential individual differences in the performance of a variety of jobs. SYNWORK1 is designed to capture the dynamic aspects of a complex work activity and consists of four distinct subtasks that are performed singularly or simultaneously. The subtasks include a simple recall task, a self-paced task requiring concentration and monitoring, and reacting to visual and auditory information. One problem with this approach, however, is that the synthetic work tasks may have limited face validity and thus may not be capturing relevant aspects of actual jobs. In addition, outcomes from studies using these types of tasks offer little in the way of designing intervention strategies for actual work environments.
A second approach is to observe performance in actual work settings. However, as pointed out by Salthouse and colleagues (1996), the process of observation and assessment may be intrusive and disruptive and not well received by workers. Furthermore, it is difficult to maintain scientific rigor and control. An alternative approach is to conduct a detailed analysis of real-world work tasks and build actual simulations of these tasks so that they can be examined in research environments. The advantage of this approach is that the important elements of a real-world problem can be investigated under controlled conditions. Potential problems with this approach are that the task analysis and development of the simulations can be very time consuming, and there may be challenges associated with scaling aspects of real-world situations into platforms that are feasible for experimental settings. In addition, there may be limited generalizability with respect to the research findings. However, simulations can be designed to represent generic systems, and findings can be generalized to a class of problems or situations.
| Ecologically Valid Research |
|---|
|
|
|---|
The concept of ecological validity has had a relatively long history within the disciplines of applied experimental psychology and human factors engineering and ergonomics (see Fisk & Kirlik, 1996; Hoc, 2001). For example, Gibson (1966) emphasized the need to study natural stimuli when investigating issues related to the acquisition of reading skills, and Chapanis (1988) discussed a number of issues related to ecological validity in a classic article concerned with the limited generalizability of research findings. More recently, investigators (e.g., Vincente, 2002; Vincente & Rasmussen, 1990) have been concerned with the development of "ecological interfaces."
It is important to note that the use of ecologically valid research techniques does not preclude advancing theory. The literature is replete with examples of how researchers addressing real-world problems have contributed to the development of theoretical models of human performance. Rogers and Fisk's (e.g., Rogers, Fisk, Mead, Walker, & Cabrera, 1996) work on training has contributed to our knowledge of skill acquisition, and Drury's (1975, 1982) early research on industrial inspection has advanced our understanding of visual search. In fact, a requirement of ecologically valid research is that it builds on and advances theory. Proponents of this approach also maintain that the development of interventions must be guided by theoretical principles. The importance of combining theory, derived from basic research, with data from more applied research has been discussed at length by Kantowitz (1992) and Fisk and Kirlik (1996).
Achieving ecological validity requires consideration of a number of issues. One of these concerns the nature of the research problem. Generally, research that is ecologically valid focuses on real-world practical problems such as issues related to work, driving, health care, training, or interface design. The research questions might include attempting to understand individual differences in the performance of a task, for instance, in examining age differences in driving ability or how older adults adapt to technological advances in the workplace. They may also encompass issues related to training or the efficacy of an intervention or design solution. For example, Rogers and Fisk have conducted a number of studies (e.g., Rogers, Fisk, Mead, Walker, & Cabrera, 1996) to identify how best to train older adults to use new technologies, and Park and colleagues (e.g., Park, Morrell, Fireske, & Kincaid, 1992) are interested in understanding the efficacy of various devices in improving the medication adherence of older people. On a more theoretical level, they are also interested in understanding how cognitive factors influence adherence performance (Brown & Park, 2003).
A second requirement of ecologically valid research, critical to obtaining answers to questions regarding real-world issues, is the inclusion of criterion tasks that capture the relevant features of real-world tasks and environments. By minimally compromising the multidimensional complexity associated with task demands, ecologically valid tasks maximize the opportunity for use of contextual support and are thus more representative of true performance effects. As previously discussed, examining performance on laboratory tasks is insufficient when attempting to derive solutions to practical problems. This is not to suggest that the use of these types of measures does not help provide insight into the source or nature of individual differences in performance. For example, understanding the link between component cognitive abilities, aging, and driving is important to understanding why older people perform at different levels than younger adults and to the development of theoretically driven design solutions. In fact, generalizing research findings from one situation to another requires describing, at a more abstract level, how the situations are similar (Hoc, 2001). Thus, recognizing how findings from studies of visual inspection can be useful to driving requires an understanding of how the more abstract abilities of selective attention and visual scanning are important to the performance of both tasks. However, sole reliance on laboratory measures is rarely sufficient to gain insight into the types of problems older people encounter while driving or how contextual factors such as weather or traffic affect driving performance.
Closely linked to the requirement for representative tasks is the selection of representative performance measures. Dependent measures need to have relevance to the practical problem being investigated. In recent years within human factors engineering, a great deal of emphasis has been placed on the study of human error, which has traditionally been one of the most commonly used measures of human performance. This emphasis is due in large part to the consequences of human error in complex systems such as aviation, energy production, and health care and to the recognition that the majority of human errors are due to discrepancies between the demands of the system and human performance capabilities (Wickens & Hollands, 2000). Part of this effort has been devoted to the development of taxonomies to describe the nature of errors, as understanding the types of errors that occur has important implications for system design. Most important, understanding the fact that humans make errors or having knowledge concerning the quantity of such errors is insufficient in terms of developing design solutions. Instead, what is needed is an understanding of the types of errors that people make and the contextual factors that may contribute to the likelihood of error occurrence. In addition to errors, other performance variables such as measures that reflect preference, workload, confusion, or quality of performance may be useful. The choice of measures, however, should depend on the nature of the research question and the task and environment of interest.
In addition to consideration of tasks and outcome measures, it is also important to consider subject populations. Specifically, it is important to ensure that these populations are representative of people who are performing the task, interacting with the system, or operating within the environment of interest. For example, if the concern is developing design guidelines for nursing homes, consideration should be given to the needs of the staff, residents, and family members/visitors. Similarly, designers of websites need to be aware of the characteristics of a potentially broad base of users including people of various age groups, skill levels, ethnicity, socioeconomic status, and physical and cognitive ability. Issues related to recruitment of diverse populations are discussed by Warren-Findlow, Prohaska, and Freedman (2003). Finally, ecologically valid research should include mechanisms for translation of findings into solutions for practical problems and for disseminating (see Farkas & Jette, 2003) and implementing these solutions in real-world settings (see Ball, Wadley, Edwards, & Roenker, 2003).
The following section discusses techniques that can be used to achieve ecological validity. This is followed by case studies that demonstrate how we have applied these techniques in examining age-related differences in the performance of computer-based work tasks.
| Designing Ecologically Valid Research Protocols |
|---|
|
|
|---|
Once representative tasks are identified, the properties of these tasks must be delineated. Generally, a task can be described as a set of goal-directed activities that have a starting point and a stopping point. A description of the task must include action requirements (physical and cognitive), performance criteria, equipment and support materials, and so forth. Task analysis represents a formalized process for decomposing and describing an activity. Using this approach, activities are decomposed according to goals (the overall intent with respect to changing the environment or system of interest), subgoals, and activities (Annett & Duncan, 2000). Consideration must also be given to the context within which the task is performed. Given that environmental factors shape behaviors, it is important to understand the main characteristics and constraints of the actual situation. There are a number of methods available for collecting task analysis data, including observation, interviews, task participation, and questionnaires.
Once the requirements of a task are understood, a simulation or model of the task can be developed so that it can be investigated within research environments. Although task simulations do not offer the full complexity of real-world field testing, they do offer the advantage of being able to observe performance under controlled conditions and the flexibility of being able to vary conditions and task requirements. For example, use of a driving simulator captures many of the elements of on-the-road driving but, at the same time, allows control over factors such as traffic conditions, weather, and driving course. The two important characteristics of a simulation are fidelity or realism and comprehensiveness or the extent to which the real-world situation is reproduced (Meister, 1990). The scale of the simulation depends to a large extent on the nature of the research question and issues related to feasibility and cost. Feasibility relates to time constraints and technical/staff requirements as well as potential demands for research participants (e.g., the amount of training required to learn a task).
To illustrate these techniques, we discuss and contrast two case studies, based on projects within our Roybal Center. The tasks used in these studies were simulated real-world tasks performed by customer service representatives in the health insurance and retail industries. In both cases, the tasks involved interacting with technology to navigate complex information databases to respond to queries or requests for information from customers. These tasks were chosen in view of the fact that customer service tasks are representative of technology-based tasks commonly performed across a wide variety of industries and that more than 3.3 million workers in the United States are customer service representatives (Communications Workers of America, 2000). Furthermore, the incidence of telecommuting, the focus of the second example, is increasing. In 1995, at least three million Americans were telecommuting for purposes of work, and this number is expected to increase by 20% per year (Nickerson & Landauer, 1997).
Customer service jobs emphasize information search and retrieval activities, and most of these jobs involve reliance on computer-based information databases. These types of jobs are expected to proliferate given the increased focus on a service-oriented economy and the availability of cheaper and more powerful technologies for storage and organization of information. In addition to being representative of tasks currently being performed in the work sector, these types of jobs are more amenable to part-time work than most traditional jobs, are characterized by minimal physical demands, and offer the potential for social interaction. These characteristics make these types of jobs particularly suitable for older adults.
The following case studies contrast the development of two different simulations corresponding to two different types of customer service representative jobs. Both simulations exemplify the tension that exists between capturing contextual variables and constraints that operate in the real world and the scaling needed to investigate these jobs in more controlled research environments.
Case Study 1: Simulation of a Service Representative in a Health Insurance Company
The first case study pertains to a job performed by service representatives of a large health insurance company. The task involved responding to calls from members of the health insurance plans offered by the company (Czaja, Sharit, Ownby, Roth, & Nair, 2001). The worker provides answers or performs actions regarding issues that members have concerning their health insurance and documents the resolution of these queries and actions. The information needed to perform the job is stored largely in computer databases; however, some of the information is derived from hard-copy sources.
Adopting an ecological perspective to development of the task simulation involved performing a task analysis of the actual job, which required close collaboration with the insurance company. We performed extensive interviews with employees and managers, observed a variety of workers performing the actual task, and reviewed relevant documentation and company records. Analysis of this information enabled us to develop problem scenarios for the experimental sessions that were representative of the types of queries and issues that service representatives encounter during a typical workday. The sample queries matched the relative frequency with which these questions occurred in actual work situations, thereby enhancing the realism of the simulation. Interaction with company personnel also enabled us to develop a model of the computer information system and interface and of the hard-copy manual of materials that the workers used as reference support. The computer information system consisted of four file subsystems, listed in a main menu, that were linked to particular types of information requests. The reference manual included detailed information on the distinction between the company's two major health plans and the benefits associated with these plans, sections pertaining to listings of service and reason codes (pertaining to services for which claims were being requested and reasons for why claims were denied), and case announcement briefs that contained information related to the specific contracts that the employers of members negotiated with the insurance company.
We also developed a database of fictitious subscribers, family members, and physicians. To enhance the realism to the study participants, names and addresses were chosen that were representative of the local community. The computer screen configurations were highly similar to those used in the actual work situation; for example, fields where particular alphanumeric codes were to be entered were located in the same screen areas (Figure 1).
|
By mapping this work scenario into an experimental paradigm, the information-seeking and retrieval activities by the subjects could be captured and measured under controlled conditions. Finally, the simulation was designed so that we could capture real-time performance data. Our measures reflected those used by the actual company and, of course, were also linked to our research hypotheses. For example, to evaluate the degree to which older adults deviated from a maximally efficient search path, a measure of navigational efficiency was computed. This measure enabled us to test the hypothesis that older adults have a greater tendency than younger people to become lost in the computer database's space. We also included measures of work output, consistent with the company's concerns regarding amount of work completed during a given day.
Ecological validity also presumes that scaling the actual work task into an experimental session does not appreciably alter the multidimensional complexity of the task and hence the demands placed on the subject. However, feasibility with respect to constructing the simulation, training demands for the participants, and the time required for the experimental sessions are also critical concerns. In our case, we attempted to scale back the job into critical elements by selecting queries and features of the information databases that represented a large proportion of the variability associated with real-world queries.
However, in developing this simulation it is worthwhile noting two issues related to ecological validity that we encountered. The first concern related to the mode in which queries were handled. In the real-world version of this task, customer queries are exclusively handled by telephone, whereas in the simulated task the majority of these requests for information were presented visually on printed cards. Given that this task was performed for 3 hrs a day over a 3-day period by 117 study participants, the logistics of providing the participants with ongoing fictitious customer calls and analyzing performance based on the transcripts of these phone conversations rendered this approach unfeasible. However, to capture some element of this characteristic, the participants were also required to respond to telephone queries generated by the experimenter at four different points in time on each of the 3 days the task was performed. Correlational analysis based on measures derived from transcripts of the phone-based interactions did indicate that the form-based method of providing queries was a valid predictor of phone-based performance (Czaja et al., 2001).
The second major concern related to the documentation of phone conversations. In the real world, service representatives are required to document, in a free-text format, all phone conversations. The risk in incorporating this feature in the simulation is that a large emphasis of the participant's time may be consumed by composing these written responses as opposed to information-seeking behavior, especially in view of the ethnic diversity of the sample (many of our participants were nonnative English speakers). Thus, a documentation system was developed that only required the participant to provide critical pieces of information related to the query (or to the request for an action such as informing the company of a change of address) that was just processed. This information provided a measure of correctness in responding to the query, as well as a measure of the quality (i.e., degree) of the documentation.
Case 2: Simulation of a Telecommuting Customer Service Representative
As in the previous case, the participants in this ongoing study assume the role of customer service representative. However, in contrast to the previous case, this task represented a simulation of a fictitious company, which is referred to as Media Products, Inc. The company is Web-based and sells products related to computers and accessories, as well as related products such as digital cameras and software. Like many Internet-based companies, Media Products prefers that customers with problems or questions correspond with service representatives via E-mail. Likewise, customer service representatives use E-mail to respond to these questions and complaints (e.g., Amazon.com). Because this job can be performed from any location where a computer with an Internet connection is available, these jobs are often referred to as telecommuting jobs. Generally, these types of jobs may offer many benefits to older people, including the ability to work at home and the ability to maintain control over the pace and duration of the work task.
Our interest was in developing a general simulation model of a telecommuting task. Again, we relied heavily on the use of task analysis to develop the simulation. Initially, we analyzed companies that sell similar products through the Internet. In some cases, companies provided listings of common problems and questions asked by customers (e.g., questions regarding product features, methods of payment, dealing with defects). We also conducted phone interviews with several individuals who were performing telecommuting work for a small Internet-based company that sold various software products.
This information served as the basis for the development of a database that consisted of three primary sections: policies and procedures, products, and customer and order information. The policies and procedures section contains rule-based information and consists of 10 submenus. The product database contains 6 submenus pertaining to different product categories, and the customer and order information section contains a listing of customers and information related to their orders. The latter two sections of this database were configured according to traditional row versus column structures. The participant's task is to sequentially open E-mails from a listing of 40 E-mails in an E-mail inbox window and respond to the customers' inquiries (Figure 2).
|
Although in Case 1 the screen configurations and, more generally, the information environment was modeled after an existing company, the use of a prototype telecommuting task enabled more flexibility in constructing the interface. Specifically, we were able to address the needs of older adults by incorporating features that minimized working memory demands. These features included the use of a split screen format that enabled the E-mail to always be present as information relevant to this E-mail was being searched for (Figure 2), and for a "history" window to be requested that displayed all selections that had been made to that point.
Another compromise to real-world activities in the interest of research objectives concerned the nature of the E-mails. Although these E-mail letters are fictitious, as were the queries discussed in Case 1, an important objective of this study is to track learning on this task. Following 1 day of training and practice, each participant performs this task over a 4-day period, with two 2-hr sessions per day and 40 E-mails per session. They are instructed to reply to as many E-mails as possible during each session. The E-mails are constructed to capture a number of distinctions that were relevant to the research objectives, including simple (one answer/selection) versus complex problems, problems that require limiting search activities to one database section (e.g., rules and policies) versus problems that require making selections across the different sections of the database, and E-mails that are succinct versus E-mails that contain excessive text. Each of the 40 E-mails had eight versions that were used across the eight sessions. Although the E-mails across sessions were worded differently and sent from different customers, the information constituting a correct E-mail response was essentially the same. For example, in a different version of a given E-mail the customer may have inquired about a particular feature of a different product, but one that was in the same product category. This level of control afforded by the laboratory environment was essential for assessing task learning.
In principle, arrangements could have been made to have this task performed in the homes of the participants, thereby providing for a more realistic simulation of the telecommuting work environment. However, this would have compromised experimental control. Finally, although face validity from the participants' perspective is not generally acknowledged as a measure of ecological validity, anecdotal reports by the subjects concerning their study experiences can provide important feedback regarding this issue to researchers. Thus we are conducting exit interviews with the participants regarding their perceptions of the task. Participants routinely noted that they felt like they were involved in a real work experience. A number of the participants who were retired or unemployed have actually inquired into obtaining these jobs. We believe that ecologically valid simulations of work tasks should evoke real experiences of work by study participants.
| Conclusions |
|---|
|
|
|---|
The success of using an ecologically valid research approach depends on capturing critical elements of tasks, environments, and behaviors. Toward this end, techniques such as task analysis and simulation can be used to help achieve ecologically valid design. The Roybal Centers have provided an opportunity and support for meeting this goal. However, there are challenges associated with conducting this type of research, and more emphasis needs to be directed toward the development of research protocols that enable the generalization of research findings to real-world settings. As noted by Fleishman and Quaintance (1984), the absence of information on tasks and natural situations is the reason why much of the current literature on human performance has resulted in few principles and guidelines for solutions to real-world problems.
| Footnotes |
|---|
2 Department of Industrial Engineering, University of Miami, Coral Gables, FL. ![]()
Received for publication July 18, 2002. Accepted for publication September 4, 2002.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
C. Schacke and S. R. Zank Measuring the Effectiveness of Adult Day Care as a Facility to Support Family Caregivers of Dementia Patients Journal of Applied Gerontology, February 1, 2006; 25(1): 65 - 81. [Abstract] [PDF] |
||||
![]() |
J. Sharit, S. J. Czaja, M. Hernandez, Y. Yang, D. Perdomo, J. E. Lewis, C. C. Lee, and S. Nair An Evaluation of Performance by Older Persons on a Simulated Telecommuting Task J. Gerontol. B. Psychol. Sci. Soc. Sci., November 1, 2004; 59(6): P305 - P316. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||
| HOME | ARCHIVE | SEARCH | TABLE OF CONTENTS |
|---|