FOTO is primarily interested in measuring the physical functional status or ability of patients, through patient self-reporting, for the purpose of obtaining an outcome. Functional status was selected by FOTO because the majority of patients receiving rehabilitation therapies receive therapy to improve a deficit in functional status. Patient-centered outcome measures were selected to adequately capture the breadth of health concepts associated with a patient's perception of their functional status, which is the most important factor influencing their actual physical functioning and participation in daily activities. In addition, patient report outcome measures are responsive to individual patient preferences, needs and values, and ensure that patients' values guide clinical treatment decisions. Patient self-report measures are the gold standard for measuring outcomes to determine the value of care from rehabilitation services. Patient self-report outcomes have been consistently endorsed by World Health Organization, Institute of Medicine, National Quality Forum, US Department of Health, Centers for Medicare and Medicaid and by all current national and international clinical practice guidelines.
Measuring outcomes and data collection in today's busy rehabilitation environments can be time consuming and hectic. FOTO proactively realized that developing ways of collecting precise outcomes measures efficiently was of great interest to clinicians and clinic managers. In 1999 FOTO began to explore development of computer adaptive testing (CAT) to replace paper and pencil surveys. FOTO's goals for developing CAT outcome measures in outpatient rehabilitation were to make data collection more efficient, reduce the 'irrelevant' questions administered to specific patients, and improve measure sensitivity to change. In 2002 FOTO implemented a general physical Functional Health Status CAT and mental health outcome surveys. Based on positive customer feedback and suggestions to refine item selection to specific patient impairments and measure sensitivity, FOTO in 2005 began development and implementing body part specific CATs outcome measures for patients with orthopedic impairments. Recently published results have supported the reduced patient burden and efficiency of assessing functional status using the FOTO CAT measures. On average it takes a patient 1-2 minutes to complete a FOTO CAT functional status survey.
FOTO currently has 1) seven body specific orthopedic CAT outcomes measures for shoulder, elbow/wrist/hand, lumbar, hip, knee, ankle/foot, and general orthopedic (i.e., cervical, thoracic, ribs, craniomandibular) and 2) a general medical CAT outcome measure for complex medical and neurological patients. Recently, new CAT measures were developed and published for orthopedic neck conditions and pelvic floor dysfunction. Additional CAT measures for specific neurological and medical conditions such as stroke, vestibular/balance, lymphedema etc. are on FOTO’s current development agenda.
For a patient reported outcome measure to be valid between patient groups, the difficulty level represented by the items (questions) of that measure need to be perceived in a similar manner by the different patient groups (e.g., females vs. males, different age groups, etc.) Additionally, as FOTO expanded into other countries outside of the U.S. i.e., Israel and more recently Canada, FOTO realized that cultural differences may potentially influence the way patients answer survey questions. It is possible that observed differences in survey functional status (FS) scores across cultural groups represent differences in item translation or cultural perceptions vs. a true measurement difference. Thus, DIF is an aspect of validity estimating measurement error related to cross cultural differences. When DIF is present, the validity for measurement estimates of FS is decreased. Published data to date reported negligible DIF impact in the adaptive test of functional status for knee impairments between 1) patients in the U.S. who speak English and those patients in Israel who speak Hebrew and 2) patients in Israel who spoke Hebrew or Russian. Results supported the validity of translation of knee FS items into Hebrew and Russian. Future testing of cross cultural DIF in multiple languages for all of FOTO’s body specific measures is being planned.
Clinicians frequently need to choose between different outcome measures when assessing the patient’s functional status during every day clinical practice. Traditionally, therapists most often select a patient self-report outcome measure (PRO) based on familiarity such as the Oswestry questionnaire (ODQ) or the Lower Extremity Functional Scale Survey (LEFS). However the process of collecting outcomes data is evolving from paper and pencil to computer-adaptive testing (CAT) administered surveys. Research has recently stimulated interesting discussions about the clinical strengths or weaknesses for using different methods of survey administration during every day clinical practice.
To assist therapists with selecting an outcome measure, FOTO has 1) investigated a head to head psychometric comparison or discriminating ability between the Oswestry questionnaire and FOTO’s Lumbar CAT (LCAT) measure, 2) developed CAT administered outcome surveys specifically examining all items from the LEFS, and 3) developed cross walk tables between FOTO’s body specific CAT measure scores with scores from other traditionally used tools such as the DASH.
Discriminating ability of functional status (FS) estimates from FOTO’s Lumbar computer adaptive test (LCAT) and the Oswestry (ODQ) measure was estimated using relative validity, calculated by dividing F values from LCAT and ODQ analyses of covariance for important risk-adjustment variables. Minimally clinically important improvement (MCII) was estimated using receiver-operating-characteristic analyses by quartiles of intake FS values, and areas under the curves were compared. Interestingly, published results indicated that both measures had similar psychometric characteristics. However, the LCAT FS estimates tended to be more discriminating than ODQ FS estimates. MCII cut scores by quartile of intake FS favored the LCAT. In addition, given the need to be efficient and precise in estimating measures of FS, results favor the LCAT because the time required to complete FOTO’s LCAT is < 2 minutes compared to the increased time (estimated 5+ minutes) required for patients to complete and clinicians to score the Oswestry questionnaire.
FOTO applied advanced mathematics and published CAT administered outcome surveys specifically developed using the items in the LEFS. Simply, FOTO developed a CAT from the LEFS. Hart and Stratford examined 1772 patient with lower extremity impairments including hip (n=444), knee (n= 949) and ankle/foot (n= 379). Their results supported 3 body specific CATs i.e., hip, knee, and ankle/foot from the LEFS. The CATs were 70% more efficient compared to asking the patient to complete all 20 items in the paper and pencil LEFS while maintaining psychometric precision. More efficient translates to reduced time for patients to complete the FS measure. Reduced patient burden, i.e., saving time, during fast-paced and often hectic outpatient environments is a strong clinical advantage for assessing functional status without reducing measurement precision.
FOTO is developing cross walk tables between the scores from FOTO’s body specific CAT measures and traditional outcome tools. For example, the (FOTO) Shoulder CAT to the DASH. Therefore if a clinician used the shoulder FOTO FS CAT and wanted to know what the score would have been if the patient completed the DASH, the cross walk table is a mathematically accurate tool to equate the 2 measures. So for example if FOTO’s shoulder CAT score (scaled 0-100) = 30 points, the DASH score (scaled 0-100) would = 21 points.
Reliability Internal consistency (person reliability) was calculated utilizing IRT methods. “Person reliability" is equivalent to the traditional "test" reliability of Cronbach’s alpha. Published person reliability results were 0.92, 0.96, 0.96, 0.96, 0.97, and 0.95 for lumbar, knee, hip, ankle/foot, shoulder, and elbow/wrist/hand respectively.
Validity Known group construct validity methods were used to assess the ability of the CAT generated functional status (FS) measures to discriminate groups of patients. Validity was tested using one-way ANCOVAs with functional status change as the dependent variable, intake FS as the covariate, with one ANCOVA for each risk-adjustment variable as the independent variable. Post hoc Sheffe analyses were run for significant main factors of each independent variable. The independent variables assessed included intake FS, age, symptom acuity, surgical history, condition complexity and prior exercise history. Overall the results were clinically logical and supported the known group construct validity for FOTO body part specific measures. For example, with the FOTO hip FS measure, patients who were older, had more chronic symptoms, had more surgeries, had more comorbidities, and did not exercise prior to receiving rehabilitation, reported worse (i.e., lower) discharge FS compared to other patients with each independent variable after controlling for intake FS.
Sensitivity Sensitivity was tested using two distribution-based approaches. First, effect size statistics were estimated as follows: (discharge FS minus intake FS)/(intake FS standard deviation). Second, we assessed minimal detectable change (MDC), which is defined as change greater than measurement error. MDCs were calculated by calculating average measurement error (standard error or SE) at 10 levels of intake functional status, which represented conditional standard errors of measurement (CSEM). To calculate each CSEM, we estimated the average SE associated within each of the 10 scale ranges using the intake data and multiplied the average SE per scale range by 1.96 times the square root of 2. The proportion of patients with change scores greater than the MDC at the upper 95% confidence interval was reported. Overall the results supported the sensitivity for FOTO body part specific measures. For example for the FOTO Lumbar CAT measure 66% of patients attained FS change scores equal to or greater than MDC at the 95% confidence interval.
Responsiveness Responsiveness was tested using an anchor-based approach by calculating the proportion of patients whose FS change was greater than minimal clinically important improvement (MCII), which is change considered important to the patient. We used the global rating of change (GROC) scale described by Jaeschke et al as the comparison standard. Patients were dichotomized by their GROC scores as patients who did not improve (i.e., GROC scores < 3 ) versus patients who improved (i.e., GROC scores ≥ 3 ). We estimated MCII (1) using all patients regardless of intake FS measure, and (2) because change is dependent on baseline FS measures using patients grouped by quartile of baseline FS scores. Area under the receiver operator curve (AUC), SE and 95% CI were used to describe the ROC results. The percent of patients whose FS change was equal to or greater than MCII was calculated. Overall the results supported the responsiveness for FOTO body part specific measures. For examples for the FOTO Lumbar and Knee CAT measures 70% patients with FS change was equal to or greater than MCII.
Risk adjustment is an essential statistical method to control for the influence of confounders such as the diversity and complexity of patients attending outpatient rehabilitation clinic settings. For example differences in outcomes between your patients and other providers may be due to the fact that your patients received superior treatment or may simply be due to the differences in the characteristics of the patients you are managing such as the patient’s age, gender, duration of symptoms or medical complexity. Without appropriate risk adjustment, patient functional status outcomes cannot be interpreted in a meaningful manner when comparing outcomes between different patients and providers. Observational data must be risk adjusted in order to identify effective interventions used during every day clinical care. Despite best efforts to risk adjust; the reality is that there may be unmeasured or unknown factors which therapists have very little control over such as patient motivation or adequate financial support to access care, etc. Although risk adjustment has limitations i.e., all important factors are not measured or known, the benefits of risk adjusting far outweigh potential risks for not risk adjusting observational outcomes data.
FOTO developed models for predicting discharge functional status. The models were generated by applying multiple linear regression statistical methods controlling for important patient characteristics known to influence the prediction of the patient’s function. FOTO’s initial risk adjusted prediction model was first published for the CMS funded Pay-for-Performance grant research project (Hart and Connolly 2006).
FOTO currently risk adjusts for 1) six non-modifiable patient variables i.e., gender, age, duration of symptoms, surgical history, medical comorbidities, payer type and 2) two modifiable patient variables i.e., intake severity of functional status and fear avoidance beliefs of physical activity.
In addition to the independent variables included in FOTO’s model mentioned above, other treatment processes and variables and statistical methods, not currently included in FOTO’s risk adjusted model, have been published. For example, to help control for potential non-random variation in certain treatment processes, Resnik et al (Phys Ther 2008) developed advanced analyses by using several types of hierarchical linear statistical models. The purpose of the advanced modeling techniques was to account for nonrandom clustering of (1) patients nested within physical therapists only (physical therapist being the random factor), (2) patients nested within clinics only (clinic being the random factor), and (3) a multilevel model with patients nested within physical therapists who were nested within clinics. In simply terms, the models helped to explain the potential influence of why some patients are referred to certain therapists and why certain therapists work at specific clinics and the effect that patient and therapist nesting have on predicting outcomes.
FOTO’s risk adjustment is a fluid process in that other variables and statistical analyses identified in the research will be tested. Depending on test results, new variables may be included or existing variables deleted from FOTO’s current model. FOTO’s goal is to produce the most parsimonious model for predicting patient’s functional status outcomes.
Patient’s Functional Status Score. A functional status score is produced when the patient completes the FOTO measure (administered by computer adaptive testing or a paper and pencil survey). The functional status score is continuous and transformed to a linear metric using a modern statistical approach named Item Response Theory (IRT). Scores generally range from 0 (low function) to 100 (high function). The survey is standardized, and the scores are validated for the measurement of function for a specific body part impairment. (lumbar, hip, knee, ankle/foot, shoulder, elbow/wrist/hand and general orthopedic (i.e., cervical, thoracic, ribs, craniomandibular).
Patient’s Functional Status Change Score. A functional status change score is calculated by subtracting the Patient’s Functional Status Score at Admission from the Patient’s Functional Status Score at Discharge.
Predicted Functional Status Change Score. Functional Status Change Scores for patients are risk adjusted using multiple linear regression methods that include the following independent variables: Patient’s Functional Status Score at Admission, patient age, symptom acuity, surgical history, gender, number of co morbidities, payer type, and level of fear-avoidance. The Patient’s Functional Status Change Score is the dependent variable. The statistical regression produces a Risk-Adjusted Predicted Functional Status Change Score.
Risk-adjusted Functional Status Change Residual Score. The difference between the actual change and predicted change scores (after risk adjustment) is the residual score and should be interpreted as the unit of functional status change different than predicted given the risk-adjustment variables of the patient being treated. As such, the residual score represents risk-adjusted change corrected for patient characteristics. Residual scores of zero (0) or greater (> 0) should be interpreted as functional status change scores that were predicted or better than predicted given the risk-adjustment variables of the patient. Residual scores less than zero (< 0) should be interpreted as functional status change scores that were less than predicted given the risk-adjustment variables of the patient.
Aggregated risk-adjusted residual scores: The average of residual scores from a provider (clinician or clinic). The aggregated scores are used to make comparisons between providers.
Step 1. The patient completes FOTO’s functional status survey at admission, which generates the Patient’s Functional Status Score at Admission (Intake),
Step 2. The patient completes FOTO’s functional status survey at or near the time of treatment Discharge, which generates the Patient’s Functional Status Score at Discharge,
Step 3. The Patient’s Functional Status Change Score (raw, non-risk-adjusted) is generated
Step 4. A Risk-adjusted Predicted Functional Status Change Score is generated using a regression equation
Step 5. A Residual Score is generated for each patient.
Steps 1-5 above and Step 6
Step 6: The average residual scores per clinician and/or clinic are calculated, and scores for all clinicians/clinics in the database are ranked. The quality score is the percentile of the clinician and/or clinic ranking. FOTO recommends based on reliability analyses at the provider level that clinicians have a minimum of 10 patients per year for lumbar, elbow wrist and hand, and general orthopaedic, 20 for knee, foot and ankle and shoulder, and 30 for hip, and clinics have a minimum of 10 patients/therapist per year for small clinics or 40 patients per year for larger clinics (5 or more clinicians) in order to obtain stable estimates of provider performance.
FOTO has published extensive work on the topic of functional staging to assist clinicians in interpreting FS CAT measurement scores in a manner that is clinically meaningful. For example, a patient with low back pain was seen for therapy and had an intake FS score of 33 on a scale from 0 (lowest function) to 100 (highest function). What does a score of 33 mean with regards to the patient’s current physical capabilities? Score 33 provides only a general sense of the patient’s status, which is insufficient when a clinician wants to understand the patient’s actual functional abilities. To improve clinically meaningful interpretations of the FOTO CAT FS measurement scores, FOTO developed and published body specific functional classification systems referred to as the Back Pain Function Classification System (BPFCS).
The BPFCS is based on both the International Classification of Functioning, Disability and Health (ICF) framework of activities performed and a clinically logical hierarchical progression of functional stages paralleling the numeric FS scores from 0 (“low FS”) to 100 (“high FS”). The BPFCS defines and categorizes a numeric score from the FOTO CAT FS measure into one of five functional stages capturing the patient’s current physical abilities. The higher the stage, the higher the patient’s perceived functional capabilities. For instance, FOTO’s BPFCS classifies patients based on the patient’s FS score into 5 hierarchical functional levels: (1) is exceedingly limited in the ability to perform easy, routine functions; (2) exhibits extreme difficulty performing usual work or household activities; (3) exhibits moderate difficulty performing usual work or household activities; (4) exhibits little difficulty performing usual work or household activities and hobbies; and (5) back to normal life performing rigorous daily activities. Each stage outlines the patients’ perceived levels of difficulty for performing their usual ADL, work, and/or recreational across a wide range of physical tasks. To date, functional staging classification models have been published for the following FOTO specific body part CAT FS measures: shoulder, lumbar, hip, knee, and ankle/foot impairments.
The debates and evidence regarding how to identify a clinical expert have been ongoing for many years. The debates have centered around which clinician characteristics identify a clinical expert e.g., years of clinical experience, therapist’s age, specialty certification, or recognition by peers. Although a consensus on how to define a clinical expert has not been agreed upon, research using FOTO’s extensive national database has contributed extensively to the topic of defining a physical therapy clinical expert. Resnik et al (Phys Ther 2003) published a novel approach using the clinicians’ outcome data for classifying therapists as clinical experts. When defining expertise based on outcomes, Resnik and other researchers reported that years of experience, specialty certification, and types of manual therapeutic interventions were not important traits in describing clinical experts. Instead, their findings supported a growing body of evidence that certain other clinician characteristics are more useful for explaining what an expert is. For instance, therapists classified as expert had a patient-centered approach to care, enhanced collaborative clinical reasoning, and promoted patient self-care and empowerment. Therapist-patient interaction and relationships appear to be associated with better patient outcomes.
Extensive work has been published using FOTO data to examine 1) the inter-rater reliability and outcomes for the McKenzie method to evaluate and manage patients, 2) the prevalence for centralization and directional preference as subgroup classification categories within the McKenzie classification system for patients with low back pain attending physical therapy outpatient setting, 3) the association between centralization and directional preference and pain and functional status outcomes, and 4) head–to-head comparison between prevalence of subgroups using clinical prediction rules and McKenzie system. In numerous articles examining the FOTO database, Werneke et al reported that centralization and directional preference prevalence varied considerably depending on duration of patients’ symptoms and age. For example, older patients and those patients with chronic pain had a lower prevalence rate for centralization compared to younger patients with acute or subacute pain. Despite the association between prevalence rates for centralization and directional preference and the patient’s age and symptom acuity, if centralization occurred during the assessment, consistent and positive associations were reported between these signs and symptoms and better outcomes. In addition, Werneke et al reported that subgroup classifications were not mutually exclusive between different classification paradigms; these finding were consistent with results published by many other authors using different patient samples.
Managing patients with musculoskeletal impairments from a psychosocially informed perspective has been recommended as evidence-based practice by all recent clinical practice guidelines. Using FOTO’s extensive database, George et al 2011 studied the scope of the prevalence and impact of depressive symptoms on outcomes for patients attending physical therapy for a wide variety of different anatomical regions. George et al reported that depressive symptoms had a consistent detrimental influence on outcomes and recommended screening patients, regardless of anatomical impairment, for depressive symptoms in physical therapy outpatient settings. In addition, 2 recent studies using FOTO data by Hart et al 2011 and Werneke et al 2011 not only recommended screening by physical therapists for psychosocial distress during the initial evaluation but also serially throughout the treatment episode. Both authors reported that intake psychosocial risk classification changes from high or medium risk to low risk over the course of the treatment episode. Change in psychosocial risk explained more of the variance in outcomes compared to the psychosocial risk screened at baseline. FOTO offers clinicians many optional psychosocial screening surveys for routine practice including: STarT, depressive symptoms, somatization, catastrophizing, self-efficacy, and fear-avoidance beliefs.
FOTO data have been used to examine the effectiveness of therapeutic interventions using both an observational and randomized controlled trial designs. The relationship between patient outcomes and treatment processes and interventions was clearly shown by Deutscher et al (ARPM 2009) in a large prospective, observational cohort study, using data from Israel’s large FOTO patient database. Deutscher et al examined outcomes from 22,019 adult patients seeking treatment due to lumbar spine, knee, cervical spine, or shoulder impairments from any of Maccabi Healthcare Services 54 community based outpatient physical therapy (PT) clinics in 2005-2008. Functional status (FS) data were collected at intake and discharge from therapy, using FOTO’s body part–specific CAT patient reported surveys. Associations between demographic and health characteristics at intake and treatment process variables with discharge FS were evaluated using multivariable linear regression methods. After controlling for patient characteristics, the following treatment processes were found to be significantly associated with discharge FS: good compliance with attendance and home exercise program (for all impairments); waiting time between referral and initiation of PT (lumbar impairments). The following treatment interventions were positively associated with discharge FS: joint mobilization (cervical and knee), stabilization exercises (lumbar), proprioceptive exercises (knee), passive movements (shoulder), group exercise (cervical), stretching exercises (shoulder). The following treatment interventions were negatively associated with discharge FS: shortwave therapy (knee, shoulder), therapeutic ultrasound (shoulder); cold packs (knee), group exercise (knee and shoulder), neural mobilization (shoulder). Although this study did not examine hip, ankle/foot, elbow wrist and hand patients, we believe that the study’s results provide initial evidence of the relationship between treatment processes, therapist’s interventions, and patient outcomes.
In a single-blind randomized clinical trial designed study, Saban et al examined the effectiveness of different interventions commonly used to treat plantar heel pain syndromes using FOTO data. The authors reported treatment consisting of deep massage therapy to posterior calf muscles and neural mobilization with a self-stretch exercise program was significantly more effective in treating plantar heel pain compared to ultrasound with a self-stretch treatment program.
Pay-for-Performance is an alternative value-based payment method proposed for rehabilitation services to replace expensive traditional fee for service payment approaches. The aim for implementing P4P would be to align financial incentives with the implementation of rehabilitation care processes based on best clinical practices and the achievement of better patient outcomes. If financial incentives could be aligned with best practices and patient outcomes, alternative payment methods could be developed for rehabilitation services designed to compensate rehabilitation therapists based on the value of their patient care. In order not to penalize clinicians for treating the sickest patients, the P4P model would require sophisticated risk adjustment and the use of psychometrically sound outcome measures. The P4P vision to align financial incentives with value of service has been endorsed by Institute of Medicine, Centers of Medicare and Medicaid Services, National Quality Forum, as well as stakeholders such as the American Physical Therapy Association.
FOTO was the first to 1) design P4P model funded by a CMS grant, 2) demonstrate the feasibility of implementing a pay-for-performance process in outpatient physical and occupational therapy, 3) provide information to Medicare concerning payment policy for outpatient physical and occupational therapy, 4) discuss implications for the development of an alternative payment method for rehabilitation services, and 5) publish the P4P results on Medicare’s website (Hart & Connolly 2006). Other payers are currently testing FOTO’s P4P model in different regional areas in the US.