Specific Outcome Measures 7. 3.

Author: Michael Lamont

As the resources and skills available for evaluations are often limited, it is important to decide at the planning stage of a programme what will be measure and why (Dugdill & Stratton, 2007). For the purpose of these guidelines we have categorised outcome measures into three categories:

  • Physical outcome measures
  • Psychological outcome measures
  • Social functioning outcome measures

Physical outcome measures

There are a variety of methods available for measuring sport/physical activity levels but there is no gold standard (Welk, 2002).

Key parameters for physical activity measurement include:

  • Frequency (when/how often does it occur)
  • Intensity (how hard is the activity)
  • Time (the duration)
  • Type of activity (walking, running, swimming etc.).

Dugdill & Stratton (2007) stated the challenge is selecting a measure that is valid, accurate and reliable, defining a valid instrument as one that measures what it purports to measure and a reliable measure as one that produces stable and repeatable results when used under the same conditions.

Self-report tools

Self-report tools are one of the most commonly used outcome measure as they are less time consuming than other measures and are easily administered. Self-report tools can take the form of diaries, questionnaires or interviews. They can be completed both by the participant and by another person significant in the participant’s life e.g. parent, carer, sibling or child.

A person’s ability to understand a survey question and their ability to accurately recall and communicate their activity pattern will affect the reliability and validity of their self-report, therefore the most reliable tools tend to be 3 day or 7 day recall tools (e.g. a participant recalling and communicating their level of physical activity participation over past 3 or 7 days) (Dugdill & Stratton, 2007). These self-report tools are recommended as they have shown adequate reliability and validity in large populations (Welk, 2002).

International Physical Activity Questionnaire Revised (IPAQ): The IPAQ (Booth, 2002) comprises a set of 4 questionnaires. The purpose of the questionnaires is to provide common instruments that can be used to obtain internationally comparable data on health–related physical activity.

Global Physical Activity Questionnaire (GPAQ) (“WHO: GPAQ Questionnaire): The GPAQ covers several components of physical activity including intensity, duration, and frequency.

The Active Lives Questionnaire (“Sport England”, 2015): The Active Lives Questionnaire collects data on demongraphics, geography and engagement in sport/physical activity and takes roughly
15 minutes to complete.

The Physical Activity Questionnaire for Older Children (PAQ-C) and Adolescents (PAQ-A) (Crocker et al., 1997): The PAQ-C and PAQ-A are Self-administered 7 day recall questionnaires, assessing general levels of physical activity in 9 to 15 year old children using 10 questions. There are no valid questionnaires for children under the age of 9.

The Borg Scale of Perceived Exertion (Borg, 1982): The Borg RPE scale is a self-report rating scale assessing a participant’s level
of perceived exertion during physical activity.

Heart rate Monitors

Heart rate monitoring usually come in the form of a belt that fits around the chest and detects electrical impulses from the heart and converts these to beats per minute. These data are either stored in the belt or transmitted to a receiver in the form of a wristwatch. Heart rate monitors can be programmed to record heart rate second to second or minute to minute (recording interval is called an epoch) continuously for weeks. The main advantage of heart rate monitoring is the relatively low participant burden and ease with which data is collected and analysed. The instruments require a PC for collected data to be downloaded. Although these tools provide objective measures of physical activity, the cost and technical expertise required for use and data analysis may restrict the feasibility of use
for evaluation purposes.


These small devices are usually placed on the waistband or the wrist in wheelchair users and record the vertical (uni-axial) or vertical, horizontal and diagonal (tri-axial) acceleration of the body. These accelerations are then converted to gravitational counts per epoch duration. These instruments can record in second by second or minute-by-minute epochs (Dugdill & Stratton 2007). As with heart rate monitors, the cost and technical expertise required for the use and data analysis of accelerometers may restrict the feasibility of their use for evaluation purposes.


Pedometers provide information on walking.
A person’s individual data such as stride length, body weight and age can be input into some pedometers. The in correct input of stride length is arguably the largest cause of error in estimating physical activity energy expenditure and distances covered during walking. The best use of pedometers is for recording steps and pedometers should always be manually checked for counts by using a calibrated shaker table or by hand (by counting each shake 1, 2, 3 etc. and checking against the device) (Dugdill & Stratton 2007). For representative data to be obtained participants it is advised participants wear a pedometer for 3 days (Tudor-Locke et al., 2005).

Pedometers are a low cost method of generating accurate and reliable data (depending on the quality of the pedometer; Schneider et al., 2004). The daily target for physical activity
is 10,000 steps per day (Tudor-Locke and Bassett, 2004) for persons without physical disabilities. However 15,000 and 12,000 steps have been recommended for male and female adolescents, respectively (Tudor Locke et al., 2004). The key aspect for activity intervention is not necessarily the debate over number of steps but whether total steps increase as a result
of engaging in an activity intervention. Recent pedometer evaluations in schools have suggested that pedometers work as motivational tools (Butcher et al., 2007) and stimulate increases in physical activity (Dugdill & Stratton 2007).

Berg balance scale (BBS) (Berg, Wood-Dauphinée, Williams & Maki, 1992)

The BBS is a qualitative measure that assesses balance via performing functional activities such as reaching, bending, transferring, and standing that incorporates most components of postural control: sitting and transferring safely between chairs; standing with feet apart, feet together, in single-leg stance, and feet
in the tandem Romberg position with eyes open
or closed; reaching and stooping down to pick something off the floor. Each item is scored along a 5-point scale, ranging from 0 to 4, each grade with well-established criteria. Zero indicates the lowest level of function and 4 the highest level of function. The total score ranges from 0 to 56. The BBS is reliable (both inter- and intratester) and has concurrent and construct validity.

Systemic Observation

Systematic observation involves a trained person observing and coding predetermined physical activity behaviours of participants over a set period of time, e.g. sitting, walking, running etc. The SPACES system (Systematic Pedestrian and Cycling Environmental Scan, Pikora et al., 2002) is an example of a comprehensive observation tool used to assess walking and cycling. Systematic observation requires observers to have undertaken specific training and can be used to assess participants in real time or video recordings. Although providing vaild data, the time and specific training required for this technique may reduce its feasibility for evaluation purposes.

Psychological outcome measures

The Warwick-Edinburgh Mental Well-being Scale (WEMWBS) (Tennant et al., 2007): The WEMWBS was developed to enable the monitoring of mental wellbeing in the general population and the evaluation of projects, programmes and policies which aim to improve mental wellbeing. SWEMWBS is a shortened version of WEMWBS. This is a 7 item scale for which item scores need transforming.

Rosenberg Self-esteem Scale (RSES) (Rosenberg, 1965): The RSES is a 10-item scale that measures global self-worth by measuring both positive and negative feelings about the self. The scale is believed to be uni-dimensional. All items are answered using a 4-point Likert scale format ranging from strongly agree to strongly disagree.

General Self-Efficacy Scale (GSE) (Schwarzer & Jerusalem, 2010): This scale was created to assess a general sense of perceived self-efficacy with the aim in mind to predict coping with daily hassles as well as adaptation after experiencing all kinds of stressful life events. The scale is designed for the general adult population, including adolescents. Persons below the age of 12 should not be tested. The measure has been used internationally with success for two decades. It is suitable for a broad range of applications. It can be taken to predict adaptation after life changes, but it is also suitable as an indicator of quality of life at any point in time

The Life Satisfaction Questionnaire (LISAT) (Fugl-Maeyer et al. 1991): The LISAT is a self or interviewer-administered rating scale, taking approximately 5 minutes to administer. The LISAT-9 has 9 items; one is a global item for ‘life as a whole’ and 8 are domain-specific items for ‘vocational situation’, ‘financial situation’, ‘leisure’, ‘contact friends’, ‘sexual life’, ‘activities of daily living’, ‘family life’, and ‘partnership relationship’. The LISAT-11 has 11 items, which includes the same items as the LISAT-9 but with two additions evaluating ‘physical health’ and ‘psychological health’.

World Health Organization Quality of Life Instrument (WHOQOL-BREF) (WHO, 1998): The WHOQOL-BREF instrument comprises 26 items, which measure the following broad domains: physical health, psychological health, social relationships, and environment. The WHOQOL-BREF is a shorter version of the original instrument that may be more convenient for use in large research studies or clinical trials. The questionnaire captures many subjective aspects of quality of life (QOL) and is one of the best known instruments for cross-cultural comparisons of QOL and is available in many languages.

The Beck Depression Inventory (BDI) (Steer, Beck, Brown, 1996): BDI is a 21-item self-reporting questionnaire for evaluating the severity of depression in normal and psychiatric populations. A shorter version of the questionnaire, the BDI Fast Screen for Medical Patients (BDI-FS), is available for primary care use. That version contains seven self-reported items each corresponding to a major depressive symptom in the preceding 2 weeks.

Evaluation of Social Functioning

The New Philanthropy Capital’s Outcomes Map: Personal and Social Well-being (Copps and Plimmer, 2013).

In this NPC publication, Copps & Plimmer (2013) defined personal and social well-being as a person’s state of mind, relationship with the world around them, and the fulfilment they get from life. It can be understood as how people feel and how they function, both on a personal and a social level, and how they evaluate their lives as a whole. It is linked to a range of other outcomes, including mental health.

Copps & Plimmer (2013) divided the measurement of personal and social well-being into on 3 categories:

  1. Feelings about self.
  2. Relationships with family and friends.
  3. Perception and connectedness to the community.

1. Improved feelings of self

Examples of valid outcome measures discussed in Psychological Outcome Measures section. Other examples include:

  • The Self-concept Scale (10-items) (Marsh, 1992).
  • The Resilience Scale (14-item) (Wagnild and Young, 1987).
  • The Children’s Society’s Wellbeing Index (Rees, Goswami & Bradshaw, 2010).

2. Improved relationships with family and friends

Examples of valid outcome measures included:

The Multidimensional Students’ Life Satisfaction Scale (MSLSS) (Huebner, 2001)

The MSLSS is designed to provide a profile of children’s life satisfaction across key domains. The 40-item scale is completed by children and young people and captures information on five domains:

  • Family (7 items)
  • Friends (9 items)
  • School (8 items)
  • Living Environment (9 items)
  • Self (7 items)
  • There is also a 6-item Brief Multidimensional Students’ Life Satisfaction Scale.

The Friendship Scale (Hawthorne 2006)

This short, user-friendly 6 item scale measures 6 of the 7 important dimensions that contribute to social isolation and its opposite, social connection.

Lubben Social Network Scale–Revised (LSNS-R) (Lubben et al., 2002)

The LSNS-R is designed to gauge social isolation in older adults by measuring perceived social support received by family, friends and mutual supports (eg. neighbours), including confidant relationships. The tool has an abbreviated version (LSNS-6) and an expanded version (LSNS-18) and takes approximately 5-10 minutes to administer.

UCLA Loneliness Scale – Revised (Russell, Peplau, & Cutrona, 1980)

This 20-item scale is designed to measure a person’s subjective feelings of loneliness and social isolation.

3. Improved perceptions of and connectedness to the community

Copps & Plimmer (2013) defined this as a person feeling part of a meaningful community or communities, feeling connected to the environment around them, and feeling included and involved. Approaches to measuring these aspects of well-being tend to be survey-based and depend on the responses of individuals to questions about their feelings and perceptions. Many of the tools tend to be very similar and are often derived from the same research base but differ slightly in length and emphasis. Overall, there is no firm consensus on what the best tools are.

In practice, where they are in use, well-being approaches tend to be combined with measures specific measures tailored to the intervention. As in many areas of measurement, there remains a skills gap in analysing and interpreting data. There is a clear need to create tools that are practical and can be applied by non-experts (Copps & Plimmer, 2013).


Berg K, Wood-Dauphinee S, Williams JI, Maki, B: Measuring balance in the elderly: Validation of an instrument. Can. J. Pub. Health, supplement 2:S7-11, 1992.

Borg, G. A. (1982). Psychophysical bases of perceived exertion. Med sci sports exerc, 14(5), 377-381.

Butcher, Z. Fairclough, S. Stratton, G. & Richardson, D. (2007) The Effect of Feedback and Information on Children’s Pedometer Step Counts at School. Pediatric Exercise Science, 19(1).

Copps, J. & Plimmer, D. (2013). Outcomes Map: Personal and Social Well-being. NPC, retrieved from: https://www.thinknpc.org/?s=OUTCOMES+MAP%3A+PERSONAL+AND+SOCIAL+WELL-BEING.

Dugdill, L. & Stratton, G. (2007). Evaluating sport and physical activity interventions: a guide for practitioners. University of Salford.

Fugl-Meyer, A. R. Eklund, M. & Fugl-Meyer, K. S. (1991). Vocational rehabilitation in northern Sweden. III. Aspects of life satisfaction. Scandinavian journal of rehabilitation medicine, 23(2), 83-87.

Hawthorne, G. (2006). Measuring social isolation in older adults: development and initial validation of the friendship scale. Social Indicators Research, 77(3), 521-548.

Huebner, E. S. (2001). Manual for the multidimensional students’ life satisfaction scale. SC: University of South Carolina (unpublished paper provided by the author).

Lubben, J. (2002). Lubben Social Network Scale–Revised. Retrieved from: https://instruct.uwo.ca/kinesiology/9641/Assessments/Social/LSNS-R.html

Sportanddev.org, (2013). Retrieved from: https://www.sportanddev.org/en... England: The Active Lives Questionnaire, 2015, retrieved from: http://www.activelivessurvey.o... England, (2017). Review of Evidence on the Outcomes of Sport and Physical Activity – A Rapid Evidence Review. Retrieved from: https://www.sportengland.org/media/11719/sport-outomes-evidence-review-report.pdf.

Pikora, T. J. Bull, F. C. L. Jamrozik, K. Knuiman, M. Giles-Cortie, B. & Donovan, R. J. (2002). Developing a reliable audit instrument to measure the physical activity environment for physical activity. American Journal of Preventive Medicine, 23 (3), 187-194.

Sport for Development Coalition, (2015). Sport for Development outcomes and measurement framework. Retrieved from: https://londonfunders.org.uk/s... R.A., Beck A.T. & Garrison B (1986). Applications of the Beck Depression Inventory. In: Sartorius N, Ban TA, eds. Assessment of Depression. Geneva, Switzerland: World Health Organization, 121–142.

Stratton, G. Ridgers, N.D. Gobbii, R. & Tocque, K. (2005) Physical Activity Exercise, Sport and Health: Regional Mapping for the North-West. Retrieved from: www.nwph.net/pad/accessed.

Schwarzer, R. & Jerusalem, M. (2010). The general self-efficacy scale (GSE). Anxiety, Stress, and Coping, 12, 329-345.

RE-AIM, (2014). RE-AIM as a planning tool. Retrieved from: http://www.re-aim.org/re-aim-as-a-planning-tool/.

Rosenberg, M. (1965). Rosenberg self-esteem scale (RSE). Acceptance and commitment therapy. Measures package, 61, 52.

Russell, D. Peplau, L. A. & Cutrona, C. E. (1980). The revised UCLA Loneliness Scale: Concurrent and discriminant validity evidence. Journal of personality and social psychology, 39(3), 472-480.

Tennant, R. Hiller, L. Fishwick, R. Platt, S. Joseph, S. Weich, S. & Stewart-Brown, S. (2007). The Warwick-Edinburgh mental well-being scale (WEMWBS): development and UK validation. Health and Quality of life Outcomes, 5(1), 63.

World Health Organization: GPAQ Questionnaire http://www.who.int/ncds/surveillance/steps/GPAQ/en/) Retrieved 12.11.2018

Welk, G. J. (2002). Physical activity assessments for health-related research. Human Kinetics.

World Health Organization, (1998). World Health Organization Quality of Life Instrument (WHOQOL-BREF). Retrieved from: http://www.who.int/substance_abuse/research_tools/whoqolbref/en/.