The Interface Usage Skills Test: An Open Source Tool for Quantitative Evaluation in Real Time for Clinicians and Researchers
Abstract
Assistive machines endow people with limited mobility the opportunity to live more independently. However, operating these machines poses risks to the safety of the human operator as well as the surrounding environment. Thus, proper user training is an essential step towards independent control and use of functionally assistive machines. The human operator can use a variety of control interfaces to issue control signals to the device, depending on the residual mobility and level of injury of the human operator. Proficiency in operating the interface of choice precedes the skill in operating the assistive machine. In this systems paper, we present an open source tool for automatically and objectively quantifying user skill in operating various interface devices.
I INTRODUCTION
Assistive machines enable people with motor impairments or other mobility deficits to achieve a higher level of independence, community and social participance, and quality of life. Although these devices expand a person’s capabilities, if handled without proper training, they can also pose safety hazards to the human using the device as well as those around them. Thus, many individuals who can benefit from an assistive machine such as a powered wheelchair are barred from using one because of inadequate control proficiency [1].
Proficiency in handling the interfacing device precedes proficiency in handling the assistive machine. The prescription of the most appropriate interface device to an individual based on their preferences and capabilities, the adjustment of the interface to the specific needs and abilities of that person, and the subsequent training of them in the safe and skilled use of that interface are integral elements of the preparation of the human to operate an assistive machine with greater success. For both clinicians and researchers to develop better-informed strategies and technologies that overcome operational challenges still faced by current and potential powered wheelchair (PW) users, it is imperative to have a standard tool for quantitative assessment of interface usage skill. However, there is a lack of objective quantitative measurements of interface use. Considering the specific case of a PW, there is a lack of objective quantitative assessment of the wheelchair user’s navigational control skills in each step of the process of gaining access to and using a PW—from introducing the individual to a PW by a wheelchair professional, the selection of an appropriate interface in a wheelchair seating clinic, and ongoing training with a physical therapist. Clinicians use qualitative and subjective observations and their previous experience as a way to gauge which settings and interfaces are suitable for the patient. Therapists commonly use the Wheelchair Skills Test (WST) [2], the current clinical standard for measuring wheelchair skills, to evaluate a patient’s performance by assigning a discrete capacity score, again through subjective observations. The System Usability Scale (SUS) is another commonly used tool that is completed by a user to assess the subjective usability of various products [3]. However, this tool does not provide a quantitative measure of the user’s skill in using the specific product. Access to analytics of the person’s interface usage skill can help identify areas of deficit in order for clinicians to provide better informed and targeted training and therapy for the ultimate goal of improving functional assistive machine usage.
There is currently no standard assessment tool for evaluating a user’s ability to control an interface for driving an assistive device. We fill this gap with the work described in this systems paper. We present a real-time analytics suite with conversion to app form suitable to run on any computer platform and a free beta release on the Android Play Store***Source code: https://github.com/argallab/InterfaceUsageSkillsTest.git. The tool can be used by clinicians without additional training and provides on-the-fly statistics on multiple interface usage measures. Raw data are also stored, which can be used for a more detailed analysis of custom interface usage characteristics. We also contribute hardware specifications for an adaptor that allows common commercial interfaces used for assistive machines to communicate over Bluetooth with our assessment tool.
First, we cover a brief background on the relevant literature in Section II. We then provide a detailed description of our Interface Skills Test software and hardware evaluation tools in Sections III and IV, with a discussion of the preliminary results. Implications for clinical and research use is covered in Section V. We conclude with our proposal for future work in Section VI.
II Background
In this section, we present a summary of related work on assistive machine skills assessment tools and characterizing interface usage.
II-A Powered Wheelchair Skill Measures
In the domain of functional rehabilitation, outcome measures span a wide range from kinesthetic and neurophysiological [4] measures of patients to global outcomes in terms of overall function and community reintegration [5].
The Wheelchair Skills Test (WST) is the state-of-the-art in clinical evaluation of an individual’s ability to drive a PW, and is an intermediate-level outcome that lies between the two extremes cited above [6]. This measure consists of various tasks—including ascending and descending slopes, and navigating through doorways—and for each task, an observing therapist chooses a capacity and confidence based on the completion of the task and subjective safety. A score of 0 indicates failure to complete the task, 1 indicates that the user had some difficulties completing the task, and 2 indicates that they accomplished the task without any difficulty. This measure does not capture details of the exact difficulties the person experienced in completing specific tasks, such as control smoothness. Although the WST is a powerful assessment tool, its delivery is subject to clinician training [7]. Powered Mobility Clinical Driving Assessment (PMCDA) is another assessment tool that is also observational [8].
Currently, there are no assessments that consider how an individual completes the skill in terms of objective measures of safety, such as distance to barriers, speed, or smoothness profiles [9]. Furthermore, in the assistive robotics domain, there is no standard way to assess user skill in operating a teleoperaton interface. The most common performance measures used to evaluate the efficacy of assistive robotic systems include task completion time and tracking errors, or subjective questionnaires that do not assess objective user skill [10, 11, 12].
All of these assessments require a specific physical setup and, more importantly, do not provide a quantitative analysis of interface usage, but rather qualitative observations on wheelchair driving skill. We propose that by looking at one level of abstraction—the interface operation—we can better identify areas of control deficit that will lead to improving overall wheelchair driving skill.
II-B Interface Usage Characterization
To our knowledge, there is no standard assessment tool for characterizing interface usage. Research studies of interface characterization have focused mainly on the neuromuscular and physical response of the human during manual control to study human sensing and response characteristics [13, 14, 15].
In the field of assistive technology, clinicians have been surveyed on the usefulness and adequacy of powered wheelchair control interfaces for their patients [16]. Their results provide subjective evidence regarding steering and maneuvering difficulty, and for a need to integrate robot autonomy into conventional powered wheelchairs. However, their results do not provide quantitative information on interface usage, which could be exploited by assistive autonomy. Novel interface technologies have been developed for those with severe motor impairments, including an isometric joystick [17]. The authors compared task completion time and accuracy of the novel interface to a conventional hand-held joystick within a control population but not an end-user population. In another work, the authors introduce a novel interface and compare user precision and performance between the target spinal cord injured (SCI) group and an uninjured control group, but do not compare the performance against commercially available interfaces [18].
In the human-robot interaction domain, a study investigated the influence of video game usage on human-robot team performance [19]. The authors classify interfaces based on their inputs and outputs and use the information to create a framework for systematically evaluating interfaces in the Human Robot Interaction (HRI) domain. Prior work in remote teleoperation has also studied the effect of time delay and communication channel degradation on the quality of teleoperation and manual control [20].
In this work, we contribute an assessment tool for the quantitative characterization of interface usage.
III Interface Skills Test
In prior work, we introduced a variety of interface usage metrics to characterize interface usage skill [21]. In this work, we expand upon these measures and design an open source application that computes these measures online in real time and stores data within a user profile that is easy to use by anyone familiar with a smart phone, tablet, or computer. No training is needed to use this tool, and analytical graphs are available for the clinician and researcher, as well as the human operator of the assistive machine. In this section, we introduce the interface skills test, describe key variables that impact interface usage, configurable settings, and outcome measures.
III-A Tasks
The assessment consists of a series of tasks that evaluate qualities of the human input including speed, precision, and stability: all necessary criteria for a human to be able to properly command an assistive machine. This is done through two distinct tasks: a command following task and trajectory following task. Tasks are designed in a simulated environment so that uncertainties from real world dynamics do not corrupt the interface usage performance measures.
Individual user profiles can be made and the tasks can be selected via a menu (Fig. 2(a)). Each task can be reconfigured as described in Section III-C.
III-A1 Command Following
The command following task is designed to uncover a patient’s ability to respond to a visual command stimulus in terms of response accuracy, speed, and stability (Fig. 1(a)). In this task, a white arrow—the command prompt —appears on the screen pointing in different directions in a random and balanced sequence. The default direction settings include the four cardinal and four inter-cardinal angles. The length of the arrow also changes in a random and balanced order to measure how well the human can scale inputs to the prompted command. The human is instructed to issue a command for the same direction and magnitude (if the interfacing device allows for scaling) as soon as they see the command prompt and to continue issuing the command uninterrupted for the duration of the prompt (). The blue arrow is the feedback of the actual command issued by the human.
III-A2 Trajectory Following
The trajectory following task is designed to evaluate how well the human is able to follow a predefined path to evaluate signal integrity in terms of smoothness, ability to give corrections, and directness of the human command. Trajectory following can be thought of as the inherent ability to generate commands to follow waypoints while using visual feedback correction. The ability to follow a trajectory where there is a single known goal—without interference from the wheelchair dynamics and external sources of noise—aims to uncover how a person’s intended goal may differ from the signal they output through the interface. The task consists of controlling the motion of a 2D simulated wheelchair (the yellow and red pentagon shape in Fig. 1(b) and Fig. LABEL:fig:traj_curve) along a predefined path. The path is demarcated to indicate motion along goal posts. The trajectory path begins with a square path, followed by a curved path. Only the path in the immediate vicinity of the wheelchair is visible at any given moment (as in Fig. 1(b)). The patient is instructed to stay within the bounds of the clearly marked path and to avoid going into the out-of-bounds grey area. The square and curved paths are designed such that they contain the basic commands covered by general interfacing devices used for 2D assistive machines. The square path consisted of two forward, two backward, two left turn and two right turn trajectories. The curved path consists of two long arcs and two small arcs.


III-B Key Variables
A multitude of variables impact the human’s interface usage characteristics while operating an assistive machine. We group these variables into four distinct categories [22]:
-
•
Operational Variables. Factors such as the dynamics of the device they are controlling (e.g., rear-wheel vs mid-wheel drive wheelchair), the mechanics of the control interface, as well as what information is available to the human and how.
-
•
Environmental Variables. Factors such as temperature, whether the task is indoors or outdoors, or noise.
-
•
Internal Variables. Factors such as internal motivation, training and skill, fatigue, and stress that affect human internal states.
-
•
Procedural Variables. Features of the experimental design, such as the instructions for the given task and order of task presentation.
The effect of procedural variables is minimized in the Interface Skills Test by standardizing the instructions for each task within the app. To control for the remaining variables, the clinician, test taker, or test giver can input information using questionnaires prior to and after a given task, as described in the following section.
III-C Configurable Settings
The assessment tool is designed with a variety of configurable input settings through a user-friendly GUI. The configurable measures consist of information relating to operational variables, environmental variables, and the human’s internal variables, as well as configurable settings pertaining to how the tasks are presented to the human test taker.
The controlled covariates for documenting the human’s internal state are collected through a series of Likert-type questionnaires administered directly within the app. Documenting these variables is not necessary for running the assessment, but it is important and recommended to keep track of these as they may inform larger trends in the human’s interface usage skill characteristics.
-
•
Fatigue: Measured using the Fatigue Scale, which is an 11-item Likert-type questionnaire [23].
-
•
Motivation: Measured using the Intrinsic Motivation Inventory (IMI) [24].
-
•
Workload: Measured using the raw NASA-TLX which is a shortened version of this assessment tool [25].
-
•
Stimulant consumption: Text entry question.
-
•
Confidence: A 5-point rating scale question on how confident the human is in their ability to use the interfacing device.
-
•
Stress: Measured via the Perceived Stress Questionnaire [26].
Other control variables can also be documented that monitor environmental and operational variables:
-
•
The interface used during the test.
-
•
How often the interface is used daily by the patient.
Independent variables used within the command following tasks able to be reconfigured by the clinician and test-taker include (Fig. 2(b)):
-
•
Set of target control commands. This also includes the choice of selecting only directions or magnitudes of the commands. The default is the four cardinal and four inter-cardinal angles.
-
•
Number of times each target command is prompted. The default is set to 20.
-
•
Range of time each target prompt is displayed. This is the amount of time each command prompt is visible and the time the user has to respond. The default range is between 1-2 seconds.





III-D Outcome Measures and Scoring
One of the main contributions of our work is that the assessment is calculated using strict closed-form equations, and that the results do not depend on qualitative observations of the test taker. Additionally, outcome measures are calculated while the assessment is being administered, and results are available immediately following the conclusion of the assessment. A summary of the performance statistics is available immediately after the conclusion of a test trial, accessed via the GUI as seen in Figure 2.
The outcome measures available immediately after the command following task include:
-
•
Average response delay: The average of all time differences between target command prompts and the first instance when the patient issues the correct command.
is the total number of target command prompts, and is the target command prompt which lasts for a duration of . is the patient command at time , and the time when this command first comes within tolerance of the target prompt. The clock restarts for each target command prompt.†††When the dimensionality of is greater than 1, the operator is used to compute the difference .
-
•
Average successful response percent: The percentage of command prompts to which the patient is able to respond successfully.
where is a tracking index.
-
•
Average settling time: The time it takes until the patient continually issues the prompted command within an allowable tolerance of .
is the time from which the human command remains within tolerance.
-
•
Initial response accuracy: How close on average the first within-tolerance response is to the target prompt.
where is the patient command at time .
-
•
Average settled accuracy: How close on average the within-tolerance response is to the target command after having settled.
where is the within-tolerance response once settled and until the end of the trial .
The outcome measures available immediately after the trajectory following task include:
-
•
Average stability: Measured as the dimensionless jerk of the patient trajectory.
where is the speed, is the total trial time, and is the maximum of .
-
•
Average speed: The average speed during the trajectory following task.
where is the Euclidean distance between the start and end position for a straight path segment, and is the arc length for a curved path segment. and are respectively the start and end times of a given path segment traversal.
-
•
Percentage of time out of bounds: The percentage of time during the trajectory following tasks when the simulated wheelchair is outside the indicated path barriers.
where and are the time the 2-D wheelchair went out of and came back within bounds, respectively, and is the number of samples in the trajectory.
Additionally, the detailed raw data used to calculate the summary statistics for each test condition are stored as an SQL file that can be accessed for further analysis at any time.
III-E Practicality, Usability, and Reliability
In terms of practicality, we designed our assessment tool to require minimal equipment which reduces cost, space requirements, and set-up time. The only equipment needed are the person’s own electric assistive machine (e.g., powered wheelchair), a tablet or any device able to run the assessment application, and our interfacing device (Sec. IV) if the interface is not already Bluetooth capable.
In terms of usability, we find that the full assessment can be completed in one session lasting between 30-60 minutes with the default settings, and without taxing the patient or the experimenter. The duration can be adjusted to be longer based on the configurable independent variables, as deemed appropriate by the experimenter. Simple and clear instructions for each task are contained within the app itself, so that the tester can easily administer the assessment. There have been no adverse incidents with the assessment tool as all tasks are simulation-based. The start and end of each test are also clearly defined.
Our motivation was to design an assessment tool that produces outcome measures that are repeatable, consistent, precise, and immune to test-taker bias. The outcome measures

are calculated automatically, which preserves scoring reliability across various test takers and sessions for a single patient profile. For example, to calculate the total percentage of time the simulated wheelchair is outside of the allowed bounds during the trajectory following task, we use definitive geometries (Fig. 3) with boundaries designed to account for human field-of-view limitations.
IV Hardware Connection
Some modern powered wheelchairs have Bluetooth-enabled interfaces that allow the interface to connect to digital devices such as computers, smart phones, and tablets. However, many PWs still lack this capability. We have designed a multi-interfacing device that connects to various common interfaces used for the control of powered wheelchairs, such as joysticks, switch-based headarrays, and sip/puff interfaces. Our device serves as a connection that communicates signals from the control interface over Bluetooth, which can then be detected by any device with the interface skills test app. With our open source design, any PW can be used to measure interface usage skill.
This work was inspired by the Freedom Wing adaptor by AbleGamers [27]. The novel contribution of our work is to replace the wired connection with Bluetooth and to allow for various types of interface connections.
The hardware required is minimal and includes off-the-shelf components. A Raspberry Pi11footnotetext: Model B+ was tested. acts as a bridge by relaying commands from the assistive devices to our app over Bluetooth.22footnotetext: The source code for the device is available at https://github.com/argallab/WheelchairBLEGamepad.git. The Raspberry Pi emulates an XBox 360 controller, converting commands from the assistive interface into button presses and joystick values on the controller. An additional PiCAN2 board is used to allow CAN Bus interface connections to the Raspberry Pi, which is required for some R-Net type interface devices. A diagram of the hardware bridge is shown in Figure 4. The hardware connection was tested with R-Net switch-based headarray and joystick interfaces on Permobil M3 and F3 Corpus powered wheelchairs.

V Discussion
The goal of designing this assessment tool is to improve the interface training and wheelchair navigation performance by documenting initial and subsequent interface usage skill and characteristics via reliable, repeatable, and objective outcome measures. Reliable and objective measurement instruments are needed not just for providing informed care to patients, but also to assist in testing research hypotheses, comparing outcomes, and developing new technologies.
The scoring for all tests are digitized and analytical, so the outcome measures are not subject to experimenter bias.
This assessment also makes possible the study of other interesting research questions. For example, there is potential in using this tool to identify how long-term therapy affects interface usage skill. Also, the tool allows for identifying how various key variables affect different qualities of the human input during interface usage. Furthermore, the tool may aid in deciding which interfaces and which settings are more suitable for a particular individual with evidence-based measurements. There is also the potential to evaluate how various autonomous robotics assistance interventions affect—either improving or decreasing—the patient’s interface usage skill.
VI Conclusion and Future Work
In this systems paper, we presented the Interface Skills Test; an assessment tool for evaluating various qualities of interface usage. We anticipate that with automated outcome measure calculations, this assessment tool can aid in understanding a patient’s interface usage skill and diagnosing appropriate solutions to overcome deficiencies. This contribution can potentially improve the quality of clinical care and also allow robotics researchers to design customized and intelligent assistance algorithms.
The current version of this assessment tool is limited to 2D assistive machines. In future work, we will expand to cover higher-dimensional machines such as robotic arms. Additionally, our future iteration will include additional tasks and measures to include reachability and operation range.
We have currently tested the hardware bridge with R-Net controllers. Our next iteration will expand to other common wheelchair controller types. We plan to evaluate test-retest validity, context validity, and usefulness to clinicians via within-subject study.
ACKNOWLEDGMENT
This material is based upon work supported by the National Science Foundation under Grant IIS-1552706. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
References
- [1] Dahlia Kairy, Paula W. Rushton, Philippe Archambault, Evelina Pituch, Caryne Torkia, Anas El Fathi, Paula Stone, François Routhier, Robert Forget, Louise Demers, Joelle Pineau, and Richard Gourdeau. Exploring powered wheelchair users and their caregivers’ perspectives on potential intelligent power wheelchair use: A qualitative study. International Journal of Environmental Research and Public Health, 11:2244–2261, 2 2014.
- [2] Dalhousie University. Wheelchair skills test (wst) 5.1 form.
- [3] Aaron Bangor, Philip T Kortum, and James T Miller. An empirical evaluation of the system usability scale. Intl. Journal of Human–Computer Interaction, 24(6):574–594, 2008.
- [4] Michael L Boninger, Rory A Cooper, Mark A Baldwin, Sean D Shimada, and Alicia Koontz. Wheelchair pushrim kinetics: body weight and median nerve function. Archives of physical medicine and rehabilitation, 80(8):910–915, 1999.
- [5] SL Wood-Dauphinee, MA Opzoomer, Jack Ivan Williams, B Marchand, and Walter O Spitzer. Assessment of global function: The reintegration to normal living index. Archives of physical medicine and rehabilitation, 69(8):583–590, 1988.
- [6] R. Lee Kirby, Janneke Swuste, Debbie J. Dupuis, Donald A. MacLeod, and Randi Monroe. The wheelchair skills test: A pilot study of a new outcome measure. Archives of Physical Medicine and Rehabilitation, 83:10–18, 2002.
- [7] Edward Giesbrecht. Wheelchair skills test outcomes across multiple wheelchair skills training bootcamp cohorts. International Journal of Environmental Research and Public Health, 19(1):21, 2022.
- [8] Jorge L. Candiotti, Deepan C. Kamaraj, Brandon Daveler, Cheng Shiu Chung, Garrett G. Grindle, Rosemarie Cooper, and Rory A. Cooper. Usability evaluation of a novel robotic power wheelchair for indoor and outdoor navigation. Archives of Physical Medicine and Rehabilitation, 100:627–637, 4 2019.
- [9] Francois Routhier, Claude Vincent, Johanne Desrosiers, and Sylvie Nadeau. Mobility of wheelchair users: a proposed performance assessment framework. Disability and rehabilitation, 25(1):19–34, 2003.
- [10] Ahmetcan Erdogan and Brenna D Argall. The effect of robotic wheelchair control paradigm and interface on user performance, effort and preference: an experimental assessment. Robotics and Autonomous Systems, 94:282–297, 2017.
- [11] Dan R. Olsen and Michael A. Goodrich. Metrics for evaluating human-robot interactions. NIST Performance Metrics for Intelligent Systems Workshop, pages 507–527, 2003.
- [12] M a Goodrich, E R Boer, J W Crandall, R W Ricks, and M L Quigley. Behavioral entropy in human-robot interaction. Proceedings of Performance Metrics for Intelligent Systems, pages 24–26, 2004.
- [13] Thomas B Sheridan and William R Ferrell. Man-machine systems; Information, control, and decision models of human performance. the MIT press, 1974.
- [14] D Kleinman, Sheldon Baron, and W Levison. A control theoretic approach to manned-vehicle systems analysis. IEEE Transactions on Automatic Control, 16(6):824–832, 1971.
- [15] JD Burchfield, JI Elkind, and DC Miller. On the optimal behavior of the human controller: A pilot study comparing the human controller with optimal control models. Bolt Beranek and Newman Inc., Rept, 1532, 1967.
- [16] L Fehr, W E Langbein, and S B Skaar. Adequacy of power wheelchair control interfaces for persons with severe disabilities: a clinical survey. Journal of rehabilitation research and development, pages 353–60, 2000.
- [17] Rory A. Cooper, Donald M. Spaeth, Daniel K. Jones, Michael L. Boninger, Shirley G. Fitzgerald, and Songfeng Guo. Comparison of virtual and real electric powered wheelchair driving using a position sensing joystick and an isometric joystick. Medical Engineering and Physics, pages 703–708, 2002.
- [18] Ali Farshchiansadegh, Farnaz Abdollahi, David Chen, Mei-Hua Lee, Jessica Pedersen, Camilla Pierella, Elliot J. Roth, Ismael Seanez Gonzalez, Elias B. Thorp, and Ferdinando A. Mussa-Ivaldi. A body machine interface based on inertial sensors. Proceedings of Engineering in Medicine and Biology Society, 2014.
- [19] Justin Richer and Jill L Drury. A video game-based framework for analyzing human-robot interaction: characterizing interface design in real-time interactive multimedia applications. In Proceedings of the Conference on Human-Robot Interaction, 2006.
- [20] SA GMV. Teleoperation with time delay a survey and its use in space robotics.
- [21] Mahdieh Nejati Javaremi, Michael Young, and Brenna D. Argall. Interface operation and implications for shared-control assistive robots. In Proceedings of the International Conference on Rehabilitation Robotics, 2019.
- [22] Duane T McRuer and Henry R Jex. A review of quasi-linear pilot models. IEEE transactions on human factors in electronics, (3):231–249, 1967.
- [23] Trudie Chalder, G Berelowitz, Teresa Pawlikowska, Louise Watts, S Wessely, D Wright, and EP Wallace. Development of a fatigue scale. Journal of psychosomatic research, 37(2):147–153, 1993.
- [24] Basic Psychological Needs and Physical Education. Intrinsic motivation inventory (imi).
- [25] S. G. Hart. Nasa-task load index (nasa-tlx); 20 years later. Proceedings of the human factors and ergonomics society annual meeting, 50(9):904–908, 1985.
- [26] Herbert Fliege, Matthias Rose, Petra Arck, Otto B Walter, Rueya-Daniela Kocalevent, Cora Weber, and Burghard F Klapp. The perceived stress questionnaire (psq) reconsidered: validation and reference values from different clinical and healthy adult samples. Psychosomatic medicine, 67(1):78–88, 2005.
- [27] Bill. Ablegamers & atmakers team up for the freedomwing!, Jan 2020.