As medical innovators we all want to support the design of safe and usable medical devices. Yet, in a healthcare environment that is going through fast, unprecedented change, where standard tools and processes are harder to apply, that aim is not without challenge.
At PDD, we have experienced first-hand the impact of the COVID-19 pandemic when testing and assessing medical devices and services in healthcare settings. Although some activities can still be done with appropriate mitigations and safeguards in place, others simply can’t.
On a positive note, those constraints have also sparked innovation, with many players in the industry embracing alternative methodologies, such as remote usability testing (simulated testing happens through a digital platform rather than in a market research facility), to continue to drive ideas forward.
From our experience, the results of using these alternative methods have been encouraging, if somehow limited. For some products and services, like a mobile phone application or a software platform, remote usability testing is likely to work well. For others, when testing the physical characteristics or assembly of large, complex devices which depend on supporting infrastructure, less so.
Importantly, while the move to remote methods is a welcome and necessary adaption in these difficult times, it is also worth remembering that the practice is still relatively new and untested. Driven by an immediate need, no one has yet been able to assess whether remote methods will generate the quality of evidence that we need in healthcare, or whether they are the most appropriate to use in the first place. As we continue to navigate the pandemic in the coming months, addressing those issues will be key when introducing safe and reliable medical technology.
A case in point is the certification that occurs in the form of CE marking or US regulatory approval. All regulatory bodies require a dossier of evidence and/or argumentation that the technology we want to release is safe and effective, all of it underpinned by a risk-management process.
There are many angles to making that case, but when it comes to usability, the aim is to obtain evidence that the potential for use-related error has been minimised or removed, reduced to an acceptable level with suitable mitigations in place. One method to generate this evidence would be to get people to try out the technology and show that the mitigations are working, ideally in as real a context as possible – maybe even in the real healthcare environment. But this isn’t always practical, especially not at this moment in time.
This is why device developers often choose to collect data in a simulated context. For example, when testing a dialysis machine, we don’t perform a real dialysis process, we take the treatment out of the equation but keep the rest of the test environment as close to reality as possible. Whilst questions of representativeness will always arise, this way of testing has become a ‘de-facto’ standard and is significantly better than exposing people to unproven technology during a clinical treatment.
To some extent, remote usability testing is an extension of this approach – whilst the test environment is likely to be less representative as we test a product at distance, the remote set-up may bring other benefits in terms of scale, flexibility and safety.
So, can remote usability substitute face-to-face simulated use testing? Whereas previously we would have put medical technology through a summative usability validation using simulated use testing – can we now do a similar thing through remote methods?
Unfortunately, there is no simple answer to any of these questions. As with any development scenario, we need to consider both our ability to use the test method effectively and the quality of the evidence it is likely to provide. In other words, we need to pick the right tool for the job and weigh the ‘pros’ and ‘cons’ depending on what we want to achieve.
In other industries where safety is paramount, like defence or aerospace, practitioners often come up against these questions and always seek to consider the type of data that is generated by a test method and its suitability.
Figure 1 (below) shows the top-level properties that can be used to reflect on the strength of the data that we use to support a safety argument. For example, certain types of objective data may be preferable to subjective data. We might ask basic questions about the replicability of the data that gets generated – there may be uncontrolled variables as a result of using a remote method that do not exist in a more controlled environment. We also need to understand the intrinsic limitations of a test method – as well as the potential benefit.
Coverage | The extent to which the claim is actually addressed by the evidence.
For Human Factors Engineering, we claim: “The device has been found to be safe and effective for the intended users, uses and use environments”. Does the evidence provide sufficient coverage for this claim? |
Scope | The extent to which our method can be expected to address the claim.
Is the evidence relevant? |
Trustworthiness | The likelihood that the evidence is free from errors.
Our ability to detect error and/or degree of review. |
Replicability | The ease with which the evidence can be replicated.
The potential to generate a consistent result when the trial is repeated. |
Reinforcement | The extent to which multiple items of evidence support the same aspects of a claim.
Is there corroboration – for example, similar findings across different test methods? |
Independence | The diversity of the evidence, as well as the different tools and methods used to obtain it.
We typically collect multiple lines of evidence. To what extent could these different lines influence each other? |
User Defined Importance | The additional weighting placed upon certain types of evidence by legislation, standards or client preference.
For Human Factors, this relates to our ability to satisfy standards and/or guidance – for example that provided by a health authority or regulator. |
Figure 1: Nature of Evidence
Let’s take remote usability as an example. If we look at coverage, remote methods can provide for a greater catchment area – we can pull in an international cohort in ways that would not be possible with a facility-based study. On the other hand, task coverage may be limited – for example, we might not be able to test certain properties of a device when observational data is difficult to collect using a camera (or webcam).
Remote studies can also limit the extent to which we can create a proxy of the use environment – unless a facility is used, it may be hard to consider the relationship with other pieces of equipment. Remote studies might also offer less associated control – we can’t control or capture data in quite the same way as we are not in the room with the participant. This last point may also have an impact on the trustworthiness of the data – freedom from errors (as a result of the test method) might already be limited by testing away from the hospital context, but could be called into further question as a result of a video link of limited fidelity, for example.
Standard process requires us to understand interactions between users, the tools they use and the environments where they live and work. As healthcare innovators, we need to consider the usability of solutions that get introduced and mitigate the potential for use-related error. This is not a nice to have, but a recognised and mandated safety process that applies throughout the design and development of new solutions.
So, is the push towards remote usability testing allowing us to satisfy these objectives?
We would argue that with any change in methodology we need to consider what we are trying to achieve, why we are doing what we are doing, to which extent it is proven and to which extent it makes sense.
It is not good enough to simply collect evidence along the lines of what has occurred in the past, we need to reflect on the quality and suitability of that evidence as we look to the future. Post-COVID, two things have fundamentally changed; firstly, it has become harder (although not impossible) to collect certain types of evidence; secondly, the burden of evidence may have changed – for example, we may need to collect a higher standard of evidence in order to compensate for the increased pressure that the healthcare context has come under and the changes to the way in which healthcare occurs.
Our responsibility as innovators is therefore twofold – is the evidence we are collecting good evidence? Is the amount of detail and rigour proportionate to the current challenge? It could be that such considerations occurred during the move to remote methods but it may also be that the move itself was driven out of practicality, rather than best-practice.
Although the only organisations that can properly answer these questions are those that control the release of medical equipment onto the market – I would recommend a degree of caution in substituting what was previously physical testing (i.e. facility based simulated testing) with remote methods.
It is important that medical equipment is safe to use and makes it to market as quickly as possible, but we also need to make sure that the testing is done rigorously, following the highest standards of safety.
Ultimately, there is no doubt that remote usability testing opens up huge opportunities in healthcare innovation. To make the most of what it can offer, we must celebrate its advances but also, crucially, understand its limitations.