With every year that passes, our relationship to information technology becomes more complex, and our dependence deeper. Technology is our great ally, promising greater efficiency and productivity. It also promises greater safety for our patients. However, this relationship with technology can sometimes be a brittle one. We can quickly cross a safety gap from a comfortable place where everything works well, to one where the limits of technology introduce new risks. Whether it is through a computer network failure, applying a system software patch, or a user accidentally clicking on the wrong patient name, it is surprisingly easy to move from safe to unsafe. As the footprint of technology across our health services has grown, so to by extrapolation, has the associated risk of technology harms to patients.1 It is the potential abruptness of this transition to increased risk of harm, this lack of graceful degradation in performance, and the silence accompanying degradation, that remain unsolved challenges to the effective use of information technology in healthcare.

In this special focus issue, we present studies that help us better understand the nature of the safety challenge in informatics. In the decade since the first paper to describe the unintended negative consequences of health IT appeared in this journal,2 many investigators across many nations have worked to characterise the nature of IT related problems, and their resulting harms. This literature is summarised in a systematic review by Kim et al. who find, across 34 separate published studies, that we now have a fairly robust understanding of the types of problems that can be encountered, as well as their effects on care delivery and potential consequences for patients.3 What is still lacking however, are data on their true frequency, scale and severity. They apply a simple theoretical framework called the information value chain, to both help explain why some problems are more likely to produce harm than others, as well as to provide a template for reporting problems and their consequences.

Some reporting tools do exist to help in benchmarking aspects of clinical information system safety. Chaparro and colleagues report on the safety performance of computerized physician order entry (CPOE) across 41 US pediatric hospitals, using the Leapfrog CPOE tool.4 Their data raises concerns, given that on average only 62% of potential medication errors were detected by CPOE, with a range of 23 to 91%. The better news was that performance seemed to increase by about 4% per year, indicating perhaps that feeding back results of these tests might prompt system improvement.

Key to patient safety, whether technology associated or not, is the development of a safety culture. In their paper on safety huddles, Menon et al. explore how daily discussions to identify and manage safety risks in healthcare, including those associated with electronic health records, appear to create a collective sense of situational awareness of safety issues, which is a necessary prerequisite to responding to such problems.5

So, what does it mean to safely use information technology in healthcare? We would hope firstly that errors in technology design, construction and implementation are few. Many other safety critical industries face similar challenges and we need not reinvent what already seems to work well. Part of technology brittleness comes from the clash between the “work as imagined” encoded in software, and the complexity and fluidity of “work as done”, executed in practice.6 Workarounds and errors can arise when users try to bend systems to their true work needs, either because their work needs have changed, or there are fundamental misunderstandings between designer and user about what work looks like. Embracing the ‘missing’ information about user needs that is left in plain sight when users create workarounds should give designers the insights needed to craft more adaptive and localisable technologies.7

Performability (performance related reliability) is a concept, developed in computer engineering, to describe a graceful degradation in system performance as components start to fail.8 Performable systems are thus relatively resilient in the face of failure because performance does not quickly fall over a cliff when things go wrong, and degradation can be visible, giving system users valuable warning that they are working in a higher risk context. While performability is already likely to be a design property of infrastructure such as computer networks and databases, it is not a design feature of other parts of the system. We really should be engineering performability into workflows, user interfaces, and socio-technical procedures such as return to paper during a computer downtime. Harnessing redundancy in design can enhance the likelihood of graceful degradation under system stress and improve patient safety.9

We also have much to learn from success as well from failure. Patient safety researchers are increasingly emphasising the need to both study Safety-1 (when things go wrong) as well as Safety-2 (when things go right).10Understanding why similar events in one setting lead to harm but not in another, or what triggers a ‘great catch’ can be highly instructive.11

Informatics safety science is still in its infancy. We have over the last decade focussed mainly on making the case to take the safety of information technology seriously, by documenting what can go wrong, and demonstrating the consequences that such problems bring. We must not forget that technology has overall improved patient safety, despite the new risks, and that our clinical users have high expectations of technology performance. The next decade of safety research must move to building systems that can effectively detect problems as they unfold, and ensure the design of such systems are as resilient as possible in the face of inevitable system stresses and failures. Risk cannot ever be completely engineered out of our systems, but it can be effectively managed and minimised. There remains much to do.