Discussion Paper

Meeting the Moment: Addressing Barriers and Facilitating Clinical Adoption of Artificial Intelligence in Medical Diagnosis

SHARE

Introduction

Clinical diagnosis is essentially a data curation and analysis activity through which clinicians seek to gather and synthesize enough pieces of information about a patient to determine their condition. The art and science of clinical diagnosis dates to ancient times, with the earliest diagnostic practices relying primarily on clinical observations of a patient’s state, coupled with methods of palpation and auscultation (Mandl and Bourgeois, 2017; Berger, 1999). Following a period of stagnation in clinical diagnostic practices, the 17th through 19th centuries marked a period of discovery that transformed modern clinical diagnostics, with the advent of the microscope, laboratory analytic techniques, and more precise physical examination and imaging tools (e.g., the stethoscope, ophthalmoscope, X-ray, and electrocardiogram) (Walker, 1990). These foundational achievements, among many others, laid the groundwork for modern clinical diagnostics. However, the volume and breadth of data for which clinicians are responsible has exponentially grown, generating challenges for human cognitive capacity to assimilate.

Computerized diagnostic decision support (DDS) tools emerged to alleviate the burden of data overload, enhance clinicians’ decision-making capabilities, and standardize care delivery processes. DDS tools are a subcategory of clinical decision support (CDS) tools, with the distinction that DDS tools focus on diagnostic functions, whereas CDS tools more broadly can offer diagnostic, treatment, and/or prognostic recommendations. Debuting in the 1970s and 1980s, expert-based DDS tools such as MYCIN, Iliad, and Quick Medical Reference operated by encoding then-current knowledge about diseases through a series of codified rules, which rendered a diagnostic recommendation (Miller and Geissbuhler, 2007). While these early DDS tools initially achieved pockets of success, the promise of many of these tools diminished as several shortcomings became evident. Most prominently, the capacity of data collection and the complexity of knowledge representation prevented accurate representation of the pathophysiological relationships between a disease and treatments. Programmed with a limited set of information and decision rules, several expert-based DDS tools could not generalize to all settings and cases. Some suffered from performance issues as well, often struggling to generate a result or yielding an errant diagnosis. Moreover, users were frustrated. Since these tools existed outside of the main clinical information systems, clinicians had to reenter a long list of information to use them, which created significant friction in their workflows. Similarly, updating the knowledge base of a DDS system often required cumbersome manual entry. Finally, there was a lack of incentives to drive adoption. Thus, provider acceptance remained low, and expert-based DDS tools faded from use (Miller, 1994).

The revitalization of the artificial intelligence (AI) field – the ability of computer algorithms to perform tasks that typically require human intelligence – offers an opportunity to augment human diagnostic capabilities and address the limitations of expert-based DDS tools (Yu, Beam, and Kohane, 2018). Current AI techniques possess not only remarkable processing power, speed, and ability to link and organize large volumes of multimodal data, but also the ability to learn and adjust based on novel inputs, building upon previous knowledge to generate new insights. For this reason, AI approaches, specifically machine learning (ML), are especially well suited to the problems of clinical diagnosis, shortening the time for disease detection, diagnostic accuracy, and reducing medical errors. By doing so, AI diagnostic decision support (AI-DDS) tools could reduce the cognitive burden on providers, mitigate burnout, and further enhance care quality.

While contemporary AI-DDS tools are more sophisticated than their expert-based predecessors, concerns about their development, interoperability, workflow integration, maintenance, sustainability, and workforce requirements remain, hampering the adoption of AI-DDS tools. Additionally, the “black box” nature of some AI systems poses liability and reimbursement challenges that can affect provider trust and adoption. This paper examines the key factors related to the successful adoption of AI-DDS tools, organized into four domains: reason to use, means to use, method to use, and desire to use. Additionally, the paper discusses the crosscutting issues of bias and equity as they relate to provider trust and adoption of these tools. Addressing biases and inequities perpetuated by AI tools is paramount to preventing the widening of disparities experienced by certain populations and to engendering confidence and trust among clinicians who are responsible for providing care to these populations. To conclude, the authors discuss the policy implications around the adoption of AI-DDS systems and propose action priorities for providers, health systems leaders, legislators, and policy makers to consider as they engage in collaborative efforts to advance the longevity and success of these tools in supporting safe, effective, efficient, and equitable diagnosis.

A Primer on AI-Diagnostic Decision Support Tools

AI-DDS tools come in various forms, use myriad AI techniques (see Table 1), and can be applied to a growing number of conditions and clinical disciplines. In this paper, the authors focus on adoption factors as they relate to assistive AI-DDS tools. Unlike autonomous AI tools, which operate independently from a human, assistive AI tools involve a human to some degree in the analysis and decision-making process (see Figure 1) (Bitterman, Aerts, and Mak, 2020). The authors in this paper focus on AI-DDS tools designed to support health care professionals in decision-making processes, rather than consumer-facing tools in which a layperson interacts with an AI-DDS system.

Current AI-DDS tools reflect artificial narrow intelligence (ANI), i.e., the application of high-level processing capabilities on a single, predetermined task, as opposed to artificial general intelligence (AGI), which refers to human-level reasoning and problem-solving skills across a broad range of domains. AI-aided diagnostic tools are designed to address specific clinical issues related to a prescribed range of clinical data. They do not (and are not intended to) comprise omniscient, science-fiction-like algorithmic interfaces that can span all disease contexts. Ultimately, the purpose of AI-DDS tools is to augment provider expertise and patient care rather than dictate it.

Generally, assistive AI-DDS tools currently use a combination of computer vision and ML techniques such as deep learning, working to identify complex non-linear relationships between features of image, video, audio, in vitro, and/or other data types, and anatomical correlates or disease labels. The authors highlight a few representative examples below.

Most prominently, assistive AI-DDS tools can be found in the field of diagnostic imaging, given the highly digital and increasingly computational nature of the field. In fact, radiology boasts more Food and Drug Administration (FDA)-authorized (that is, cleared or approved) AI tools than any other medical specialty (Benjamens et al., 2020). A well-studied algorithm within the cardiac imaging space is HeartFlow FFRCT. Trained on large amounts of computed tomography (CT) scans, this algorithm employs deep learning to create a precise 3D visualization of a patient’s heart and major vessels to assist in the detection of arterial blockage (Heartflow, 2014). Deep learning methods can also be applied to gauge minute variations in cardiac features such as ventricle size and cardiac wall thickness to make distinctions between hypertrophic cardiomyopathy and cardiac amyloidosis – two conditions which have similar clinical manifestations and can often be misdiagnosed (Duffy et al., 2022). Within oncology, ML techniques in the form of computer-aided detection systems have been used since the 1990s to support early detection of breast cancer (Fenton et al., 2007; Nakahara et al., 1998). Since then, the FDA has approved several AI-based cancer detection tools to help detect anomalies in breast, lung, and skin images, among others (Shen et al., 2021; Ray and Gupta, 2020; Ardila et al., 2019). Many of these models have been shown to improve diagnostic accuracy and prediction of cancer development well before onset (Yala et al., 2019).

Beyond imaging, AI applications include the early recognition of sepsis, one of the leading causes of death worldwide. Electronic health record (EHR)-integrated decision tools such as Hospital Corporation of America (HCA) Healthcare’s Sepsis Prediction and Optimization Therapy (SPOT) and the Sepsis Early Risk Assessment (SERA) algorithm developed in Singapore draw on a vast repository of structured and unstructured clinical data to identify signs and symptoms of sepsis up to 12 – 48 hours sooner than traditional methods. In this regard, natural language processing (NLP) of unstructured clinical notes is particularly promising. NLP helps to discern information from a patient’s social history, admission notes, and pharmacy notes to supplement findings from blood results, creating a richer picture of a person’s risk for sepsis (Goh et al., 2021; HCA Healthcare Today, 2018). However, there are significant concerns about the clinical utility and generalizability of these tools across different geographic settings (Wong et al., 2021).

In the fields of mental health and neuropsychiatry, AI-DDS tools hold potential for combining multimodal data to uncover pathological patterns of psychosocial behavior that may facilitate early diagnosis and intervention. For instance, the FDA recently authorized marketing of an AI-based diagnostic aid for autism spectrum disorder (ASD) developed by Cognoa, Inc. As a departure from deep learning and CNNs, the Cognoa algorithm is based in random forest decision trees. It integrates information from three sources to provide a binary prediction of ASD diagnosis:

a brief parent questionnaire regarding child behavior completed via mobile app,
key behaviors identified in videos of child behaviors, and
a brief clinician questionnaire.

The tool has demonstrated safety and efficacy for ASD diagnosis in children ages 18 months to five years, performing at least as well as conventional autism screening tools (Abbas et al., 2020). There have also been promising demonstrations of AI for diagnosing depression, anxiety, and posttraumatic stress disorder (Lin et al., 2022; Khan et al., 2021; Marmar et al., 2019).

AI-DDS systems are also becoming increasingly common in the field of pathology, particularly in vitro AI-DDS tools. Akin to the radiological examples, AI techniques can analyze blood and tissue samples for the presence of diagnostic biomarkers and characterize cell or tissue morphology. For example, a model developed by PreciseDx uses CNNs to calculate the density of Lewy-type synucleinopathy, a biomarker of early Parkinson’s disease, in the peripheral nerve tissue of saliva glands (Signaevsky et al., 2022).

Facilitating Provider Adoption of AI-Diagnostic Decision Support Tools

Despite the significant potential AI-DDS tools hold in augmenting medical diagnosis, these tools may fail to achieve wide clinical uptake if there is insufficient clinical acceptance. A particularly telling example is that of many early expert-based DDS examples (the forerunners to modern AI-DDS systems, as discussed in the Introduction), which disappointed provider expectations because of a host of usability and performance issues as discussed in the Introduction.

However, the deficiencies of these early AI-DDS tools are instructive for facilitating the adoption of contemporary AI-DDS tools. Additionally, lessons learned from implementing current non-AI-based DDS tools, or systems that generate recommendations by matching patient information to a digital clinical knowledge base, can offer insight. The authors of this paper present a model for understanding the key drivers of clinical adoption of AI-DDS tools by health systems and providers alike, drawing from these historical examples and the current discourse around AI, as well as notable frameworks of human behavior (Ajzen, 1985; Ajzen, 1991). This model focuses on eight major determinants across four interrelated core domains, and the issues covered within each domain are as follows (see Figure 2):

Domain 1: Reason to use explores the alignment of incentives, market forces, and reimbursement policies that drive health care investment in AI-DDS.
Domain 2: Means to use reviews the data and human infrastructure components as well as the requisite technical resources for deploying and maintaining these tools in a clinical environment.
Domain 3: Method to use discusses the workflow considerations and training requirements to support clinicians in using these tools.
Domain 4: Desire to use considers the psychological aspects of provider comfort with AI, such as the extent to which the tools alleviate clinician burnout, provide professional fulfillment, and engender overall trust. This section also examines medicolegal challenges, one of the biggest hurdles to fostering provider trust in and the adoption of AI-DDS.

Domain 1: Reason to Use

At the outset, the adoption and scalability of a given AI-DDS tool are driven by two simple but critical factors that dictate the fate of nearly any novel technology being introduced into a health setting. The first factor is the ability of a tool to address a pressing clinical need and improve patient care and outcomes (alignment with providers’ and health systems’ missions). Considering that these tools require sufficient financial investment for deployment and maintenance, the second factor is the tool’s affordability both to the patient and health system, including the incentives for the provider, patient, and health system to justify the costs of acquiring the tool and investments needed to implement it. The issues related to Alignment and Incentives and Reimbursements are, in practice, deeply intertwined and codependent. However, for the purposes of the discussion that follows, the authors have separated the two for clarity, emphasizing the logistical and technical steps relevant to Incentives and Reimbursement.

Alignment with Health Care Missions

AI-DDS tools must facilitate the goals and core objectives of the health care institution and care providers they serve, although the specific impetus and pathway for AI-DDS tool adoption can vary by organization. For instance, risk prediction and early diagnosis AI-DDS tools being developed and implemented by the Veterans Health Administration (VHA) – the largest integrated health care system in the United States – were initiated by governmental mandates and congressional acts requiring VHA to improve specific patient outcomes in this population (i.e., the Comprehensive Addiction and Recovery Act) (114th Congress, 2016b). Such initiatives, mandated on a national level, benefit immensely because the VHA is a nationalized health care service, capable of deploying resources in an organized fashion and on a large scale. Another pathway by which these tools can be introduced into clinical settings is through private AI developers collaborating with academic health centers or other independent health systems. These collaborations can result in the creation of novel AI-DDS tools or the customization of “off-the-shelf” commercial tools. A recent example of this type of partnership is Anumana, Inc., a newly founded health technology initiative between Nference (a biomedical start-up company) and Mayo Clinic focused on leveraging AI for early diagnosis of heart conditions based on ECG data (Anumana, 2022). In this context, the AI-DDS development process may be geared toward a given health system’s specific needs or strategic missions. However, this does not necessarily preclude its broader utility in other health systems.

A useful framework for evaluating the necessity and utility of AI-DDS tools relates to the Quintuple Aim of health care – better outcomes, better patient experiences, lower costs, better provider experiences, and more equitable care (Matheny et al., 2019). Given the link between patient outcomes and provider experience, it is also important to establish and validate the accuracy of new AI-DDS tools at the start of the adoption process and throughout its use. However, there are often discrepancies between AI-DDS developers’ scope and the realities of clinical practice, resulting in tools that can be either inefficient or only tangentially useful. To reassure providers that their tools are optimized for clinical effectiveness, health system leaders must be committed to regular evaluations of AI-DDS models and performance, as well as efficient communication with developers and companies to update algorithms based on changes like diagnosis prevalence and risk-factor profiles. As algorithms are deployed, and their output is presented to providers in EHR systems, special attention must be paid to the information design and end-user experience to optimize providers’ ability to extract key information and act on it efficiently (Tadavarthi et al., 2020). Another critical step in proving robust clinical utility of an AI-DDS tool will be to demonstrate low burden of unintended harms and consequences with use of a given tool (i.e., high sensitivity and high specificity) (Unsworth et al., 2022). The degree to which provider reasoning impacts the AI-DDS will also play a role in this regard. Finally, in implementing care plans based in part on AI-DDS output, all care team members must be coordinated in their response and long-term follow-up roles (see Domain 2: Means to Use for discussion about requisite resources and roles to accomplish these tasks).

Incentives and Reimbursement

Many health care systems operate on razor-thin financial margins (Kaufman Hall & Associates, 2022). Moving forward, robust insurance reimbursement programs for the purchase and use of AI-DDS tools will be critical to promoting greater adoption by providers and health systems (Chen et al., 2021). However, incentive structures and payer reimbursement protocols for AI-DDS tools are in their nascent stages. Furthermore, insurance dynamics, including for AI-DDS
systems, are particularly complex in the U.S., due in part to the heterogeneity of potential payers that range from governmental entities to private insurers to self-insured employers.

In the current fee-for-service environment, a general trend is for the Centers for Medicare and Medicaid Services (CMS), the federal agency that is the nation’s largest health care payer, to be the first to establish payment structures for new technologies and for private payers to then emulate the standards set by CMS (Clemens and Gottlieb, 2017). In determining whether to reimburse the use of a novel AI-DDS tool (and to what extent), a primary consideration for payers, regardless of type, is to assess whether the technology in question pertains to a condition or illness that falls under the coverage benefits of the organization. For instance, an AI-DDS system may be deemed as a complementary or alternative health tool, which may fall outside the scope of many insurance plans and, therefore, be ineligible for reimbursement. If the AI-DDS tool is indeed related to a covered benefit by an insurer [(for examples of AI-DDS tools currently reimbursed by U.S. Medicare, see (Parikh and Helmchen, 2022)], developers must provide payers with an adequate evidentiary basis for the utility and safety of the new tool. For this assessment, payers often require data similar to what the FDA would require for premarket approval of a device – for example, clinical trial data showing effectiveness (clinical validity and utility) or other solid evidence that clinical use of the tool improves health care outcomes (Parikh and Helmchen, 2022). Developers bringing new DDS systems to market through FDA’s other market authorization pathways, such as 510(k) clearance or de novo classification, may lack such data and need to generate additional evidence of safety and effectiveness to satisfy payers’ data requirements (Deverka and Dreyfus, 2014). Ongoing post-marketing surveillance to verify the clinical safety and effectiveness of new AI-DDS tools thus is important not only to support the FDA’s continuing safety oversight but also as a source of data to support payers’ evaluation processes.

Experts in health care technology assessment highlight two components of AI-DDS evaluation that are of particular interest to payers: potential algorithm bias and product value. Payers must be convinced that a given AI-DDS will perform accurately and improve outcomes in the specific populations they serve. As described later in this paper, algorithm bias can arise with the use of non-representative clinical data in AI-DDS algorithm development and testing and may lead to suboptimal performance in disparate patient populations based on geographic or socioeconomic factors, as well as in historically marginalized populations (e.g., the elderly and disabled, homeless/displaced populations, and LGBTQ communities). To avoid such biases, monitoring and local validation need to be incorporated into reimbursement frameworks. With regard to product value, payers may weigh the potential clinical benefits of an AI-DDS tool relative to standard diagnostic approaches against the logistical and workflow disruptions that introducing and integrating a new tool into health systems may cause (Tadavarthi et al., 2020; Parikh and Helmchen, 2022). Furthermore, payers can also seek assurance of long-term technical support from algorithm developers.

Although there are not direct reimbursement channels for many types of AI-DDS tools, within the scope of CMS payment systems, there are currently two primary mechanisms through which AI-DDS services can be reimbursed. The first is that CMS reimburses physician office payments through the Medicare Physician Fee Schedule (MPFS). Within MPFS, payment details are specified via the Current Procedure Terminology (CPT), maintained by the American Medical Association (AMA). CPT codes denote different procedures and services provided in the clinic. New AI-CDS/DDS systems that receive approval for reimbursement by CMS may be assigned a CPT code, as was done in 2020 for IDx-DR, an autonomous AI tool for the diagnosis of diabetic retinopathy (Digital Diagnostics, 2022). The second CMS mechanism is through the Inpatient Prospective Payment System (IPPS) for hospital outpatient services. Within IPPS, the Diagnosis Related Groups (DRG) coding system describes bundles of procedures and services provided to clusters of medically similar patients. Novel AI-DDS tools can be reimbursed in the context of a DRG via a mechanism known as the New Technology Add-on Payment (NTAP). NTAP, created to encourage the adoption of promising new health technologies, provides supplemental payment to a hospital for using a given new technology in the context of a broader care plan that may be covered in the original DRG (Chen et al., 2021).

As AI-DDS systems become more prevalent, sophisticated, and integrated into broader diagnostic workflows, distinguishing their specific role in the diagnostic process and ascribing specific reimbursement values to an algorithm may become difficult. AI-DDS tools may fare better and enjoy greater adoption under value-based payment frameworks, where efficiency and overall quality of care are incentivized rather than individual procedures (Chen et al., 2021).

Domain 2: Means to Use

Paramount to establishing the value proposition is ensuring that clinical environments are properly equipped to support and sustain the implementation of AI-DDS tools. This consists of two interrelated elements: (a) the data and computing infrastructure required to collect and clean health care data, develop and validate an AI algorithm at the point of care, and perform routine maintenance and troubleshooting of technical problems in a high-throughput environment; and (b) the human and operational resources needed to conduct these technical functions so clinicians can seamlessly interface with these tools.

Infrastructure

Building the necessary infrastructure to deploy AI-DDS relies on developing the hardware and software capabilities to support a range of functions beginning with data processing and curation. Concurrent with developing and implementing a working AI-DDS pipeline, several health IT infrastructure and data flow steps are required to support the implementation and sustainment of an AI-DDS tool. The first point of entry into the pipeline is data ingestion. This step requires linking a data producer, such as an MRI machine, into a data collection and processing workflow to maintain and represent the data in a way that can be leveraged by an AI-DDS algorithm. Many AI-DDS systems currently in use are “locked,” which means that the algorithms are static. However, in the case of a continuous learning/adaptive AI system, in which the system continuously ingests new data to update the algorithm in “real-time,” this could be performed on a fixed schedule (e.g., every day, month, etc.) or a trigger. The next consideration is determining where and how the raw data is stored (e.g., enterprise data warehouse [EDW] versus a data lake). In practice, these considerations are constrained by, first, the specific clinical problem being addressed and, second, the extent to which the available resources can accommodate the complexity of the pipeline. An EDW, which contains structured, filtered data for specific uses, may be preferred for operational analysis, whereas a data lake house, which is a large repository of raw data for purposes yet to be specified, may be selected by institutions seeking to perform deep research analysis. While model development is a distinct step in building an AI pipeline, it is nonetheless interdependent on deployment considerations. For example, an institution seeking to build analytic tools that are robust to future changes in imaging (e.g., adding a new MRI machine) may opt for a more flexible architecture of a data lake house instead of a traditional EDW. This, in turn, creates dependency cascades since data storage choice changes the order and extent to which data cleaning and other pre-processing pipelines are implemented. Thus, AI-DDS development and implementation choices are both business operations and data science decisions since their steps are codependent.

Some clinical problems may require more frequent data updates or “data meals” to ensure that adaptive AI systems can appropriately address rapidly evolving issues with a nascent foundation of data. For instance, a COVID-19 diagnostic model at the beginning of the pandemic might have been built around admission vital signs and complete blood count (CBC) results. However, as knowledge about the natural history of the illness progressed, the model may have evolved to include additional data types such as erythrocyte sedimentation rates (ESR), chest X-ray (CXR) images, and metabolic panel data. In many hospital systems, adding the ESR values is not particularly challenging from a data ingestion standpoint because this data originates from the same system that provides the CBC values. However, the addition of CXR images is challenging because it requires working with another department – radiology, in this instance – and interfacing with another information system (picture archiving and communication system [PACS]). Finally, extending predictions from a single outcome at a discrete point in time (i.e., cross-sectional analysis) to multiple predictions or ones relying on time series data can impact upstream choices for data ingestion pipelines.

It is also important to consider that health care AI needs to be deployed in clinical workflows. In these settings, the demand for near real-time data can result in added hardware complexity, expense, and risk. Notably, for most AI-DDS systems, raw data is insufficient; high-quality data that has been curated and annotated is required for robust algorithm training. At a minimum, redundant storage and processing cores capable of model training and validation are essential. While the granular technical requirements are specific to the algorithm employed, the amount and type of data (e.g., images vs. audio vs. text) institutions seek to implement AI-DDS tools may necessitate the ability to access storage on the terabyte and potentially petabyte scale. However, not all data are required to be available for realtime access. Furthermore, while discussion of data privacy and security is beyond the scope of this section, there are numerous Health Insurance Portability and Accountability Act (HIPAA)-compliant cloud solutions that could address the issues of availability of real-time data access and storage. These issues should be carefully considered in an institution’s data plan when seeking to develop and deploy AI-DDS tools.

Another major consideration beyond storage is processing power, particularly for model development and model updating. The types and number of specific chipsets that would be most beneficial should be determined by expert consultation once there is some understanding of the clinical use case and the amount and type of medical data involved. Due to the computational requirements, deep learning-based models might require use of graphical processing units (GPUs), which, in contrast to central processing units (CPUs), offer the ability to do parallel processing with multiple cores, which is particularly useful in deep learning models. While such models could be run on conventional CPUs, efficiency may be reduced by several orders of magnitude depending on model complexity, resulting in models that take weeks to train rather than hours.

Finally, with respect to deployment, it is essential that there is a local solution permitting any mission-critical AI-DDS tools to continue to function at times when internet connectivity is disrupted. Previously, these “downtime” events were often limited to a few hours or days. However, in the age of hospitals becoming an increasing target for ransomware attacks, some planning should be made for what to do if a downtime event lasts weeks or months.

With respect to software needs, the ability of models to run on mobile devices is becoming increasingly important. As such, the ability to either securely log on to a hospital’s server or perform the computations for an AI-DDS on a mobile device is becoming the industry standard, rather than a bespoke one-off requirement for providers enthusiastic about technology. The extent to which health systems should invest in such technology depends on the amount and type of data, the complexity and efficiency of AI/ML models, and the clinical scenario the AI-DDS is addressing. To illustrate, consider an AI-DDS that predicts the need for hospital admission based on data collected from traveling wound care nurse checking capillary blood glucose and uploading a picture of a patient’s worsening extremity wound. All of this can now be done on a mobile device. A model could be implemented such that a traveling wound care nurse takes a picture and runs the model at the point of care using an application on a mobile device.

Another key consideration for deployment of AI-DDS tools is system interoperability. This issue can be conceptualized from many different “pain points”. One occurs at the data ingestion stage, as discussed previously. This may be due to incompatible EHR systems (e.g., the hospital’s inpatient system uses Cerner, but the outpatient clinics use Epic), which cannot “speak” to one another. Alternatively, a health system may have hospitals that use the same EHR, but the EHRs do not share a common data storage repository. Although everyone uses the same PACS system, pulling imaging data from hospitals A, B, and C requires accessing one server, while pulling data from hospitals X, Y, and Z across the state requires accessing a different server, an issue of interoperability related to information exchange. A second ingestion scenario would require harmonization of different sensors into the same repository. For example, the hospital may use multiple types of point-of-care glucose monitors. The workflow workaround is often that the bedside technician looks at the monitor reading and then types it into the EHR. However, if this data needed to be transitioned into an automatically collected format, there may need to be different integrations for each type of glucose monitor. A second “pain point” occurs in the data cleaning stage, known as the data curation stage. Consider the ramifications of a hospital changing from reporting hemoglobins to hematocrits or traditional troponins to high sensitivity troponins. While this makes little difference at the bedside, it has the potential to signifi cantly complicate AI/ML modeling if the change is not recognized and a standardized process for addressing the inconsistency is not developed. Although a hospital’s primary focus should be on selecting tools that enhance value for patients, some attention should be devoted to considering how these tools may impact AI-DDS pipelines. As the reliance on cyber-physical systems grows, health systems should plan to mitigate how physical equipment upgrades change AI/ML data ingestion and use pipelines. Usually, such changes have a trivial effect on overall model performance; however, they can significantly impact the time and effort required to pre-process data. The most efficient way would be to have members of the AI-DDS team with expertise in cyber-physical systems and extract, transform, and load (ETL) data pipelines.

In addition, ensuring providers can readily access AI-DDS tools is critical to adoption. Successfully deploying an AI-DSS tool requires optimizing the multitude of human and software factors involved in the patient care workflow. However, as a preliminary consideration, the essential task is building infrastructure that avoids clinician devising workarounds. There is ample evidence that clinicians will avoid using or develop workarounds for poorly-tailored solutions or requirements that are perceived as being foisted on them and otherwise constitute yet another inefficiency in an already inefficient system. Regarding software, developers must be prepared to ensure that the tool can be used and viewed on both desktop and mobile devices and potentially by provider-facing and patient-facing versions of the EHR software. Transitioning between these various contexts should be seamless and, more importantly, provide the same information.

Resources

Apart from the data and computational infrastructure necessary to develop, implement, and maintain a health care AI-DDS solution, there are also significant human capital requirements. Practices and health systems often lack the required human resources to run a minimum data infrastructure that can support AI-powered applications. Key requirements include, but are not limited to, frontline IT staff, data architects, and AI-machine learning specialists to understand the context of use and tailor the solution to be fit for purpose. The infrastructure also requires information security and data privacy officers, legal and industrial contract officers for business and data use agreements, and IT educators to train and retrain providers and staff.

To ensure sustainable and safe integration of AI-DDS tools into clinical care, it is crucial that the tools meet the clinical needs of the institution while also maintaining alignment with best practice guidelines, which change over time (Sutton et al., 2020). This requires a governance process in the health care system, with time investments from executive leadership and sponsorship as well as committee and oversight mechanisms to provide regular review (Kawamanto et al., 2018). Direct clinical champions must also have dedicated time to interface between front-line clinicians and the leadership, informatics, and data science teams. These models and tools need to be assessed for accuracy in the local environment and modified and updated if they do not perform as expected. Lastly, they must be surveilled over time and checked regularly to ensure performance maintenance.

One of the major challenges in effectively deploying AI in health care is managing implementation and maintenance costs. Nationally, non-profit hospital systems report an average profit margin of around 6.5%. (North Carolina State Health Plan and Johns Hopkins Bloomberg School of Public Health, 2021). These relatively slim margins encourage health care systems to be conservative in investing in unproved or novel technologies. Robust analysis of cost savings and cost estimates in the deployment of AI in health care is still lagging, with only a small number of articles found in recent systematic reviews, most of which focus on specific cost elements (Wolff et al., 2020). In general, industry estimates the overall cost of development and implementation of such tools can range from $15,000 to $1 million, depending on the complexity of the system and integration with workflow (Sanyal, 2021).

Another challenge is the tension between hiring a health care technology firm to develop or adapt the algorithms and tools into a health care environment versus hiring and supporting internal staff, which could cost between $600 and $1,550 a day (Luzniak, 2021). Even when much of the core data science expertise is hired into a system, data scientists spend about 45% of their time on data cleaning (GlobeNewswire, 2020). Because familiarity and ongoing business intelligence and clinical operations needs require managing data, many systems choose to hire internally for a portion of their infrastructure needs, which require a continued injection of capital.

Domain 3: Method to Use

Operationalizing and scaling innovations within the health care delivery system is costly and challenging. This is partly due to the heterogeneity of clinical workflows across and within organizations, medical specialties, patient populations, and geographic areas. Thus, AI-DDS tools must contend with this heterogeneity by plugging into key process steps that are universally shared. However, a weakness that limits options for reshaping physician workflows is the still nascent implementation science for deploying interventions that change provider behavior as well as the non-modularity and non-modifiability of extant, sometimes antiquated point-of-care software, including EHRs (Mandl and Kohane, 2012).

Coupled with workflow challenges is the issue of developing and deploying these tools in a manner that improves efficiency of practice and frees up cognitive and emotional space for providers to interact with their patients. The risk of unsuccessful systems interfering with or detracting from the diagnostic process, through user interface distractions or data obfuscation, exists and must be guarded against. In addition, extensive user training, both onboarding and ongoing and equally nimble educational infrastructure, is necessary to ensure technical proficiency.

Workflow

AI-DDS tools must be effectively integrated into clinical workflows to impact patient care. Unfortunately, many integrations of AI solutions into clinical care fail to improve outcomes because context-specific factors limit efficacy when tools are diffused across sites. Although numerous details are crucial to integrating AI/ML tools into practice, three key insights have emerged from experiences integrating AI/ML tools into practice at various locations and drawn from literature reviews of the AI clinical care translation process (Kellogg et al., 2022; Sendak et al., 2020a; Yang et al., 2020; He et al., 2019; Wiens et al., 2019; Kawamoto, 2005).

First, health systems looking to use AI-DDS tools must recognize the factors that shape adoption and be willing to restructure roles and responsibilities to allow these tools to function optimally. The current state of health information technology centers workflows around the EHR, and AI tools often automate tasks that historically required manual data entry or review. Similarly, AI tools often codify clinical expertise and can prompt concern from clinicians who value autonomy (Sandhu et al., 2020). To navigate these complexities, health systems may need to develop new workflows that change clinical roles and responsibilities, including new ways for interdisciplinary teams to respond to AI alerts. For example, an increasing number of AI tools require staff in a remote, centralized setting to support bedside clinical teams (Escobar et al., 2020; Sendak et al., 2020b). Many hospitals already benefit from more manual remote, interdisciplinary support through services such as cardiac telemetry, eICU, and overnight teleradiology. Similarly, AI can decentralize the location of specialized services. For example, instead of diabetic retinopathy screening requiring a visit to a retina specialist, Digital Diagnostics now hosts automated AI machines at grocery stores (Digital Diagnostics, 2019).

Second, health systems must closely examine the unique impacts of AI integration on different stakeholders along the care continuum and balance stakeholder interests. This is a key facet in establishing the value proposition for the introduction of a new AI-DDS tool. Experience in AI integration reveals that “predictive AI tools often deliver the lion’s share of benefits to the organization, not to the end user” (Kellogg et al., 2022). Predictive AI tools often identify events before they happen, meaning the optimal setting for AI use is upstream of the setting typically affected by the event. For example, patients with sepsis die in the hospital and often in intensive care units, but timely intervention to prevent complications must occur within the emergency department (ED). Similarly, patients with end-stage renal disease often present to the ED to initiate dialysis, but preventive interventions must occur in primary care. Project leaders looking to integrate AI into workflows must map out value streams, and if value is captured by downstream stakeholders in a different setting, project leaders must identify other opportunities to create value for end users. One approach is to identify
“how a tool can help the intended end users fix problems they face in their day-to-day work” (Kellogg et al., 2022). For example, when a team of cardiologists and vascular surgeons aimed to reduce unnecessary hospital admissions for patients with low-risk pulmonary embolisms (PEs), ED clinicians initially pushed back. Scheduling outpatient followup for a low-risk PE had historically been challenging, so the specialists offered to coordinate care for patients identified by the AI/ML tool and block off outpatient appointments to ensure timely follow-up, allowing both the tool and the clinicians to operate as efficiently as possible (Vinson et al., 2022).

Third, workflows should be continuously monitored and adapted to respond to optimize the labor effort required to effectively use AI tools. For example, when a chronic kidney disease algorithm was implemented on a Duke Health Medicare population of over 50,000 patients, many patients identified by the algorithm as high risk for dialysis were already on dialysis or seeing a nephrologist outside of Duke (Sendak et al., 2017). Early intervention was no longer as relevant for these patients, so the team agreed to establish a new pre-rounding process by which a nurse filtered out patients already impacted by the outcome of interest. However, after months of manually reviewing alerts for patients identified by an AI tool as high risk of inpatient mortality, the lead nurse felt confident that the algorithm identified appropriate patients (Braier et al., 2020). The team agreed to remove the manual review step and directly automate emails to hospitalist attendings to consider goals of care conversations. Lastly, there must also be feedback loops with end users to ensure that the AI tool continues to be appropriately used. For example, hospitalists using the inpatient mortality tool inquired about using the tool to triage patients to intensive care units. Similarly, nurses responding to sepsis alerts began asynchronously messaging clinicians in the ED through the EHR rather than calling and talking directly with provider. These changes in communication approach and intended use may seem subtle but can undermine validity of the tool and potentially harm patients. To avoid drift in workflow or use of AI tools, project leaders should clearly document algorithms and regularly train staff on appropriate use (Sendak et al., 2020c).

Efficiency of Practice

The impact of AI-DDS tools and systems on the cognitive and clerical burdens of health care providers remains unclear. Successful tools would ideally reduce both burdens by delivering just-in-time diagnostic assistance in the most unobtrusive manner to providers while minimizing clerical tasks that might be generated by their use (e.g., extra clicks, menu navigation, more documentation). Experience with traditional CDS systems has shown that these tools are significantly more likely to be used if they are integrated into EHRs instead of existing as stand-alone systems. However, integration alone is insufficient. How that integration is executed – from the design of the user interfaces to the way alerts and notifications are displayed (e.g., triggers, cadence) or handled (e.g., non-interruptive versus interruptive alert) – is critical to practice efficiency and, ultimately, provider acceptance and adoption.

One major impediment is the high degree of difficulty integrating new software with vendor EHR products. Most integrations are “one-offs,” and, therefore, the technology fails to diffuse broadly. The 21st Century Cures Act (“Cures Act”) specifies a new form of health IT interoperability underpinning the redesign of provider-facing applications as modular components that can be launched within the context of the EHR, and which may be instrumental in delivering AI capabilities to the point of care (114th Congress, 2016a). The Cures Act and the federal rule that implements interoperability provisions require that EHRs have an application programming interface (API) granting access to patient records “with no special effort” (Wu et al., 2021; HHS, 2020). “APIs are how modern computer systems talk to each other in standardized, predictable ways. The Substitutable Medical Applications, Reusable Technologies (SMART) on Fast Healthcare Interoperability Resource (FHIR) API, required under the rule, enables researchers, clinicians, and patients to connect applications to the health system across EHR platforms” (Wu et al., 2021). Top EHR vendors have all incorporated common API standards (“SMART on FHIR”) into their products, creating a substantial opportunity for innovation in software and data-assisted health care delivery. Illustrative of the transformative potential of the integration of AI-DDS with EHRs is Apple’s decision to use the SMART API to connect its Health App to EHRs at over 800 health systems, giving 200 million Americans the option to acquire standardized and computable copies of their medical record data on their phones. The implementation science underpinning translation of machine learning to practice is nascent, however. Cultivating support for standards is driving an emerging ecosystem of substitutable apps, which can be added to or deleted from EHRs (like apps on a smartphone can). Such apps yield opportunity to deliver the output of diagnostic algorithms within the provider workflow during an EHR session within a patient context (Barket and Johnson, 2021; Kensaku et al., 2021; Khalifa et al., 2021).

EHR alert fatigue is a widespread and well-studied phenomenon among providers that has been linked to avoidable medical errors and burnout (Ommaya et al., 2018). How the introduction of AI-DDS systems into next-generation EHRs might affect alert fatigue and the provider experience is unclear. Successful deployment of these AI-DDS tools likely requires use of both human factors engineering and informatics principles, as the problem arises from the technology and how busy humans interact with it. Diagnostic outputs provided by the DDS should be specific, and clinically inconsequential information should be reduced or eliminated. Outputs should be tiered according to severity with any alternative diagnoses presented in a way that signals providers to clinically important data. Alerts must be designed with human factors principles in mind (e.g., format, content, legibility, placement, colors). Only the most important, high-level, or severe alerts should be made interruptive.

While thoughtful human-centered design can facilitate adoption to an extent, some degree of health care provider training will be required to ensure the necessary competencies to use AI-based DDS tools. The rapid pace of technological change requires such educational infrastructure to be equally nimble. Training opportunities must be integrated across undergraduate medical education, graduate medical education, and continuing medical education. To the extent that some AI-DDS tools are designed to support collaborative team workflows, interprofessional and multidisciplinary training is also necessary. While competencies surrounding the use of AI-DDS systems are still evolving and yet to be established, the authors of this paper have identified the following core areas as essential:

Foundational knowledge (“What is this tool?”);
Critical appraisal (“Should I use this tool?”);
Clinical decision-making (“When should I use this tool?”);
Technical use (“How should I use this tool?);
Addressing unintended consequences (“What are the side effects of this tool and how should I manage them?”)

For foundational knowledge, health care providers need to understand the fundamentals of AI, how AI-DDS are created and evaluated, their critical regulatory and medicolegal issues, and the current and emerging roles of AI in health care. For critical appraisal, providers need to be able to evaluate the evidence behind AI-DDS systems and assess their benefits, harms, limitations, and appropriate uses via validated evaluation frameworks for health care AI. For clinical decision-making, providers need to identify the appropriate indications for and incorporate the outputs of AI-DDS into decision-making such that effectiveness, value, and fairness are enhanced. For technical use, providers need to perform the tasks critical to operating AI-based DDS in a way that supports efficiency, builds mastery, and preserves or augments patient-provider relationships. To address unintended consequences, providers need to anticipate and recognize the potential adverse effects of AI-DDS systems and take appropriate actions to mitigate or address them. Determining how to integrate this education into an already crowded training space, whether extra certification or credentialing is required for providers to use AI-DDS, and how institutions can adapt to rapidly changing training needs on the frontlines remain open questions.

Domain 4: Desire to Use

Ultimately, the success of AI-DDS tools in optimizing health system performance is dependent on the desire of clinicians to incorporate these tools into routine practice. Indeed, the factors discussed in the previous three core domain sections are crucial variables in the “desire to use” calculus. Additionally, it is important to attend to psychological factors, such as addressing how these tools can facilitate professional fulfillment among providers, including mitigating burnout. The other indispensable element within the desire to use core domain is trust. Clinicians must be able to trust that these tools can deliver quality care outcomes for their patients without creating harm or error and align with both patients’ and clinicians’ ethics and values.

Professional Fulfillment

Continued alignment of AI technology with the element of the Quintuple Aim to improve the work-life balance of health care professionals remains an indispensable aspect of the potential success and adoption of AI tools. Health care providers report high levels of professional burnout, partially attributable to EHRs and related technologies (Melnick et al., 2020). Generally, for every one hour spent with patients, providers spend another two hours in front of their computers (Colligan et al., 2016). The exponential rise in digital work since the COVID-19 pandemic began has exacerbated burnout and amplified some providers’ deeply rooted reluctance to adopt new technologies (Lee et al., 2022). Successful AI-DDS tools will need to overcome this hesitancy and tap into positive sources of fulfillment for providers, including facilitating professional pride, autonomy, and security; reassessing or expanding their scope of practice; and augmenting their sense of proficiency and mastery.

A major contributing source of professional fulfillment is the strength of the patient-provider relationship. As discussed, AI-DDS tools hold the potential to greatly improve diagnostic accuracy and reduce medical errors. If seamlessly integrated, they could also unburden providers of rote tasks, enabling them to allocate more attention to engaging and establishing meaningful bonds with patients. However, by deferring certain higher-order data analysis and synthesis tasks – functions traditionally within the scope of providers – to an AI-based system, providers may experience a sense of detachment from their work. There also is concern that AI systems could erode the patient-provider relationship if patients begin to preferentially value the diagnostic recommendation of an AI system. While the personal qualities of interacting with a human might be preferred, some believe that AI’s ability to emulate human conversation (via chatbots or conversational agents) could eventually supplant providers (Goldhahn et al., 2018). However, it should be noted that this concern only applies to autonomous systems, and the assistive systems this paper focuses on involve, by definition, a health care professional in the workflow.

As observed in previous cycles of AI diffusion, potential threats to job security have negatively impacted provider receptivity to AI. Anxiety has been particularly acute in certain specialties, such as radiology, where in 2016, speculation arose that radiologists would be irrelevant in five years (Hinton, 2016). However, instead of replacing providers, AI in radiology has assumed an assistive role, supporting providers in the sorting, highlighting, and prioritizing key findings that might otherwise be missed (Parakh, 2019). Therefore, to foster the adoption of AI-DDS, it is important to uphold the paradigm of augmented intelligence – in which these tools enhance human cognition, and the human is ultimately the arbiter of the action recommended. A key element of this is to empower providers to co-exist in an increasingly digital world through skill-building and instilling trust and transparency in AI systems. It is also important to reconsider expectations about provider roles and responsibilities. With the potential of increased practice efficiency, AI-DDS tools may expand provider bandwidth and purview. In this regard, providers could see patients in greater numbers, through multiple media, and in geographically distant areas.

Despite increasingly sophisticated AI algorithms, it is imperative to value the human qualities that can correct or counteract the shortcomings of AI systems. For instance, biased algorithms struggle with diagnosing melanoma in darker-skinned patients (Krueger, 2022). Having a provider carefully review and assess results produced or interpreted by an AI tool is essential to avoiding a missed or erroneous diagnosis in this case. Above all, provider involvement is critical in shared decision-making. Even in circumstances when an AI-DDS tool is highly accurate, providers are indispensable in helping patients select the right course of treatment based on their health goals and preferences.

Trust

Trust within human-AI-diagnostic partnerships requires a human willingness to be vulnerable to an AI system. Trust overall is a complex concept and trust in technology is equally complex (Lankton et al., 2015). A human user may distrust an AI-DDS tool whose recommendations go against their intuitive conclusions, especially if that person has professional training and significant experience. A user may also distrust AI-DDS recommendations if the user finds something faulty with the development process of the tool, such as inadequate testing or a lack of process transparency. Another potential impediment can include concern that the tool’s development and use is motivated by profits over people or a lack of professional values alignment (Rodin and Madsbierg, 2021). Clarity in individual clinician and health care organizational governance and standards setting for various AI tools remains unclear, which also may inhibit trust. Drivers of trust, on the other hand, can include positive past experiences with a particular manufacturer or service provider, seamless interoperability of a new application with an existing suite of tools from a familiar and currently trusted company or product, or company reputation among the professional health care community (Adiekum et al., 2018; Benjamin, 2021; European Commission, 2019).

In this section of the paper, the authors focus on two significant sources of distrust with AI-DDS products as especially relevant to the adoption of AI-DDS by clinicians:

bias (real or perceived) and
liability.

Providers may be concerned that AI-DDS tools underperform in care for certain patients, especially marginalized populations, as AI trained on biased data can produce algorithms that reproduce these biases. However, it is critical to recognize that bias has multiple sources. It could arise, for example, if the data used to train the AI did not adequately represent all population subgroups that eventually will rely on the AI-DDS tool. It is crucial to ensure that training data are as inclusive and diverse as the intended patient populations, and that deficiencies in the training data are frankly disclosed. Using all-male training data for a tool intended for use only in males to detect a male health condition would not result in bias, but using all-male data would cause bias in tools intended for more general use. Other bias types could exist, for example, if AI tools are trained using real-world data incorporating systemic deficiencies in past health care. For example, if doctors in the past systematically underdiagnosed kidney disease in Black patients, the AI can “learn” that bias and then underdiagnose kidney disease in future Black patients. Thus, it is crucial to design and monitor AI tools with a lens toward preventing, detecting, and correcting bias and disclosing limitations of the resulting AI-DDS tools.

Complicating this issue is the fact that it can be very difficult to understand the inner workings of many AI-DDS algorithms. The terms “transparency” and “explainability” can have various technical meanings in different contexts, but this paper conceives them broadly to denote that the user of an AI tool, such as a health care professional, would be able to understand the underlying basis for its recommendations and how it arrived at them. It can be challenging, and at times impossible, to understand how an AI arrives at its output and to determine whether the tool in question problematically replicates social biases in its predictions. Furthermore, developers rarely reveal the underlying data sets used to train AI-DDS algorithms, making it difficult for providers to ascertain if a particular product is trained to reflect their patient populations. There may also be tension between the AI-DDS purchasing decisions made by hospital leadership and the providers affiliated with the institutions, with the perception that hospital leadership is “imposing” use of specific AI-DDS algorithms on the providers.

To foster trust among clinician users, a regulatory framework that prospectively aims to prevent injuries (see discussion in Tools to Promote Trust), coupled with mechanisms to assign accountability and compensate patients if problematic outcomes occur, must exist. Because AI-DDS tools sit at the intersection of technology and clinical practice, there are two potential avenues for compensating patient injuries through the American tort system. The first is medical malpractice, which implies that the ultimate responsibility for problematic clinical decisions rests with the provider. The second is product liability, which implies that the responsibility for problematic clinical decisions rests instead with the developer and manufacturer of the AI-DDS tool.

Currently, the dividing line appears to be whether an independent professional, such as an end-user provider, could review the recommendations from an AI tool and understand how it arrived at them. As commentators note:

The Cures Act parses the product/practice regulatory distinction as follows: Congress sees it as a medical practice issue (instead of a product regulatory issue) to make sure health care professionals safely apply CDS [clinical decision support] software recommendations that are amenable to independent professional review. In that situation, safe and effective use of CDS software is best left to clinicians and to their state practice regulators, institutional policies, and the medical profession. When CDS software is not intended to be independently reviewable by the health care provider at the point of care, there is no way for these bodies to police appropriate clinical use of the software. In that situation, the Cures Act tasks the FDA with overseeing its safety and effectiveness. Doing so has the side effect of exposing CDS software developers to a risk of product liability suits (Evans and Pasquale, 2022).

This distinction is a workable and sensible one, reflecting the limitations of the average provider’s abilities to evaluate new AI-DDS tools. It would be helpful to educate providers and hospital administrators on the dividing line between explainable CDS tools, which allow health care providers to understand and challenge the basis for algorithmic decision-making and “black box” algorithms, for which the basis of algorithmic decisions making is obscure, on the other hand. This distinction carries implications for liability insofar as courts may hesitate to hold providers accountable for “black box” tools that precluded the possibility of provider control. Providers who hesitate to adopt AI-DDS out of fear of medical malpractice liability may find that distinction comforting and trust-building. For patient injuries arising when AI-DDS systems are in use, policymakers and courts may wish to consider shifting the balance of liability from the current norm (which focuses almost entirely on medical malpractice) to one that also includes product liability in situations where the AI tool, rather than the provider, appears primarily at fault. This shift could further encourage trust and desire to use these tools among providers and would incentivize developers to design algorithms and select training data with a view to minimizing poor outcomes.

Product liability generally arises when a product inflicts “injuries that result from poor design, failure to warn about risks, or manufacturing defects” (Maliha et al., 2021). Product liability, to date, has only been applied in limited and inconsistent fashion to software in general and to health care software in particular (Brown and Miller, 2014). For example, in Singh v. Edwards Lifesciences Corp, the court permitted a jury to award damages against a developer because its software resulted in a catheter malfunctioning (CaseText, 2009b). On the other hand, in Mracek v. Bryn Mawr Hospital, a court rejected via summary judgment the plaintiff’s argument that product liability should be imposed when the da Vinci surgical robot malfunctioned in the course of a radical prostatectomy (CaseText, 2009a). Further complicating the product liability landscape, the Supreme Court concluded in Riegel v. Medtronic that devices going through the FDA premarket approval process, as opposed to other market authorization pathways such as 510(k) clearance, can enjoy certain protection against state product liability cases (CaseText, 2008). Thus, available redress for patients can vary depending on the market authorization pathway for the specific AI tool. The conflicting and limited case law in this area suggests that there is room to explore an expanded product liability landscape for AI-DDS software. One clear point from prior case law is that clinicians will bear the brunt of liability for injuries that occur when using AI-DDS tools “off-label” (e.g., using a tool that warns it is only intended for use on one patient population on a different population). This fact may help incentivize AI tool developers to disclose limitations of their training data since doing so can shift liability to providers who venture beyond the tool’s intended use.

It is also important to note that opening the door to product liability suits does not foreclose the potential for medical malpractice suits against providers who use AI-DDS tools. A provider who relies on AI-DDS tools in good faith could still face medical malpractice liability if their actions fall below the generally accepted standard of care for use of such tools or if the AI-DDS tool is used “off label”, i.e. using an AI-DDS tool developed for one type of MRI interpretation on another type of MRI image (Prince et al., 2019). Overall, courts are reluctant to excuse physician liability, allowing malpractice claims to proceed against physicians even in cases where:

there was a mistake in the medical literature or an intake form;
a pharmaceutical company failed to warn of a therapy’s adverse effect; or
there were errors by system technicians or manufacturers (Maliha et al., 2021).

These cases, taken together, suggest that providers cannot simply point to an AI-DDS error as a shield from medical malpractice liability.

Eventually, widespread adoption of AI-DDS could open the door for medical malpractice liability for providers who do not incorporate these tools into their practice, i.e., “failure to use”. Physicians, specifically, open themselves to medical malpractice liability when they fail to deliver care at the level of a competent physician of their specialty (Price et al., 2019). Currently, the standard of care does not include relying on AI-DDS tools. But as more and more providers incorporate AI-DDS tools into their practice, that standard may shift. Once the use of AI-DDS is considered part of the standard of care, medical malpractice liability will create a strong incentive for all providers to rely on these tools, regardless of their personal views on appropriateness.

Tools to Promote Trust

Two of the most impactful mechanisms to promote trust in AI-DDS among clinicians (and, thus, improving desire to use) would be to further refi ne the existing regulatory landscape for AI-DDS tools and to promote collaborations between stakeholders. This section of the paper explores avenues to promote trust.

To minimize concerns about liability, nuanced, thoughtful regulation and governance from all levels of the U.S. government – federal, state, and local – can reassure providers that they can trust available AI-DDS tools and move forward with implementation. A key factor affecting clinicians’ willingness to adopt AI-DDS tools is likely whether the tools will receive a rigorous, data-driven review of safety and effectiveness by the FDA before moving into clinical use. A potential concern is that some, but not necessarily all, AI-DDS software is subject to FDA medical device regulation under the Cures Act. It remains difficult for providers to intuit whether a given type of AI-DDS tool is or is not likely to have received oversight under FDA’s medical device regulations. Uncertainty about which tools will receive FDA oversight – and which marketing authorization process the FDA may require (e.g., premarket approval, 510(k), or de novo classification) – likely fuels provider discomfort with using AI-DDS tools.

A key source of this uncertainty, at present, is that the Cures Act addresses the scope of the FDA’s power to regulate various types of medical software but does not itself define or use the terms DDS or CDS software (114th Congress, 2016a; 21 U.S. Code § 360j, 2017). As used in this paper, AI-DDS tools broadly refer to computer-based tools, driven by AI algorithms, that use clinical knowledge and patient-specific health information to inform health care providers’ diagnostic decision-making processes (see Table 1), with DDS tools being a subset of CDS tools more generally. This paper thus follows the definition provided by the Office of the National Coordinator for Health Information Technology (ONC), which stresses that CDS tools “provide … knowledge and person-specific information, intelligently filtered or presented at appropriate times, to enhance health and health care” (ONC, 2018). The FDA has used this ONC definition when discussing how CDS software is broadly understood (FDA, 2019b). Central to the ONC definition, and this paper, is the notion that DDS and CDS tools combine general medical “knowledge” with patient-specific information to produce recommended diagnoses. With AI-DDS systems, that knowledge can include inferences generated internally by an AI/ML algorithm.

The Cures Act authorizes the FDA to regulate only some of the software that might fit into the broader, more common conception of AI-DDS systems just described. Thus, FDA lacks authority to regulate all of the tools that clinicians might think of as being DDS/CDS tools. The Cures Act expressly excludes five categories of medical software from the definition of a “device” that FDA can regulate (114th Congress, 2016a [21 U.S.C. § 360j(o)(1), 2017]). One of these exclusions places restrictions on FDA’s power to regulate CDS and DDS software (114th Congress, 2016a [21 U.S.C. § 360j(o)(1)(E)]). Box 1 shows the specific wording of the relevant Cures Act exclusion.

Looking at the basic exclusion in Box 1, the first two conditions, (i) and (ii), describe CDS and DDS software without using those names. The third condition, shown at (iii), bears on the concept this paper refers to as explainability, again without using that term. When all three conditions are met, this passage of the Cures Act creates a potential exclusion from FDA regulation for CDS/DDS software that meets the criterion for explainability set out in condition (iii) of Box 1. This exclusion, however, is subject to the two exceptions shown at the bottom of Box 1.

The first exception – the saving clause – confirms the FDA’s power to regulate many types of software whose function supports diagnostic testing, such as software used in the bioinformatics pipeline for genomic testing. Before the Cures Act, FDA’s medical device authority included oversight covering both in vitro diagnostic devices (which support clinical laboratory testing of biospecimens) and in vivo devices (such as X-rays and MRI machines that produce images of tissues within a patient’s body). FDA has long regulated software embedded in diagnostic hardware devices, for example, software internal to sequencing analyzers and MRI machines. The saving clause confirms FDA’s power to regulate “stand-alone” diagnostic software that is not necessarily part of a hardware device but processes signals from in vitro and in vivo testing devices.

This power is crucial in light of the modern trend for many clinical laboratories to use third-party software service providers and vendors for data analysis supporting complex diagnostic tests, such as genomic tests (Curnutte et al., 2014). In vitro diagnostic testing by clinical laboratories is subject to the Clinical Laboratory Improvement Amendments of 1988 (CLIA) regulations (100th Congress, 1988). The CLIA framework focuses on the quality of clinical laboratory services but does not provide an external, data-driven regulatory review of the safety and effectiveness of tests used in providing those services, nor does it evaluate the software laboratories use when analyzing and interpreting test results. FDA’s authority to regulate stand-alone diagnostic software positions FDA to oversee clinical laboratory software, even in situations where FDA exercises discretion and declines to regulate an underlying laboratory-developed test (Evans et al., 2020). In a 2019 draft guidance document, circulated for comment purposes only, the FDA noted that “bioinformatics products used to process high volume ‘omics’ data (e.g., genomics, proteomics, metabolomics) process a signal from an in vitro diagnostic (IVD) and are generally not considered to be CDS” tools (FDA, 2019b). The saving clause clarifies that FDA can regulate such software, even in situations where it might technically be considered CDS software falling within the basic exclusion in Box 1 (114th Congress, 2016a [21 U.S.C. § 360j(o)(1)(E)]).

Much of the AI-DDS software providers use in clinical health care settings would not fall under the saving clause (see Box 1), which seems directed at software processing signals from diagnostic devices as part of the workflow for producing finished diagnostic test reports and medical images. However, there is some ambiguity. An example would be an AI-DDS tool that analyzes several of a patient’s gene variants along with the patient’s reported symptoms, clinical observations, treatment history, and environmental exposures to recommend a diagnosis to a clinician. It is unclear if the fact that the tool processes gene variant data means that it is “processing a signal from an IVD device” and thus FDA-regulated, or if the saving clause only applies when the signal is directly fed to the software as part of the clinical laboratory workflow. Without knowing how the FDA interprets the breadth of the saving clause, it is hard for clinicians to understand what is and is not regulated.

Assuming the saving clause does not apply, AI/DDS tools are generally excluded from FDA regulation if they meet all three of the conditions listed at (i)-(iii) in Box 1. The first two conditions are fairly straightforward, but it is still not clear how the FDA plans to assess whether the third condition, bearing on the concept of explainability, has been met. How, precisely, the FDA will decide whether an AI/DDS tool is “intended” to be “for the purpose of” “enabling [a] health care professional to independently review the basis for [its] recommendations” (see Box 1) is unknown. The FDA’s regulation on the “Meaning of intended uses” offers insight into the range of direct and circumstantial evidence the agency can consider when assessing objective intent (FDA, 2017b [21 C.F.R. § 801.4]). Yet how the agency will apply those principles in the specific context of AI/ML software tools is not clear.

Without greater clarity on these matters, clinicians lack a sense of whether a given type of AI-DDS tool usually is, or usually is not, subject to FDA oversight or what FDA’s oversight process entails. Almost six years after the Cures Act, FDA’s approach for regulating AI/ML CDS/DDS software remains a work in progress, leaving uncertainties that can erode clinicians’ confidence when using these tools. Through two rounds of draft guidance (in 2017 and 2019), the FDA solicited public comments to clarify its approach to regulating CDS/DDS tools. A final guidance on Clinical Decision Support Software appears on the list of “prioritized device guidance documents the FDA intends to publish during FY2022” (October 1, 2021 – September 30, 2022) (FDA, 2021c). As this paper went to press in September 2022, the final guidance was not yet available, but the authors hope it may clarify these and other unresolved questions around the regulation of CDS/DDS tools.

Unfortunately, guidance documentswhether draft or final – have no binding legal effect and do not establish clear, enforceable legal rights and duties on which software developers, clinicians, state regulators, and members of the public can rely. There is fairly wide scholarly agreement that the use of guidance as a regulatory tool can be appropriate for emerging technologies where knowledge is rapidly evolving and flexibility is warranted, but there can be longterm costs when agencies choose to rely on guidance and voluntary compliance instead of promulgating enforceable regulations (Wu, 2011; Cortez, 2014). FDA’s Digital Innovation Action Plan (FDA, 2017a; Gottlieb, 2017) and its Digital Health Software Precertification (Pre-Cert) Program (FDA, 2021b) both acknowledge that its traditional premarket review process for moderate and higher-risk devices is not well suited for “the faster iterative design, development, and type of validation used for software-based medical technologies” (FDA, 2017a). The FDA’s 2021 AI/ML Action Plan envisions incorporating ongoing post-marketing monitoring and updating of software tools after they enter clinical use (FDA, 2021a). This may leave health care providers in the uncomfortable position of using tools that may be modified even after the FDA clears them for clinical use and potentially facing liability if patient injuries occur. Also, it implies that vendors and developers of AI/ML tools will need access to real-world clinical health care data to support ongoing monitoring of how the tools perform in actual clinical use.

Future reliance on post-marketing monitoring offers an example of why regulating via non-binding guidance documents can create long-term problems. The HIPAA Privacy Rule contains an exception that lets HIPAA-covered health care providers, such as hospitals, share data with device manufacturers to help them meet their FDA regulatory compliance obligations (for example, to help manufacturers comply with the FDA’s adverse-event reporting requirements) (HHS, 2003). Unfortunately, when FDA regulates manufacturers by means of guidance documents and other non-mandatory programs, this important HIPAA pathway for accessing data may be unavailable, because guidance documents create no enforceable legal obligations. To maximize software developers’ access to real-world evidence for post-marketing monitoring and updating of AI/DDS tools, the FDA will ultimately need to set binding regulatory requirements (for example, for developers to monitor for racial, gender, or other biases in the post-market period). Related concerns surround the future development of state law, including both state regulations and tort law. Safe clinical use of AI/DDS tools will ultimately require state-level medical practice regulations and common law addressing issues such as appropriate staffing for, and use of, AI/DDS tools in clinical settings. To foster optimal development of state law, it is helpful to have federal regulations providing a stable demarcation between the FDA’s role versus that of the states. Federal guidance documents, due to their nonbinding nature and ease of revision, may not meet this need. The FDA’s current heavy reliance on guidance documents and voluntary measures may be appropriate in the early years as AI/DDS tools emerge as a new technology, but the agency should stay mindful of the need to promulgate regulations whenever appropriate and feasible.

Apart from the regulatory framework, another mechanism to instill trust is through increased and consistent collaboration among developers, ethicists, and clinical diagnosticians during various phases of the AI lifecycle. Early innovation in the process of AI pre-market design, testing, clinical application, and post-market oversight resulted in fragmented and siloed professional stakeholder groups with different goals, expertise, ethical frameworks, and paradigms of professionalism and professional accountability. While a great deal of health care professional ethical attention, input, and engagement has been integrated into AI use and application in the post-market phase, there has been an important gap in full integration of professional end-user partnership within the AI tool development process needed to build trustworthy AI tools.

Numerous AI and digital health ethical frameworks have been published as part of the concerted effort to build trustworthy human-AI partnerships. For example, the European Commission’s Ethics Guidelines for Trustworthy AI is a foundational work on the topic, with seven key requirements:

Human agency and oversight,
Technical robustness and safety,
Privacy and data governance,
Transparency,
Diversity, non-discrimination and fairness,
Environmental and societal well-being, and
Accountability (European Commission, 2019).

Additionally, over 40 different U.S. technology companies and venture capital firms have signed on to a Responsible Innovations Charter, with similar key principles:

Innovating intentionally,
Operating with accountability and transparency,
Advancing inclusive prosperity,
Building sustainably,
Respecting people,
Championing diversity, and
Promoting healthy societies (Responsible Innovation Labs, 2022).

The American Medical Association has developed policies and frameworks for practicing diagnosticians to govern and assess AI integration into clinical practice (Crigger et al., 2022). Essentially, the structured assessment aids the clinician in ascertaining: whether a tool is beneficial to patient outcomes; whether a tool appears to work; and whether a tool appears to work for their patients. These guidelines, along with several global government-produced assessments for organizational leaders, provide a systematic and structured assessment for providers to select and utilize trustworthy and beneficial AI for their practice.

Ensuring and Promoting Health Equity in the Deployment of AI-Assisted Diagnostic Tools

In addition to facilitating uptake and overcoming barriers to the adoption of AI-DDS tools elucidated in this review, being cognizant of the implications for equity throughout the life cycle of these tools and making a consistent effort to address past, current, and potential equity issues are critical to preventing widening disparities in health care delivery. While there is excitement and demonstrated benefits to bringing AI-DDS tools into clinical practice, poor data quality, prevalent biases in health care, and a lack of structural supports available to end users jeopardize progress toward achieving health equity and fuel ongoing uncertainties and hesitancies about adopting these tools.

AI/ML algorithms are often developed using limited data samples that may not represent the people they are meant to impact (Zou and Schiebiner, 2021). Furthermore, social determinants of health data are generally not well captured in data sets used to train these algorithms. Data elements derived from diverse sources that could help provide a more holistic view of the patient may not be available to certain care settings due to the limitations of EHR systems, data privacy concerns, a lack of data standardization, and financial constraints on the part of health systems to obtain large data sets (Zusterzeel et al., 2022; Alami et al., 2020). Inaccurate representation in training, testing, and validation data sets also results in the development of flawed models. Models not accurately trained in the context that they are intended for may also have difficulty performing when there is a shift in population demographics (Singh et al., 2020).

AI tools rely on human interaction from their inception to deployment, and AI algorithms can replicate explicit and implicit biases in human decision-making in health care settings (Char et al., 2018). Inherent discrimination occurring within care delivery can be challenging to predict and uncover, and biases could easily transfer over into the design and use of AI algorithms (Leslie et al., 2021; Char et al., 2018). For example, the biases of developers, researchers, and designers can manifest early in the development phase if they choose target variables and proxies for those variables without considering upstream social determinants of health and related confounders (Leslie et al., 2021). Along with the data collection issues summarized above, other data extraction and measurement errors due to biases built into physical devices can negatively influence care decisions and perpetuate inequities (Leslie et al., 2021; Zou and Schiebiner, 2021). In the case of the pulse oximeter, this medical device uses infrared and red light signaling that interacts with skin pigmentation to read the oxygen saturation in the patient’s blood and shows varying results based on skin color (Zou and Schiebiner, 2021). Previous studies have shown how patients with darker skin received inaccurate oxygen readings compared to White patients (Leslie et al., 2021; Zou and Schiebiner, 2021). This data is fed into algorithms to assist with decision-making, and clinicians may unintentionally accept results and act on flawed recommendations, affecting the ability of patients to acquire needed care, such as supplementary oxygen (Zou and Schiebiner, 2021; Rajkomar et al., 2018).

In addition to the adverse effects of incorrect data usage and biases, the absence of infrastructure to support equitable AI in developing and deploying AI-DDS tools will ultimately widen disparities. The digital gap perpetuates inequities through many social factors that may intertwine, including a lack of broadband internet access across regions and an inability to purchase up-to-date and well-equipped devices (Ramsetty and Adams, 2020). For example, AI tools extracting data from EHR systems may be more prevalent in larger health care organizations in well-resourced cities than small rural hospitals or physician practices, which have fewer resources and expertise readily available (Goldfarb and Teodoridis, 2022; Reisman, 2017). The associated financial costs for EHR implementation continue to be a primary barrier to the adoption of AI-DDS tools (Goldfarb and Teodoridis, 2022). AI algorithms applied to clinical settings that disproportionately serve populations that experience a form of privilege (i.e., wealthy populations) marginalize groups that do not actively seek care in the same settings (DeCamp and Lindvall, 2020; Rajkomar et al., 2018). Nevertheless, data collection issues persist in settings with EHR systems due to the lack of compatibility between these systems and certain providers serving different hospitals and health care facilities, further contributing to data silos and insufficiently informed AI tools (Goldfarb and Teodoridis, 2022).

Path Forward – Policy Implications and Action Priorities

Fostering provider adoption of novel AI-DDS systems will require broad infrastructural support, beginning with robust tool evaluations by health systems and payers, clear commitments from health systems and developers to regular monitoring and updating of algorithms, and training care teams to effectively interpret and implement changes based on AI-DDS outputs. Developers, payers, health systems, and providers are becoming increasingly aware of potential biases in AI algorithms and their deployment. Data representativeness and robust model training must be a top priority in algorithm development to increase trust and adoption among all relevant stakeholders. Data integrity and reliability are at the very core of sound algorithm development, yielding better prospects for provider adoption of those algorithms. Therefore, collaborative efforts aimed at curating rich and multimodal patient data – including crucial social determinants information – will be paramount. Such efforts need to be coupled with robust and consistent standards for data access, sharing, harmonization, and interoperability, while simultaneously prioritizing data privacy and security to ultimately drive excellent model development. In a similar vein, boosting provider comfort and adoption may also depend on model transparency. Providing health care teams with key parameters driving an AI-DDS output that can serve as modifiable targets for patient outcome improvement may facilitate greater adoption. To conclude, this paper presents key action priorities in each of the four domains related to provider adoption of AI-DDS tools outlined in this paper:

Domain 1: Reason to Use

Establishing clear impetus to incorporate novel AI-DDS tools into health systems is contingent on a given tool’s clinical efficacy, specifically as it relates to a health system’s target population, and affordability, both to the health system and patient. Developers, payers, health systems, and providers are becoming increasingly aware of potential biases in AI algorithms and their deployment. Data representativeness and robust model training and testing must be the top priority in algorithm development in efforts to increase trust and adoption among all relevant stakeholders.
Collaborative efforts among multiple health care systems aimed at curating rich and multimodal patient data – including essential social determinants information – will be paramount. Such efforts need to be coupled with robust and consistent standards for data access, sharing, and interoperability, while simultaneously prioritizing data privacy and security, to ultimately drive excellent model development.
In addition to ensuring robust clinical utility, algorithm developers must design AI-DDS tools to integrate seamlessly into existing care team infrastructures, ensuring that their product value is not diminished by logistical inefficiency and cognitive burden.

Domain 2: Means to Use

Policy makers and payers should consider promoting sustainability through reimbursement to create a sustainable environment for the adoption and continual use of AI-DDS tools and to further promote capital infrastructure investments by health systems to facilitate this goal.
If consensus-based standards do not emerge, ensuring interoperability could require a “top-down” regulatory approach. For instance, the United States Office of the National Coordinator for Health Information Technology (ONC) could develop health IT certification criteria that assess the ability of EHR systems to support data lifecycles. However, given the nascent understanding of ideal workflows and life cycles, standardization at this time is likely premature.
Policy makers and payers should consider using incentives to encourage the use of evidence-based AI-DDS in clinical practice. As per prior payment models, if adoption is sufficient and the evidence of improved processes and outcomes becomes established, AI-DDS tools may become the standard of care in specific clinical scenarios.

Domain 3: Method to Use

Public and private research funders should increase focus and funding opportunities to advance the still nascent implementation science of AI-DDS, for example, through RFPs that focus on integrating AI-DDS into clinical workflows and health IT systems and its impact on the behaviors of clinical teams.
Institutions of medical education and accreditation organizations should review emerging competencies for the use of AI-DDS and consider how to integrate these into the current training and certification ecosystem to adapt to the rapidly changing needs of the clinical front line.
Professional societies, trade associations, and health care quality organizations should identify diagnostic centers of excellence that specialize in AI-DDS to facilitate the surfacing and effective diffusion of best practices through interdisciplinary learning networks and capacity-building programs.
Software and algorithm designers of point-of-care AI-DDS for providers and patients at home should leverage the public SMART on FHIR and SMART/HL7 Bulk FHIR APIs regulated under the ONC 21st Century Cures Act Rule, so that algorithms can be widely and uniformly integrated into care across EHR vendor products and other IT tools.
Regulators should monitor, for example through the 21st Century Cures Act EHR Reporting Program, EHR vendor implementation of public FHIR APIs to ensure their turnkey use by apps made accessible at the point of care.

Domain 4: Desire to Use

Professional societies, trade associations, and health care quality organizations should center AI-related efforts to promote clinician well-being through human-centered design in AI technology, aligned with the work-life balance of health care professionals outlined in the Quintuple Aim. The FDA should offer guidance and/or other communications, specifically tailored to health care providers tasked with using AI/DDS tools, to aid their understanding of the types of software are – and are not – likely to receive FDA oversight under 21 U.S.C. § 360j(o)(1)(E). Specifically, it will be imperative to clarify how broadly the agency construes the saving clause for “software that processes signals…”, and the agency’s approach for assessing whether software is “intended … for the purpose … of enabling” a health care professional to independently review the basis of its recommendations. Encouraging clinicians to trust these tools may require helping them develop an intuitive grasp of the FDA’s role and its jurisdictional limits.
The FDA should continue to explore the special considerations affecting design, validation review, market authorization, and post marketing oversight for AI-DDS tools, offering timely guidance while recognizing that, over the long term, notice-and-comment rulemaking may offer advantages over the continued use of guidance documents – for example – to enhance developers’ access to HIPAA-protected real-world data for use in regulatory compliance activities, and to provide needed clarity and stability to foster development of state regulations and common law addressing clinical use of AI-DDS systems.
Professional medical, nursing, and other health care societies should develop clinical practice guidelines for AI system applications.
The FDA, CDC, and ONC should ensure transparency and publicly accessible reporting for flaws and safety incidents related to AI-DDS tools, malfunctions, and patient harm.
Software developers should integrate human clinical diagnosticians at all phases of software development, design, validation, implementation, and iterative improvements.

AI-DDS systems are becoming increasingly prevalent, sophisticated, and reliable. Across medical specialties, these tools demonstrate potential to make the clinical diagnostic process more efficient and accurate, ultimately improving patient outcomes. Focused efforts to create equitable and robust AI-DDS algorithms, streamline integration of new AI-DDS tools into clinical workflows, and train health care providers to effectively use such tools – coupled with strong regulatory oversight and financial incentives – will optimize the likelihood that innovative, clinically impactful AI-DDS systems are adopted and used responsibly by health care providers to the ultimate benefit of their patients.

Join the conversation!

Tweet this! #AI tools have the potential to revolutionize clinical diagnostics and decisionmaking, but their adoption must consider a number of factors, including issues of equity. A new #NAMPerspectives outlines these considerations: https://doi.org/10.31478/202209c #NAMLeadershipConsortium

Tweet this! “Despite the significant potential #AI-DDS tools hold in augmenting medical diagnosis, these tools may fail if there is insufficient clinical acceptance.” Read more in a new #NAMPerspectives: https://doi.org/10.31478/202209c #NAMLeadershipConsortium

Tweet this! Reason to use, means to use, method to use, and desire to use outline the key framework of a new #NAMPerspectives focused on addressing barriers and facilitating clinical adoption of #AI in medical diagnosis: https://doi.org/10.31478/202209c #NAMLeadershipConsortium

Download the graphics below and share them on social media!

References

21 U.S. Code § 360j. 2017. General provisions respecting control of devices intended for human use. Available at: https://www.law.cornell.edu/uscode/text/21/360j (accessed July 26, 2022).
100th Congress. 1988. Public Law 100-578, 102 STAT. 2903. Available at: https://www.govinfo.gov/content/pkg/STATUTE-102/pdf/STATUTE-102-Pg2903.pdf (accessed July 27, 2022).
114th Congress. 2016a. H.R. 34 – 21st Century Cures Act. Available at: https://www.congress.gov/bill/114th-congress/house-bill/34/text (accessed July 26, 2022).
114th Congress. 2016b. S.524 – Comprehensive Addiction and Recovery Act of 2016. Available at: https://www.congress.gov/bill/114th-congress/senate-bill/524/text (accessed July 27, 2022).
Abbas, H., F. Garberson, S. Liu-Mayo, E. Glover, and D. P. Wall. 2020. Multi-modular AI Approach to Streamline Autism Diagnosis in Young Children. Scientific Reports 10(5014). https://doi.org/10.1038/s41598-020-61213-w.
Abdulkareem, M., and S. E. Petersen. 2021. The Promise of AI in Detection, Diagnosis, and Epidemiology for Combating COVID-19: Beyond the Hype. Frontiers in Artificial Intelligence. https://doi.org/10.3389/frai.2021.652669.
Adiekum, A., A. Blasimme, and E. Vayena. 2018. Elements of Trust in Digital Health Systems: Scoping Review. Journal of Medical Internet Research 20(12):e11254. https://doi.org/10.2196/11254.
Aggarwal, N., M. Ahmed, S. Basu, J. J. Curtin, B. J. Evans, M. E. Matheny, S. Nundy, M. P. Sendak, C. Shachar, R. U. Shah, and S. Thadaney-Israni. 2020. Advancing Artificial Intelligence in Health Settings Outside the Hospital and Clinic. NAM Perspectives. Discussion Paper, National Academy of Medicine, Washington, DC. https://doi.org/10.31478/202011f.
Ajzen, I. 1991. The theory of planned behavior. Organizational Behavior and Human Decision Processes 50(2):179-211. https://doi.org/10.1016/0749-5978(91)90020-T.
Ajzen, I. 1985. From Intentions to Actions: A Theory of Planned Behavior. In Action Control, edited by J. Kuhl and J. Beckmann. Berlin: Springer. pp. 11-39.
Alami, H., P. Lehoux, Y. Auclair, M. de Guise, M. P. Gagnon, J. Shaw, D. Roy, R. Fleet, M. A. Ag Ahmed, and J. P. Fortin. 2020. Artificial Intelligence and Health Technology Assessment: Anticipating a New Level of Complexity. Journal of Medical Internet Research 22(7). https://doi.org/10.2196/17707.
Anumana. 2022. About Us. Available at: https://www.anumana.ai/aboutus (accessed May 12, 2022).
Ardila, D., A. P. Kiraly, S. Bharadwaj, B. Choi, J. J. Reicher, L. Peng, D. Tse, M. Etemadi, W. Ye, G. Corrado, D. P. Naidich, and S. Shetty. 2019. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nature Medicine (6):954-961. https://doi.org/10.1038/s41591-019-0447-x.
Barker, W., and C. Johnson. 2021. The Ecosystem of Apps and Software Integrated with Certified Health Information Technology. Journal of the American Medical Informatics Association 28(11):2379-2384. https://doi.org/10.1093/jamia/ocab171.
Benjamins, R. 2021. A choices framework for the responsible use of AI. AI and Ethics 1(1):49-53. https://doi.org/10.1007/s43681-020-00012-5.
Benjamens, S., P. Dhunnoo, and B. Meskó. 2020. The state of artificial intelligence-based FDA-approved medical devices and algorithms: an online database. npj Digital Medicine 3(118). https://doi.org/10.1038/s41746-020-00324-0.
Berger, D. 1999. A brief history of medical diagnosis and the birth of the clinical laboratory. Part 1—Ancient times through the 19th century. MLO: Medical Laboratory Observer 31(7):28-30, 32, 34-40. Available at: https://pubmed.ncbi.nlm.nih.gov/10539661/ (accessed July 26, 2022).
Bitterman, D. S., H. J. W. L. Aerts, and R. H. Mak. 2020. Approaching autonomy in medical artificial intelligence. The Lancet Digital Health 2(9):e447-e449. https://doi.org/10.1016/S2589-7500(20)30187-4.
Brajer, N., B. Cozzi, M. Gao, M. Nichols, M. Revoir, S. Balu, J. Futoma, J. Bae, N. Setji, A. Hernandez, and M. Sendak. 2020. Prospective and External Evaluation of a Machine Learning Model to Predict In-Hospital Mortality of Adults at Time of Admission. JAMA Network Open 3(2):e1920733. https://doi.org/10.1001/jamanetworkopen.2019.20733.
Brown, S. H., and R. A. Miller. 2014. Legal and regulatory issues related to the use of clinical software in health care delivery. In Clinical Decision Support, 2nd edition, edited by R. A. Greenes. New York: Elsevier.
CaseText. 2009a. Mracek v. Bryn Mawr Hospital. 610 F Supp 2d, 401 (ED Pa 2009). Available at: https://casetext.com/case/mracek-v-bryn-mawr-hosp-2 (accessed July 26, 2022).
CaseText. 2009b. Singh v. Edwards Lifesciences. Available at: https://casetext.com/case/singh-v-edwards-lifesciences (accessed July 27, 2022).
CaseText. 2008. Riegel v. Medtronic, Inc. Available at: https://casetext.com/case/riegel-v-medtronic-inc-3 (accessed September 16, 2022).
Char, D., N. Shah, and D. Magnus. 2018. Implementing machine learning in health care – addressing ethical challenges. New England Journal of Medicine 378(11):981-983. https://doi.org/10.1056/NEJMp1714229.
Chen, M. M., L. P. Golding, and G. N. Nicola. 2021. Who Will Pay for AI? Radiology: Artificial Intelligence 3(3). https://doi.org/10.1148/ryai.2021210030.
Clemens, J., and J. D. Gottlieb. 2017. In the Shadow of a Giant: Medicare’s Influence on Private Physician Payments. Journal of Political Economy 125(1):1-39. https://www.journals.uchicago.edu/doi/10.1086/689772.
Cortez, N. 2014. Regulating Disruptive Innovation. Berkeley Technology Law Journal 29:175-218. http://dx.doi.org/10.2139/ssrn.2436065.
Crigger, E., K. Reinbold, C. Hanson, A. Kao, K. Blake, and M. Irons. 2022. Trustworthy Augmented Intelligence in Health Care. Journal of Medical Systems 46(12). https://doi.org/10.1007/s10916-021-01790-z.
Curnutte, M. A., K. L. Frumovitz, J. M. Bollinger, A. L. McGuire, and D. J. Kaufman. 2014. Development of the clinical next-generation sequencing industry in a shifting policy climate. Nature Biotechnology 32(10):980-982. https://doi.org/10.1038/nbt.3030.
DeCamp, M., and C. Lindvall. 2020. Latent bias and the implementation of artificial intelligence in medicine. Journal of the American Medical Informatics Association 27(12):2020-2023. https://doi.org/10.1093/jamia/ocaa094.
Deverka, P. A., and J. C. Dreyfus. 2014. Clinical Integration of Next Generation Sequencing: Coverage and Reimbursement Challenges. Journal of Law, Medicine & Ethics 42:22-41. https://doi.org/10.1111/jlme.12160.
Digital Diagnostics. 2022. IDx-DR. Available at: https://www.digitaldiagnostics.com/products/eyedisease/idx-dr/ (accessed July 26, 2022).
Digital Diagnostics. 2019. Autonomous AI diagnostics launch in retail health clinics. November 19. Available at https://www.digitaldiagnostics.com/newsroom/autonomous-ai-diagnostics-launch-in-retail-healthclinics/ (accessed on May 11, 2022).
Duffy, G., P. P. Cheng, N. Yuan, B. He, A. C. Kwan, M. J. Shun-Shin, K. M. Alexander, J. Ebinger, M. P. Lundgren, F. Rader, D. H. Liang, I. Schnittger, E. A. Ashley, J. Y. You, J. Patel, R. Witteles, S. Cheng, and D. Ouyang. 2022. High-Throughput Precision Phenotyping of Left Ventricular Hypertrophy with Cardiovascular Deep Learning. Journal of the American Medical Association, Cardiology 7(4):386–395. https://doi.org/10.1001/jamacardio.2021.6059.
Escobar, G. J., V. X. Liu, A. Schuler, B. Lawson, J. D. Greene, and P. Kipnis. 2020. Automated Identification of Adults at Risk for In-Hospital Clinical Deterioration. New England Journal of Medicine 383(20):1951-1960. https://doi.org/10.1056/NEJMsa2001090.
European Commission. 2019. Ethics Guidelines for Trustworthy AI. High-level expert group on Artificial Intelligence. Available at: https://www.aepd.es/sites/default/files/2019-12/ai-ethics-guidelines.pdf (accessed on March 10, 2022).
Evans, B., and P. Ossorio. 2018. The Challenge of Regulating Clinical Decision Support Software After 21st Century Cures. American Journal of Law & Medicine 44(2-3):237-251. https://doi.org/10.1177/0098858818789418.
Evans, B. and F. Pasquale. 2022. Product Liability Suits for AI/ML Software 22-46, in The Future of Medical Device Regulation: Innovation and Protection, edited by I. G. Cohen, T. Minsen, W. N. Price II, C. Robinson, and C. Shachar. London: Cambridge University Press.
Evans, B. J., G. Javitt, R. Hall, M. Robertson, P. Ossorio, S. M. Wolf, T. Morgan, and E. W. Clayton. 2020. How Can Law and Policy Advance Genomic Analysis and Interpretation for Clinical Care? Journal of Law, Medicine, and Ethics 48 (Supp 1):44-68. https://doi.org/10.1177/1073110520916995.
Fenton, J. J., S. H. Taplin, P. A. Carney, L. Abraham, E. A. Sickles, C. D’Orsi, E. A. Berns, G. Cutter, E. Hendrick, W. E. Barlow, and J. G. Elmore. 2007. Influence of Computer-Aided Detection on Performance of Screening Mammography. New England Journal of Medicine 365:1399-1409. https://doi.org/10.1056/NEJMoa066099.
U.S. Food and Drug Administration (FDA). 2021a. Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. Available at: https://www.fda.gov/media/145022/download (accessed on May 11, 2022).
FDA. 2021b. Digital Health Software Precertification (Pre-Cert) Program. Available at: https://www.fda.gov/medical-devices/digital-health-center-excellence/digital-health-software-precertification-precert-program (accessed on May 11, 2022).
FDA. 2021c. CDRH Proposed Guidances for Fiscal Year 2022 (FY2022). Available at: https://www.fda.gov/medical-devices/guidance-documents-medical-devices-and-radiation-emitting-products/cdrh-proposed-guidances-fiscal-year-2022-fy2022 (accessed on September 14, 2022).
FDA. 2019a. Clinical Decision Support Software: Draft Guidance for Industry and Food and Drug Administration Staff. pp. 28. Available at: https://www.fda.gov/media/109618/download (accessed on May 11, 2022).
FDA. 2019b. Clinical Decision Support Software: Draft Guidance for Industry and Food and Drug Administration Staff. Available at: https://www.fda.gov/media/109618/download (accessed on May 11, 2022).
FDA. 2017a. Digital Health Innovation Action Plan. Available at: https://www.fda.gov/downloads/MedicalDevices/DigitalHealth/UCM568735.pdf (accessed on May 11, 2022).
FDA. 2017b. CFR – Code of Federal Regulations Title 21. Available at: https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSearch.cfm?fr=801.4 (accessed September 16, 2022).
FDA. 2016. Step 3: Pathway to Approval. Available at: https://www.fda.gov/patients/device-developmentprocess/step-3-pathway-approval (accessed May 15, 2022).
GlobeNewswire. 2020. Anaconda Releases 2020 State of Data Science Survey Results. Available at: https://www.globenewswire.com/news-release/2020/06/30/2055578/0/en/Anaconda-Releases-2020-State-of-Data-Science-Survey-Results.html (accessed May 11, 2022).
Goh, K. H., L. Wang, A. Y. K. Yeow, H. Poh, K. Li, J. J. L. Yeow, and G. Y. H. Tan. 2021. Artificial intelligence in sepsis early prediction and diagnosis using unstructured data in healthcare. Nature Communication 12. https://doi.org/10.1038/s41467-021-20910-4.
Goldfarb, A., and F. Teodoridis. 2022. Why is AI adoption in health care lagging? Brookings, March 9. Available at: https://www.brookings.edu/research/whyis-ai-adoption-in-health-care-lagging/ (accessed May 17, 2022).
Goldhahn J., V. Rampton, and G. A. A. Spinas. 2018. Could artificial intelligence make doctors obsolete? BMJ 363:k4563. https://doi.org/10.1136/bmj.k4563.
Gottlieb, S. 2019. FDA Announces New Steps to Empower Consumers and Advance Digital Healthcare. Available at https://www.fda.gov/news-events/fda-voices/fda-announces-new-steps-empower-consumers-and-advance-digital-healthcare (accessed on May 11, 2022).
He, J., S. L. Baxter, J. Xu, J. Xu, X. Zhou, and K. Zhang. 2019. The practical implementation of artificial intelligence technologies in medicine. Nature Medicine 25:30-36. https://doi.org/10.1038/s41591-018-0307-0.
U.S. Department of Health and Human Services (HHS). 2020. 21st Century Cures Act: Interoperability, Information Blocking, and the ONC Health IT Certification Program. Available at: https://www.federalregister.gov/d/2020-07419 (accessed July 26, 2022).
HHS. 2003. Disclosures for Public Health Activities. Available at: https://www.hhs.gov/hipaa/for-professionals/privacy/guidance/disclosures-public-healthactivities/index.html (accessed September 16, 2022).
Heartflow. 2014. Heartflow Secures De Novo Clearance from the U.S. Food and Drug Administration for Breakthrough FFRCT Technology. Available at: https://www.heartflow.com/newsroom/heartflowsecures-de-novo-clearance/ (accessed March 15, 2022).
Hinton, G. 2016. On Radiology. Available at: https://www.youtube.com/watch?v=2HMPRXstSvQ (accessed May 15, 2022).
Kaufman Hall & Associates. 2022. National Hospital Flash Report. Available at: https://www.kaufmanhall.com/sites/default/files/2022-03/National-Hospital-Flash-Report-March-2022.pdf (accessed May 25, 2022).
Kawamoto, K., P. V. Kukhareva, C. Weir, M. C. Flynn, C. J. Nanjo, D. K Martin, P. B Warner, D. E. Shields, S. Rodriguez-Loya, R. L. Bradshaw, R. C. Cornia, T. J. Reese, H. S. Kramer, T. Taft, R. L. Curran, K. L. Morgan, D. Borbolla, M. Hightower, W. J. Turnbull, M. B. Strong, W. W. Chapman, T. Gregory, C. H. Stipelman, J. H. Shakib, R. Hess, J. P. Boltax, J. P. Habboushe, F. Sakaguchi, K. M. Turner, S. P. Narus, S. Tarumi, W. Takeuchi, H. Ban, D. W. Wetter, C. Lam, T. J. Caverly, A. Fagerlin, C. Norlin, D. C. Malone, K. A. Kaphingst, W. K. Kohlmann, B. S. Brooke, and G. Del Fiol. 2021. Establishing a multidisciplinary initiative for interoperable electronic health record innovations at an academic medical center. JAMIA Open 4(3). https://doi.org/10.1093/jamiaopen/ooab041.
Kawamoto, K. 2005. Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success. BMJ 330(7494):765. https://doi.org/10.1136/bmj.38398.500764.8F.
Kellogg, K. C., M. Sendak, and S. Balu, 2022. AI on the Frontlines. MIT Sloan Management Review. Available at: https://sloanreview.mit.edu/article/ai-onthe-front-lines/ (accessed May 11, 2022).
Kensaku, K., P. Kukhareva, C. Weir, M. Flynn, C. Nanjo, D. Martin, P. B. Warner, D. E. Shields, S. Rodriguez-Loya, R. L. Bradshaw, R. C. Cornia, T. J. Reese, H. S. Kramer, T. Taft, R. L. Curran, K. L. Morgan, D. Borbolla, M. Hightower, W. J. Turnbull, M. B. Strong, W. W. Chapman, T. Gregory, C. H. Stipelman, J. H. Shakib, R. Hess, J. P. Boltax, J. P. Habboushe, F. Sakaguchi, K. M. Turner, S. P. Narus, S. Tarumi, W. Takeuchi, H. Ban, D. W. Wetter, C. Lam, T. J. Caverly, A. Fagerlin, C. Norlin, D. C. Malone, K. A. Kaphingst, W. K. Kohlmann, B. S. Brooke, and G. Del Fiol. 2021. Establishing a Multidisciplinary Initiative for Interoperable Electronic Health Record Innovations at an Academic Medical Center. JAMIA Open 4(3):ooab041. https://doi.org/10.1093/jamiaopen/ooab041.
Khalifa, A., C. Mason, H. Garvin, M. Williams, G. Del Fiol, B. Jackson, S. Bleyl, G. Alterovitz, and S. Huff. 2021. Interoperable Genetic Lab Test Reports: Mapping Key Data Elements to HL7 FHIR Specifications and Professional Reporting Guidelines. Journal of the American Medical Informatics Association 28(12):2617–25. https://doi.org/10.1093/jamia/ocab201.
Khan, N. S., M. S. Ghani, and G. Anjum. 2021. ADAMsense: Anxiety-displaying activities recognition by motion sensors. Pervasive and Mobile Computing 78(21). https://doi.org/10.1016/j.pmcj.2021.101485.
Krueger, L. 2022. Clinical decision-making bias in darker skin types: a prospective survey study identifying diagnostic bias in decision to biopsy. Abstract presented at 18th Skin of Color Society Scientific Symposium, March 24, 2022. Boston, MA.
Lankton, N. K., D. H. McKnight, and J. Tripp. 2015. Technology, Humanness, and Trust: Rethinking Trust in Technology. Journal of the Association for Information Systems 16(10):880-918. https://doi.org/10.17705/1jais.00411.
Lee, P., A. Abernethy, D. Shaywitz, A. V. Gundlapalli, J. Weinstein, P. M. Doraiswamy, K. Schulman, and S. Madhavan. 2022. Digital Health COVID-19 Impact Assessment: Lessons Learned and Compelling Needs. NAM Perspectives. Discussion Paper, National Academy of Medicine, Washington, DC. https://doi.org/10.31478/202201c.
Lee, Y., Y. S. Kim, D.-I. Lee, S. Jeong, G.-H. Kang, Y. S. Jang, W. Kim, H. Y. Choi, J. G. Kim, and S.-H. Choi. 2022. The application of a deep learning system developed to reduce the time for RT-PCR in COVID-19 detection. Science Reports 12(1234). https://doi.org/10.1038/s41598-022-05069-2.
Leslie, D., A. Mazumder, A. Peppin, M. K. Wolters, and A. Hagerty. 2021. Does “AI” stand for augmenting inequality in the era of COVID-19 healthcare? BMJ 372:1-5. https://doi.org/10.1136/bmj.n304.
Lin, D., T. Nazreen, T. Rutowski, Y. Lu, A. Harati. E. Shriberg, P. Chlebek, and M. Aratow. 2022. Feasibility of a Machine Learning-Based Smartphone Application in Detecting Depression and Anxiety in a Generally Senior Population. Frontiers in Psychology 13. https://doi.org/10.3389/fpsyg.2022.811517.
Luzniak, K. 2021. “What’s the cost of artificial intelligence in healthcare?” Neoteric, December 16. Available at: https://neoteric.eu/blog/whats-the-cost-of-artificial-intelligence-in-healthcare/ (accessed May 9, 2022).
Mäkelä, K., M. I. Mäyränpää, H. K. Sihvo, P. Bergman, E. Sutinen, H. Ollila, R. Kaarteenaho, and M. Myllärniemi. 2021. Artificial intelligence identifies inflammation and confirms fibroblast foci as prognostic tissue biomarkers in idiopathic pulmonary fibrosis. Human Pathology (107):58-68. https://doi.org/10.1016/j.humpath.2020.10.008.
Maliha G., S. Gerke, I. G. Cohen, and R. B. Parikh. 2021. Artificial Intelligence and Liability in Medicine: Balancing Safety and Innovation. Milbank Quarterly 99(3):629-647. https://doi.org/10.1111/1468-0009.12504.
Mandel, J. C., D. A. Kreda. K. D. Mandl, I. S. Kohane, and R. B. Ramoni. 2016. SMART on FHIR: A Standards-Based, Interoperable Apps Platform for Electronic Health Records. Journal of the American Medical Informatics Association 23(5):899–908. https://doi.org/10.1093/jamia/ocv189.
Mandl, K. D., and F. T. Bourgeois. 2017. The Evolution of Patient Diagnosis: From Art to Digital Data-Driven Science. Journal of American Medical Association 318(19):1859–1860. https://doi:10.1001/jama.2017.15028.
Mandl, K. D., and I. S. Kohane. 2017. A 21st-Century Health IT System – Creating a Real-World Information Economy. New England Journal of Medicine 376(20):1905-1907. https://doi.org/10.1056/NEJMp1700235.
Mandl, K. D., and I. S. Kohane. 2012. Escaping the EHR Trap – The Future of Health IT. New England Journal of Medicine 366(24):2240-2242. https://doi.org/10.1056/NEJMp1203102.
Mandl, K. D., J. C. Mandel, S. N. Murphy, E. V. Bernstam, R. L. Ramoni, D. A. Kreda, J. M. McCoy, B. Adida, and I. S. Kohane. 2012. The SMART Platform: Early Experience Enabling Substitutable Applications for Electronic Health Records. Journal of the American Medical Informatics Association 19(4):597-603. https://doi.org/10.1136/amiajnl-2011-000622.
Marmar, C. R., A. D. Brown, M. Qian, E. Laska, C. Siegel, M. Li, D. Abu-Amara, A. Tsiartas, C. Richey, J. Smith, B. Knoth, and D. Vergyri. 2019. Speech-based markers for posttraumatic stress disorder in US veterans. Depression and Anxiety (36):607-616. https://doi.org/10.1002/da.22890.
Matheny, M., S. Thadaney Israni, M. Ahmed, and D. Whicher, Editors. 2019. Artificial Intelligence in Health Care: The Hope, the Hype, the Promise, the Peril. NAM Special Publication, National Academy of Medicine, Washington, DC.
Melnick, E. R., L. N. Dyrbye, C. A. Sinsky, M. Trockel, C. P. West, L. Nedelec, M. A. Tutty, and T. Shanafelt. 2020. The association between perceived electronic health record usability and professional burnout among US physicians. Mayo Clinic Proceedings 95(3):476-487. https://doi.org/10.1016/j.mayocp.2019.09.024.
Miller, A. R. 1994. Medical Diagnostic Decision Support Systems – Past, Present, and Future: A Threaded Bibliography and Brief Commentary. Journal of Medical Informatics 1(1):8-27. https://doi.org/10.1136/jamia.1994.95236141.
Miller, A. R., and A. Geissbuhler. 2007. Diagnostic Decision Support Systems. Clinical Decision Support Systems: Theory and Practice. 2nd edition, edited by K. J. Hannah and M. J. Ball. New York, NY: Springer Science.
Nakahara, H. K. Namba, A. Fukami, R. Watanabe, M. Mizutani, T. Matsu, S. Nishimura, S. Jinnouchi, S. Nagamachi, T. Ohnishi, S. Futami, L. G. Flores, M. Nakahara, and S. Tamura. 1998. Computer-Aided Diagnosis (CAD) for Mammography: Preliminary Results. Breast Cancer 5:401-405. https://doi.org/10.1007/BF02967438.
North Carolina State Health Plan and John Hopkins Bloomberg School of Public Health. 2021. North Carolina Hospitals: Charity Care Case Report. Available at: https://s3.documentcloud.org/documents/21094171/download-1.pdf (accessed May 11, 2022).
Office of the National Coordinator for Health Information Technology (ONC). 2018. Clinical Decision Support. Available at: https://www.healthit.gov/topic/safety/clinical-decision-support (accessed September 14, 2022).
Ommaya, A. K., P. F. Cipriano, D. B. Hoyt, K. A. Horvath, P. Tang, H. L. Paz, M. S. DeFrancesco, S. T. Hingle, S. Butler, and C. A. Sinsky. 2018. Care-Centered Clinical Documentation in the Digital Environment: Solutions to Alleviate Burnout. NAM Perspectives. Discussion Paper, National Academy of Medicine, Washington, DC. https://doi.org/10.31478/201801c.
Parakh, A., H. Lee, J. H. Lee, B. H. Eisiner, D. V. Sahani, and S. Do. 2019. Urinary stone detection on CT images using deep convolutional neural networks: Evaluation of model performance and generalization. Radiology: Artificial Intelligence 1(4). https://doi.org/10.1148/ryai.2019180066.
Parikh, R. B., and L. A. Helmchen. 2022. Paying for artificial intelligence in medicine. npj Digital Medicine 5(63):1-5. https://doi.org/10.1038/s41746-022-00609-6.
Price, W. N., S. Gerke, and I. G. Cohen. 2019. Potential liability for physicians using artificial intelligence. JAMA 322(18):1765. https://doi.org/10.1001/jama.2019.15064.
Rajkomar, A., M. Hardt, M. Howell, G. Corrado, and M. Chin. 2018. Ensuring Fairness in Machine Learning to Advance Health Equity. Annals of Internal Medicine 169:866-872. https://doi.org/10.7326/M18-1990.
Ramsetty, A., and C. Adams. 2020. Impact of the digital divide in the age of COVID-19. Journal of American Medical Informatics Association 27(7):1147-1148. https://doi.org/10.1093/jamia/ocaa078.
Ray, A., A. Gupta, and A. Al. 2020. Skin Lesion Classification with Deep Convolutional Neural Network: Process Development and Validation. Journal of Medical Internet Research Dermatology (1):e18438. https://doi.org/10.2196/18438.
Reisman, M. 2017. EHRs: The Challenge of Making Electronic Data Usable and Interoperable. P&T 42(9):572-575. Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5565131/ (accessed May 23, 2022).
Responsible Innovation Labs. 2022. Charter. Available at: https://www.rilabs.org/charter (accessed on June 30, 2022).
Ridgely, M. S., and M. D. Greenberg. 2012. Too many alerts, too much liability: sorting through the malpractice implications of drug-drug interaction clinical support. Saint Louis University Journal of Health Law & Policy 5(2):257-295. Available at: https://scholarship.law.slu.edu/jhlp/vol5/iss2/4 (accessed July 27, 2022).
Rodin, J., and S. Madsbjerg. 2021. Making Money Moral: How a New Wave of Visionaries Is Linking Purpose and Profit. Wharton School Press: Philadelphia.
Sandhu, S., A. L. Lin, N. Brajer, J. Sperling, W. Ratliff, A. D. Bedoya, S. Balu, C. O’Brien, and M. P. Sendak. 2020. Integrating a Machine Learning System into Clinical Workflows: Qualitative Study. JMIR 22(11). https://doi.org/10.2196/22421.
Sanyal, S. 2021. How much does artificial intelligence cost in 2021? Analytics Insights. Available at: https://www.analyticsinsight.net/how-much-does-artificial-intelligence-cost-in-2021/ (accessed May 11, 2022).
Sendak, M. P., J. D’Arcy, S. Kashyap, M. Gao, M. Nichols, K. Corey, W. Ratliff, and S. Balu. 2020a. A Path for Translation of Machine Learning Products into Healthcare Delivery. European Medical Journal Innovations. https://doi.org/10.33590/emjinnov/19-00172.
Sendak, M. P., W. Ratliff, D. Sarro, E. Alderton, J. Futoma, M. Gao, M. Nichols, M. Revoir, F. Yashar, C. Miller, K. Kester, S. Sandhu, K. Corey, N. Brajer, C. Tan, A. Lin, T. Brown, S. Engelbosch, K. Anstrom, M. C. Elish, K. Heller, R. Donohoe, J. Theiling, E. Poon, S. Balu, A. Bedoya, and C. O’Brien. 2020b. Real-World Integration of a Sepsis Deep Learning Technology into Routine Clinical Care: Implementation Study. JMIR Medical Informatics 8(7): e15182. https://doi.org/10.2196/15182.
Sendak, M. P., M. Gao, N. Brajer, and S. Balu. 2020c. Presenting machine learning model information to clinical end users with model facts labels. npj Digital Medicine 3(4). https://doi.org/10.1038/s41746-020-0253-3.
Sendak, M. P., S. Balu, and K. A. Schulman. 2017. Barriers to Achieving Economies of Scale in Analysis of EHR Data: A Cautionary Tale. Applied Clinical Informatics 8(3):826-831. https://doi.org/10.4338/ACI-2017-03-CR-0046.
Shen, Y., F. E. Shamout, J. R. Oliver, J. Witowski, K. Kannan, J. Park, N. Wu, C. Huddleston, S. Wolfson, A. Millet, R. Ehrenpreis, D. Awal, C. Tyma, N. Samreen, Y. Gao, C. Chhor, S. Gandhi, C. Lee, S. Kumari-Subaiya, C. Leonard, R. Mohammed, C. Moczuski, J. Altabet, J. Babb, A. Lewin, B. Reig, L. Moy, L. Heacock, and K. J. Geras. 2021. Artificial intelligence system reduces false-positive findings in the interpretation of breast ultrasound exams. Nature Communication 12(5645). https://doi.org/10.1038/s41467-021-26023-2.
Signaevsky, M., B. Marami, M. Prastawa, N. Tabish, M. A. Iida, X. F. Zhang, M. Sawyer, I. Duran, D. G. Koenigsberg, C. H. Bryce, L. M. Chahine, B. Mollenhauer, S. Mosovsky, L. Riley, K. D. Dave, J. Eberling, C. S. Coffey, C. H. Adler, G. E. Serrano, C. L. White III, J. Koll, G. Fernandez, J. Zeineh, C. Cordon-Cardo, T. G. Beach, and J. F. Crary. 2022. Antemortem detection of Parkinson’s disease pathology in peripheral biopsies using artificial intelligence. Acta Neuropathological Communications 10(21). https://doi.org/10.1186/s40478-022-01318-7.
Singh, R. P., G. L. Hom, M. D. Abramoff, J. P. Campbell, and M. F. Chiang. 2020. Current Challenges and Barriers to Real-World Artificial Intelligence Adoption for the Healthcare System, Provider, and the Patient. Translational Vision Science & Technology 9(2):1-6. https://doi.org/10.1167/tvst.9.2.45.
HCA Healthcare Today. 2018. SPOT: How HCA Healthcare is “sniffing out” sepsis early. Available at: https://hcahealthcaretoday.com/2018/09/10/spot-how-hca-is-sniffing-out-sepsis-early/ (accessed May 1, 2022).
Sutton, R. T., D. Pincock, D. C. Baumgart, D. C. Sadowski, R. N. Fedorak, and K. I. Kroeker. 2020. An overview of clinical decision support systems: benefits, risks, and strategies for success. npj Digital Medicine 3(17). https://doi.org/10.1038/s41746-020-0221-y.
Syrowatka, A., M. Kuznetsova, A. Alsubai, A. L. Beckman, P. A. Bain, K. J. Thomas Craig, J. Hu, G. P. Jackson, K. Rhee, and D. W. Bates. 2021. Leveraging artificial intelligence for pandemic preparedness and response: a scoping review to identify key use cases. npj Digital Medicine 4(96). https://doi.org/10.1038/s41746-021-00459-8.
Tadavarthi, Y. B. V., E. Krupinski, A. Prater, J. Gichoya, N. Safdar, and H. Trivedi. 2020. The State of Radiology AI: Considerations for Purchase Decisions and Current Market Offerings. Radiology: Artificial Intelligence 2(6). https://doi.org/10.1148/ryai.2020200004.
Unsworth, H., V. Wolfram, B. Dillon, M. Salmon, F. Greaves, X. Liu, T. MacDonald, A. K. Denniston, V. Sounderajah, H. Ashrafi an, A. Darzi, C. Ashurst, C. Holmes, and A. Weller. 2022. Building an evidence standards framework for artificial intelligence-enabled digital health technologies. The Lancet Digital Health 4(4):e216-e217. https://doi.org/10.1016/S2589-7500(22)00030-9.
Vinson, D. R., S. D. Casey, P. L. Vuong, J. Huang, D. W. Ballard, and M. E. Reed. 2022. Sustainability of a Clinical Decision Support Intervention for Outpatient Care for Emergency Department Patients With Acute Pulmonary Embolism. JAMA Network Open 5(5):e2212340. https://doi.org/10.1001/jamanetworkopen.2022.12340.
Walker, H. K. 1990. The Origins of the History and Physical Examination, Clinical Methods: The History, Physical, and Laboratory Examinations. 3rd edition, edited by W.D. Hall and J.W. Hurst. Boston, MA.
Wiens, J., S. Saria, M. Sendak, M. Ghassemi, V. X. Liu, F. Doshi-Velez, K. Jung, K. Heller, D. Kale, M. Saeed, P. N. Ossorio, S. Thadaney Israni, and A. Goldenberg. 2019. Do no harm: a roadmap for responsible machine learning for health care. Nature Medicine 25(9):1337-1340. https://doi.org/10.1038/s41591-019-0548-6.
Wolff, J., J. Pauling, A. Keck, and J. Baumbach. 2020. The Economic Impact of Artificial Intelligence in Health Care: Systematic Review. Journal of Medical Internet Research 22(2):e16866. https://doi.org/10.2196/16866.
Wong, A., E. Otles, J. P. Donnelly, A. Krumm, J. McCullough, O. DeTroyer-Cooley, J. Pestrue, M. Phillips, J. Konye, C. Penoza, M. Ghous, and K. Singh. 2021. External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. JAMA Internal Medicine 181(8):1065-1070. https://doi.org/10.1001/jamainternmed.2021.2626.
Wu, A. C., C. Graif, S. G. Mitchell, J. Meurer, and K. D. Mandl. 2021. Creative Approaches for Assessing Long-term Outcomes in Children. Pediatrics 148(Suppl1):s25-s32. https://doi.org/10.1542/peds.2021-050693F.
Wu, T. 2011. Agency Threats. Duke Law Journal 60(8):1841-1857. Available at: https://scholarship.law.duke.edu/dlj/vol60/iss8/4 (accessed September 26, 2022).
Wynants, L., B. Van Calster, G. S. Collins, R. D. Riley, G. Heinze, E. Schuit, M. M. J. Bonten, D. L. Dahly, J. A. Damen, T. P. A. Debray, V. M. T. de Jong, M. De Vos, P. Dhiman, M. C. Haller, M. O. Harhay, L. Henckaerts, P. Heus, M. Kammer, N. Kreuzberger, A. Lohmann, K. Luijken, J. Ma, G. P. Martin, D. J. McLernon, C. L. Andaur Navarro, J. B. Reitsma, J. C. Sergeant, C. Shi, N. Skoetz, L. J. M. Smits, K. I. E. Snell, M. Sperrin, R. Spijker, E. W. Steyerberg, T. Takada, I. Tzoulaki, S. M. J.
van Kuijk, B. C. T. van Bussel, I. C. C. van der Horst, F. S. van Royen, J. Y. Verbakel, C. Wallisch, J. Wilkinson, R. Wolff, L. Hooft, K. G. M. Moons, and M. van Smeden. 2020. Prediction models for diagnosis and prognosis of COVID-19: Systematic review and critical appraisal. BMJ 269:m1328. https://doi.org/10.1136/bmj.m1328.
Yala, A., C. Lehman, T. Schuster, T. Portnoi, and R. Barzilay. 2019. A Deep Learning Mammography-based Model for Improved Breast Cancer Risk Prediction. Radiology 292(1):60-66. https://doi.org/10.1148/radiol.2019182716.
Yang, Z., C. Silcox, M. Sendak, S. Rose, D. Rehkopf, R. Phillips, L. Peterson, M. Marino, J. Maier, S. Lin, W. Liaw, I. A. Kakadiaris, J. Heintzman, I. Chu, and A. Bazemore. 2022. Advancing primary care with Artificial Intelligence and Machine Learning. Healthcare (Amsterdam, Netherlands) 10(1). https://doi.org/10.1016/j.hjdsi.2021.100594.
Yu, K., A. L. Beam, and I. S. Kohane. 2018. Artificial intelligence in healthcare. Nature Biomedical Engineering 2:719–731. https://doi.org/10.1038/s41551-018-0305-z.
Zou, J. and L. Schiebiner. 2021. Ensuring that biomedical AI benefits diverse populations. eBioMedicine 67:1-6. https://doi.org/10.1016/j.ebiom.2021.103358.
Zusterzeel, R., B. A. Goldstein, B. J. Evans, T. Roades, K. Mercon, C. Silcox. 2022. Evaluating AI-Enabled Clinical Decision and Diagnostic Support Tools Using Real-World Data. Duke Margolis Center for Health Policy. Available at: https://healthpolicy.duke.edu/publications/evaluating-ai-enabled-clinical-decision-and-diagnostic-support-tools-using-real-world (accessed October 13, 2022).

Suggested Citation

Adler-Milstein, J., N. Aggarwal, M. Ahmed, J. Castner, B. Evans, A. Gonzalez, C. A., James, S. Lin, K. Mandl, M. Matheny, M. Sendak, C. Shachar, and A. Williams. 2022. Meeting the Moment: Addressing Barriers and Facilitating Clinical Adoption of Artificial Intelligence in Medical Diagnosis. NAM Perspectives. Discussion Paper, Washington, DC. https://doi.org/10.31478/202209c.

DOI

https://doi.org/10.31478/202209c

Author Information

Julia Adler-Milstein, PhD, is Professor of Medicine and Director of the Center for Clinical Informatics and Improvement Research (CLIIR) at the University of California-San Francisco. Nakul Aggarwal, BS, is an MD-PhD candidate at the University of Wisconsin-Madison. Mahnoor Ahmed, MEng, is an Associate Program Officer at the National Academy of Medicine. Jessica Castner, PhD, RN-BC, is President of Castner Incorporated and Editor-in-Chief of the Journal of Emergency Nursing. Barbara J. Evans, PhD, JD, is Professor of Law and Stephen C. O’Connell Chair at the University of Florida. Andrew A. Gonzalez, MD, JD, MPH, is Associate Director for Data Science and Research Scientist at Regenstrief Institute. Cornelius A. James, MD, is a Clinical Assistant Professor in the Departments of Internal Medicine, Pediatrics, and Learning Health Sciences at the University of Michigan. Steven Lin, MD, is Clinical Associate Professor of Medicine at Stanford University. Kenneth D. Mandl, MD, MPH, is Director of the Computational Health Informatics Program (CHIP) at Boston Children’s Hospital. Michael E. Matheny, MD, MS, MPH, is Co-Director of the Center for Improving the Public’s Health through Informatics at Vanderbilt University. Mark P. Sendak, MD, MPP, is Population Health and Data Science Lead at the Duke Institute for Health Innovation at Duke University. Carmel Shachar, JD, MPH, is Executive Director of the Petrie-Flom Center for Health Law Policy, Biotechnology, and Bioethics at Harvard Law School. Asia Williams, MPH, is an Associate Program Officer at the National Academy of Medicine.

Jessica Castner is the current NAM Nurse Scholar-in-Residence; Andrew Gonzalez is the current NAM Gilbert S. Omenn fellow; Steven Lin is the current NAM James C. Puffer, M.D./American Board of Family Medicine fellow; and Julia Adler-Milstein and Kenneth D. Mandl are members of the National Academy of Medicine.

Conflict of Interest Disclosure

Jessica Castner discloses receiving grants and fees from the National Institutes of Health, fees from the Emergency Nurses Association, and serving as co-chair of the American Thoracic Society’s Health Policy Committee on Terrorism and Inhalation Disasters section. Barbara Evans discloses receiving grants from the National Institutes of Health. Steven Lin discloses serving as VP of Health Sciences for Codex Health, where he is a paid consultant; and receiving grants administered through Stanford University from Amazon, American Academy of Family Physicians, American Board of Family Medicine, Center for Professionalism and Value in Health Care, DeepScribe, Google Health, Omada Health, Predicta Med, Quadrant Technologies, Soap Health, Society of Teachers of Family Medicine, UCSF, and Verily. Kenneth Mandl discloses that his laboratory receives sponsored research funding from Quest Diagnostics; and that Boston Children’s Hospital receives corporate philanthropic support for his laboratory from SMART Advisory Committee members, which include the American Medical Association, BMJ Group, Eli Lilly and Company, Google Cloud, Hospital Corporation of America, Microsoft, Optum, Cambia Health Solutions, Quest Diagnostics, and Humana. Mark Sendak discloses that he is co-inventor of technology licensed from Duke University to Cohere Med, Inc and Clinetic, Inc.; and that he holds equity in Clinetic, Inc. Carmel Shachar discloses that she is a member of Advarra’s Institutional Research Board.

Acknowledgements

This paper benefitted from the insights of Matthew Diamond, U.S. Food and Drug Administration; Maryellen Giger, University of Chicago; Brian Gurbaxani, Centers for Disease Control and Prevention; and Christina Silcox, Duke University.

Sections of the paper were developed based on the thoughtful input of Clifford Goodman, PhD, Lewin Group; Vivian Lee, MD, PhD, MBA, Verily; and Suzanne Tamang, PhD, Stanford University and Veterans Affairs.

Jessica Castner acknowledges the support of the American Academy of Nursing, the American Nurses Association, and the American Nurses Foundation.

Additional Information

DISCLAIMER

The views expressed in this paper are those of the authors and not necessarily of the authors’ organizations, the National Academy of Medicine (NAM), or the National Academies of Sciences, Engineering, and Medicine (the National Academies). The paper is intended to help inform and stimulate discussion. It is not a report of the NAM or the National Academies. Copyright by the National Academy of Sciences. All rights reserved.