FDA Unveils 10 Guiding Principles for AI and ML Device Development

Good Machine Learning Practice for Medical Device Development for AI and ML



The U.S. The Food and Drug Administration or FDA unveiled a list of ‘guiding principles’ last week intending to help promote safe and effective development of all medical devices that are using Artificial Intelligence and Machine Learning. The FDA along with the U.K. and Canadian counterparts released the principles that are intended to lay the foundation for ‘Good Machine Learning Practice’. AI and ML technologies have great strength to change healthcare by obtaining new and important insights from the tons of data that is generated during the delivery of healthcare each new day. These AI and ML use software algorithms to learn from the real world and use them in situations that can be advantageous. 

The agency said as the AI and ML medical device field evolves, so Good Machine Learning Practice and consensus are important for standards. But most of you all might be wondering right? Why are these guiding principles of AI and ML vital? As per the FDA, AI and ML technologies have great potential to expand the healthcare industry but their complexities also present unique considerations. And so the team has come up with a total of 10 principles to identify and point at which international standard organizations and other bodies can collaborate and work towards the Good Machine Learning Practice. These principles can also be used to tailor and adapt to the good practices from other fields that can be beneficial for health tech too. 

The guiding principles may be used for three reasons, one to adopt good practices, two for tailoring them from other sectors to improve health tech, and three for creating new practices for medical technology and healthcare.  Let’s see what these principles are. 

The multi-disciplinary expertise can be utilized across the total product life cycle: In-depth understanding of a model’s intended integration into clinical workflow and the intended benefits along with associated patient risks that can help ML-enabled effective medical devices. 

This is designed in a way that can implement good software engineering and security practices effectively: This can ensure good software engineering practices, data management, data quality assurance, and robust cybersecurity practices. 

The clinical study participants and the data sets are part of the intended patient population: Data collection protocols should ensure that the apt characteristics of the intended patient population such as age, gender, sex, race, and ethinicity measurment inputs are sufficiently represented in a sample of adequate size in the clinical study and training and test datasets. 

Training datasets are independent of test sets: Training and testing the datasets are selected and maintained to be appropriately independent of one another.  

Selected reference datasets are all based upon the best available techniques and methods: Accepted, best available methods for developing a reference dataset ensure that clinically relevant and well-characterized data are collected and the limitations of the reference and understood. 

Model design is modified according to the available data and reflects the intended use of the device: Model design is suited to the available data and supports the active mitigation of known risks such as performance degradation, overfitting, and security risks. 

Focus is laid on the performance of the human-AI team: The model outputs are addressed with emphasis on the performance of the Human-AI team rather than just the performance of the model isolation. 

Testing showcases device performance during clinically apt conditions: The considerations include the patient population, clinical environment, important subgroups, and clinical environment and use by the human-AI team, and potential confounding factors.  

Users are provided with the clear and required information: Users are provided ready access to clear information that is relevant for the intended audience. 

Deployed models are monitored for performance and re-training risks are managed: Deployed AI models can be monitored in real-world use with a focus on maintained or improved safety and performance.