Federated Learning

Federal learning is a futuristic technology that encrypts healthcare data and protects patients’ privacy

Artificial intelligence and its applications are helping healthcare in many profitable ways. Starting from diagnosing diseases in the initial stage to doing surgeries and using robots to stay close with people in critical situations, AI has made a significant mark in the world of medicine. Unfortunately, a major menace that it poses is the privacy of patient data. Researchers and scientists across the globe are working on federated learning, a newly proposed machine learning method to address patients’ data privacy concerns.

Gaining access to the huge volumes of data is an initial threat that needs a quick solution. Especially, healthcare data concerns private details of patients’ health conditions. They are hectic to share and critical to protect. IBM Watson, one of the most famous applications of AI in healthcare reported an incident where the AI mechanism prescribed a drug that could have killed a cancer patient during simulation. This incident happened because IBM Watson used a limited set of ‘hypothetical’ cancer cases instead of real patient data to train its software. The mishap proves that training the AI mechanism with minimum data is dangerous in healthcare. However, if we start training AI with datasets from multiple sources across medium, it leads us to a far triggered spot of data privacy breach. To tackle the challenge, both the healthcare and technology sector are working on federated learning to encrypt patient data.

Federated learning is a machine learning method that uses a decentralized dataset. It enables training models at the client-side while preserving their privacy, and aggregates the knowledge from the nodes to learn a global model. The interesting part is that the data are kept private and not transmitted to any other nodes. Instead, the characteristics of the global model are shared with the clients, and once the training is done locally, the characteristics are sent back to the global one for aggregation. The twist in technology is widely seen as a place where data could be trusted. Researchers have carried out a couple of trials on federated learning which turned out to be successful.

 

Using federated learning to train AI algorithms

The research was conducted by researchers at London Medical imaging and AI Centre for value-based healthcare along with partnering organisations such as Nvidia and Owkin, and fourteen other institutions. The research paper published in Nature Digital Medicine (NDM) reveals the solutions federated learning may provide for the future of digital health, and the challenges that arise around quality, hegemony, and security of patient data. The research proposes the use of federated learning, a machine learning technique that trains an algorithm across multiple decentralised data points to provide a solution to securely utilising large volumes of clinical data and help realise the full potential of machine learning in healthcare. The paper also highlights that as existing medical data sits in data silos with restricted access, a federated learning model could be the key to realising the potential of AI.

 

Testing federated learning on healthcare data

Researchers from Samsung Advanced Institute of Health Sciences & Technology, Sungkyunkwan University, Seoul along with researchers in Samsung Medical Centre and Department of Intelligent Precision Healthcare Convergence, Sungkyunkwan University, Suwon evaluated federated learning in a realistic setting. The team implemented federated learning using a client-server architecture with Python. The implemented client-server version of the federated learning software was deployed to Amazon web services. Modified National Institute of Standards and Technology (MNIST), Medical Information Mart for Intensive Care-III and electrocardiogram (ECG) datasets were used to evaluate the performance of federated learning. As a result, federated learning demonstrated reliable performance in cases where the distribution was imbalanced, skewed and extreme, reflecting the real-life scenario in which data distributions from various hospitals are different.