Jessie A Ellis
Oct 08, 2025 19:10
NVIDIA’s federated AI models, using FLARE and BioNeMo, improve protein localization predictions, crucial for biology and drug discovery, while maintaining data privacy.
In a significant advancement for biology and drug discovery, NVIDIA has introduced a method for predicting protein properties using federated AI models, as detailed by Holger Roth. This approach leverages NVIDIA FLARE and the BioNeMo Framework to predict the subcellular localization of proteins, a critical factor in understanding cellular processes and identifying therapeutic targets.
Federated Learning Approach
Federated learning allows researchers to collaboratively train AI models without the need to transfer sensitive data between institutions. The NVIDIA FLARE tutorial demonstrates how to fine-tune the ESM-2nv model to classify proteins based on their subcellular localization. This model utilizes embeddings of protein sequences, drawing from datasets like those in the study “Light Attention Predicts Protein Location from the Language of Life.”
Data and Training Process
The data utilized follows the biotrainer standard, formatted as FASTA files, and includes a sequence, training/validation split, and one of ten location classes, such as Nucleus or Cell Membrane. This setup presents a real-world classification challenge, ideal for federated learning applications.
Utilizing the BioNeMo Framework in Docker, researchers can run the Federated Protein Property Prediction tutorial in a Jupyter Lab environment. NVIDIA FLARE facilitates federated training by enabling local training and only sharing model updates, thus ensuring privacy. The FedAvg method aggregates these updates to form a global model.
Training and Results
The team fine-tuned the ESM-2nv model, balancing predictive accuracy with computational efficiency. Key steps included data splitting, federated averaging, and visualization using TensorBoard. This setup allowed for monitoring of both local and federated training in real-time.
Results showed that federated training outperformed local models, increasing average accuracy from 78.8% to 81.7% across sites. This demonstrates the effectiveness of federated learning in enhancing model performance by leveraging data from multiple institutions.
Advantages of BioNeMo and FLARE
The integration of BioNeMo and FLARE offers several benefits beyond protein localization, as it supports AI developments in the scientific community while maintaining data privacy. This approach fosters collaboration, allowing each site to contribute to a more robust model.
With these advancements, NVIDIA positions itself at the forefront of collaborative AI in life sciences. The future of AI in this field is collaborative, and the tools provided by NVIDIA FLARE and BioNeMo are paving the way for new discoveries in healthcare and biotechnology.
For more details, visit the official NVIDIA blog.
Image source: Shutterstock
Credit: Source link