Abstract
A large-scale, high-quality annotated dataset is crucial for building a powerful deep learning model. However, creating such a dataset raises significant privacy and security concerns, especially when it involves multiple organizations. Ensuring high-quality annotations is also expensive and time-consuming. Federated Learning (FL) has emerged as a new paradigm of distributed learning, enabling multiple participants to collaboratively build a powerful model without the need to collect local data from participants. Despite its great potentials, FL faces several challenges, including achieving local model heterogeneity among the clients, and dealing with label noises, which is not yet extensively addressed in the literature, not even for the conventional centralized learning (CL) scenarios.
To tackle the challenges of model heterogeneity in FL, we propose a novel rule-based collaborative learning framework. This framework empowers each participant to select a customized local learning model that aligns with the characteristics of its local training data, as well as its storage and computing capabilities. To address the challenge of label noise, we first propose a simple yet effective robust label noise learning method for CL scenarios. This method effectively mitigates the negative impact of noisy labels for datasets with up to 50% label noise, without introducing significant extra computational costs. We then extend this model to FL scenarios by designing a two-stage learning process, which maintains effective performance even when local participants have limited samples. Empirical studies, conducted in comparison with state-of-the-art methods using well-known benchmark datasets, validate the effectiveness and superiority of these proposed models.