Federated Learning for Cross-Carrier Insurance Fraud Detection: Secure Multi-Institutional Collaboration
Keywords:
Insurance Fraud Detection, Feder- ated Learning, Multi-Carrier Collaboration, Data Pri- vacy, Differential Privacy, Secure Multiparty Compu- tation, Homomorphic Encryption, Data Governance, Privacy-Preserving Mechanisms, Fraud Analytics, Hor- izontal and Vertical Partitioning, Data Confidential- ity, Model Collaboration Framework, Label Accu- racy, Cross-Institutional Learning, Fraud Prevention, Privacy-by-Design, Data Minimization, Secure Data Sharing, Law-Enforcement Optimization.Abstract
Insurance fraud affects all play- ers—individual customers and insurers alike. Yet, insurance companies are limited in their capabilities to detect fraud. Pooling data across multiple insurance carriers may enable federated machine-learning models that improve fraud- detection performance without compromising data confidentiality and privacy. A proposed system is based on horizontally or vertically partitioned federated-learning architectures, information- protection techniques such as differential privacy, and privacy-preserving mechanisms including secure multiparty computation and homomorphic encryption. Relevant data governance controls—especially aligning with data minimization and so-called data-access principles—address the key privacy concerns in a multi-institutional collaboration environment. A more effective fraud-detection model derived from data fed by multiple carriers may significantly enhance each institution’s detection capability and is important in optimally allocating law-enforcement and investigation resources. Other factors that may influence the fraud- detection capability of insurers include the absence of necessary data, too little data, data noise, mislabeling, and mislabeled samples. noch Ivory-Segatta and others proposed methods to mitigate these concerns by creating a model collaboration framework for a voice-recognition task. Their framework provided an easy-to-use, secure, and reliable environment for cross-collaboration learning between multiple parties to boost label-accuracy sampling for voice-recognition applications.


