ModReduce: A Multi-Knowledge Distillation Framework

Author's Department

Journalism & Mass Communication Department

Fifth Author's Department

Computer Science & Engineering Department

Find in your Library

https://doi.org/10.12785/ijcds/1571065777

All Authors

Yahya Abbas Abdelhakim Badawy Mohamed Mahfouz Samah Hussein Samah Ayman Hesham M. Eraqi Cherif Salama

Document Type

Research Article

Publication Title

International Journal of Computing and Digital Systems

Publication Date

1-1-2025

doi

10.12785/ijcds/1571065777

Abstract

Deep neural networks have achieved revolutionary results in several domains; however, they require extensive computational resources and a large memory footprint. Research has been conducted in the field of knowledge distillation to enhance the performance of smaller models by transferring knowledge from larger networks. This knowledge can be categorized into three main types: response-based, feature-based, and relation-based. Prior works explored distilling one or two knowledge types; however, we hypothesize that distilling all three knowledge types could enable a more comprehensive transfer of information and improve the student’s accuracy. In this paper, we propose ModReduce; a unified knowledge distillation framework that utilizes the three knowledge types using a combination of offline and online knowledge distillation. ModReduce is a distillation framework that uses the best distillation methods currently available for each type of knowledge to learn a better student. As such, it can be updated with novel methods as they become available. During training, three students each learn a single knowledge type from the teacher using offline distillation before leveraging online distillation to teach each other what they learned; akin to peer learning, where different students excel in different aspects of a subject under teacher guidance and then help each other learn the subject better. During inference, only the best-performing student (based on validation accuracy during training) is used, so no additional inference costs are introduced. Extensive experimentation on 15 different Teacher-Student architectures demonstrated that ModReduce produces a student that outperforms state-of-the-art methods with an average relative improvement of up to 48.29% without additional inference cost. The source code is available at https://github.com/Yahya-Abbas/ModReduce.

Share

COinS