The task of summarization can be categorized into two methods, extractive and abstractive summarization. Extractive approach selects highly meaningful sentences to form a summary while the abstractive approach interprets the original document and generates the summary in its own words. The task of generating a summary, whether extractive or abstractive, has been studied with different approaches such as statistical-based, graph-based, and deep-learning based approaches. Deep learning has achieved promising performance in comparison with the classical approaches and with the evolution of neural networks such as the attention network or commonly known as the Transformer architecture, there are potential areas for improving the summarization approach. The introduction of transformers and its encoder model BERT, has created advancement in the performance of many downstream tasks in NLP, including the summarization task. The objective of this thesis is to study the performance of deep learning-based models on text summarization through a series of experiments, and propose “SqueezeBERTSum”, a trained summarization model fine-tuned with the SqueezeBERT encoder which achieved competitive ROUGE scores retaining original BERT model’s performance by 98% with ~49% fewer trainable parameters.


School of Sciences and Engineering


Computer Science & Engineering Department

Degree Name

MS in Computer Science

Graduation Date

Winter 1-31-2022

Submission Date


First Advisor

Ahmed Rafea

Committee Member 1

Mohamed Moustafa

Committee Member 2

Mohsen Rashwan



Document Type

Master's Thesis

Institutional Review Board (IRB) Approval

Not necessary for this item