In this thesis, we propose different MAC protocols based on three Reinforcement Learning (RL) approaches, namely Q-Learning, Deep Q-Network (DQN), and Deep Deterministic Policy Gradient (DDPG). We exploit the primary user (PU) feedback, in the form of ARQ and CQI bits, to enhance the performance of the secondary user (SU) MAC protocols. Exploiting the PU feedback information can be applied on the top of any SU sensing-based MAC protocol. Our proposed model relies on two main pillars, namely, an infinite-state Partially Observable Markov Decision Process (POMDP) to model the system dynamics besides a queuing-theoretic model for the PU queue; the states represent whether a packet is delivered or not from the PU’s queue and the PU channel state. The proposed RL access schemes are meant to design the best SU’s access probabilities in the absence of prior knowledge of the environment, by exploring and exploiting discrete and continuous action spaces, based on the last observed PU’s feedback. The performance of the proposed schemes show better results compared to conventional methods under more realistic assumptions, which is one major advantage of our proposed MAC protocols.


Electronics & Communications Engineering Department

Degree Name

MS in Electronics & Communication Engineering

Graduation Date

Winter 1-31-2021

Submission Date


First Advisor

Dr. Karim Seddik

Committee Member 1

Dr. Ayman El-Ezaby

Committee Member 2

Dr. Ahmed Khattab

Committee Member 3

Dr. Sherif Abdel Azeem


86 p.

Document Type

Master's Thesis

Institutional Review Board (IRB) Approval

Approval has been obtained for this item

Ehab_Maged_ElGuindy_signature.pdf (457 kB)