As vast amounts of unstructured data are becoming available digitally, computer-based methods to extract relevant and meaningful information are needed. Named entity recognition (NER) is the task of identifying text spans that mention named entities, and to classify them into predefined categories. Despite the existence of numerous and well-versed NER methods, the bio-medical domain remains under-studied. The objective of this research is to identify an efficient technique for NER tasks from biomedical data. This is achieved by investigating using deep learning technologies namely pre-trained BERT  model and its variances SciBERT  and BioBERT . Preprocessing the data before passing it for training influences model performance. There is also investigation with some preprocessing rules to monitor their effect on model performance. Our model outperforms vanilla BERT, and BioBERT where is Precision: 66.20%, Recall: 98.96%, F1: 79.33%.
School of Sciences and Engineering
Computer Science & Engineering Department
MS in Computer Science
Committee Member 1
Committee Member 2
Institutional Review Board (IRB) Approval
Not necessary for this item
(2023).Named Entity Recognition from Biomedical Text [Master's Thesis, the American University in Cairo]. AUC Knowledge Fountain.
Guirguis, Maged. Named Entity Recognition from Biomedical Text. 2023. American University in Cairo, Master's Thesis. AUC Knowledge Fountain.