Novel DNA ligases from the Red Sea brine pools: Cloning, expression, in silico characterization and comparative thermostability

Iyanu Oduwole

Abstract

Extreme physicochemical conditions such as high temperature, salinity, and the presence of heavy metal are characteristics of some of the Red Sea brine pools environment. We screened two Red Sea Brine pools (Atlantis II(ATII), and Discovery Deeps (DD), and one interface layer (Kebrit Deep) to identify novel DNA ligases with potential extreme biochemical properties. Furthermore, we did an in silico comparative thermostability study by examining the stability role of proline and arginine residues at the loop conformations and exposed regions of ligase sequences from metagenomic assemblies of different extreme environments, including the Red Sea metagenomes. A sequence-based metagenomics approach was used to identify the putative DNA ligase sequences from the Red Sea brine pools and interface layer metagenomes downloaded from the NCBI database. 6, 148, 453 metagenomic reads were assembled using MEGAHIT, which generated 783,176 contigs. A concatenated HMM model built from raw HMM models of ATP and NAD+ ligases domains available from the Pfam database was used to scan predicted ORFs from contigs. A total of 18 ORFs were identified, and two of the ORFs, LigATL1 ATP type), from AT-II and LigKDU4 (NAD+ type) from KB, were selected for synthesis, phylogenetic study, and further preliminary characterizations. LigATL1 was cloned, expressed, and partially purified. Additionally, ligase sequences from psychrophilic, mesophilic, thermophilic, and hyperthermophilic environments were retrieved from the NCBI database for comparative thermostability study with some of the putative Red Sea ligase sequences. The retrieved 22 ligase sequences were divided into five different closest taxonomic groups. ConSurf and DisEMBL servers were used to analyze Proline (Pro) and Arginine (Arg) residues in the exposed/buried regions and the loop and hot loops regions of the putative ligases (retrieved + Red Sea), respectively. A putative LigATL1 showed a 38% identity to ATP-Dependent DNA ligase from Erysipelotrichaceae bacterium, while LigKDU4 has a 60% identity to NAD+ Dependent DNA ligase from Candidatus Marinimicrobia bacterium. The phylogenetic analysis suggests that LigATL1 belongs to the LigD(ATP type) family, while LigKDU4 is amongst the LigA family,(NAD+ type). LigATL1 has 100% confidence modeling using bound-adenylated nicked human DNA ligase as a template, and is superimposed with the highest similarity (Template modeling ™ score =1.0) to thermostable DNA ligase from S.solfataricus. LigKDU4 modeled with 100% confidence using bound-adenylated nicked E.coli DNA ligase, and also superimposed with the highest similarity(TM score= 1.0) to thermostable t2 filiform DNA ligase. In vitro, functional assay and biochemical characterization are still required to confirm both enzyme activity and thermostability. For the comparative thermostability analysis, many Ligase sequences from thermophilic or hyper thermophilic environments had higher Pro and Arg residues both at the exposed and the hot loops regions than those from other mesophilic and psychrophilic environments. The highest buried Pro and Arg residues were reported for ligase sequences from psychrophilic environments at almost all the groups. Two out of five putative ligase sequences selected for the thermophilic AT-II environment had more hot loops and less buried Pro and Arg residues than other pairs in their respective groups. In the case of LigKDU4(MLK), it has the highest hot loop and exposed Arg residues than its pairs in its group which is unusual when compared to Arg analysis in other groups. This comparative study can give an insight into improving the thermal stability of enzymes generally.