Hyper-Variable Spike Protein of Omicron Corona Virus and Its Differences with Alpha and Delta Variants: Prospects of RT-PCR and New Vaccine

and Its Differences with Alpha and Abstract NCBI SARS-CoV-2 Database was analyzed between November-December, 2021 to decipher the spread of Delta corona virus variants in the USA and compared with highly transmissible new omicron variant recently originated in South Africa. Presently, B.1.617.2 and AY.103 lineages Delta variants with spike protein L452R, T478K, P681R mutations and F157/R158 two amino acids deletions were predominant in the USA and superseded the deadly outbreaks of B.1.1.7 Alpha variant with deletions of H69, V70 and Y145 amino acids as well as N501Y, and D614G highly transmissible mutations. Interestingly, omicron variant has six H69, V70, V143, Y144, Y145, L212 immune-escape deletions as well as 29 mutations in the spike protein including most deadly N501Y (Y498 in omicron) and D614G (G611 in omicron). This indicated that omicron variant was originated by combination among B.1.1.7, AY.X and B.1.617.2 lineages. A unique three amino acids (EPE) insertion at 215 position of spike protein was detected to compensate six deletions suggesting further recombination events. Three Serine residues were mutated at amino acids 371 (S=L, L368 in omicron), 373 (S=P, P370 in Omicron), 375 (S=F, F372 in omicron) but compensated at 446 (G=S, S443 in omicron) and 496 (G=S, S493 in omicron) at the RBD domain of omicron virus. The three amino acids (ERS) deletion at position 30 in the N-protein acts as another signature of omicron virus. Omicron variant has less mutation in the 2/3 5’-end of the genome that codes for ORF1ab poly-protein but dominant P4715L mutation in the RNA-dependent RNA polymerase. However, overall amino acid composition, alipathic index, and instability index were found fairly constant although hydrophobic plot gave some difference between spike protein of Wuhan and omicron corona viruses. BLAST search detected 20nt and 19nt perfect match of hyper-variable 22957-22977nt region comprising 488-493 amino acids (NH2-PLRSYS-CO2H) of the spike protein of omicron virus with the ch-2 of Seladonia tumulorum or ch-16 of Steromphala cineraria respectively. A primer set designed from the RBD domain of spike gene did not detected the omicron genome by BLAST search but primers from the constant regions of the genome worked well. Such hyper-variation in the spike protein suggested that DNA vaccine or mRNA vaccine using spike gene of corona virus may not efficiently protect omicron virus infection and attenuated whole corona virus vaccine will be

Severe COVID-19 is more common in adults aged ~70 years with co-morbidities such as diabetes, cardiovascular disease and chronic respiratory disease. A difference in case fatality rates across countries was observed, possibly due to a diverse demographic composition and the type of control measures that have been taken in different Open Access Journal countries to stop viral spreading [15]. According to 2020 database, three major Clades of SARS-CoV-2 can be identified and named as Clade G (variant of the spike protein S-D614G), Clade V (variant of the ORF3a coding protein NS3-G251V), Clade GR (S-D614G + N-G204R) and Clade S (variant ORF8-L84S) [16,17]. SARS-CoV-2 variants emerged many fold in late 2020, and at least three variants of concern (B. Presently, at least ten vaccine candidates vaccinated 70% world population. Vaccine usually is a protein or synthetic peptides from Coronavirus that can elicits humoral antibody (IgG) as well as T-cell mediated ability to destroy virus. Attenuated or killed Corona virus (Covaxin, Bharat Biotech, India) also used like Pox vaccination. As genetic information in cells processed from DNA to RNA to protein, scientists have exploited DNA vaccine as well as RNA vaccine for the protection of Corona virus. Indian Serum Institute uses killed virus where as Russia uses mRNA vaccine (Sputnik V) and England (Oxford + Astra-Zeneca) uses S gene DNA vaccine using adenovirus vector (Ad5 or Ad26). USA (Moderna/ Pfizer) and Germany's BioNTech uses S gene mRNA vaccine [19]. The most companies used spike protein (S gene) which was the receptor protein of corona virus that bound to ACE-2 receptor of lung cells of human and animal.
Mutation greatly affected increase modes of virus transmission as in case of D614G and N501Y mutations [16]. Further, a decrease vaccine utility (protection against virus) was reported with immune escape (T cell immunity) as in case of 69, 70 and 145 amino acids deletions in alpha corona virus (B.1.1.7) [20]. Further, mutations like L452R, E484K, and other at the RBD of virus greatly lower the neutralization efficacy of serum antibody from earlier corona patients to mutant viruses [21]. Presently, deletion of F157 and R158 in AY.X and B.1.1.617.2 Delta variants produced increased transmission in presence of D614G mutation together further lowering the vaccine efficacy. Very recently, distinct omicron virus new mutations found lowering vaccine utility and increasing transmission rate but reports of confirmed immunological data yet to come [22][23][24]. We will molecularly study the spread of omicron virus specifically in the USA by analyzing NCBI Virus Database between 20 th November to 25 th December, 2021 using different free software available in the net.

Database analysis
We used NCBI (www.ncbi.nlm.gov) SARS-CoV-2 database only as it gave multi-alignment data for up to 500 sequences. Such alignment detected most sequences were incomplete and separately analyzed. But  Red denotes mutant amino acids and with underlined means well characterized mutations that enhanced viral transmission (G611 and Y498 here). middle complete sequences were checked for comparable sequences by looking red lines for mutations and mostly AY.x variant corona viruses. Omicron viruses have many red lines between 21000-28000nt for the hyper-variable spike protein and other structural proteins ( Figure 1C). Thus, we covered many sequences to few omicron virus sequences helpful for analysis by freely available Multalin software and CLUSTAL-Omega software. It took 2-3 minutes for spike protein (1273 aa) alignment by Multalin software but it took 30-40 minutes for CLUSTAL Omega software 30kb RNA genome alignment. As we found most sequences were AY.X Delta type (~85%) and some Delta B.1.1.617.2 type (~10%). We only documented data for omicron sequences (0.5-5%) deposited between 23 rd November and 24 th December, 2021. Date of sequence deposition in the NCBI Database, Author's name and Collection date of virus were used to analyze sequence sets. Although first we detected one omicron corona virus sequence in such search. Then, we BLAST analyzed the hyper-variable regions (60 nt) to get more three omicron sequences but one US such US originated sequence had incomplete spike protein and two complete sequences each from Canada and Belgium. However, from December 6, more and more omicron viruses were detected in the database. Our guess sequencing primers used for Wuhan, Alpha and Delta variants were not worked well using standard kits available. But more and more omicron virus complete sequences will be deposited afterwards. We have no access to GISAIP database and the accession number for the first omicron virus genome (29684 bp) is EPI_ISL_6640916 with collection date 11-11-2021 and submission date 23-11-2021. Parts of multi-alignments were presented with different omicron signatures. During review of the paper, we also analyzed the database and huge omicron sequences were deposited in last week of December, 2021 and first week of January, 2022.

Results
We first detected an omicron strain (B.     The last day of our analysis was dated 24-12-2021 deposited sequences. We found by multi-alignment that Nickerson, et al., deposited many omicron sequences originated in Washington with accession numbers OM003743 (19-12- We got all the 141 suspected 1270 aa length complete + incomplete omicron sequences and found 21 complete sequences. Multialignment produced three mutations (D212Y in UHO53537, R343K in UHO53468/91 and UHO53648 and A698V in UHO53131, UGO96815 and UGO96803) in the omicron sequences. So, up to 24 th December 2021, the Database (December 6 to December 24, 2021 total sequences deposited were 3,71,307) penetration of complete + incomplete omicron sequences were 0.0379% and for complete sequences it further very reduced to 0.0056%. This indicated it was very hard to get omicron virus sequences which was complete and authentic ( Figure 6). We want to learn more signatures for omicron corona virus. We found no changes in the furine cleavage site of S protein ( Figure 8) (Hoffmann, et al., 2020). When we analyzed the N-protein sequences from omicron viruses, we detected unique three amino acid deletion (ERS) at position 30 and multi-alignment data presented in figure  9. Interestingly, N-protein had point mutations like P13L, R203K, G204R and in some D343G. Similarly, two extra new mutation in the M-protein were detected (D3G, A63T) ( Figure 10) but no mutation in the ORF3a protein (data not shown). Small structural E protein (75 aa) in omicron virus has one mutation (T9I; data not shown). Never the less omicron virus transmission is rapidly increasing in 90 countries and in some US states it is about 10% where as in South Africa was 90% now. Thus, the rate of omicron transmission has increased in UK and Germany where as about 400 patients were detected in India. Death already was reported in England and USA although the omicron disease appeared to be less virulent where oxygen support and hospitalization were unnecessary.
Surprisingly, on 25-12-2021 huge omicron data deposited in the NCBI SARS-CoV-2 Database as we reported higher than 216 omicron sequences. But dated 27-12-2021 analysis we did not find the pattern of multi-alignment for omicron virus ( Figure 1C). Ultimately we discovered the heavy read lines were found in Delta variants (reversed) due to huge deposition of omicron sequences from 25-12-2021 onwards. We then BLAST searched the 60 nt hyper-variable region of spike protein (22894 5'-AAC TGA AAT CTA TCA GGC CGG TAA CAA ACC TTG TAA TGG TGT TGC AGG TTT TAA TTG TTA-3' 22953) and we found 3815 possible omicron sequences instead of only four (4) found on dated 08-12-2021 BLAST search. Such data was astonishing and a huge spread of corona virus was evident in the USA from the second week of December, 2021. Dated 29-12-2021 and 31-12-2021 analysis, we discovered a mixed trend of alignment suggesting our method still work well. Likely omicron penetration increased to 1.2% during end of December, 2021.
Next we analyzed the important of S gene mutations on corona virus diagnostics. Many RT-PCR kits utilized the S gene primers where some kits appeared unable to give RT-PCR data from the S gene region. We determined the sequence variation in omicron virus as compared to Wuhan 2019 strain. Data presented in figure 11 where two or more regions in the genome were presented. We made 10 primers set using NCBI Primer Design software and one was located in the S gene. Analysis found by BLAST that forward primer (F3= 5'-23518GAC TAA GTC TCA TCG GCG GG23537-3' would not hybridize to omicron genome but reverse primer 5'-24130CCC ACA TGA GGG ACA AGG AC24111-3' did well. Old primers designed for Wuhan strain were hardly identify S gene of omicron variants pinpointing new primers design were necessary to track omicron transmission using S gene primers. This is an example to be aware for RT-PCR using old primers for the detection of omicron virus spread.    Figure 12). We do not know why such similarity of viral sequence to lower eukaryotes genome like fly or mollusca! However, such information could have some interest to some evolutionary biologist.
Then, we wanted know how six amino acids deletions and three amino acids insertions with many mutations could protect the omicron virus S protein functional and stable giving the capacity for higher transmission than alpha and delta variants? When we analyzed the amino acid composition of spike proteins, we found no gross changes in omicron as compared to Wuhan and Alpha strains. Data presented in figure 13 (www.expasy.org/cgi-bin /portparam). Very minor changes were noticed and boxed for Arginine, Aspergine, Glutamine, Phenyl Alanine and Serine ( Figure 11). Acidic amino acids    figure 14. There are some differences as shown by green box.

Discussion and Conclusion
Omicron variant of corona viruses are rapidly spreading in the world including USA, UK, Australia, and India. We analyzed here the NCBI database to conclude that the omicron virus was rapidly spreading in the United States of America and spike protein of omicron corona virus was very stable although it had more deletions and mutations than alpha and delta variants. Recently, a paper was published on African omicron corona virus phylogenetics using GISAIP database and the accession no of the first omicron virus genome (29684 bp) was EPI_ISL_6640916 with collection date 11-11-2021 and submission date 23-11-2021 [25]. However, I have no access to such database. The molecular mechanism of the new 26 unknown spike protein mutations were yet to know but such changes occurred with higher transmission and combined 6 deletions of amino acid surely increased immune   escape [26]. We analyzed enormous data from NCBI SARS-CoV-2 database from November-December, 2021 and detected about ~141 omicron sequences (upto 24-12-2021 deposit). But such sequences were mostly incomplete and we finally got about 21 authentic omicron spike protein sequences ( Figure 6). On Christmas day Howard D, et al., deposited about 216 omicron sequences but mostly incomplete due to ambiguity of insertion sequences but there were enough complete sequences for spike protein to analyze. I faced tremendous problem to get an authentic omicron spike protein sequence during last week of November, 2021 when omicron reports were mounting in the media. In January, 2022, the number of Omicron sequences were more than Delta variants! Question arises such hyper-variable difference of omicron spike protein could be involved in the vaccine failure for the corona vaccine made from S gene! Analysis suggested that some hydrophobic regions had similarity and surely partial protection possible! Study indicated that only 20% and 24% of BNT162b2 vaccine recipients had detectable neutralizing antibody against the omicron variant HKU691 and HKU344-R346K, respectively, while none of the Coronavac recipients had detectable neutralizing antibody titre against either omicron isolate. Omicron variant escapes neutralizing antibodies elicited by BNT162b2 or Coronavac [27]. Using animal model Starr TN, et al., [28] showed that the antibodies, S2H97 and S2E12 bound with high affinity across all sarbecovirus clades to a cryptic epitope and prophylactically protects hamsters from viral challenge [28]. Further study suggested that E484K mutation evaded antibody neutralization elicited by infection or vaccination and further enhanced by K417N and N501Y mutations [29]. A very similar conclusion was confirmed by antibodies raised against deletion mutants of RBD domain of spike protein [30]. Wang R, et al., [20,21] and others (2021) [33][34][35]. Thus, all natural mutations have allosteric effects that drive either interspecies transmission or escape from antibody neutralization [36]. Omicron virus was already transmitted in 90 countries and likely will be threat to humanity [37]. Omicron may be ten times more contagious than the original virus and twice more infectious than delta variant. We have shown that omicron viruses are greatly affected many US States including CA, NY, CO, MN and NJ. Further, omicron may be twice more likely to escape current vaccines than the delta variant [38]. Wang R, et al., identified fast-growing RBD mutations like N439K, S477N, S477R, and N501T that enhanced the RBD and ACE2 binding. L452R mutation in the spike reduces its interaction with Wuhan corona virus antibodies. Similarly, mutations E484K and K417N found in South Africa and L452R and E484Q found in India variants could be responsible for such reduced antibody interaction [21]. Miller NL, et al., preprint disclosed that the omicron variant increased antibody escape due to mutations in class 3 and 4 antibody epitopes in the spike protein as well as enhanced transmissibility via disruption of ligand-receptor interface [26]. Finally, molecular biology of omicron virus has just started to define its functions of genetic changes [25]. Although mild symptoms of fever, cold and pneumonia reported, delmicron (Delta + Omicron) has created a havoc calamity in the world [39]. Remdesivir drug however has some benefit to control corona virus spread targeting RNA dependent RNA polymerase as well as some immune drug were discovered [40]. We have pinpointed the differences in the spike protein of omicron but omicron spike protein appeared very stable to interact with ACE-2 receptor. But genomic mutations may affect RT-PCR ( Figure 11) and thus new RT-PCR primers were presented from conserved regions. Interestingly, now more sequences for omicron virus will be available in the NCBI SARS-CoV-2 Database.