This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Bioinformatics and Biotechnology, is properly cited. The complete bibliographic information, a link to the original publication on https://bioinform.jmir.org/, as well as this copyright and license information must be included.
Emergence of the new SARS-CoV-2 variant B.1.1.529 worried health policy makers worldwide due to a large number of mutations in its genomic sequence, especially in the spike protein region. The World Health Organization (WHO) designated this variant as a global variant of concern (VOC), which was named “Omicron.” Following Omicron’s emergence, a surge of new COVID-19 cases was reported globally, primarily in South Africa.
The aim of this study was to understand whether Omicron had an epidemiological advantage over existing variants.
We performed an in silico analysis of the complete genomic sequences of Omicron available on the Global Initiative on Sharing Avian Influenza Data (GISAID) database to analyze the functional impact of the mutations present in this variant on virus-host interactions in terms of viral transmissibility, virulence/lethality, and immune escape. In addition, we performed a correlation analysis of the relative proportion of the genomic sequences of specific SARS-CoV-2 variants (in the period from October 1 to November 29, 2021) with matched epidemiological data (new COVID-19 cases and deaths) from South Africa.
Compared with the current list of global VOCs/variants of interest (VOIs), as per the WHO, Omicron bears more sequence variation, specifically in the spike protein and host receptor-binding motif (RBM). Omicron showed the closest nucleotide and protein sequence homology with the Alpha variant for the complete sequence and the RBM. The mutations were found to be primarily condensed in the spike region (n=28-48) of the virus. Further mutational analysis showed enrichment for the mutations decreasing binding affinity to angiotensin-converting enzyme 2 receptor and receptor-binding domain protein expression, and for increasing the propensity of immune escape. An inverse correlation of Omicron with the Delta variant was noted (r=–0.99,
In silico analysis of viral genomic sequences suggests that the Omicron variant has more remarkable immune-escape ability than existing VOCs/VOIs, including Delta, but reduced virulence/lethality than other reported variants. The higher power for immune escape for Omicron was a likely reason for the resurgence in COVID-19 cases and its rapid rise as the globally dominant strain. Being more infectious but less lethal than the existing variants, Omicron could have plausibly led to widespread unnoticed new, repeated, and vaccine breakthrough infections, raising the population-level immunity barrier against the emergence of new lethal variants. The Omicron variant could have thus paved the way for the end of the pandemic.
A new variant of SARS-CoV-2 (lineage B.1.1.529) was reported from Botswana, South Africa, and multiple other countries [
Owing to the heterogeneity of previous infections and vaccination coverage across the global population, there has been significant ambiguity in reports on the epidemiological properties of Omicron [
We performed an in silico analysis of the complete genomic sequences of the Omicron BA.1 variant available on the Global Initiative on Sharing Avian Influenza Data (GISAID) platform [
The SARS-CoV-2 genomic sequence for the Omicron variant and other global VOCs/variants of interest (VOIs) were downloaded from the EpiCoV database of GISAID [
Mutational analysis on the genomic sequences was performed, and the 3D structure of the spike protein with amino acid changes in Omicron was generated using the CoVsurver app provided by GISAID [
Furthermore, the functional impact of the mutations present at the RBD of the variants was assessed using an open analysis pipeline developed by Starr et al [
The epidemiological correlates of the Omicron variant were assessed based on the comparative analysis of the genomic sequences from GISAID [
An expected (
For the analysis of epidemiological data, statistical tests were performed to evaluate intergroup differences among SARS-CoV-2 variants in Microsoft Excel 2019 and the R statistical package version 4.2.2. The normality of the data was examined using the Shapiro-Wilk test. Pearson (
Approval from the institutional ethics committee was precluded as publicly available/open access databases were used for this study.
A total of 3604 genomic sequences of Omicron from 54 countries were uploaded on GISAID up to December 10, 2021 (see Figure S1 in
Compared to the current list of global VOCs/VOIs (as per the WHO), Omicron showed more sequence variation, specifically in the spike protein (nucleotides 21,563-25,384; amino acids 1-1273), including the receptor-binding motif (RBM; nucleotides 22,869-23,089 and amino acids 438-508), where the riffs were most prominent (Table S1 in
Three-dimensional structure of Omicron (BA.1) spike glycoprotein in the interaction of human angiotensin-converting enzyme 2 (ACE2), showing key amino acid substitutions. (Data source: CoVsurver app from GISAID [
The analysis of Omicron’s genomic and protein sequence homology with the reference strain and current global VOCs/VOIs (as per the WHO) showed the highest similarity of Omicron with the Alpha variant for the complete sequence as well as for the RBM. However, the highest similarity for the complete nucleotide and protein sequences for the spike protein were noted with the Beta and Delta variant, respectively (see Table S1 in
Multiple clusters of closely spaced mutations were noted across the sequence, which were most densely located in the spike protein region, particularly in its S1 subunit, including the host RBM (
The mutational landscape in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage: BA.1). The analysis of the mutations present at the RBD using a deep mutational scanning pipeline by Starr et al [
Predicted impact of receptor-binding motif variations in the SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage: BA.1) on interactions with the host.a
ACE2b binding site mutations | ACE2 binding (Δlog10 KD appc,d) | Protein expression (Δlog mean MFIe,f) | ACE2 contact with SARS-CoV-2 | RSAg bound | SARS-CoV-1 amino acid | RaTG13 amino acid | GD Pangolin-CoV amino acid |
G339D | 0.06 | 0.30 | false | 0.47 | G | G | G |
S371L | –0.14 | –0.61 | false | 0.46 | S | S | S |
S373P | –0.08 | –0.22 | false | 0.48 | F | S | S |
S375F | –0.55 | –1.81 | false | 0.48 | S | S | S |
K417N | –0.45 | 0.10 | true | 0.19 | V | K | R |
N440K | 0.07 | –0.12 | false | 0.68 | N | H | N |
G446S | –0.20 | –0.40 | true | 0.55 | T | G | G |
S477N | 0.06 | 0.06 | false | 0.76 | G | S | S |
T478K | 0.02 | 0.02 | false | 0.48 | K | K | T |
E484A | –0.07 | –0.23 | false | 0.50 | P | T | E |
Q493R | –0.09 | –0.06 | true | 0.10 | N | Y | Q |
G496S | –0.63 | 0.12 | true | 0.04 | G | G | G |
Q498R | –0.06 | 0.10 | true | 0.00 | Y | Y | H |
N501Y | 0.24 | –0.14 | true | 0.03 | T | D | N |
Y505H | –0.71 | 0.16 | true | 0.12 | Y | H | Y |
aBased on the study of Starr et al [
bACE2: angiotensin-converting enzyme 2.
cKD app: apparent dissociation constant.
dA positive Δlog10 KD app value relative to the unmutated SARS-CoV-2 receptor-binding domain (3.9 × 10−11 M) indicates stronger binding.
eMFI: mean fluorescence intensity.
fPositive Δlog MFI values relative to the unmutated SARS-CoV-2 receptor-binding domain indicate increased expression.
gRSA: relative solvent accessibility.
Lineage comparison between Omicron and other global variants of concerns/interest. Only mutations with >75% prevalence in at least one lineage are shown. (Data source: outbreak.info, based on the SARS-CoV-2 genomic sequences uploaded in GISAID until December 6, 2021).
Mutations in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage BA.1) spike protein influencing immune escape via antibody recognition sites and/or antigenic drift.a
Mutation | Frequency (%)b | Remarks | Reference |
H69del | 20.35 | H69del+V70del have 2-fold higher infectivity compared to the wild type. H69del+V70del-containing viruses showed reduced neutralization sensitivity to mAbc COVA1-21, targeting an as-yet-undefined epitope outside the RBDd | [ |
V70del | 20.37 | H69del+V70del have 2-fold higher infectivity compared to the wild type. H69del+V70del-containing viruses showed reduced neutralization sensitivity to mAb COVA1-21, targeting an as-yet-undefined epitope outside the RBD | [ |
V143del | 0.12 | N/Ae | [ |
Y144del | 20.94 | Decreased sensitivity to convalescent sera | [ |
Y145del | 2.33 | Decreased sensitivity to convalescent sera | [ |
G339D | 0.01 | N/A | [ |
S371L | 0.00 | N/A | [ |
S373P | 0.01 | N/A | [ |
S375F | 0.00 | N/A | [ |
K417N | 0.83 | N/A | [ |
N440K | 0.17 | N/A | [ |
G446S | 0.01 | N/A | [ |
S477N | 1.31 | S477N was also resistant to neutralization by the human convalescent sera tested in this study, but not to vaccine-elicited sera | [ |
E484A | 0.02 | N/A | [ |
Q493R | 0.01 | N/A | [ |
G496S | 0.01 | N/A | [ |
Q498R | 0.00 | N/A | [ |
N501Y | 24.11 | Associated with increased transmissibility and increased affinity for human ACE2f receptor | [ |
H655Y | 2.25 | N/A | [ |
aBased on the genomic sequences of Omicron uploaded on GISAID [
bAmong all SARS-CoV-2 genomic sequences uploaded on GISAID [
cmAB: monoclonal antibody.
dRBD: receptor-binding domain.
eN/A: not applicable.
fACE2: angiotensin-converting enzyme 2.
Mutations in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage BA.1) spike protein influencing receptor binding.a
Mutation | Frequency (%)b | Effect on virus-host interactions | Remarks | Reference |
G339D | 0.01 | Increased RBDc expression | N/Ad | [ |
S371L | 0.00 | Increased ACE2e binding | N/A | [ |
S373P | 0.01 | Increased RBD expression | N/A | [ |
K417N | 0.83 | Increased RBD expression | N/A | [ |
N440K | 0.17 | Increased ACE2 binding | N/A | [ |
S477N | 1.31 | Increased ACE2 binding/ increased RBD expression | S477N was also resistant to neutralization by the human convalescent sera tested in this study, but not to vaccine-elicited sera | [ |
T478K | 52.56 | Increased ACE2 binding/increased RBD expression | Decreased sensitivity to convalescent sera | [ |
Q493R | 0.01 | Host change | N/A | [ |
G496S | 0.01 | Increased RBD expression | N/A | [ |
N501Y | 24.11 | Increased ACE2 binding/host change | Associated with increased transmissibility and increased affinity for human ACE2 receptor | [ |
Y505H | 0.00 | Increased RBD expression | N/A | [ |
D614G | 98.51 | Increased infectivity | Lower cycle threshold values were observed in G614 infections, indicating a higher viral load | [ |
aBased on the genomic sequences of Omicron uploaded on GISAID [
bAmong all SARS-CoV-2 genomic sequences uploaded on GISAID [
cRBD: receptor-binding domain.
dN/A: not applicable.
eACE2: angiotensin-converting enzyme 2.
Mutations in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage BA.1) spike protein influencing viral oligomerization interfaces.a
Mutations | Frequency (%)b | Remarks | Reference |
S371L | 0.00 | N/Ac | [ |
S373P | 0.01 | N/A | [ |
S375F | 0.00 | N/A | [ |
K417N | 0.83 | N/A | [ |
S477N | 1.31 | S477N was also resistant to neutralization by the human convalescent sera tested in this study, but not to vaccine-elicited sera | [ |
Q493R | 0.01 | N/A | [ |
N501Y | 24.11 | Associated with increased transmissibility and increased affinity for human ACE2d receptor | [ |
Y505H | 0.00 | N/A | [ |
N764K | 0.01 | N/A | [ |
D796Y | 0.08 | N/A | [ |
N856K | 0.00 | N/A | [ |
Q954H | 0.00 | N/A | [ |
N969K | 0.00 | N/A | [ |
L981F | 0.00 | N/A | [ |
aBased on the genomic sequences of Omicron uploaded on GISAID [
bAmong all SARS-CoV-2 genomic sequences uploaded on GISAID [
cN/A: not applicable.
dACE2: angiotensin-converting enzyme 2.
Mutations in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage BA.1) spike protein influencing host adaptation and other mechanisms.a
Mutations | Frequency (%)b | Effect on virus-host interactions | Remarks | Reference |
A67V | 0.36 | Unknown | N/Ac | [ |
T95I | 21.32 | Unknown | N/A | [ |
G142D | 33.40 | Unknown | N/A | [ |
Q954H | 0.00 | Host adaptation (cell culture) | N/A | [ |
N211del | 0.02 | Unknown | N/A | [ |
L212I | 0.01 | Unknown | N/A | [ |
ins214EPE | 0.00 | Unknown | N/A | [ |
H655Y | 2.25 | Host adaptation (cats); spike glycoprotein fusion efficiency | N/A | [ |
N679K | 0.09 | Unknown | N/A | [ |
P681H | 22.73 | Unknown | P681H mutation at the S1/S2 site of the SARS-CoV-2 spike protein may increase its cleavability by furin-like proteases, but this does not translate into increased virus entry or membrane fusion | [ |
T547K | 0.00 | Unknown | N/A | [ |
N856K | 0.00 | Ligand binding | N/A | [ |
aBased on the genomic sequences of Omicron uploaded on GISAID [
bAmong all SARS-CoV-2 genomic sequences uploaded on GISAID [
cN/A: not applicable.
Mutations in SARS-CoV-2 variant B.1.1.529 (Omicron, sublineage BA.1) outside of the spike protein.a
Mutations | Frequency (%)b | Effect on virus-host interactions | Remarks | References | |
Envelope (E) T9I | 0.09 | Viral oligomerization interfaces | N/Ac | [ |
|
|
|||||
|
M D3G | 0.08 | Unknown | N/A | [ |
|
M Q19E | 0.00 | Unknown | N/A | [ |
|
M A63T | 0.01 | Unknown | N/A | [ |
|
|||||
|
N P13L | 0.63 | Antigenic drift | P13L variant in B*27:05-restricted CD8+ nucleocapsid epitope, showing complete loss of responsiveness to the T-cell lines evaluated | [ |
|
N E31del | 0.00 | Unknown | N/A | [ |
|
N R32del | 0.00 | Unknown | N/A | [ |
|
N G204R | 26.20 | Unknown | N/A | [ |
|
|||||
|
NSP3 K38R | 0.01 | Unknown | N/A | [ |
|
NSP3 S1265del | 0.02 | Unknown | N/A | [ |
|
NSP3 L1266I | 0.02 | Unknown | N/A | [ |
|
NSP3 A1892T | 0.00 | Unknown | N/A | [ |
|
NSP4 T492I | 47.76 | Viral oligomerization interfaces | N/A | [ |
|
NSP5 P132H | 0.01 | Unknown | N/A | [ |
|
NSP6 L105del | 0.02 | Unknown | N/A | [ |
|
NSP6 S106del | 24.74 | Unknown | N/A | [ |
|
NSP6 G107del | 24.74 | Unknown | N/A | [ |
|
NSP6 I189V | 0.03 | Unknown | N/A | [ |
|
NSP12 P323L | 96.69 | Viral oligomerization interfaces | N/A | [ |
|
NSP14 I42V | 0.00 | Viral oligomerization interfaces | N/A | [ |
aBased on the genomic sequences of Omicron uploaded on GISAID [
bAmong all SARS-CoV-2 genomic sequences uploaded on GISAID [
cN/A: not applicable.
A total of 4224 SARS-CoV-2 genomic sequences (Delta, n=999; Omicron, n= 2937; and others, n= 288) were uploaded on GISAID from South Africa in the period of study. For the complete duration of the study, Delta correlated negatively with the number of new COVID-19 cases (
Epidemiological correlates of Omicron and Delta variants genomic sequences reported on GISAID from South Africa for the period of October 1 to December 10, 2021. The proportion of Delta and Omicron variants among the total SARS-CoV-2 genomic sequences were correlated with the new COVID-19 cases and deaths in the study period (3-day sum of each variable). A sharp change in the dominance from Delta to Omicron was observable since the report of the first Omicron case (November 5, 2021). The rise of Omicron cases paralleled the increase in the new COVID-19 cases. In comparison, the Delta variant showed a fall in the same period. Notably, there has been no increase in the number of deaths postemergence of Omicron. (Data sources: GISAID and Worldometer).
Our analysis of the SARS-CoV-2 genomic sequences and epidemiological data from South Africa unravels multiple observations regarding host-virus interactions, which may help to predict the further epidemiological potential of the Omicron variant. We found that compared to the current list of global VOCs/VOIs (as per the WHO), Omicron bears more sequence variation, specifically in the spike protein and RBM. Omicron showed the closest nucleotide and protein sequence homology with the Alpha variant. Further, the mutational analysis showed enrichment for the mutations decreasing ACE2-binding affinity and RBD protein expression, but increased propensity of immune escape. The analysis of the viral genomic sequences and epidemiological data from South Africa reflected an inverse correlation of Omicron with Delta variant infections, with a subsequent decrease. There was a steep rise in the number of new COVID-19 cases in parallel with the increase in the proportion of Omicron since the report of the first case; however, the incidence of deaths did not increase.
Our analyses showed that among the existing VOCs and VOIs, Omicron bears the highest homology of the complete sequence and RBM (nucleotide and protein sequences) with the Alpha variant (Table S1 in
As Omicron bears key mutations from multiple existing VOCs/VOIs, with approximate sequence homology variation rather than a direct descendance, the numerous recombination events between the variants inside hosts can be a more plausible explanation for its origin.
It will be pertinent to explore the evolutionary mechanisms involved in accumulating such a large number of mutations in Omicron. Speculations were raised that the long-term persistence of SARS-CoV-2 infection in an immunocompromised host could be a probable mechanism behind the origin of Omicron [
Our analysis shows that Omicron accumulated multiple closely spaced mutations at the RBM with ACE2 (
Many of the mutations, especially in the nonspike regions, are linked with viral oligomerization, synthesis, and packaging of the ribonucleic acid core (
Most spike mutations (18/32) in Omicron have occurred at the known antibody recognition sites (
A nonspike mutation in the nucleocapsid (N) protein (P13L) present in Omicron (
The overall evidence supports Omicron’s very high immune-escape ability [
The analysis of the SARS-CoV-2 genomic sequences from South Africa indicates that Omicron gained an advantage in terms of transmissibility over the Delta variant (
Further, death, which is considered a strong indicator of virulence/lethality, showed a negative correlation (r=–0.04,
An approximately 2.4 (2.0-2.7) times higher transmissibility was suggested with Omicron compared to the Delta variant in the South African population [
Notably, the presence of an immunological barrier in the population imparted by the recent COVID-19 wave mediated by the Delta variant could be a likely reason for this variant’s fall in new cases [
We analyzed a limited number of genomic sequences and epidemiological data from specific geographical regions affected by Omicron. Further, the relative frequency of specific lineage-characterizing mutations in the Omicron variant may have varied since the study’s inception. Both of these limitations may have an impact on the quality of the results.
In silico analysis of viral genomic sequences suggests that the Omicron variant has more remarkable immune-escape ability than the existing VOCs/VOIs, including Delta, but reduced virulence/lethality than other reported variants. The higher power for immune escape for Omicron was a likely reason for the resurgence in COVID-19 cases and its soon becoming a globally dominant strain. Being more infectious but less lethal than the existing variants, Omicron could have plausibly led to widespread unnoticed new, repeated, and vaccine breakthrough infections, raising the population-level immunity barrier against the emergence of new lethal variants. The Omicron variant could have thus paved the way for the end of the pandemic.
Global spread of the SARS-CoV-2 Omicron variant (Figure S1). Nucleotide and protein sequence homology of Omicron (BA.1) with wild-type SARS-CoV-2 and other global variants of concern/interest (Table S1).
angiotensin-converting enzyme 2
Global Initiative on Sharing Avian Influenza Data
National Center for Biotechnology Information
Phylogenetic Assignment of Named Global Outbreak
polymerase chain reaction
receptor-binding domain
receptor-binding motif
variant of concern
variant of interest
World Health Organization
Primary data used for this study are publicly available on the GISAID database [
AK, GK, and PD collected samples and analyzed data. AK wrote the first draft. AA and HNS performed the statistical analysis. MAF, SK, RKN, RKJ, CS, MK, PP, KS, KK, and SNP reviewed and edited the paper. All authors consented to submit the final draft.
None declared.