https://bioinform.jmir.org/issue/feedJMIR Bioinformatics and Biotechnology2023-01-10T09:30:04-05:00JMIR Publicationseditor@jmir.orgOpen Journal Systems Unless stated otherwise, all articles are open-access distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work ("first published in the Journal of Medical Internet Research...") is properly cited with original URL and bibliographic citation information. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included. Methods, web-based platforms, open data and open software tools for big data analytics, machine learning-based predictive models using genomic and imaging data, and information retrieval in biology and medicine. https://bioinform.jmir.org/2024/1/e52059/ Machine Learning Models for Prediction of Maternal Hemorrhage and Transfusion: Model Development Study2024-02-05T09:00:04-05:00Homa Khorrami AhmadziaAlexa C DziennyMike BopfJaclyn M PhillipsJerome Jeffrey FederspielRichard AmdurMadeline Murguia RiceLaritza Rodriguez<strong>Background:</strong> Current postpartum hemorrhage (PPH) risk stratification is based on traditional statistical models or expert opinion. Machine learning could optimize PPH prediction by allowing for more complex modeling. <strong>Objective:</strong> We sought to improve PPH prediction and compare machine learning and traditional statistical methods. <strong>Methods:</strong> We developed models using the Consortium for Safe Labor data set (2002-2008) from 12 US hospitals. The primary outcome was a transfusion of blood products or PPH (estimated blood loss of ≥1000 mL). The secondary outcome was a transfusion of any blood product. Fifty antepartum and intrapartum characteristics and hospital characteristics were included. Logistic regression, support vector machines, multilayer perceptron, random forest, and gradient boosting (GB) were used to generate prediction models. The area under the receiver operating characteristic curve (ROC-AUC) and area under the precision/recall curve (PR-AUC) were used to compare performance. <strong>Results:</strong> Among 228,438 births, 5760 (3.1%) women had a postpartum hemorrhage, 5170 (2.8%) had a transfusion, and 10,344 (5.6%) met the criteria for the transfusion-PPH composite. Models predicting the transfusion-PPH composite using antepartum and intrapartum features had the best positive predictive values, with the GB machine learning model performing best overall (ROC-AUC=0.833, 95% CI 0.828-0.838; PR-AUC=0.210, 95% CI 0.201-0.220). The most predictive features in the GB model predicting the transfusion-PPH composite were the mode of delivery, oxytocin incremental dose for labor (mU/minute), intrapartum tocolytic use, presence of anesthesia nurse, and hospital type. <strong>Conclusions:</strong> Machine learning offers higher discriminability than logistic regression in predicting PPH. The Consortium for Safe Labor data set may not be optimal for analyzing risk due to strong subgroup effects, which decreases accuracy and limits generalizability. 2024-02-05T09:00:04-05:00 https://bioinform.jmir.org/2023/1/e45370/ User and Usability Testing of a Web-Based Genetics Education Tool for Parkinson Disease: Mixed Methods Study2023-08-30T09:15:08-04:00Noah HanRachel A PaulTanya BardakjianDaniel KargilisAngela R BradburyAlice Chen-PlotkinThomas F Tropea<strong>Background:</strong> Genetic testing is essential to identify research participants for clinical trials enrolling people with Parkinson disease (PD) carrying a variant in the glucocerebrosidase (<i>GBA</i>) or leucine-rich repeat kinase 2 (<i>LRRK2</i>) genes. The limited availability of professionals trained in neurogenetics or genetic counseling is a major barrier to increased testing. Telehealth solutions to increase access to genetics education can help address issues around counselor availability and offer options to patients and family members. <strong>Objective:</strong> As an alternative to pretest genetic counseling, we developed a web-based genetics education tool focused on <i>GBA</i> and <i>LRRK2</i> testing for PD called the Interactive Multimedia Approach to Genetic Counseling to Inform and Educate in Parkinson’s Disease (IMAGINE-PD) and conducted user testing and usability testing. The objective was to conduct user and usability testing to obtain stakeholder feedback to improve IMAGINE-PD. <strong>Methods:</strong> Genetic counselors and PD and neurogenetics subject matter experts developed content for IMAGINE-PD specifically focused on <i>GBA</i> and <i>LRRK2</i> genetic testing. Structured interviews were conducted with 11 movement disorder specialists and 13 patients with PD to evaluate the content of IMAGINE-PD in user testing and with 12 patients with PD to evaluate the usability of a high-fidelity prototype according to the US Department of Health and Human Services Research-Based Web Design & Usability Guidelines. Qualitative data analysis informed changes to create a final version of IMAGINE-PD. <strong>Results:</strong> Qualitative data were reviewed by 3 evaluators. Themes were identified from feedback data of movement disorder specialists and patients with PD in user testing in 3 areas: content such as the topics covered, function such as website navigation, and appearance such as pictures and colors. Similarly, qualitative analysis of usability testing feedback identified additional themes in these 3 areas. Key points of feedback were determined by consensus among reviewers considering the importance of the comment and the frequency of similar comments. Refinements were made to IMAGINE-PD based on consensus recommendations by evaluators within each theme at both user testing and usability testing phases to create a final version of IMAGINE-PD. <strong>Conclusions:</strong> User testing for content review and usability testing have informed refinements to IMAGINE-PD to develop this focused, genetics education tool for <i>GBA</i> and <i>LRRK2</i> testing. Comparison of this stakeholder-informed intervention to standard telegenetic counseling approaches is ongoing. <strong>Trial Registration:</strong> 2023-08-30T09:15:08-04:00 https://bioinform.jmir.org/2023/1/e44700/ Secure Comparisons of Single Nucleotide Polymorphisms Using Secure Multiparty Computation: Method Development2023-07-18T09:30:30-04:00Andrew WoodsSkyler T KramerDong XuWei Jiang<strong>Background:</strong> While genomic variations can provide valuable information for health care and ancestry, the privacy of individual genomic data must be protected. Thus, a secure environment is desirable for a human DNA database such that the total data are queryable but not directly accessible to involved parties (eg, data hosts and hospitals) and that the query results are learned only by the user or authorized party. <strong>Objective:</strong> In this study, we provide efficient and secure computations on panels of single nucleotide polymorphisms (SNPs) from genomic sequences as computed under the following set operations: union, intersection, set difference, and symmetric difference. <strong>Methods:</strong> Using these operations, we can compute similarity metrics, such as the Jaccard similarity, which could allow querying a DNA database to find the same person and genetic relatives securely. We analyzed various security paradigms and show metrics for the protocols under several security assumptions, such as semihonest, malicious with honest majority, and malicious with a malicious majority. <strong>Results:</strong> We show that our methods can be used practically on realistically sized data. Specifically, we can compute the Jaccard similarity of two genomes when considering sets of SNPs, each with 400,000 SNPs, in 2.16 seconds with the assumption of a malicious adversary in an honest majority and 0.36 seconds under a semihonest model. <strong>Conclusions:</strong> Our methods may help adopt trusted environments for hosting individual genomic data with end-to-end data security. 2023-07-18T09:30:30-04:00 https://bioinform.jmir.org/2023/1/e43906/ Mutations of SARS-CoV-2 Structural Proteins in the Alpha, Beta, Gamma, and Delta Variants: Bioinformatics Analysis2023-07-14T10:00:06-04:00Saima Rehman KhetranRoma Mustafa<strong>Background:</strong> COVID-19 and Middle East Respiratory Syndrome are two pandemic respiratory diseases caused by coronavirus species. The novel disease COVID-19 caused by SARS-CoV-2 was first reported in Wuhan, Hubei Province, China, in December 2019, and became a pandemic within 2-3 months, affecting social and economic platforms worldwide. Despite the rapid development of vaccines, there have been obstacles to their distribution, including a lack of fundamental resources, poor immunization, and manual vaccine replication. Several variants of the original Wuhan strain have emerged in the last 3 years, which can pose a further challenge for control and vaccine development. <strong>Objective:</strong> The aim of this study was to comprehensively analyze mutations in SARS-CoV-2 variants of concern (VoCs) using a bioinformatics approach toward identifying novel mutations that may be helpful in developing new vaccines by targeting these sites. <strong>Methods:</strong> Reference sequences of the SARS-CoV-2 spike (YP_009724390) and nucleocapsid (YP_009724397) proteins were compared to retrieved sequences of isolates of four VoCs from 14 countries for mutational and evolutionary analyses. Multiple sequence alignment was performed and phylogenetic trees were constructed by the neighbor-joining method with 1000 bootstrap replicates using MEGA (version 6). Mutations in amino acid sequences were analyzed using the MultAlin online tool (version 5.4.1). <strong>Results:</strong> Among the four VoCs, a total of 143 nonsynonymous mutations and 8 deletions were identified in the spike and nucleocapsid proteins. Multiple sequence alignment and amino acid substitution analysis revealed new mutations, including G72W, M2101I, L139F, 209-211 deletion, G212S, P199L, P67S, I292T, and substitutions with unknown amino acid replacement, reported in Egypt (MW533289), the United Kingdom (MT906649), and other regions. The variants B.1.1.7 (Alpha variant) and B.1.617.2 (Delta variant), characterized by higher transmissibility and lethality, harbored the amino acid substitutions D614G, R203K, and G204R with higher prevalence rates in most sequences. Phylogenetic analysis among the novel SARS-CoV-2 variant proteins and some previously reported β-coronavirus proteins indicated that either the evolutionary clade was weakly supported or not supported at all by the β-coronavirus species. <strong>Conclusions:</strong> This study could contribute toward gaining a better understanding of the basic nature of SARS-CoV-2 and its four major variants. The numerous novel mutations detected could also provide a better understanding of VoCs and help in identifying suitable mutations for vaccine targets. Moreover, these data offer evidence for new types of mutations in VoCs, which will provide insight into the epidemiology of SARS-CoV-2. 2023-07-14T10:00:06-04:00 https://bioinform.jmir.org/2023/1/e48631/ Introducing JMIR Bioinformatics and Biotechnology: A Platform for Interdisciplinary Collaboration and Cutting-Edge Research2023-06-12T09:00:55-04:00Ece Dilber Gamsiz Uzun<i>JMIR Bioinformatics and Biotechnology</i> supports interdisciplinary research and welcomes contributions that push the boundaries of bioinformatics, genomics, artificial intelligence, and pathology informatics.2023-06-12T09:00:55-04:00 https://bioinform.jmir.org/2023/1/e40673/ Genomic Insights Into the Evolution and Demographic History of the SARS-CoV-2 Omicron Variant: Population Genomics Approach2023-06-12T09:00:11-04:00Kritika M GargVinita LambaBalaji Chattopadhyay<strong>Background:</strong> A thorough understanding of the patterns of genetic subdivision in a pathogen can provide crucial information that is necessary to prevent disease spread. For SARS-CoV-2, the availability of millions of genomes makes this task analytically challenging, and traditional methods for understanding genetic subdivision often fail. <strong>Objective:</strong> The aim of our study was to use population genomics methods to identify the subtle subdivisions and demographic history of the Omicron variant, in addition to those captured by the Pango lineage. <strong>Methods:</strong> We used a combination of an evolutionary network approach and multivariate statistical protocols to understand the subdivision and spread of the Omicron variant. We identified subdivisions within the BA.1 and BA.2 lineages and further identified the mutations associated with each cluster. We further characterized the overall genomic diversity of the Omicron variant and assessed the selection pressure for each of the genetic clusters identified. <strong>Results:</strong> We observed concordant results, using two different methods to understand genetic subdivision. The overall pattern of subdivision in the Omicron variant was in broad agreement with the Pango lineage definition. Further, 1 cluster of the BA.1 lineage and 3 clusters of the BA.2 lineage revealed statistically significant signatures of selection or demographic expansion (Tajima’s D<−2), suggesting the role of microevolutionary processes in the spread of the virus. <strong>Conclusions:</strong> We provide an easy framework for assessing the genetic structure and demographic history of SARS-CoV-2, which can be particularly useful for understanding the local history of the virus. We identified important mutations that are advantageous to some lineages of Omicron and aid in the transmission of the virus. This is crucial information for policy makers, as preventive measures can be designed to mitigate further spread based on a holistic understanding of the variability of the virus and the evolutionary processes aiding its spread. 2023-06-12T09:00:11-04:00 https://bioinform.jmir.org/2023/1/e43665/ Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank Selection2023-06-06T09:00:09-04:00Emine Guven<strong>Background:</strong> There is a great need to develop a computational approach to analyze and exploit the information contained in gene expression data. The recent utilization of nonnegative matrix factorization (NMF) in computational biology has demonstrated the capability to derive essential details from a high amount of data in particular gene expression microarrays. A common problem in NMF is finding the proper number rank (r) of factors of the degraded demonstration, but no agreement exists on which technique is most appropriate to utilize for this purpose. Thus, various techniques have been suggested to select the optimal value of rank factorization (r). <strong>Objective:</strong> In this work, a new metric for rank selection is proposed based on the elbow method, which was methodically compared against the cophenetic metric. <strong>Methods:</strong> To decide the optimum number rank (r), this study focused on the unit invariant knee (UIK) method of the NMF on gene expression data sets. Since the UIK method requires an extremum distance estimator that is eventually employed for inflection and identification of a knee point, the proposed method finds the first inflection point of the curvature of the residual sum of squares of the proposed algorithms using the UIK method on gene expression data sets as a target matrix. <strong>Results:</strong> Computation was conducted for the UIK task using gene expression data of acute lymphoblastic leukemia and acute myeloid leukemia samples. Consequently, the distinct results of NMF were subjected to comparison on different algorithms. The proposed UIK method is easy to perform, fast, free of a priori rank value input, and does not require initial parameters that significantly influence the model’s functionality. <strong>Conclusions:</strong> This study demonstrates that the elbow method provides a credible prediction for both gene expression data and for precisely estimating simulated mutational processes data with known dimensions. The proposed UIK method is faster than conventional methods, including metrics utilizing the consensus matrix as a criterion for rank selection, while achieving significantly better computational efficiency without visual inspection on the curvatives. Finally, the suggested rank tuning method based on the elbow method for gene expression data is arguably theoretically superior to the cophenetic measure. 2023-06-06T09:00:09-04:00 https://bioinform.jmir.org/2023/1/e37306/ The Identification of Potential Drugs for Dengue Hemorrhagic Fever: Network-Based Drug Reprofiling Study2023-05-09T09:45:03-04:00Praveenkumar Kochuthakidiyel SureshGnanasoundari SekarKavya MalladyWan Suriana Wan Ab RahmanWan Nazatul Shima ShahidanGokulakannan Venkatesan<strong>Background:</strong> Dengue fever can progress to dengue hemorrhagic fever (DHF), a more serious and occasionally fatal form of the disease. Indicators of serious disease arise about the time the fever begins to reduce (typically 3 to 7 days following symptom onset). There are currently no effective antivirals available. Drug repurposing is an emerging drug discovery process for rapidly developing effective DHF therapies. Through network pharmacology modeling, several US Food and Drug Administration (FDA)-approved medications have already been researched for various viral outbreaks. <strong>Objective:</strong> We aimed to identify potentially repurposable drugs for DHF among existing FDA-approved drugs for viral attacks, symptoms of viral fevers, and DHF. <strong>Methods:</strong> Using target identification databases (GeneCards and DrugBank), we identified human–DHF virus interacting genes and drug targets against these genes. We determined hub genes and potential drugs with a network-based analysis. We performed functional enrichment and network analyses to identify pathways, protein-protein interactions, tissues where the gene expression was high, and disease-gene associations. <strong>Results:</strong> Analyzing virus-host interactions and therapeutic targets in the human genome network revealed 45 repurposable medicines. Hub network analysis of host-virus-drug associations suggested that aspirin, captopril, and rilonacept might efficiently treat DHF. Gene enrichment analysis supported these findings. According to a Mayo Clinic report, using aspirin in the treatment of dengue fever may increase the risk of bleeding complications, but several studies from around the world suggest that thrombosis is associated with DHF. The human interactome contains the genes <i>prostaglandin-endoperoxide synthase 2</i> (<i>PTGS2</i>), <i>angiotensin converting enzyme (ACE)</i>, and <i>coagulation factor II, thrombin</i> (<i>F2)</i>, which have been documented to have a role in the pathogenesis of disease progression in DHF, and our analysis of most of the drugs targeting these genes showed that the hub gene module (human-virus-drug) was highly enriched in tissues associated with the immune system (<i>P</i>=7.29 × 10<sup>–24</sup>) and human umbilical vein endothelial cells (<i>P</i>=1.83 × 10<sup>–20</sup>); this group of tissues acts as an anticoagulant barrier between the vessel walls and blood. Kegg analysis showed an association with genes linked to cancer (<i>P</i>=1.13 × 10<sup>–14</sup>) and the advanced glycation end products–receptor for advanced glycation end products signaling pathway in diabetic complications (<i>P</i>=3.52 × 10<sup>–14</sup>), which indicates that DHF patients with diabetes and cancer are at risk of higher pathogenicity. Thus, gene-targeting medications may play a significant part in limiting or worsening the condition of DHF patients. <strong>Conclusions:</strong> Aspirin is not usually prescribed for dengue fever because of bleeding complications, but it has been reported that using aspirin in lower doses is beneficial in the management of diseases with thrombosis. Drug repurposing is an emerging field in which clinical validation and dosage identification are required before the drug is prescribed. Further retrospective and collaborative international trials are essential for understanding the pathogenesis of this condition. 2023-05-09T09:45:03-04:00 https://bioinform.jmir.org/2023/1/e42421/ The Differentially Expressed Genes Responsible for the Development of T Helper 9 Cells From T Helper 2 Cells in Various Disease States: Immuno-Interactomics Study2023-02-23T09:15:03-05:00Manoj KhokharPurvi PurohitAshita GadwalSojit TomoNitin Kumar BajpaiRavindra Shukla<strong>Background:</strong> T helper (Th) 9 cells are a novel subset of Th cells that develop independently from Th2 cells and are characterized by the secretion of interleukin (IL)-9. Studies have suggested the involvement of Th9 cells in variable diseases such as allergic and pulmonary diseases (eg, asthma, chronic obstructive airway disease, chronic rhinosinusitis, nasal polyps, and pulmonary hypoplasia), metabolic diseases (eg, acute leukemia, myelocytic leukemia, breast cancer, lung cancer, melanoma, pancreatic cancer), neuropsychiatric disorders (eg, Alzheimer disease), autoimmune diseases (eg, Graves disease, Crohn disease, colitis, psoriasis, systemic lupus erythematosus, systemic scleroderma, rheumatoid arthritis, multiple sclerosis, inflammatory bowel disease, atopic dermatitis, eczema), and infectious diseases (eg, tuberculosis, hepatitis). However, there is a dearth of information on its involvement in other metabolic, neuropsychiatric, and infectious diseases. <strong>Objective:</strong> This study aims to identify significant differentially altered genes in the conversion of Th2 to Th9 cells, and their regulating microRNAs (miRs) from publicly available Gene Expression Omnibus data sets of the mouse model using in silico analysis to unravel various pathogenic pathways involved in disease processes. <strong>Methods:</strong> Using differentially expressed genes (DEGs) identified from 2 publicly available data sets (GSE99166 and GSE123501) we performed functional enrichment and network analyses to identify pathways, protein-protein interactions, miR-messenger RNA associations, and disease-gene associations related to significant differentially altered genes implicated in the conversion of Th2 to Th9 cells. <strong>Results:</strong> We extracted 260 common downregulated, 236 common upregulated, and 634 common DEGs from the expression profiles of data sets GSE99166 and GSE123501. Codifferentially expressed ILs, cytokines, receptors, and transcription factors (TFs) were enriched in 7 crucial Kyoto Encyclopedia of Genes and Genomes pathways and Gene Ontology. We constructed the protein-protein interaction network and predicted the top regulatory miRs involved in the Th2 to Th9 differentiation pathways. We also identified various metabolic, allergic and pulmonary, neuropsychiatric, autoimmune, and infectious diseases as well as carcinomas where the differentiation of Th2 to Th9 may play a crucial role. <strong>Conclusions:</strong> This study identified hitherto unexplored possible associations between Th9 and disease states. Some important ILs, including <i>CCL1</i> (chemokine [C-C motif] ligand 1), <i>CCL20</i> (chemokine [C-C motif] ligand 20), <i>IL-13</i>, <i>IL-4</i>, <i>IL-12A</i>, and <i>IL-9</i>; receptors, including <i>IL-12RB1</i>, <i>IL-4RA</i> (interleukin 9 receptor alpha), <i>CD53</i> (cluster of differentiation 53), <i>CD6</i> (cluster of differentiation 6), <i>CD5</i> (cluster of differentiation 5), <i>CD83</i> (cluster of differentiation 83), <i>CD197</i> (cluster of differentiation 197), <i>IL-1RL1</i> (interleukin 1 receptor-like 1), <i>CD101</i> (cluster of differentiation 101), <i>CD96</i> (cluster of differentiation 96), <i>CD72</i> (cluster of differentiation 72), <i>CD7</i> (cluster of differentiation 7), <i>CD152</i> (cytotoxic T lymphocyte–associated protein 4), <i>CD38</i> (cluster of differentiation 38), <i>CX3CR1</i> (chemokine [C-X3-C motif] receptor 1), <i>CTLA2A</i> (cytotoxic T lymphocyte–associated protein 2 alpha), <i>CTLA28</i>, and <i>CD196</i> (cluster of differentiation 196); and TFs, including <i>FOXP3</i> (forkhead box P3), <i>IRF8</i> (interferon regulatory factor 8), <i>FOXP2</i> (forkhead box P2), <i>RORA</i> (RAR-related orphan receptor alpha), <i>AHR</i> (aryl-hydrocarbon receptor), <i>MAF</i> (avian musculoaponeurotic fibrosarcoma oncogene homolog), <i>SMAD6</i> (SMAD family member 6), <i>JUN</i> (Jun proto-oncogene), <i>JAK2</i> (Janus kinase 2), <i>EP300</i> (E1A binding protein p300), <i>ATF6</i> (activating transcription factor 6), <i>BTAF1</i> (B-TFIID TATA-box binding protein associated factor 1), <i>BAFT</i> (basic leucine zipper transcription factor), <i>NOTCH1</i> (neurogenic locus notch homolog protein 1), <i>GATA3</i> (GATA binding protein 3), <i>SATB1</i> (special AT-rich sequence binding protein 1), <i>BMP7</i> (bone morphogenetic protein 7), and <i>PPARG</i> (peroxisome proliferator–activated receptor gamma, were able to identify significant differentially altered genes in the conversion of Th2 to Th9 cells. We identified some common miRs that could target the DEGs. The scarcity of studies on the role of Th9 in metabolic diseases highlights the lacunae in this field. Our study provides the rationale for exploring the role of Th9 in various metabolic disorders such as diabetes mellitus, diabetic nephropathy, hypertensive disease, ischemic stroke, steatohepatitis, liver fibrosis, obesity, adenocarcinoma, glioblastoma and glioma, malignant neoplasm of stomach, melanoma, neuroblastoma, osteosarcoma, pancreatic carcinoma, prostate carcinoma, and stomach carcinoma. 2023-02-23T09:15:03-05:00 https://bioinform.jmir.org/2023/1/e42700/ SARS-CoV-2 Omicron Variant Genomic Sequences and Their Epidemiological Correlates Regarding the End of the Pandemic: In Silico Analysis2023-01-10T09:30:04-05:00Ashutosh KumarAdil AsgharHimanshu N SinghMuneeb A FaiqSujeet KumarRavi K NarayanGopichand KumarPrakhar DwivediChetan SahniRakesh K JhaMaheswari KulandhasamyPranav PrasoonKishore SeshamKamla KantSada N Pandey<strong>Background:</strong> Emergence of the new SARS-CoV-2 variant B.1.1.529 worried health policy makers worldwide due to a large number of mutations in its genomic sequence, especially in the spike protein region. The World Health Organization (WHO) designated this variant as a global variant of concern (VOC), which was named “Omicron.” Following Omicron’s emergence, a surge of new COVID-19 cases was reported globally, primarily in South Africa. <strong>Objective:</strong> The aim of this study was to understand whether Omicron had an epidemiological advantage over existing variants. <strong>Methods:</strong> We performed an in silico analysis of the complete genomic sequences of Omicron available on the Global Initiative on Sharing Avian Influenza Data (GISAID) database to analyze the functional impact of the mutations present in this variant on virus-host interactions in terms of viral transmissibility, virulence/lethality, and immune escape. In addition, we performed a correlation analysis of the relative proportion of the genomic sequences of specific SARS-CoV-2 variants (in the period from October 1 to November 29, 2021) with matched epidemiological data (new COVID-19 cases and deaths) from South Africa. <strong>Results:</strong> Compared with the current list of global VOCs/variants of interest (VOIs), as per the WHO, Omicron bears more sequence variation, specifically in the spike protein and host receptor-binding motif (RBM). Omicron showed the closest nucleotide and protein sequence homology with the Alpha variant for the complete sequence and the RBM. The mutations were found to be primarily condensed in the spike region (n=28-48) of the virus. Further mutational analysis showed enrichment for the mutations decreasing binding affinity to angiotensin-converting enzyme 2 receptor and receptor-binding domain protein expression, and for increasing the propensity of immune escape. An inverse correlation of Omicron with the Delta variant was noted (r=–0.99, <i>P</i><.001; 95% CI –0.99 to –0.97) in the sequences reported from South Africa postemergence of the new variant, subsequently showing a decrease. There was a steep rise in new COVID-19 cases in parallel with the increase in the proportion of Omicron isolates since the report of the first case (74%-100%). By contrast, the incidence of new deaths did not increase (r=–0.04, <i>P</i>>.05; 95% CI –0.52 to 0.58). <strong>Conclusions:</strong> In silico analysis of viral genomic sequences suggests that the Omicron variant has more remarkable immune-escape ability than existing VOCs/VOIs, including Delta, but reduced virulence/lethality than other reported variants. The higher power for immune escape for Omicron was a likely reason for the resurgence in COVID-19 cases and its rapid rise as the globally dominant strain. Being more infectious but less lethal than the existing variants, Omicron could have plausibly led to widespread unnoticed new, repeated, and vaccine breakthrough infections, raising the population-level immunity barrier against the emergence of new lethal variants. The Omicron variant could have thus paved the way for the end of the pandemic. 2023-01-10T09:30:04-05:00