<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.0 20040830//EN" "http://dtd.nlm.nih.gov/publishing/2.0/journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.0" xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">JBB</journal-id>
      <journal-id journal-id-type="nlm-ta">JMIR Bioinform Biotech</journal-id>
      <journal-title>JMIR Bioinformatics and Biotechnology</journal-title>
      <issn pub-type="epub">2563-3570</issn>
      <publisher>
        <publisher-name>JMIR Publications</publisher-name>
        <publisher-loc>Toronto, Canada</publisher-loc>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="publisher-id">v4i1e43665</article-id>
      <article-id pub-id-type="pmid"/>
      <article-id pub-id-type="doi">10.2196/43665</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Original Paper</subject>
        </subj-group>
        <subj-group subj-group-type="article-type">
          <subject>Original Paper</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank Selection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="editor">
          <name>
            <surname>Uzun</surname>
            <given-names>Ece</given-names>
          </name>
        </contrib>
      </contrib-group>
      <contrib-group>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Özkan</surname>
            <given-names>Selin</given-names>
          </name>
        </contrib>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Banf</surname>
            <given-names>Michael</given-names>
          </name>
        </contrib>
        <contrib contrib-type="reviewer">
          <name>
            <surname>Staffini</surname>
            <given-names>Alessio</given-names>
          </name>
        </contrib>
      </contrib-group>
      <contrib-group>
        <contrib id="contrib1" contrib-type="author" corresp="yes">
          <name name-style="western">
            <surname>Guven</surname>
            <given-names>Emine</given-names>
          </name>
          <degrees>MSc, PhD</degrees>
          <xref rid="aff1" ref-type="aff">1</xref>
          <address>
            <institution>Department of Biomedical Engineering</institution>
            <institution>Düzce University</institution>
            <addr-line>College of Engineering, Main Campus, M-2 Building, #202</addr-line>
            <addr-line>Düzce, 81620</addr-line>
            <country>Turkey</country>
            <phone>90 5388733459</phone>
            <email>emine.guven33@gmail.com</email>
          </address>
          <ext-link ext-link-type="orcid">https://orcid.org/0000-0001-9324-0879</ext-link>
        </contrib>
      </contrib-group>
      <aff id="aff1">
        <label>1</label>
        <institution>Department of Biomedical Engineering</institution>
        <institution>Düzce University</institution>
        <addr-line>Düzce</addr-line>
        <country>Turkey</country>
      </aff>
      <author-notes>
        <corresp>Corresponding Author: Emine Guven <email>emine.guven33@gmail.com</email></corresp>
      </author-notes>
      <pub-date pub-type="collection">
        <year>2023</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>6</day>
        <month>6</month>
        <year>2023</year>
      </pub-date>
      <volume>4</volume>
      <elocation-id>e43665</elocation-id>
      <history>
        <date date-type="received">
          <day>19</day>
          <month>10</month>
          <year>2022</year>
        </date>
        <date date-type="rev-request">
          <day>19</day>
          <month>12</month>
          <year>2022</year>
        </date>
        <date date-type="rev-recd">
          <day>5</day>
          <month>2</month>
          <year>2023</year>
        </date>
        <date date-type="accepted">
          <day>28</day>
          <month>4</month>
          <year>2023</year>
        </date>
      </history>
      <copyright-statement>©Emine Guven. Originally published in JMIR Bioinformatics and Biotechnology (https://bioinform.jmir.org), 06.06.2023.</copyright-statement>
      <copyright-year>2023</copyright-year>
      <license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/4.0/">
        <p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Bioinformatics and Biotechnology, is properly cited. The complete bibliographic information, a link to the original publication on https://bioinform.jmir.org/, as well as this copyright and license information must be included.</p>
      </license>
      <self-uri xlink:href="https://bioinform.jmir.org/2023/1/e43665" xlink:type="simple"/>
      <abstract>
        <sec sec-type="background">
          <title>Background</title>
          <p>There is a great need to develop a computational approach to analyze and exploit the information contained in gene expression data. The recent utilization of nonnegative matrix factorization (NMF) in computational biology has demonstrated the capability to derive essential details from a high amount of data in particular gene expression microarrays. A common problem in NMF is finding the proper number rank (r) of factors of the degraded demonstration, but no agreement exists on which technique is most appropriate to utilize for this purpose. Thus, various techniques have been suggested to select the optimal value of rank factorization (r).</p>
        </sec>
        <sec sec-type="objective">
          <title>Objective</title>
          <p>In this work, a new metric for rank selection is proposed based on the elbow method, which was methodically compared against the cophenetic metric.</p>
        </sec>
        <sec sec-type="methods">
          <title>Methods</title>
          <p>To decide the optimum number rank (r), this study focused on the unit invariant knee (UIK) method of the NMF on gene expression data sets. Since the UIK method requires an extremum distance estimator that is eventually employed for inflection and identification of a knee point, the proposed method finds the first inflection point of the curvature of the residual sum of squares of the proposed algorithms using the UIK method on gene expression data sets as a target matrix.</p>
        </sec>
        <sec sec-type="results">
          <title>Results</title>
          <p>Computation was conducted for the UIK task using gene expression data of acute lymphoblastic leukemia and acute myeloid leukemia samples. Consequently, the distinct results of NMF were subjected to comparison on different algorithms. The proposed UIK method is easy to perform, fast, free of a priori rank value input, and does not require initial parameters that significantly influence the model’s functionality.</p>
        </sec>
        <sec sec-type="conclusions">
          <title>Conclusions</title>
          <p>This study demonstrates that the elbow method provides a credible prediction for both gene expression data and for precisely estimating simulated mutational processes data with known dimensions. The proposed UIK method is faster than conventional methods, including metrics utilizing the consensus matrix as a criterion for rank selection, while achieving significantly better computational efficiency without visual inspection on the curvatives. Finally, the suggested rank tuning method based on the elbow method for gene expression data is arguably theoretically superior to the cophenetic measure.</p>
        </sec>
      </abstract>
      <kwd-group>
        <kwd>gene expression data</kwd>
        <kwd>nonnegative matrix factorization</kwd>
        <kwd>rank factorization</kwd>
        <kwd>optimal rank</kwd>
        <kwd>unit invariant knee method</kwd>
        <kwd>elbow method</kwd>
        <kwd>consensus matrix</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec sec-type="introduction">
      <title>Introduction</title>
      <p>Nonnegative matrix factorization (NMF) algorithms have been advanced for the application fields of bioinformatics, artificial intelligence [<xref ref-type="bibr" rid="ref1">1</xref>], signal processing systems [<xref ref-type="bibr" rid="ref2">2</xref>], and music signal processing systems [<xref ref-type="bibr" rid="ref3">3</xref>]. Lee and Seung [<xref ref-type="bibr" rid="ref4">4</xref>] formulated a parts-based illustrated algorithm to solve the problem of the NMF puzzle. Furthermore, various algorithms have been established to develop a solution to the NMF problem depending on the field [<xref ref-type="bibr" rid="ref5">5</xref>-<xref ref-type="bibr" rid="ref8">8</xref>].</p>
      <p>Several approaches have been developed for clustering samples, mutational processes, and gene expression levels that draw similar expression motifs [<xref ref-type="bibr" rid="ref4">4</xref>,<xref ref-type="bibr" rid="ref9">9</xref>-<xref ref-type="bibr" rid="ref11">11</xref>]. However, cancer analysis and classification based on genomic data offers a more powerful method that approach the sensitivity of advanced computational techniques to tackle certain problems such as modeling multiple, heterogeneous populations and reducing the number of variables (genes or mutations). Consequently, the choice of a trivial number of discriminatory features from thousands of features enhances crafting successful pinpointing classification systems [<xref ref-type="bibr" rid="ref12">12</xref>-<xref ref-type="bibr" rid="ref14">14</xref>]. Although neural networks are prone to overfitting, if the examined structure is noisy, as in the case of tumor expression profiling [<xref ref-type="bibr" rid="ref15">15</xref>], Pal et al [<xref ref-type="bibr" rid="ref12">12</xref>] suggested a variation of a multilayer perceptron network for biomarkers identification. Nevertheless, these approaches have severe constraints in capturing the entire framework essential in the data. Moreover, they generally highlight the dominant forms in a data set and cannot detect different signatures with a universal standard. Thus, an unbiased technique is needed for deciphering many clusters without visual inspection that is also capable of utilizing a computational program.</p>
      <p>A common problem in conventional multivariate data analysis methods such as factor analysis (FA), principal component analysis (PCA), cluster analysis, and NMF is to detect the proper number (r) of factors, principal components, clusters, and ranks, respectively. Item redundancy is common in long questionnaires such as those used in a pilot questionnaire study, arguing for the utilization of FA and the variance inflation factor on a lifestyle questionnaire. Staffini et al [<xref ref-type="bibr" rid="ref16">16</xref>] concluded that both methods are acceptable for item reduction; however, both of these techniques might produce distinct features as an outcome.</p>
      <p>The aim of this study was to utilize the unit invariant knee (UIK) method for obtaining related biological and molecular correlations in gene expression data. The UIK method is used to catch compositions essential for the data and to offer biological understanding by systematizing both the features and samples. The approach is based on a “knee point” and its unit invariant estimation using the extremum distance estimator method introduced by Christopoulous [<xref ref-type="bibr" rid="ref17">17</xref>]. In this regard, NMF decomposes the gene expression data set into fragments of evocative features such as metagene and mutational signatures. When applying this method to conventional factorization techniques such as PCA or FA with World Values Survey Wave 5 United States data [<xref ref-type="bibr" rid="ref18">18</xref>], certain factors (elements) clearly explained the questionnaire responses (1=“Not at all like me”...6=“Very much like me”) [<xref ref-type="bibr" rid="ref19">19</xref>,<xref ref-type="bibr" rid="ref20">20</xref>].</p>
      <p>Therefore, given an NMF method and a data set (a target matrix), the tens of thousands of genes regarding a small number of signatures can be analyzed. Gene expression patterns of samples can then be studied to determine the expression motifs of the signatures. The signatures define an interesting decomposition of genes, analogous to the motifs of Hutchins et al [<xref ref-type="bibr" rid="ref10">10</xref>] in which the first value is selected where the residual sum of squares (RSS) curvature presents an inflection point. The machinery of the UIK method can then be used to detect this inflection and expression motifs define a robust clustering of samples.</p>
      <p>In this study, the elbow technique was considered for model selection utilizing alternative parsing and its robustness was evaluated [<xref ref-type="bibr" rid="ref19">19</xref>,<xref ref-type="bibr" rid="ref21">21</xref>]. The idea behind this approach is to develop an unbiased computable optimization point of the RSS curve that can then be used to select tuning parameters. The UIK method has proven to be useful for a variety of models, from classifying recordings of echolocation to a decision of predictive models for soil carbon at the field scale [<xref ref-type="bibr" rid="ref22">22</xref>,<xref ref-type="bibr" rid="ref23">23</xref>], but has not been used for NMF on genetic data to date. The advantage of the UIK method relative to the cophenetic measure method [<xref ref-type="bibr" rid="ref24">24</xref>,<xref ref-type="bibr" rid="ref25">25</xref>], as another NMF rank estimation measure, is that UIK yields a closed-form formula that can provide greater insight and computational speed in simulations, which can then be applied for selecting the rank of NMF for real high-dimensional hyperspectral data.</p>
      <p>Finally, this study applies the combination of NMF and the UIK method (designated the uikNMF method) to simplify cancer classification tasks by clustering tumor samples and mutational signature data sets. This enables illustrating numerous sturdy decompositions of genetic and mutational signatures from experimental and simulated data sets.</p>
    </sec>
    <sec sec-type="methods">
      <title>Methods</title>
      <sec>
        <title>NMF Approach</title>
        <p>Given a target matrix V<sup>m×n</sup>, NMF identifies nonnegative matrices such that N<sup>m×r</sup> and M<sup>r×n</sup> (ie, with all entries≥0) to present the matrix decomposition as:</p>
        <disp-formula>V ≈ NM <bold>(1)</bold></disp-formula>
        <p>In practice, N is typically viewed as a basis or metagenes matrix, and the mixture coefficient matrix and metagene expression profiles refer to the matrix N. The rank factorization is chosen such that r≤min(m,n). The goal behind this selection is to explain and split the details classified among V into r factors (ie, the columns of N). Given a matrix V<sup>m×n</sup>, NMF finds two nonnegative matrices, N<sup>m×r</sup> and M<sup>r×n</sup> (ie, with all elements≥0), to represent the decomposed matrix as</p>
        <disp-formula>V ≈ NM,</disp-formula>
        <p>for instance by natural demanding of nonnegative N and M to minimize the reconstruction error:</p>
        <disp-formula>||V – NM||<sub>F</sub>, subject to N ≥ 0, M ≥ 0 <bold>(2)</bold></disp-formula>
        <p>In this case, we consider a gene expression data set characterized by the expression levels of <italic>m</italic> genes (probes) by <italic>n</italic> samples of unique tissues, cells, cell lines, time points, or experiments. The number <italic>m</italic> of genes usually ranges from hundreds to thousands, and the <italic>n</italic> of experiments or patients is typically 100 for gene expression research. The gene expression data set is presented by a matrix of expression <italic>V</italic> of size N×M, whose rows consist of the expression levels of <italic>m</italic> genes and columns consist of <italic>n</italic> samples.</p>
        <p>The aim is to identify a small number of rank factorizations, each defined as a positive linear combination of the <italic>V</italic> target matrix. The positive linear combination of metagenes is described by the gene expression motif of the samples. To obtain a dimensional reduction of the microarray data and evaluate the distinctions among samples, NMF was implemented utilizing R statistical environment version 3.6.3 with the “NMF” package [<xref ref-type="bibr" rid="ref26">26</xref>].</p>
      </sec>
      <sec>
        <title>Cophenetic Measure</title>
        <p>In the framework of classification analyses, Brunet et al [<xref ref-type="bibr" rid="ref9">9</xref>] suggested utilizing the <italic>cophenetic correlation coefficient</italic> as a metric asset of the clusters. Furthermore, a cophenetic measure was proposed as one of the metrics utilizing the consensus matrix as a criterion for rank selection [<xref ref-type="bibr" rid="ref25">25</xref>]. Studying the values of the consensus matrix as a similarity metric, the cophenetic correlation coefficient is defined as the correlation between the sample distances induced by the consensus matrix and the cophenetic distances obtained by its hierarchical clustering.</p>
      </sec>
      <sec>
        <title>Proposed UIK Method</title>
        <p>Hutchins et al [<xref ref-type="bibr" rid="ref10">10</xref>] demonstrated how the variation in the RSS of the estimated matrix resulting from NMF analysis reveals a robust approximation of the proper number of elements (r). They employed Lee and Seung’s [<xref ref-type="bibr" rid="ref4">4</xref>] algorithm to select r, in which the plot of the RSS presents the first inflection point. In practice, the rank factorization r can be computed with a considerably smaller number of iterations, typically 20-30 runs for each value of r. In contrast, an optimal NMF interpretation requires a couple of hundred random restarts, which is computationally costly.</p>
        <p>For instance, a fundamental step for any unsupervised algorithm is to determine the optimal number of clusters (k) into which the data may be clustered [<xref ref-type="bibr" rid="ref27">27</xref>]. The <italic>elbow method</italic> is one of the most popular methods to determine the optimal value of such components of optimum features [<xref ref-type="bibr" rid="ref17">17</xref>,<xref ref-type="bibr" rid="ref18">18</xref>]. The utilization of UIK methodology for identification of the knee (elbow) point of a curve has consistently proven to be immensely advantageous in a wide variety of studies to locate the optimal number of “components” on a scree plot of k-means, PCA, FA, and NMF [<xref ref-type="bibr" rid="ref27">27</xref>-<xref ref-type="bibr" rid="ref32">32</xref>].</p>
        <p>In many cases, utilization is referred to as uik(x,y), where x is the vector of ranks, components, clusters, or factors and y is the related vector of the RSS curve [<xref ref-type="bibr" rid="ref10">10</xref>,<xref ref-type="bibr" rid="ref22">22</xref>,<xref ref-type="bibr" rid="ref33">33</xref>]. In regression analysis, the term mean squared error (MSE) is sometimes used to refer to the unbiased estimate of error variance (ie, the RSS divided by the degrees of freedom). Ulfarsson and Solo [<xref ref-type="bibr" rid="ref34">34</xref>] proposed a metric for rank selection in NMF by selecting the tuning parameters of an unbiased computable estimator of the MSE [<xref ref-type="bibr" rid="ref25">25</xref>]. Thus, as illustrated in <xref rid="figure1" ref-type="fig">Figure 1</xref>, the aim is to find an inflection where r meets the proper number of the factorization ranks utilizing the “elbow point,” which is virtually the point where a severely decreasing or increasing curve begins to turn “flat enough” [<xref ref-type="bibr" rid="ref19">19</xref>,<xref ref-type="bibr" rid="ref20">20</xref>,<xref ref-type="bibr" rid="ref22">22</xref>,<xref ref-type="bibr" rid="ref33">33</xref>,<xref ref-type="bibr" rid="ref35">35</xref>]. Furthermore, this study considered the function of the rank factorization curve and used the function uik() from the R package <italic>inflection</italic> to select the optimal rank [<xref ref-type="bibr" rid="ref33">33</xref>,<xref ref-type="bibr" rid="ref36">36</xref>,<xref ref-type="bibr" rid="ref37">37</xref>]. The uik() function detects the factorization rank when the curve begins to climb faster (start point) and the point beyond which the curve flattens out (ending point), which are generally known as the <italic>knee points</italic> of a curve (<xref rid="figure1" ref-type="fig">Figure 1</xref>). In <xref rid="figure1" ref-type="fig">Figure 1</xref>, the emergence of factorization rank for the Golub et al [<xref ref-type="bibr" rid="ref38">38</xref>,<xref ref-type="bibr" rid="ref39">39</xref>] gene expression data set is shown on the rank survey plot. The optimal rank of the RSS plot is in between knee points detected by the uik() function of the R package <italic>inflection</italic> at the curve to which the cumulative rank factorization belongs<italic>.</italic></p>
        <fig id="figure1" position="float">
          <label>Figure 1</label>
          <caption>
            <p>(A) Rank survey plots for residual sum of squares (RSS) and (B) cophenetic coefficient curves factorization rank. The factorization rank ranges from 3 to 37. The aim is to decide whether the optimal rank factorization is very rigid by simple visual inspection. (C) The function of factorization rank is selected as the emergence rank of the RSS survey. The rank range between knee points is detected by the uik() function of the R package "inflection" at the curve of the cumulative rank units. The best fit is determined using a linear regression model.</p>
          </caption>
          <graphic xlink:href="bioinform_v4i1e43665_fig1.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
        </fig>
      </sec>
      <sec>
        <title>Cross-validation</title>
        <p>This study used cross-validation to select an optimal number of implicit elements in NMF. The goal of NMF is to obtain low-dimensional N and M with all nonnegative elements by minimizing the reconstruction error |V – NM|<sup>2</sup>. Leaving out a single entry of V (eg, <bold>V<sub>ab</sub></bold>) and implementing NMF of the resulting matrix may produce a different result than the actual result. In other words, finding N and M while minimizing reconstruction error over all nonmissing entries results in:</p>
        <disp-formula>
        ∑<sub>ij≠ab</sub>(V<sub>ij</sub> – [NM]<sub>ij</sub>)<sup>2</sup> <bold>(3)</bold>
        </disp-formula>
        <p>Consequently, the left-out element V<italic><sub>ab</sub></italic> can be predicted by calculating [<bold>WH</bold>]<sub>ab</sub> and then determining the prediction error as:</p>
        <disp-formula>
        E(ab) = (V<italic><sub>ab</sub></italic> – [<bold>WH</bold>]<sub>ab</sub>)<sup>2</sup> <bold>(4)</bold>
        </disp-formula>
        <p>One can repeat this process by crossing out all entries of V<italic><sub>ab</sub></italic> one at a time and adding up the error of prediction overall, <italic>a<sub>a</sub></italic> and <italic>b<sub>b</sub></italic>. This will lead to the predicted residual sum of squares (PRESS) value. The PRESS value is defined as E(r) = ∑<sub>ab</sub>E(ab), which will strongly depend on the rank r. The prediction error, E(r), will have a minimum defined as an “optimal rank” r.</p>
        <p>Since the NMF must be reiterated for each crossed-out value and might also be difficult to code (depending on the target matrix entries and how smooth it is to implement NMF with missing values), this can be a computationally expensive procedure. For instance, in PCA, one can avoid this by crossing out entire rows of V, which eventually speeds up the computing [<xref ref-type="bibr" rid="ref40">40</xref>]. All the traditional cross-validation rules can apply here. Therefore, by not including multiple entries instead of a single entry and iterating the computation process by bootstrapping the entries instead of looping over all the entries, both techniques can help speed up the procedure.</p>
        <p>Note that various techniques have been developed to select the optimal rank factorization. For example, Brunet et al [<xref ref-type="bibr" rid="ref9">9</xref>] suggested seizing the first value of r for which the cophenetic coefficient value was decreasing, whereas Frigyesi et al [<xref ref-type="bibr" rid="ref11">11</xref>] considered the smallest value at which the decrease in the RSS is lower than the decay of the RSS simulated from random data. The aim of this study was to decide how and which approach performs better on an estimation of the latent factors given different algorithms of NMF.</p>
      </sec>
      <sec>
        <title>Gene Expression Data Set</title>
        <p>This study illustrates the utilization of NMF based on the UIK method to select the optimal rank on the RSS curve with a leukemia gene expression data set (esGolub) in simplifying cancer subtypes [<xref ref-type="bibr" rid="ref38">38</xref>,<xref ref-type="bibr" rid="ref41">41</xref>,<xref ref-type="bibr" rid="ref42">42</xref>]. This data set has been used in several previous studies on NMF and is built in the NMF package’s data [<xref ref-type="bibr" rid="ref9">9</xref>,<xref ref-type="bibr" rid="ref26">26</xref>,<xref ref-type="bibr" rid="ref43">43</xref>], packed into an ExpressionSet object [<xref ref-type="bibr" rid="ref39">39</xref>]. To achieve biologically meaningful results, we used the entire gene expression data set including 5000 features for 38 leukemia samples. The difference between acute myelogenous leukemia and acute lymphoblastic leukemia (ALL) has been noted. ALL is also separated into two subtypes: T-cell and B-cell ALL.</p>
        <p>Furthermore, this data set has served as a touchstone in cancer classification at the molecule, histology, and stage levels [<xref ref-type="bibr" rid="ref38">38</xref>,<xref ref-type="bibr" rid="ref44">44</xref>]. In this study, this data set was reprocessed to compare several clustering techniques regarding their effectiveness and permanence in recuperating other differentially expressed genes (DEGs) and associated pathways. Before the NMF procedure, dimension reduction is recommended for larger gene expression data sets by nonspecific criteria based on the characteristics of the expression estimates (ie, the mean threshold of variance and genes with the smallest average variances) [<xref ref-type="bibr" rid="ref45">45</xref>].</p>
        <p>For example, by looking at the NMF rank survey plot of RSS in <xref rid="figure1" ref-type="fig">Figure 1</xref>, we want to decide how many basis vectors we should keep to obtain the optimal rank of the target (original) matrix. To achieve such a task, an unbiased technique for deciding the number of clusters without visual interpretation that is simultaneously capable of utilizing a computational program is needed.</p>
      </sec>
      <sec>
        <title>Simulated Mutational Processes Data</title>
        <p>The simulated mutational process data obtained from Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] is publicly available as a MATLAB file on SigProfiler [<xref ref-type="bibr" rid="ref47">47</xref>]. They identified the handful of functional processes for a group of 100 simulated cancer genomes based on the repeatability of their signatures and low error for reconstructing the novel catalogs. The data set was generated by employing 10 mutational processes with different signatures (motifs), each with 96 mutation types, and adding a Poisson noise. The data also correspond to the six subtypes: C:G to A:T, C:G to G:C, C:G to T:A, T:A to A:T, T:A to C:G, and T:A to G:C and their immediate 5′ and 3′ sequence background.</p>
        <p>Analyses were performed utilizing the R programming language. Before the procedure, the low-quality genes with an inadequate number of reads were eliminated and gene expression values were converted to a logarithmic scale. The data set (<xref ref-type="table" rid="table1">Table 1</xref>) was then normalized by computing the averages of each sample in R. The <italic>NMF</italic> R package was used to draw plots of rank surveys using the plot() function [<xref ref-type="bibr" rid="ref48">48</xref>]. Rank survey analysis was performed to compare the optimal rank with distinct methods using the <italic>inflection</italic> package’s uik() and check_curve() functions [<xref ref-type="bibr" rid="ref36">36</xref>]. The readMat() function of the R.matlab package [<xref ref-type="bibr" rid="ref49">49</xref>] was used to import the simulated mutational processes data (<xref ref-type="table" rid="table1">Table 1</xref>) from the MATLAB file into the R environment (see Supplementary Data S1 in <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref>).</p>
        <table-wrap position="float" id="table1">
          <label>Table 1</label>
          <caption>
            <p>Gene expression and simulated mutational data sets.</p>
          </caption>
          <table width="1000" cellpadding="5" cellspacing="0" border="1" rules="groups" frame="hsides">
            <col width="450"/>
            <col width="230"/>
            <col width="320"/>
            <thead>
              <tr valign="top">
                <td>Data set</td>
                <td>Size</td>
                <td>Samples</td>
              </tr>
            </thead>
            <tbody>
              <tr valign="top">
                <td>esGolub gene expression</td>
                <td>5000×38</td>
                <td>38</td>
              </tr>
              <tr valign="top">
                <td>Mutational processes</td>
                <td>100×96</td>
                <td>96</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
      </sec>
    </sec>
    <sec sec-type="results">
      <title>Results</title>
      <sec>
        <title>Applications of NMF Based on the UIK Method</title>
        <sec>
          <title>Leukemia (esGolub) Data Set</title>
          <p>The present results are based on the NMF package of Gaujoux and Seoighe [<xref ref-type="bibr" rid="ref26">26</xref>] combined with the technique introduced by Hutchins et al [<xref ref-type="bibr" rid="ref10">10</xref>] (<xref rid="figure1" ref-type="fig">Figure 1</xref>). However, as shown in <xref rid="figure2" ref-type="fig">Figure 2</xref>, this study also tested other algorithms taken from the “brunet” and “nsNMF” algorithms to illustrate remarkable differences. It is important to emphasize that there is no remarkable base in the experimental data examined herein. Consequently, it is not possible to demonstrate considerable doubt that the proposed approach operates effectively on the experimental data set. As indicated in <xref rid="figure2" ref-type="fig">Figure 2</xref>, the uik() function selects the optimal rank as the curve starts to decline faster (start point) and the point beyond that the curve flattens out (ending point), which are generally known as the knee points of a curve (<xref rid="figure1" ref-type="fig">Figure 1</xref>). The UIK method identified 15 components for the brunet algorithm, whereas the nsNMF algorithm detected 14 latent factors as the best representation for the whole esGolub data set.</p>
          <p>By simply looking at the cophenetic correlation or RSS plots of rank factorization in <xref rid="figure3" ref-type="fig">Figure 3</xref>A, one can confirm that the optimum rank factorization is 3. For performance reasons, the submatrix esGolub (1:200) was initially performed with only 10 runs for each rank value. As demonstrated in <xref rid="figure3" ref-type="fig">Figure 3</xref>B, the UIK method of optimal rank factorization was validated by comparing with Gaujoux’s estimates of the esGolub subdata set [<xref ref-type="bibr" rid="ref50">50</xref>] (also see Supplementary Data S2 in <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref>). Consensus methods converged on a rank of 3, replicating the result of Brunet et al [<xref ref-type="bibr" rid="ref9">9</xref>], in which it was proposed that 3 factors yielded a more complete understanding of the esGolub data set with 200 features from 38 leukemia samples.</p>
          <fig id="figure2" position="float">
            <label>Figure 2</label>
            <caption>
              <p>Application of the unit invariant knee (UIK) method on different algorithms: (A) “Brunet” and (B) “nsNMF.” The optimal rank, which UIK represents, is 15 for the Brunet algorithm, whereas the UIK of the nsNMF algorithm reveals 14 as an optimum rank, similar to the “Lee” algorithm.</p>
            </caption>
            <graphic xlink:href="bioinform_v4i1e43665_fig2.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
          </fig>
          <fig id="figure3" position="float">
            <label>Figure 3</label>
            <caption>
              <p>(A) Estimation of the optimal rank. Nonnegative matrix factorization (NMF) survey plot of quality measures obtained from factorization rank from 2 to 6 by running the target matrix esGolub [1:200] 10 times. (B) The function of factorization rank is selected as the emergence rank of the residual sum of squares (RSS) survey. For example, the rank range of 2 to 6 is between knee points detected by the R inflection package's uik()function at 3. Overall,  the method of the UIK estimation was confirmed with former results.</p>
            </caption>
            <graphic xlink:href="bioinform_v4i1e43665_fig3.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
          </fig>
        </sec>
        <sec>
          <title>Simulated Mutational Process Data</title>
          <p>It is challenging to observe the rank factorization of the simulated data on the cophenetic coefficient curve (<xref rid="figure4" ref-type="fig">Figure 4</xref>A). Moreover, there is no clue in deciding rank factorization simply by observing the cophenetic correlation (<xref rid="figure4" ref-type="fig">Figure 4</xref>A) and the RSS (<xref rid="figure4" ref-type="fig">Figure 4</xref>B) plots. Nevertheless, the UIK method successfully validated the results of Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] and calculated 10 mutational signatures for the simulated data. From the perspective of Frigyesi et al [<xref ref-type="bibr" rid="ref11">11</xref>], <xref rid="figure4" ref-type="fig">Figure 4</xref>B further demonstrates that the actual optimal value of r=10 with the UIK method evaluates the ability of each value of the rank to classify the samples into the same number of classes, which could be smaller than the cophenetic measure (<xref rid="figure4" ref-type="fig">Figure 4</xref>A). Despite a decline in the cophenetic correlation coefficient value for r=5, 8, 10, the clusters are stationary and reflected as robust by Brunet et al [<xref ref-type="bibr" rid="ref9">9</xref>], which produces unmeaningful results that match the actual signatures. Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] considered that the biological significance of the 10th cluster, for r=10, is less clear with the cophenetic measure. The sharp decrease in the cophenetic correlation coefficient at r=13 indicates that substantially less stability is achieved using more than 10 clusters. Since this approach does not always provide a clear and consistent cutoff for the choice of r, Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] utilized the average silhouette width of the <italic>N</italic> clusters as a measure of reproducibility for the whole solution. Here, the method of UIK estimation with the former results of actual signatures according to Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] was validated (see Supplementary Data S3-S4 in <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref>).</p>
          <fig id="figure4" position="float">
            <label>Figure 4</label>
            <caption>
              <p>(A) It is complicated to locate the optimal rank with the cophenetic correlation coefficient approach. (B) However, the unit invariant knee (UIK) method can facilitate this decision more quickly and more accurately, which agrees with the number of signatures detected by Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>]. RSS: residual sum of squares.</p>
            </caption>
            <graphic xlink:href="bioinform_v4i1e43665_fig4.png" alt-version="no" mimetype="image" position="float" xlink:type="simple"/>
          </fig>
        </sec>
      </sec>
    </sec>
    <sec sec-type="discussion">
      <title>Discussion</title>
      <sec>
        <title>Principal Results</title>
        <p>The novel finding of this study is the ability to apply the UIK method in selecting optimal ranks based on the RSS curve of factorization ranks of the NMF technique. First, this study employed the Golub et al [<xref ref-type="bibr" rid="ref38">38</xref>] data set and simulated mutational process data [<xref ref-type="bibr" rid="ref46">46</xref>,<xref ref-type="bibr" rid="ref47">47</xref>] utilizing the UIK method, which does not require averaging out the results from different runs of the nmf() function [<xref ref-type="bibr" rid="ref50">50</xref>] or considering the variance between each run.</p>
        <p>In the second module, the UIK precisely estimates simulated data with known dimensions. The UIK technique is free of a priori rank parameter input and does not require setting initial parameters that considerably affect the performance. Finally, this method was tested on gene expression data deconvolution, achieving optimal rank estimation.</p>
        <p>The proposed uikNMF technique was tested on both experimental gene expression and simulated mutational processes data sets. Moreover, our recent study of utilization of the UIK technique on NMF revealed the genetic links of type 2 diabetes (T2D) that could lead to the development of Alzheimer disease (AD) [<xref ref-type="bibr" rid="ref51">51</xref>]. The study extracted the most significant genes, or so-called “metagenes,” using the elbow method in T2D data, which may be helpful for gaining insight into the mechanism of AD and the development of related therapeutics.</p>
        <p>This study further shows that the UIK method provides a credible prediction for gene expression data and precisely estimates simulated data with known dimensions. The proposed UIK method based on the RSS curvature’s first inflection point to estimate the optimal rank is theoretically superior or equivalent to existing implementation and software. All the undertaking is done with R programming and is freely available.</p>
        <p>As future work, some software functionality ideas include adapting the UIK method on NMF rank estimation in a single function package to accommodate analyses of gene expression, mutational processes, and other biological data sets at the molecular level.</p>
      </sec>
      <sec>
        <title>Limitations</title>
        <p>The analysis has some limitations such that other NMF packages or software on gene expression research were not tested. This study demonstrates that the UIK method provides a credible prediction for gene expression data. However, it was simply assumed that the same algorithms of NMF are used, as far as the RSS and residual curves would be approximated the same way so that the UIK method would result in the same optimal ranks.</p>
      </sec>
      <sec>
        <title>Comparison With Prior Work</title>
        <p>One of the arguments related to the choice of rank is to remove noise and recover the signatures [<xref ref-type="bibr" rid="ref52">52</xref>]. However, when it comes to NMF, the choice of noise is not obvious as the noisy version of the target matrix must be nonnegative as well, which suggests that injected noise may also introduce bias [<xref ref-type="bibr" rid="ref53">53</xref>]. In addition, the selection of the noise distribution is yet another hyperparameter that is not obvious to select. To handle the noise issue, it is suggested to use gene expression data sets (ie, microarrays) with low-quality reads and genes with a very low number of reads removed before DEGs analysis. The DEGs would then be used as the target matrix for the uikNMF method, as previously demonstrated with T2D gene expression data [<xref ref-type="bibr" rid="ref51">51</xref>].</p>
        <p>Several methods have been developed to select the optimal rank factorization [<xref ref-type="bibr" rid="ref50">50</xref>]. For example, Brunet et al [<xref ref-type="bibr" rid="ref9">9</xref>] proposed grabbing the first value of r for which the cophenetic coefficient rate was declining, whereas Frigyesi et al [<xref ref-type="bibr" rid="ref11">11</xref>] pondered the minimum value at which the decrease in the RSS is lower than the decay of the RSS simulated from random data. The aim of this study was to develop a method for deciding how and which approach performs better on an estimation of the latent factors on given different algorithms of NMF.</p>
      </sec>
      <sec>
        <title>Conclusions</title>
        <p>This study demonstrates that the elbow method provides a credible prediction for both gene expression data and for precisely estimating simulated mutational processes data with known dimensions. The suggested UIK method is faster than conventional methods with regard to usage of the consensus matrix as a benchmark for rank choice, while achieving considerably better computational adeptness without visual inspection on the curvatives. It is further argued that the suggested rank tuning method based on the elbow method with gene expression data is theoretically superior to the cophenetic measure. Lastly, the proposed method could be applied to other types of gene expression data sets to reveal the most significant genes (so-called “metagenes”) in various diseases, including T2D and other metabolic diseases, and may further be helpful for understanding the underlying mechanism of AD and related neurological disorders.</p>
      </sec>
    </sec>
  </body>
  <back>
    <app-group>
      <supplementary-material id="app1">
        <label>Multimedia Appendix 1</label>
        <p>Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] simulated mutational signatures data summary (Supplementary Data S1). Implementation of the comparison of Gaujoux estimates of the esGolub subdata set with the unit invariant knee (UIK) method (Supplementary Data S2). The rank survey plot of Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] simulated mutational signatures data (Supplementary Data S3). Application of the UIK method on Alexandrov et al [<xref ref-type="bibr" rid="ref46">46</xref>] simulated mutational signatures data (Supplementary Data S4).</p>
        <media xlink:href="bioinform_v4i1e43665_app1.pdf" xlink:title="PDF File  (Adobe PDF File), 988 KB"/>
      </supplementary-material>
    </app-group>
    <glossary>
      <title>Abbreviations</title>
      <def-list>
        <def-item>
          <term id="abb1">AD</term>
          <def>
            <p>Alzheimer disease</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb2">ALL</term>
          <def>
            <p>acute lymphoblastic leukemia</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb3">DEG</term>
          <def>
            <p>differentially expressed gene</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb4">FA</term>
          <def>
            <p>factor analysis</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb5">MSE</term>
          <def>
            <p>mean squared error</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb6">NMF</term>
          <def>
            <p>nonnegative matrix factorization</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb7">PCA</term>
          <def>
            <p>principal component analysis</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb8">PRESS</term>
          <def>
            <p>predicted residual sum of squares</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb9">RSS</term>
          <def>
            <p>residual sum of squares</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb10">T2D</term>
          <def>
            <p>type 2 diabetes</p>
          </def>
        </def-item>
        <def-item>
          <term id="abb11">UIK</term>
          <def>
            <p>unit invariant knee</p>
          </def>
        </def-item>
      </def-list>
    </glossary>
    <notes>
      <sec>
        <title>Data Availability</title>
        <p>The Golub gene expression [<xref ref-type="bibr" rid="ref38">38</xref>] and simulated mutational processes [<xref ref-type="bibr" rid="ref46">46</xref>] data sets are publicly available. The data and related R studio codes supporting the findings of the article are available in <xref ref-type="supplementary-material" rid="app1">Multimedia Appendix 1</xref>.</p>
      </sec>
    </notes>
    <fn-group>
      <fn fn-type="conflict">
        <p>None declared.</p>
      </fn>
    </fn-group>
    <ref-list>
      <ref id="ref1">
        <label>1</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Laurberg</surname>
              <given-names>H</given-names>
            </name>
          </person-group>
          <article-title>Non-negative matrix factorization: theory and methods. PhD thesis</article-title>
          <source>Institut for Elektroniske Systemer, Aalborg University Denmark</source>
          <year>2008</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://vbn.aau.dk/ws/portalfiles/portal/316444854/HLA_thesis.pdf">https://vbn.aau.dk/ws/portalfiles/portal/316444854/HLA_thesis.pdf</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref2">
        <label>2</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Kameoka</surname>
              <given-names>H</given-names>
            </name>
            <name name-style="western">
              <surname>Ono</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Kashino</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Sagayama</surname>
              <given-names>S</given-names>
            </name>
          </person-group>
          <article-title>NMF: A new sparse representation for acoustic signals</article-title>
          <year>2009</year>
          <conf-name>IEEE International Conference on Acoustics, Speech and Signal Processing</conf-name>
          <conf-date>April 19-24, 2009</conf-date>
          <conf-loc>Taipei, Taiwan</conf-loc>
          <pub-id pub-id-type="doi">10.1109/icassp.2009.4960364</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref3">
        <label>3</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Cantisani</surname>
              <given-names>G</given-names>
            </name>
            <name name-style="western">
              <surname>Essid</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Richard</surname>
              <given-names>G</given-names>
            </name>
          </person-group>
          <article-title>Neuro-steered music source separation with EEG-based auditory attention decoding and contrastive-NMF</article-title>
          <year>2021</year>
          <conf-name>IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 20211</conf-name>
          <conf-date>June 6-11, 2021</conf-date>
          <conf-loc>Toronto, ON</conf-loc>
          <pub-id pub-id-type="doi">10.1109/icassp39728.2021.9413841</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref4">
        <label>4</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Lee</surname>
              <given-names>DD</given-names>
            </name>
            <name name-style="western">
              <surname>Seung</surname>
              <given-names>HS</given-names>
            </name>
          </person-group>
          <article-title>Learning the parts of objects by non-negative matrix factorization</article-title>
          <source>Nature</source>
          <year>1999</year>
          <month>10</month>
          <day>21</day>
          <volume>401</volume>
          <issue>6755</issue>
          <fpage>788</fpage>
          <lpage>791</lpage>
          <pub-id pub-id-type="doi">10.1038/44565</pub-id>
          <pub-id pub-id-type="medline">10548103</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref5">
        <label>5</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Ramanarayanan</surname>
              <given-names>V</given-names>
            </name>
            <name name-style="western">
              <surname>Katsamanis</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Narayanan</surname>
              <given-names>S</given-names>
            </name>
          </person-group>
          <article-title>Automatic data-driven learning of articulatory primitives from real-time mri data using convolutive nmf with sparseness constraints</article-title>
          <year>2011</year>
          <conf-name>INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association</conf-name>
          <conf-date>August 27-31, 2011</conf-date>
          <conf-loc>Florence, Italy</conf-loc>
          <pub-id pub-id-type="doi">10.21437/interspeech.2011-16</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref6">
        <label>6</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Zhu</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Soldevila</surname>
              <given-names>F</given-names>
            </name>
            <name name-style="western">
              <surname>Moretti</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>d'Arco</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Boniface</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Shao</surname>
              <given-names>X</given-names>
            </name>
            <name name-style="western">
              <surname>de Aguiar</surname>
              <given-names>HB</given-names>
            </name>
            <name name-style="western">
              <surname>Gigan</surname>
              <given-names>S</given-names>
            </name>
          </person-group>
          <article-title>Large field-of-view non-invasive imaging through scattering layers using fluctuating random illumination</article-title>
          <source>Nat Commun</source>
          <year>2022</year>
          <month>03</month>
          <day>18</day>
          <volume>13</volume>
          <issue>1</issue>
          <fpage>1447</fpage>
          <pub-id pub-id-type="doi">10.1038/s41467-022-29166-y</pub-id>
          <pub-id pub-id-type="medline">35304460</pub-id>
          <pub-id pub-id-type="pii">10.1038/s41467-022-29166-y</pub-id>
          <pub-id pub-id-type="pmcid">PMC8933547</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref7">
        <label>7</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Zhang</surname>
              <given-names>Y</given-names>
            </name>
            <name name-style="western">
              <surname>Du</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Ge</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Jia</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Zhang</surname>
              <given-names>A</given-names>
            </name>
          </person-group>
          <article-title>A collective nmf method for detecting protein functional module from multiple data sources</article-title>
          <year>2012</year>
          <conf-name>BCB '12: ACM Conference on Bioinformatics, Computational Biology and Biomedicine</conf-name>
          <conf-date>October 8-10, 2012</conf-date>
          <conf-loc>Orlando, Florida</conf-loc>
          <pub-id pub-id-type="doi">10.1145/2382936.2383053</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref8">
        <label>8</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Ye</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Toyoda</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Ohtsuki</surname>
              <given-names>T</given-names>
            </name>
          </person-group>
          <article-title>Blind source separation on non-contact heartbeat detection by non-negative matrix factorization algorithms</article-title>
          <source>IEEE Trans Biomed Eng</source>
          <year>2020</year>
          <month>2</month>
          <volume>67</volume>
          <issue>2</issue>
          <fpage>482</fpage>
          <lpage>494</lpage>
          <pub-id pub-id-type="doi">10.1109/tbme.2019.2915762</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref9">
        <label>9</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Brunet</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Tamayo</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Golub</surname>
              <given-names>TR</given-names>
            </name>
            <name name-style="western">
              <surname>Mesirov</surname>
              <given-names>JP</given-names>
            </name>
          </person-group>
          <article-title>Metagenes and molecular pattern discovery using matrix factorization</article-title>
          <source>Proc Natl Acad Sci U S A</source>
          <year>2004</year>
          <month>03</month>
          <day>23</day>
          <volume>101</volume>
          <issue>12</issue>
          <fpage>4164</fpage>
          <lpage>4169</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/15016911"/>
          </comment>
          <pub-id pub-id-type="doi">10.1073/pnas.0308531101</pub-id>
          <pub-id pub-id-type="medline">15016911</pub-id>
          <pub-id pub-id-type="pii">0308531101</pub-id>
          <pub-id pub-id-type="pmcid">PMC384712</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref10">
        <label>10</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Hutchins</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Murphy</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Singh</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Graber</surname>
              <given-names>J</given-names>
            </name>
          </person-group>
          <article-title>Position-dependent motif characterization using non-negative matrix factorization</article-title>
          <source>Bioinformatics</source>
          <year>2008</year>
          <month>12</month>
          <day>01</day>
          <volume>24</volume>
          <issue>23</issue>
          <fpage>2684</fpage>
          <lpage>2690</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/18852176"/>
          </comment>
          <pub-id pub-id-type="doi">10.1093/bioinformatics/btn526</pub-id>
          <pub-id pub-id-type="medline">18852176</pub-id>
          <pub-id pub-id-type="pii">btn526</pub-id>
          <pub-id pub-id-type="pmcid">PMC2639279</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref11">
        <label>11</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Frigyesi</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Höglund</surname>
              <given-names>M</given-names>
            </name>
          </person-group>
          <article-title>Non-negative matrix factorization for the analysis of complex gene expression data: identification of clinically relevant tumor subtypes</article-title>
          <source>Cancer Inform</source>
          <year>2008</year>
          <month>05</month>
          <day>29</day>
          <volume>6</volume>
          <fpage>275</fpage>
          <lpage>292</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://journals.sagepub.com/doi/abs/10.4137/cin.s606?url_ver=Z39.88-2003&amp;rfr_id=ori:rid:crossref.org&amp;rfr_dat=cr_pub  0pubmed"/>
          </comment>
          <pub-id pub-id-type="doi">10.4137/cin.s606</pub-id>
          <pub-id pub-id-type="medline">19259414</pub-id>
          <pub-id pub-id-type="pmcid">PMC2623306</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref12">
        <label>12</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Pal</surname>
              <given-names>NR</given-names>
            </name>
            <name name-style="western">
              <surname>Aguan</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Sharma</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Amari</surname>
              <given-names>S</given-names>
            </name>
          </person-group>
          <article-title>Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering</article-title>
          <source>BMC Bioinformatics</source>
          <year>2007</year>
          <month>01</month>
          <day>06</day>
          <volume>8</volume>
          <issue>1</issue>
          <fpage>5</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-8-5"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/1471-2105-8-5</pub-id>
          <pub-id pub-id-type="medline">17207284</pub-id>
          <pub-id pub-id-type="pii">1471-2105-8-5</pub-id>
          <pub-id pub-id-type="pmcid">PMC1770936</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref13">
        <label>13</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Tsai</surname>
              <given-names>Y</given-names>
            </name>
            <name name-style="western">
              <surname>Lin</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Tseng</surname>
              <given-names>GC</given-names>
            </name>
            <name name-style="western">
              <surname>Chung</surname>
              <given-names>I</given-names>
            </name>
            <name name-style="western">
              <surname>Pal</surname>
              <given-names>NR</given-names>
            </name>
          </person-group>
          <article-title>Discovery of dominant and dormant genes from expression data using a novel generalization of SNR for multi-class problems</article-title>
          <source>BMC Bioinformatics</source>
          <year>2008</year>
          <month>10</month>
          <day>09</day>
          <volume>9</volume>
          <issue>1</issue>
          <fpage>425</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-425"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/1471-2105-9-425</pub-id>
          <pub-id pub-id-type="medline">18842155</pub-id>
          <pub-id pub-id-type="pii">1471-2105-9-425</pub-id>
          <pub-id pub-id-type="pmcid">PMC2620271</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref14">
        <label>14</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Akçay</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Güven</surname>
              <given-names>E</given-names>
            </name>
            <name name-style="western">
              <surname>Afzal</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Kazmi</surname>
              <given-names>I</given-names>
            </name>
          </person-group>
          <article-title>Non-negative matrix factorization and differential expression analyses identify hub genes linked to progression and prognosis of glioblastoma multiforme</article-title>
          <source>Gene</source>
          <year>2022</year>
          <month>05</month>
          <day>25</day>
          <volume>824</volume>
          <fpage>146395</fpage>
          <pub-id pub-id-type="doi">10.1016/j.gene.2022.146395</pub-id>
          <pub-id pub-id-type="medline">35283227</pub-id>
          <pub-id pub-id-type="pii">S0378-1119(22)00214-1</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref15">
        <label>15</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Bicciato</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Luchini</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Di Bello</surname>
              <given-names>C</given-names>
            </name>
          </person-group>
          <article-title>PCA disjoint models for multiclass cancer analysis using gene expression data</article-title>
          <source>Bioinformatics</source>
          <year>2003</year>
          <month>03</month>
          <day>22</day>
          <volume>19</volume>
          <issue>5</issue>
          <fpage>571</fpage>
          <lpage>578</lpage>
          <pub-id pub-id-type="doi">10.1093/bioinformatics/btg051</pub-id>
          <pub-id pub-id-type="medline">12651714</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref16">
        <label>16</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Staffini</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Fujita</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Svensson</surname>
              <given-names>AK</given-names>
            </name>
            <name name-style="western">
              <surname>Chung</surname>
              <given-names>U</given-names>
            </name>
            <name name-style="western">
              <surname>Svensson</surname>
              <given-names>T</given-names>
            </name>
          </person-group>
          <article-title>Statistical methods for item reduction in a representative lifestyle questionnaire: pilot questionnaire study</article-title>
          <source>Interact J Med Res</source>
          <year>2022</year>
          <month>03</month>
          <day>18</day>
          <volume>11</volume>
          <issue>1</issue>
          <fpage>e28692</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://www.i-jmr.org/2022/1/e28692/"/>
          </comment>
          <pub-id pub-id-type="doi">10.2196/28692</pub-id>
          <pub-id pub-id-type="medline">35302507</pub-id>
          <pub-id pub-id-type="pii">v11i1e28692</pub-id>
          <pub-id pub-id-type="pmcid">PMC8976253</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref17">
        <label>17</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Christopoulos</surname>
              <given-names>D</given-names>
            </name>
          </person-group>
          <article-title>Developing methods for identifying the inflection point of a convex/concave curve</article-title>
          <source>arXiv</source>
          <year>2012</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://arxiv.org/abs/1206.5478">https://arxiv.org/abs/1206.5478</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref18">
        <label>18</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Inglehart</surname>
              <given-names>R</given-names>
            </name>
            <name name-style="western">
              <surname>Haerpfer</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Moreno</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Welzel</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Kizilova</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Diez-Medrano</surname>
              <given-names>J</given-names>
            </name>
          </person-group>
          <article-title>World Values Survey Round Five. Country-Pooled Datafile Version</article-title>
          <source>WVS Database</source>
          <year>2005</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://www.worldvaluessurvey.org/WVSDocumentationWV5.jsp">https://www.worldvaluessurvey.org/WVSDocumentationWV5.jsp</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref19">
        <label>19</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Christopoulos</surname>
              <given-names>D</given-names>
            </name>
          </person-group>
          <article-title>Introducing unit invariant knee (UIK) as an objective choice for elbow point in multivariate data analysis techniques</article-title>
          <source>SSRN Journal</source>
          <year>2016</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3043076">https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3043076</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref20">
        <label>20</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Cattell</surname>
              <given-names>RB</given-names>
            </name>
          </person-group>
          <article-title>The scree test for the number of factors</article-title>
          <source>Multivariate Behav Res</source>
          <year>1966</year>
          <month>04</month>
          <day>01</day>
          <volume>1</volume>
          <issue>2</issue>
          <fpage>245</fpage>
          <lpage>276</lpage>
          <pub-id pub-id-type="doi">10.1207/s15327906mbr0102_10</pub-id>
          <pub-id pub-id-type="medline">26828106</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref21">
        <label>21</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Islam</surname>
              <given-names>SA</given-names>
            </name>
            <name name-style="western">
              <surname>Díaz-Gay</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Wu</surname>
              <given-names>Y</given-names>
            </name>
            <name name-style="western">
              <surname>Barnes</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Vangara</surname>
              <given-names>R</given-names>
            </name>
            <name name-style="western">
              <surname>Bergstrom</surname>
              <given-names>EN</given-names>
            </name>
            <name name-style="western">
              <surname>He</surname>
              <given-names>Y</given-names>
            </name>
            <name name-style="western">
              <surname>Vella</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Wang</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Teague</surname>
              <given-names>JW</given-names>
            </name>
            <name name-style="western">
              <surname>Clapham</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Moody</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Senkin</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Li</surname>
              <given-names>YR</given-names>
            </name>
            <name name-style="western">
              <surname>Riva</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Zhang</surname>
              <given-names>T</given-names>
            </name>
            <name name-style="western">
              <surname>Gruber</surname>
              <given-names>AJ</given-names>
            </name>
            <name name-style="western">
              <surname>Steele</surname>
              <given-names>CD</given-names>
            </name>
            <name name-style="western">
              <surname>Otlu</surname>
              <given-names>B</given-names>
            </name>
            <name name-style="western">
              <surname>Khandekar</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Abbasi</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Humphreys</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Syulyukina</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Brady</surname>
              <given-names>SW</given-names>
            </name>
            <name name-style="western">
              <surname>Alexandrov</surname>
              <given-names>BS</given-names>
            </name>
            <name name-style="western">
              <surname>Pillay</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Zhang</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Adams</surname>
              <given-names>DJ</given-names>
            </name>
            <name name-style="western">
              <surname>Martincorena</surname>
              <given-names>I</given-names>
            </name>
            <name name-style="western">
              <surname>Wedge</surname>
              <given-names>DC</given-names>
            </name>
            <name name-style="western">
              <surname>Landi</surname>
              <given-names>MT</given-names>
            </name>
            <name name-style="western">
              <surname>Brennan</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Stratton</surname>
              <given-names>MR</given-names>
            </name>
            <name name-style="western">
              <surname>Rozen</surname>
              <given-names>SG</given-names>
            </name>
            <name name-style="western">
              <surname>Alexandrov</surname>
              <given-names>LB</given-names>
            </name>
          </person-group>
          <article-title>Uncovering novel mutational signatures by  extraction with SigProfilerExtractor</article-title>
          <source>Cell Genom</source>
          <year>2022</year>
          <month>11</month>
          <day>09</day>
          <volume>2</volume>
          <issue>11</issue>
          <fpage>100179</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://linkinghub.elsevier.com/retrieve/pii/S2666-979X(22)00124-0"/>
          </comment>
          <pub-id pub-id-type="doi">10.1016/j.xgen.2022.100179</pub-id>
          <pub-id pub-id-type="medline">36388765</pub-id>
          <pub-id pub-id-type="pii">S2666-979X(22)00124-0</pub-id>
          <pub-id pub-id-type="pmcid">PMC9646490</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref22">
        <label>22</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Tabak</surname>
              <given-names>MA</given-names>
            </name>
            <name name-style="western">
              <surname>Murray</surname>
              <given-names>KL</given-names>
            </name>
            <name name-style="western">
              <surname>Reed</surname>
              <given-names>AM</given-names>
            </name>
            <name name-style="western">
              <surname>Lombardi</surname>
              <given-names>JA</given-names>
            </name>
            <name name-style="western">
              <surname>Bay</surname>
              <given-names>KJ</given-names>
            </name>
          </person-group>
          <article-title>Automated classification of bat echolocation call recordings with artificial intelligence</article-title>
          <source>Ecol Inform</source>
          <year>2022</year>
          <month>05</month>
          <volume>68</volume>
          <fpage>101526</fpage>
          <pub-id pub-id-type="doi">10.1016/j.ecoinf.2021.101526</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref23">
        <label>23</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Saurette</surname>
              <given-names>DD</given-names>
            </name>
            <name name-style="western">
              <surname>Berg</surname>
              <given-names>AA</given-names>
            </name>
            <name name-style="western">
              <surname>Laamrani</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Heck</surname>
              <given-names>RJ</given-names>
            </name>
            <name name-style="western">
              <surname>Gillespie</surname>
              <given-names>AW</given-names>
            </name>
            <name name-style="western">
              <surname>Voroney</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Biswas</surname>
              <given-names>A</given-names>
            </name>
          </person-group>
          <article-title>Effects of sample size and covariate resolution on field-scale predictive digital mapping of soil carbon</article-title>
          <source>Geoderma</source>
          <year>2022</year>
          <month>11</month>
          <volume>425</volume>
          <fpage>116054</fpage>
          <pub-id pub-id-type="doi">10.1016/j.geoderma.2022.116054</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref24">
        <label>24</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Maisog</surname>
              <given-names>JM</given-names>
            </name>
            <name name-style="western">
              <surname>DeMarco</surname>
              <given-names>AT</given-names>
            </name>
            <name name-style="western">
              <surname>Devarajan</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Young</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Fogel</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Luta</surname>
              <given-names>G</given-names>
            </name>
          </person-group>
          <article-title>Assessing methods for evaluating the number of components in non-negative matrix factorization</article-title>
          <source>Mathematics</source>
          <year>2021</year>
          <month>11</month>
          <day>02</day>
          <volume>9</volume>
          <issue>22</issue>
          <fpage>2840</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/35694180"/>
          </comment>
          <pub-id pub-id-type="doi">10.3390/math9222840</pub-id>
          <pub-id pub-id-type="medline">35694180</pub-id>
          <pub-id pub-id-type="pii">2840</pub-id>
          <pub-id pub-id-type="pmcid">PMC9181460</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref25">
        <label>25</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Muzzarelli</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Weis</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Eickhoff</surname>
              <given-names>SB</given-names>
            </name>
            <name name-style="western">
              <surname>Patil</surname>
              <given-names>KR</given-names>
            </name>
          </person-group>
          <article-title>Rank selection in non-negative matrix factorization: systematic comparison and a new MAD metric</article-title>
          <year>2019</year>
          <conf-name>International Joint Conference on Neural Networks (IJCNN)</conf-name>
          <conf-date>July 14-19, 2019</conf-date>
          <conf-loc>Budapest, Hungary</conf-loc>
          <pub-id pub-id-type="doi">10.1109/ijcnn.2019.8852146</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref26">
        <label>26</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Gaujoux</surname>
              <given-names>R</given-names>
            </name>
            <name name-style="western">
              <surname>Seoighe</surname>
              <given-names>C</given-names>
            </name>
          </person-group>
          <article-title>A flexible R package for nonnegative matrix factorization</article-title>
          <source>BMC Bioinformatics</source>
          <year>2010</year>
          <month>07</month>
          <day>02</day>
          <volume>11</volume>
          <issue>1</issue>
          <fpage>367</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-11-367"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/1471-2105-11-367</pub-id>
          <pub-id pub-id-type="medline">20598126</pub-id>
          <pub-id pub-id-type="pii">1471-2105-11-367</pub-id>
          <pub-id pub-id-type="pmcid">PMC2912887</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref27">
        <label>27</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Marutho</surname>
              <given-names>D</given-names>
            </name>
            <name name-style="western">
              <surname>Handaka</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Wijaya</surname>
              <given-names>E</given-names>
            </name>
            <collab>Muljono</collab>
          </person-group>
          <article-title>The determination of cluster number at k-mean using elbow method and purity evaluation on headline news</article-title>
          <year>2018</year>
          <conf-name>International Seminar On Application For Technology of Information and Communication</conf-name>
          <conf-date>September 21-22, 2018</conf-date>
          <conf-loc>Semarang, Indonesia</conf-loc>
          <pub-id pub-id-type="doi">10.1109/isemantic.2018.8549751</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref28">
        <label>28</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Et-taleby</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Boussetta</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Benslimane</surname>
              <given-names>M</given-names>
            </name>
          </person-group>
          <article-title>Faults detection for photovoltaic field based on k-means, elbow, and average silhouette techniques through the segmentation of a thermal image</article-title>
          <source>Int J Photoenergy</source>
          <year>2020</year>
          <month>12</month>
          <day>4</day>
          <volume>2020</volume>
          <fpage>6617597</fpage>
          <pub-id pub-id-type="doi">10.1155/2020/6617597</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref29">
        <label>29</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Liu</surname>
              <given-names>Z</given-names>
            </name>
            <name name-style="western">
              <surname>Tan</surname>
              <given-names>V</given-names>
            </name>
          </person-group>
          <article-title>Rank-one NMF-based initialization for NMF and relative error bounds under a geometric assumption</article-title>
          <source>IEEE Transact Signal Process</source>
          <year>2018</year>
          <volume>65</volume>
          <issue>18</issue>
          <fpage>4717</fpage>
          <lpage>4731</lpage>
          <pub-id pub-id-type="doi">10.1109/ita.2018.8503169</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref30">
        <label>30</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Bandyopadhyay</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Thakur</surname>
              <given-names>SS</given-names>
            </name>
            <name name-style="western">
              <surname>Mandal</surname>
              <given-names>JK</given-names>
            </name>
          </person-group>
          <article-title>Product recommendation for e-commerce business by applying principal component analysis (PCA) and K-means clustering: benefit for the society</article-title>
          <source>Innovations Syst Softw Eng</source>
          <year>2020</year>
          <month>08</month>
          <day>25</day>
          <volume>17</volume>
          <issue>1</issue>
          <fpage>45</fpage>
          <lpage>52</lpage>
          <pub-id pub-id-type="doi">10.1007/s11334-020-00372-5</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref31">
        <label>31</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Moltu</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Stefansen</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Svisdahl</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Veseth</surname>
              <given-names>M</given-names>
            </name>
          </person-group>
          <article-title>Negotiating the coresearcher mandate - service users' experiences of doing collaborative research on mental health</article-title>
          <source>Disabil Rehabil</source>
          <year>2012</year>
          <volume>34</volume>
          <issue>19</issue>
          <fpage>1608</fpage>
          <lpage>1616</lpage>
          <pub-id pub-id-type="doi">10.3109/09638288.2012.656792</pub-id>
          <pub-id pub-id-type="medline">22489612</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref32">
        <label>32</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Vollmer Dahlke</surname>
              <given-names>D</given-names>
            </name>
            <name name-style="western">
              <surname>Fair</surname>
              <given-names>K</given-names>
            </name>
            <name name-style="western">
              <surname>Hong</surname>
              <given-names>YA</given-names>
            </name>
            <name name-style="western">
              <surname>Beaudoin</surname>
              <given-names>CE</given-names>
            </name>
            <name name-style="western">
              <surname>Pulczinski</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Ory</surname>
              <given-names>MG</given-names>
            </name>
          </person-group>
          <article-title>Apps seeking theories: results of a study on the use of health behavior change theories in cancer survivorship mobile apps</article-title>
          <source>JMIR Mhealth Uhealth</source>
          <year>2015</year>
          <month>03</month>
          <day>27</day>
          <volume>3</volume>
          <issue>1</issue>
          <fpage>e31</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://mhealth.jmir.org/2015/1/e31/"/>
          </comment>
          <pub-id pub-id-type="doi">10.2196/mhealth.3861</pub-id>
          <pub-id pub-id-type="medline">25830810</pub-id>
          <pub-id pub-id-type="pii">v3i1e31</pub-id>
          <pub-id pub-id-type="pmcid">PMC4393507</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref33">
        <label>33</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Revilla-Martín</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Budinski</surname>
              <given-names>I</given-names>
            </name>
            <name name-style="western">
              <surname>Puig-Montserrat</surname>
              <given-names>X</given-names>
            </name>
            <name name-style="western">
              <surname>Flaquer</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>López-Baucells</surname>
              <given-names>A</given-names>
            </name>
          </person-group>
          <article-title>Monitoring cave-dwelling bats using remote passive acoustic detectors: a new approach for cave monitoring</article-title>
          <source>Bioacoustics</source>
          <year>2020</year>
          <month>09</month>
          <day>17</day>
          <volume>30</volume>
          <issue>5</issue>
          <fpage>527</fpage>
          <lpage>542</lpage>
          <pub-id pub-id-type="doi">10.1080/09524622.2020.1816492</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref34">
        <label>34</label>
        <nlm-citation citation-type="confproc">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Ulfarsson</surname>
              <given-names>MO</given-names>
            </name>
            <name name-style="western">
              <surname>Solo</surname>
              <given-names>V</given-names>
            </name>
          </person-group>
          <article-title>Tuning parameter selection for nonnegative matrix factorization</article-title>
          <year>2013</year>
          <conf-name>2013 IEEE International Conference on Acoustics, Speech and Signal Processing</conf-name>
          <conf-date>May 26-31, 2013</conf-date>
          <conf-loc>Vancouver, BC</conf-loc>
          <pub-id pub-id-type="doi">10.1109/icassp.2013.6638936</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref35">
        <label>35</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Glogoza</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Urbach</surname>
              <given-names>J</given-names>
            </name>
            <name name-style="western">
              <surname>Rosborough</surname>
              <given-names>TK</given-names>
            </name>
            <name name-style="western">
              <surname>Olet</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>St Hill</surname>
              <given-names>CA</given-names>
            </name>
            <name name-style="western">
              <surname>Smith</surname>
              <given-names>CS</given-names>
            </name>
            <name name-style="western">
              <surname>Tierney</surname>
              <given-names>DM</given-names>
            </name>
          </person-group>
          <article-title>Tablet vs. station-based laptop ultrasound devices increases internal medicine resident point-of-care ultrasound performance: a prospective cohort study</article-title>
          <source>Ultrasound J</source>
          <year>2020</year>
          <month>04</month>
          <day>16</day>
          <volume>12</volume>
          <issue>1</issue>
          <fpage>18</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/32300979"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/s13089-020-00165-8</pub-id>
          <pub-id pub-id-type="medline">32300979</pub-id>
          <pub-id pub-id-type="pii">10.1186/s13089-020-00165-8</pub-id>
          <pub-id pub-id-type="pmcid">PMC7163000</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref36">
        <label>36</label>
        <nlm-citation citation-type="web">
          <article-title>inflection-package: Finds the inflection point of a curve R package</article-title>
          <source>RDRR</source>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://rdrr.io/cran/inflection/">https://rdrr.io/cran/inflection/</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref37">
        <label>37</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Christopoulos</surname>
              <given-names>DT</given-names>
            </name>
          </person-group>
          <article-title>Reliable computations of knee point for a curve and introduction of a unit invariant estimation</article-title>
          <source>ResearchGate</source>
          <year>2014</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://www.researchgate.net/publication/268977798_Reliable_computations_of_knee_point_for_a_curve_and_introduction_of_a_unit_invariant_estimation">https://www.researchgate.net/publication/268977798_Reliable_computations_of_knee_point_for_a_curve_and_introduction_of_a_unit_invariant_estimation</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref38">
        <label>38</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Golub</surname>
              <given-names>TR</given-names>
            </name>
            <name name-style="western">
              <surname>Slonim</surname>
              <given-names>DK</given-names>
            </name>
            <name name-style="western">
              <surname>Tamayo</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Huard</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Gaasenbeek</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Mesirov</surname>
              <given-names>JP</given-names>
            </name>
            <name name-style="western">
              <surname>Coller</surname>
              <given-names>H</given-names>
            </name>
            <name name-style="western">
              <surname>Loh</surname>
              <given-names>ML</given-names>
            </name>
            <name name-style="western">
              <surname>Downing</surname>
              <given-names>JR</given-names>
            </name>
            <name name-style="western">
              <surname>Caligiuri</surname>
              <given-names>MA</given-names>
            </name>
            <name name-style="western">
              <surname>Bloomfield</surname>
              <given-names>CD</given-names>
            </name>
            <name name-style="western">
              <surname>Lander</surname>
              <given-names>ES</given-names>
            </name>
          </person-group>
          <article-title>Molecular classification of cancer: class discovery and class prediction by gene expression monitoring</article-title>
          <source>Science</source>
          <year>1999</year>
          <month>10</month>
          <day>15</day>
          <volume>286</volume>
          <issue>5439</issue>
          <fpage>531</fpage>
          <lpage>537</lpage>
          <pub-id pub-id-type="doi">10.1126/science.286.5439.531</pub-id>
          <pub-id pub-id-type="medline">10521349</pub-id>
          <pub-id pub-id-type="pii">7911</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref39">
        <label>39</label>
        <nlm-citation citation-type="web">
          <article-title>Golub ExpressionSet</article-title>
          <source>NMF R Project</source>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://nmf.r-forge.r-project.org/esGolub.html">https://nmf.r-forge.r-project.org/esGolub.html</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref40">
        <label>40</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Ilin</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Raiko</surname>
              <given-names>T</given-names>
            </name>
          </person-group>
          <article-title>Practical approaches to principal component analysis in the presence of missing values</article-title>
          <source>J Machine Learn Res</source>
          <year>2010</year>
          <volume>11</volume>
          <fpage>1957</fpage>
          <lpage>2000</lpage>
        </nlm-citation>
      </ref>
      <ref id="ref41">
        <label>41</label>
        <nlm-citation citation-type="book">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Park</surname>
              <given-names>PJ</given-names>
            </name>
          </person-group>
          <person-group person-group-type="editor">
            <name name-style="western">
              <surname>Shoemaker</surname>
              <given-names>JS</given-names>
            </name>
            <name name-style="western">
              <surname>Lin</surname>
              <given-names>SM</given-names>
            </name>
          </person-group>
          <article-title>Gene expression data and survival analysis</article-title>
          <source>Methods of microarray data analysis</source>
          <year>2005</year>
          <publisher-loc>Boston, MA</publisher-loc>
          <publisher-name>Springer</publisher-name>
          <fpage>21</fpage>
          <lpage>34</lpage>
        </nlm-citation>
      </ref>
      <ref id="ref42">
        <label>42</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Ramaswamy</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Tamayo</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Rifkin</surname>
              <given-names>R</given-names>
            </name>
            <name name-style="western">
              <surname>Mukherjee</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Yeang</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Angelo</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Ladd</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Reich</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Latulippe</surname>
              <given-names>E</given-names>
            </name>
            <name name-style="western">
              <surname>Mesirov</surname>
              <given-names>JP</given-names>
            </name>
            <name name-style="western">
              <surname>Poggio</surname>
              <given-names>T</given-names>
            </name>
            <name name-style="western">
              <surname>Gerald</surname>
              <given-names>W</given-names>
            </name>
            <name name-style="western">
              <surname>Loda</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Lander</surname>
              <given-names>ES</given-names>
            </name>
            <name name-style="western">
              <surname>Golub</surname>
              <given-names>TR</given-names>
            </name>
          </person-group>
          <article-title>Multiclass cancer diagnosis using tumor gene expression signatures</article-title>
          <source>Proc Natl Acad Sci U S A</source>
          <year>2001</year>
          <month>12</month>
          <day>18</day>
          <volume>98</volume>
          <issue>26</issue>
          <fpage>15149</fpage>
          <lpage>15154</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/11742071"/>
          </comment>
          <pub-id pub-id-type="doi">10.1073/pnas.211566398</pub-id>
          <pub-id pub-id-type="medline">11742071</pub-id>
          <pub-id pub-id-type="pii">211566398</pub-id>
          <pub-id pub-id-type="pmcid">PMC64998</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref43">
        <label>43</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Friedman</surname>
              <given-names>N</given-names>
            </name>
            <name name-style="western">
              <surname>Kaminski</surname>
              <given-names>N</given-names>
            </name>
          </person-group>
          <article-title>Statistical methods for analyzing gene expression data for cancer research</article-title>
          <source>Ernst Schering Res Found Workshop</source>
          <year>2002</year>
          <volume>109</volume>
          <issue>38</issue>
          <fpage>109</fpage>
          <lpage>131</lpage>
          <pub-id pub-id-type="doi">10.1007/978-3-662-04747-7_6</pub-id>
          <pub-id pub-id-type="medline">12060998</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref44">
        <label>44</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Haferlach</surname>
              <given-names>T</given-names>
            </name>
            <name name-style="western">
              <surname>Kohlmann</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Bacher</surname>
              <given-names>U</given-names>
            </name>
            <name name-style="western">
              <surname>Schnittger</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Haferlach</surname>
              <given-names>C</given-names>
            </name>
            <name name-style="western">
              <surname>Kern</surname>
              <given-names>W</given-names>
            </name>
          </person-group>
          <article-title>Gene expression profiling for the diagnosis of acute leukaemia</article-title>
          <source>Br J Cancer</source>
          <year>2007</year>
          <month>02</month>
          <day>26</day>
          <volume>96</volume>
          <issue>4</issue>
          <fpage>535</fpage>
          <lpage>540</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://europepmc.org/abstract/MED/17146476"/>
          </comment>
          <pub-id pub-id-type="doi">10.1038/sj.bjc.6603495</pub-id>
          <pub-id pub-id-type="medline">17146476</pub-id>
          <pub-id pub-id-type="pii">6603495</pub-id>
          <pub-id pub-id-type="pmcid">PMC2360048</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref45">
        <label>45</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Hackstadt</surname>
              <given-names>AJ</given-names>
            </name>
            <name name-style="western">
              <surname>Hess</surname>
              <given-names>AM</given-names>
            </name>
          </person-group>
          <article-title>Filtering for increased power for microarray data analysis</article-title>
          <source>BMC Bioinformatics</source>
          <year>2009</year>
          <month>01</month>
          <day>08</day>
          <volume>10</volume>
          <issue>1</issue>
          <fpage>11</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-10-11"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/1471-2105-10-11</pub-id>
          <pub-id pub-id-type="medline">19133141</pub-id>
          <pub-id pub-id-type="pii">1471-2105-10-11</pub-id>
          <pub-id pub-id-type="pmcid">PMC2661050</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref46">
        <label>46</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Alexandrov</surname>
              <given-names>L</given-names>
            </name>
            <name name-style="western">
              <surname>Nik-Zainal</surname>
              <given-names>S</given-names>
            </name>
            <name name-style="western">
              <surname>Wedge</surname>
              <given-names>D</given-names>
            </name>
            <name name-style="western">
              <surname>Campbell</surname>
              <given-names>P</given-names>
            </name>
            <name name-style="western">
              <surname>Stratton</surname>
              <given-names>M</given-names>
            </name>
          </person-group>
          <article-title>Deciphering signatures of mutational processes operative in human cancer</article-title>
          <source>Cell Rep</source>
          <year>2013</year>
          <month>01</month>
          <day>31</day>
          <volume>3</volume>
          <issue>1</issue>
          <fpage>246</fpage>
          <lpage>259</lpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://linkinghub.elsevier.com/retrieve/pii/S2211-1247(12)00433-0"/>
          </comment>
          <pub-id pub-id-type="doi">10.1016/j.celrep.2012.12.008</pub-id>
          <pub-id pub-id-type="medline">23318258</pub-id>
          <pub-id pub-id-type="pii">S2211-1247(12)00433-0</pub-id>
          <pub-id pub-id-type="pmcid">PMC3588146</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref47">
        <label>47</label>
        <nlm-citation citation-type="web">
          <article-title>SigProfiler</article-title>
          <source>MathWorks</source>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://www.mathworks.com/matlabcentral/fileexchange/38724-sigprofiler">https://www.mathworks.com/matlabcentral/fileexchange/38724-sigprofiler</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref48">
        <label>48</label>
        <nlm-citation citation-type="web">
          <article-title>NMF: algorithms and framework for nonnegative matrix factorization (NMF)</article-title>
          <source>CRAN R project</source>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://cran.r-project.org/web/packages/NMF/">https://cran.r-project.org/web/packages/NMF/</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref49">
        <label>49</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Bengtsson</surname>
              <given-names>H</given-names>
            </name>
            <name name-style="western">
              <surname>Jacobson</surname>
              <given-names>A</given-names>
            </name>
            <name name-style="western">
              <surname>Riedy</surname>
              <given-names>J</given-names>
            </name>
          </person-group>
          <article-title>R.matlab: Read and Write MAT Files and Call MATLAB from Within R</article-title>
          <source>CRAN R project</source>
          <year>2018</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://cran.r-project.org/web/packages/R.matlab/index.html">https://cran.r-project.org/web/packages/R.matlab/index.html</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref50">
        <label>50</label>
        <nlm-citation citation-type="web">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Gaujoux</surname>
              <given-names>R</given-names>
            </name>
          </person-group>
          <article-title>An introduction to NMF package Version 0</article-title>
          <source>R Project for Statistical Computing</source>
          <year>2014</year>
          <access-date>2023-05-05</access-date>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="http://nmf.r-forge.r-project.org/vignettes/NMF-vignette.pdf">http://nmf.r-forge.r-project.org/vignettes/NMF-vignette.pdf</ext-link>
          </comment>
        </nlm-citation>
      </ref>
      <ref id="ref51">
        <label>51</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Rousseeuw</surname>
              <given-names>PJ</given-names>
            </name>
          </person-group>
          <article-title>Silhouettes: a graphical aid to the interpretation and validation of cluster analysis</article-title>
          <source>J Comput Appl Math</source>
          <year>1987</year>
          <month>11</month>
          <volume>20</volume>
          <fpage>53</fpage>
          <lpage>65</lpage>
          <pub-id pub-id-type="doi">10.1016/0377-0427(87)90125-7</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref52">
        <label>52</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Afzal</surname>
              <given-names>M</given-names>
            </name>
            <name name-style="western">
              <surname>Alharbi</surname>
              <given-names>KS</given-names>
            </name>
            <name name-style="western">
              <surname>Alzarea</surname>
              <given-names>SI</given-names>
            </name>
            <name name-style="western">
              <surname>Alyamani</surname>
              <given-names>NM</given-names>
            </name>
            <name name-style="western">
              <surname>Kazmi</surname>
              <given-names>I</given-names>
            </name>
            <name name-style="western">
              <surname>Güven</surname>
              <given-names>E</given-names>
            </name>
          </person-group>
          <article-title>Revealing genetic links of type 2 diabetes that lead to the development of Alzheimer's disease</article-title>
          <source>Heliyon</source>
          <year>2023</year>
          <month>01</month>
          <volume>9</volume>
          <issue>1</issue>
          <fpage>e12202</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://linkinghub.elsevier.com/retrieve/pii/S2405-8440(22)03490-9"/>
          </comment>
          <pub-id pub-id-type="doi">10.1016/j.heliyon.2022.e12202</pub-id>
          <pub-id pub-id-type="medline">36711310</pub-id>
          <pub-id pub-id-type="pii">S2405-8440(22)03490-9</pub-id>
          <pub-id pub-id-type="pmcid">PMC9876837</pub-id>
        </nlm-citation>
      </ref>
      <ref id="ref53">
        <label>53</label>
        <nlm-citation citation-type="journal">
          <person-group person-group-type="author">
            <name name-style="western">
              <surname>Lin</surname>
              <given-names>X</given-names>
            </name>
            <name name-style="western">
              <surname>Boutros</surname>
              <given-names>PC</given-names>
            </name>
          </person-group>
          <article-title>Optimization and expansion of non-negative matrix factorization</article-title>
          <source>BMC Bioinformatics</source>
          <year>2020</year>
          <month>01</month>
          <day>06</day>
          <volume>21</volume>
          <issue>1</issue>
          <fpage>7</fpage>
          <comment>
            <ext-link ext-link-type="uri" xlink:type="simple" xlink:href="https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3312-5"/>
          </comment>
          <pub-id pub-id-type="doi">10.1186/s12859-019-3312-5</pub-id>
          <pub-id pub-id-type="medline">31906867</pub-id>
          <pub-id pub-id-type="pii">10.1186/s12859-019-3312-5</pub-id>
          <pub-id pub-id-type="pmcid">PMC6945623</pub-id>
        </nlm-citation>
      </ref>
    </ref-list>
  </back>
</article>
