Archives

  • 2018-07
  • 2019-04
  • 2019-05
  • 2019-06
  • 2019-07
  • 2019-08
  • 2019-09
  • 2019-10
  • 2019-11
  • 2019-12
  • 2020-01
  • 2020-02
  • 2020-03
  • 2020-04
  • 2020-05
  • 2020-06
  • 2020-07
  • 2020-08
  • 2020-09
  • 2020-10
  • 2020-11
  • 2020-12
  • 2021-01
  • 2021-02
  • 2021-03
  • 2021-04
  • 2021-05
  • 2021-06
  • 2021-07
  • 2021-08
  • 2021-09
  • 2021-10
  • 2021-11
  • 2021-12
  • 2022-01
  • 2022-02
  • 2022-03
  • 2022-04
  • 2022-05
  • 2022-06
  • 2022-07
  • 2022-08
  • 2022-09
  • 2022-10
  • 2022-11
  • 2022-12
  • 2023-01
  • 2023-02
  • 2023-03
  • 2023-04
  • 2023-05
  • 2023-06
  • 2023-07
  • 2023-08
  • 2023-09
  • 2023-10
  • 2023-11
  • 2023-12
  • 2024-01
  • 2024-02
  • 2024-03
  • 2024-04
  • Linagliptin br Materials and methods br

    2022-05-12


    Materials and methods
    Results
    Discussion Genetic analyses conducted here were based on specific features of PhyChem indexes for nt dimers extracted to generate a numerical sequence representation, which was used to build models for distinguishing HVR1 variants between CIP and MIP. This data representation has been shown to capture accurately the complexity and Linagliptin of the intra-host HVR1 variants for the identification of important viral traits (Lara et al., 2017). Additionally, it relinquishes the need for computationally expensive multiple sequence alignment (Ma et al., 2003). It also generates a feature vector of equal size for all sequences independently of their length, thus eliminating the need in setting criteria for exclusion/inclusion of sequences or sequence positions for data analysis, and reduces the data dimensionality. All these properties are especially important for analysis of massive NGS data and, when combined with the use of statistical and machine-learning techniques, allows for effectively exploring a wide range of the feature space (Fig. 1, Fig. 5). NGS has fostered the rapid development of new bioinformatics tools, computational pipelines and software, which are rapidly becoming important for public health surveillance (Gwinn et al., 2017; Khudyakov, 2012; MacCannell, 2016; Rossi et al., 2015). However, there are many challenges for developing such new computational frameworks (MacCannell, 2016). The existing technology does not allow for sequencing of intra-host variants of the entire bacterial or viral genome. Short reads generated by the available NGS technologies require assembly into the whole-genome intra-host variants. However, computational approaches for such assembly have yet to become efficient for reconstruction of sufficiently representative intra-host populations (Posada-Cespedes et al., 2017). Owing to this limitation, NGS is frequently used to generate consensus sequences of pathogen genomes accompanied by distribution of heterogeneity along the sequence (Posada-Cespedes et al., 2017). The alternative approach is to focus on small genomic regions, for which genetic population of intra-host variants can be easily reconstructed using short NGS reads (Skums et al., 2012; Westbrooks et al., 2008). Although this approach does not generate sequences covering the entire genome, it produces data for a more accurate assessment of intra-host heterogeneity and epistatic connectivity of the sequenced region. Epistatic connectivity is rich with information on many biological traits that have fundamental clinical and public health significance (Khudyakov, 2010; Lara and Khudyakov, 2012; Skwark et al., 2017). Selection of a short genomic region for analysis is not trivial. One approach, which we have used here, is to use genomic regions encoding for intrinsically disordered proteins (IDP) or protein regions (IDPR), which are crucial regulatory molecules (Chakrabortee et al., 2016; Wright and Dyson, 2015). In HCV, IDP and IDPR have diverse and important biological functions (Dolan et al., 2015; Fan et al., 2014; Macdonald and Harris, 2004). IDPR's found in HBV, HCV and HEV have been shown to have extensive epistatic connectivity across viral proteins (Campo et al., 2011; Lara et al., 2014b; Lara et al., 2011b). HCV HVR1 encodes an IDPR (Fan et al., 2014; Kong et al., 2013). The HVR1 epistasis has been shown to be associated with HCV drug-resistance (Aurora et al., 2009; Lara et al., 2011a; Lara et al., 2011b), virulence (Lara and Khudyakov, 2012; Lara et al., 2014a), host ethnicity and gender (Lara et al., 2011a), and stages of HCV infection (Astrakhantseva et al., 2011; Lara et al., 2017). Epistatic connectivity defined by coevolution among genomic and protein sites is a fundamental genetic trait (Campo et al., 2008; Khudyakov, 2010). There are many approaches to measuring coevolution among genomic sites (Campo et al., 2011; Campo et al., 2008; Lara et al., 2017). Here, we employed DAC’ (Liu et al., 2015), which was previously used for the detection of recent HCV infections from NGS HVR1 data (Lara et al., 2017).