Abstract
Type 2 Diabetes (T2D) is now a rapidly increasing, worldwide scourge, and the identification of genetic contributors is vital. However, current analyses of multiple, disease-contributing factors, and their combined interactions, remains quite difficult, using traditional approaches. Topological Data Analysis (TDA) shows what shape a data set can have, facilitating clustering analysis, by determining which components are close to each other. Thus, TDA can generate a network, using Single-Nucleotide Polymorphism (SNP) data, revealing the genetic relatedness of specific individuals, and can derive multiple ordered sub-groups, from one with a low patient concentration, to one with a high patient concentration. Since it is widely accepted that T2D pathogenesis is affected by multiple genetic factors, we performed TDA on T2D data from the Korea Association REsource (KARE) project, a population-based, genome-wide association study of the Korean adult population. Since KARE data contains follow-up information about the incidence of T2D, we compared the T2D status of each individual, at baseline, with that of ten years later. For the TDA network-driven sub-groups, ordered by prevalence, we compared the T2D incidence rate, after ten years, for individuals initially without T2D. As a result, we found that the TDA network-driven, ordered subgroups had significantly increased incidence rates, linearly correlated with prevalence (p-value = 0.006914). Our results demonstrate the usefulness of TDA in both identifying genetic contributors (e.g., SNPs), and their interrelationships, in the pathology of complex diseases.
Original language | English |
---|---|
Pages (from-to) | 44-60 |
Number of pages | 17 |
Journal | International Journal of Data Mining and Bioinformatics |
Volume | 22 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2019 |
Bibliographical note
Publisher Copyright:© The Author(s) 2019.
Keywords
- KARE
- Korea association resource
- Network
- Single-nucleotide polymorphism
- Sub-group analysis
- Topological data analysis
- Type 2 diabetes