Topological data analysis can extract sub-groups with high incidence rates of Type 2 diabetes

Hyung Sun Kim, Chahngwoo Yi, Yongkang Kim, Uhnmee Park, Woong Kook, Bermseok Oh, Hyuk Kim, Taesung Park

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Type 2 Diabetes (T2D) is now a rapidly increasing, worldwide scourge, and the identification of genetic contributors is vital. However, current analyses of multiple, disease-contributing factors, and their combined interactions, remains quite difficult, using traditional approaches. Topological Data Analysis (TDA) shows what shape a data set can have, facilitating clustering analysis, by determining which components are close to each other. Thus, TDA can generate a network, using Single-Nucleotide Polymorphism (SNP) data, revealing the genetic relatedness of specific individuals, and can derive multiple ordered sub-groups, from one with a low patient concentration, to one with a high patient concentration. Since it is widely accepted that T2D pathogenesis is affected by multiple genetic factors, we performed TDA on T2D data from the Korea Association REsource (KARE) project, a population-based, genome-wide association study of the Korean adult population. Since KARE data contains follow-up information about the incidence of T2D, we compared the T2D status of each individual, at baseline, with that of ten years later. For the TDA network-driven sub-groups, ordered by prevalence, we compared the T2D incidence rate, after ten years, for individuals initially without T2D. As a result, we found that the TDA network-driven, ordered subgroups had significantly increased incidence rates, linearly correlated with prevalence (p-value = 0.006914). Our results demonstrate the usefulness of TDA in both identifying genetic contributors (e.g., SNPs), and their interrelationships, in the pathology of complex diseases.

Original languageEnglish
Pages (from-to)44-60
Number of pages17
JournalInternational Journal of Data Mining and Bioinformatics
Volume22
Issue number1
DOIs
Publication statusPublished - 2019

Bibliographical note

Publisher Copyright:
© The Author(s) 2019.

Keywords

  • KARE
  • Korea association resource
  • Network
  • Single-nucleotide polymorphism
  • Sub-group analysis
  • Topological data analysis
  • Type 2 diabetes

Fingerprint

Dive into the research topics of 'Topological data analysis can extract sub-groups with high incidence rates of Type 2 diabetes'. Together they form a unique fingerprint.

Cite this