論文

査読有り 国際誌
2019年4月28日

Parallel repulsive logic regression with biological adjacency.

Biostatistics (Oxford, England)
  • Daisuke Yoneoka
  • ,
  • Cindy Im
  • ,
  • Yutaka Yasui

記述言語
英語
掲載種別
研究論文(学術雑誌)
DOI
10.1093/biostatistics/kxz011

Logic regression, an extension of generalized linear models with Boolean combinations of binary variables as predictors, is a useful tool in exploring interactions among single-nucleotide polymorphisms (SNPs) in genome-wide association studies. However, since the search space defined by all possible combinations of SNPs, their complements, and logical operators in Boolean expressions can be exceedingly large in such studies, objective function optimization is slow and likely to be trapped in many local solutions, resulting in model over-fitting. We introduce a new search algorithm, parallel repulsive logic regression (PRLR), to efficiently estimate parameters of a logic regression to find a best model within the large space of SNP interactions by incorporating: (i) relevant biological adjacency matrix between SNPs to define similarity of estimation paths or trees, which are derived from physical SNP positions on chromosomes and/or memberships in biological gene pathways; and (ii) two repulsive forces to counter the similarity between and within estimation paths considered in parallel, which are introduced as penalty terms in the objective function. We compare our method's performance for identifying biologically-meaningful SNP interactions through simulations and with real genetic-epidemiological data. PRLR's detection-accuracy measures outperform existing approaches, especially in terms of positive predictive value and sensitivity for detecting SNP-SNP interactions.

リンク情報
DOI
https://doi.org/10.1093/biostatistics/kxz011
PubMed
https://www.ncbi.nlm.nih.gov/pubmed/31030217

エクスポート
BibTeX RIS