論文

国際誌
2020年11月18日

Risk prediction for malignant intraductal papillary mucinous neoplasm of the pancreas: logistic regression versus machine learning.

Scientific reports
  • Jae Seung Kang
  • Chanhee Lee
  • Wookyeong Song
  • Wonho Choo
  • Seungyeoun Lee
  • Sungyoung Lee
  • Youngmin Han
  • Claudio Bassi
  • Roberto Salvia
  • Giovanni Marchegiani
  • Cristopher L Wolfgang
  • Jin He
  • Alex B Blair
  • Michael D Kluger
  • Gloria H Su
  • Song Cheol Kim
  • Ki-Byung Song
  • Masakazu Yamamoto
  • Ryota Higuchi
  • Takashi Hatori
  • Ching-Yao Yang
  • Hiroki Yamaue
  • Seiko Hirono
  • Sohei Satoi
  • Tsutomu Fujii
  • Satoshi Hirano
  • Wenhui Lou
  • Yasushi Hashimoto
  • Yasuhiro Shimizu
  • Marco Del Chiaro
  • Roberto Valente
  • Matthias Lohr
  • Dong Wook Choi
  • Seong Ho Choi
  • Jin Seok Heo
  • Fuyuhiko Motoi
  • Ippei Matsumoto
  • Woo Jung Lee
  • Chang Moo Kang
  • Yi-Ming Shyr
  • Shin-E Wang
  • Ho-Seong Han
  • Yoo-Seok Yoon
  • Marc G Besselink
  • Nadine C M van Huijgevoort
  • Masayuki Sho
  • Hiroaki Nagano
  • Sang Geol Kim
  • Goro Honda
  • Yinmo Yang
  • Hee Chul Yu
  • Jae Do Yang
  • Jun Chul Chung
  • Yuichi Nagakawa
  • Hyung Il Seo
  • Yoo Jin Choi
  • Yoonhyeong Byun
  • Hongbeom Kim
  • Wooil Kwon
  • Taesung Park
  • Jin-Young Jang
  • 全て表示

10
1
開始ページ
20140
終了ページ
20140
記述言語
英語
掲載種別
研究論文(学術雑誌)
DOI
10.1038/s41598-020-76974-7

Most models for predicting malignant pancreatic intraductal papillary mucinous neoplasms were developed based on logistic regression (LR) analysis. Our study aimed to develop risk prediction models using machine learning (ML) and LR techniques and compare their performances. This was a multinational, multi-institutional, retrospective study. Clinical variables including age, sex, main duct diameter, cyst size, mural nodule, and tumour location were factors considered for model development (MD). After the division into a MD set and a test set (2:1), the best ML and LR models were developed by training with the MD set using a tenfold cross validation. The test area under the receiver operating curves (AUCs) of the two models were calculated using an independent test set. A total of 3,708 patients were included. The stacked ensemble algorithm in the ML model and variable combinations containing all variables in the LR model were the most chosen during 200 repetitions. After 200 repetitions, the mean AUCs of the ML and LR models were comparable (0.725 vs. 0.725). The performances of the ML and LR models were comparable. The LR model was more practical than ML counterpart, because of its convenience in clinical use and simple interpretability.

リンク情報
DOI
https://doi.org/10.1038/s41598-020-76974-7
PubMed
https://www.ncbi.nlm.nih.gov/pubmed/33208887
PubMed Central
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7676251
ID情報
  • DOI : 10.1038/s41598-020-76974-7
  • PubMed ID : 33208887
  • PubMed Central 記事ID : PMC7676251

エクスポート
BibTeX RIS