MISC

2018年11月1日

Porting DDalphaAMG solver to K computer

  • Ken-Ichi Ishikawa
  • ,
  • Issaku Kanamori

記述言語
掲載種別
機関テクニカルレポート,技術報告書,プレプリント等

We port Domain-Decomposed-alpha-AMG solver to the K computer. The system has<br />
8 cores and 16 GB memory par node, of which theoretical peak is 128 GFlops<br />
(82,944 nodes in total). Its feature, as many as 256 registers par core and as<br />
large as 0.5 byte/Flop ratio, requires a different tuning from other machines.<br />
In order to use more registers, we change some of the data structure and<br />
rewrite matrix-vector operations with intrinsics. The performance is improved<br />
by more than a factor two for twelve solves including the setup. The efficiency<br />
is still about 5% after the optimization, which is lower than a previously<br />
tuned mixed precision solver for the K computer, 22%. The throughput is,<br />
however, more than two times better for a physical point configuration.

リンク情報
arXiv
http://arxiv.org/abs/arXiv:1811.00355
URL
http://arxiv.org/abs/1811.00355v1