論文

査読有り
2018年1月28日

Parallelized software offloading of low-level communication with user-level threads

ACM International Conference Proceeding Series
  • Wataru Endo
  • ,
  • Kenjiro Taura

開始ページ
289
終了ページ
298
記述言語
英語
掲載種別
研究論文(国際会議プロシーディングス)
DOI
10.1145/3149457.3149475
出版者・発行元
Association for Computing Machinery

Although recent HPC interconnects are assumed to achieve low latency and high bandwidth communication, in practical terms, their performance is often bounded by the network software stacks rather than the underlying hardware because message processing requires a certain amount of computation in CPUs. To exploit the hardware capacity, some existing communication libraries provide an interface for parallelizing accesses to network endpoints with manual hints. However, with growing core counts per node in modern clusters, it is increasingly difficult for users to efficiently handle communication resources in multi-threading environments. We implemented a low-level communication library that can automatically schedule communication requests by offloading them to multiple dedicated threads via lockless circular buffers. To enhance the efficiency of offloading, we developed a novel technique to dynamically change the number of offloading threads using a user-level thread library. We demonstrate that our offloading architecture exhibits better performance characteristics in microbenchmark results than the existing approaches.

リンク情報
DOI
https://doi.org/10.1145/3149457.3149475
URL
http://dblp.uni-trier.de/db/conf/hpcasia/hpcasia2018.html#conf/hpcasia/EndoT18

エクスポート
BibTeX RIS