論文

査読有り
2018年3月1日

Argobots: A Lightweight Low-Level Threading and Tasking Framework

IEEE Transactions on Parallel and Distributed Systems
  • Sangmin Seo
  • ,
  • Abdelhalim Amer
  • ,
  • Pavan Balaji
  • ,
  • Cyril Bordage
  • ,
  • George Bosilca
  • ,
  • Alex Brooks
  • ,
  • Philip Carns
  • ,
  • Adrian Castello
  • ,
  • Damien Genet
  • ,
  • Thomas Herault
  • ,
  • Shintaro Iwasaki
  • ,
  • Prateek Jindal
  • ,
  • Laxmikant V. Kale
  • ,
  • Sriram Krishnamoorthy
  • ,
  • Jonathan Lifflander
  • ,
  • Huiwei Lu
  • ,
  • Esteban Meneses
  • ,
  • Marc Snir
  • ,
  • Yanhua Sun
  • ,
  • Kenjiro Taura
  • ,
  • Pete Beckman

29
3
開始ページ
512
終了ページ
526
記述言語
英語
掲載種別
研究論文(学術雑誌)
DOI
10.1109/TPDS.2017.2766062
出版者・発行元
IEEE Computer Society

In the past few decades, a number of user-level threading and tasking models have been proposed in the literature to address the shortcomings of OS-level threads, primarily with respect to cost and flexibility. Current state-of-the-art user-level threading and tasking models, however, either are too specific to applications or architectures or are not as powerful or flexible. In this paper, we present Argobots, a lightweight, low-level threading and tasking framework that is designed as a portable and performant substrate for high-level programming models or runtime systems. Argobots offers a carefully designed execution model that balances generality of functionality with providing a rich set of controls to allow specialization by end users or high-level programming models. We describe the design, implementation, and performance characterization of Argobots and present integrations with three high-level models: OpenMP, MPI, and colocated I/O services. Evaluations show that (1) Argobots, while providing richer capabilities, is competitive with existing simpler generic threading runtimes
(2) our OpenMP runtime offers more efficient interoperability capabilities than production OpenMP runtimes do
(3) when MPI interoperates with Argobots instead of Pthreads, it enjoys reduced synchronization costs and better latency-hiding capabilities
and (4) I/O services with Argobots reduce interference with colocated applications while achieving performance competitive with that of a Pthreads approach.

リンク情報
DOI
https://doi.org/10.1109/TPDS.2017.2766062
URL
http://dblp.uni-trier.de/db/journals/tpds/tpds29.html#journals/tpds/SeoABBBBCCGHIJK18

エクスポート
BibTeX RIS