論文

査読有り 国際誌
2019年10月1日

Nucleotide Archival Format (NAF) enables efficient lossless reference-free compression of DNA sequences.

Bioinformatics (Oxford, England)
  • Kirill Kryukov
  • ,
  • Mahoko Takahashi Ueda
  • ,
  • So Nakagawa
  • ,
  • Tadashi Imanishi

35
19
開始ページ
3826
終了ページ
3828
記述言語
英語
掲載種別
DOI
10.1093/bioinformatics/btz144

SUMMARY: DNA sequence databases use compression such as gzip to reduce the required storage space and network transmission time. We describe Nucleotide Archival Format (NAF)-a new file format for lossless reference-free compression of FASTA and FASTQ-formatted nucleotide sequences. Nucleotide Archival Format compression ratio is comparable to the best DNA compressors, while providing dramatically faster decompression. We compared our format with DNA compressors: DELIMINATE and MFCompress, and with general purpose compressors: gzip, bzip2, xz, brotli and zstd. AVAILABILITY AND IMPLEMENTATION: NAF compressor and decompressor, as well as format specification are available at https://github.com/KirillKryukov/naf. Format specification is in public domain. Compressor and decompressor are open source under the zlib/libpng license, free for nearly any use. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

リンク情報
DOI
https://doi.org/10.1093/bioinformatics/btz144
PubMed
https://www.ncbi.nlm.nih.gov/pubmed/30799504
PubMed Central
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6761962
ID情報
  • DOI : 10.1093/bioinformatics/btz144
  • ISSN : 1367-4803
  • PubMed ID : 30799504
  • PubMed Central 記事ID : PMC6761962

エクスポート
BibTeX RIS