2019年10月1日
Nucleotide Archival Format (NAF) enables efficient lossless reference-free compression of DNA sequences.
Bioinformatics (Oxford, England)
- ,
- ,
- ,
- 巻
- 35
- 号
- 19
- 開始ページ
- 3826
- 終了ページ
- 3828
- 記述言語
- 英語
- 掲載種別
- DOI
- 10.1093/bioinformatics/btz144
SUMMARY: DNA sequence databases use compression such as gzip to reduce the required storage space and network transmission time. We describe Nucleotide Archival Format (NAF)-a new file format for lossless reference-free compression of FASTA and FASTQ-formatted nucleotide sequences. Nucleotide Archival Format compression ratio is comparable to the best DNA compressors, while providing dramatically faster decompression. We compared our format with DNA compressors: DELIMINATE and MFCompress, and with general purpose compressors: gzip, bzip2, xz, brotli and zstd. AVAILABILITY AND IMPLEMENTATION: NAF compressor and decompressor, as well as format specification are available at https://github.com/KirillKryukov/naf. Format specification is in public domain. Compressor and decompressor are open source under the zlib/libpng license, free for nearly any use. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- リンク情報
- ID情報
-
- DOI : 10.1093/bioinformatics/btz144
- ISSN : 1367-4803
- PubMed ID : 30799504
- PubMed Central 記事ID : PMC6761962