1.12
Download the source code here: htslib-1.12.tar.bz2.
(The "Source code" downloads are generated by GitHub and are incomplete as they are missing some generated files.)
Features and Updates
-
Added experimental CRAM 3.1 and 4.0 support. (#929)
These should not be used for long term data storage as the specification still needs to be ratified by GA4GH and may be subject to changes in format. (This is highly likely for 4.0). However it may be tested using:test/test_view -t ref.fa -C -o version=3.1 in.bam -p out31.cram
For smaller but slower files, try varying the compression profile with an additional
-o small
. Profile choices arefast
,normal
,small
andarchive
, and can be applied to all CRAM versions. -
Added a general filtering syntax for alignment records in SAM/BAM/CRAM readers. (#1181, #1203)
An example to find chromosome spanning read-pairs with high mapping quality:'mqual >= 30 && mrname != rname'
To find significant sized deletions:'cigar =~ "[0-9]{2}D"' or 'rlen - qlen > 10'
.
To report duplicates that aren't part of a "proper pair":'flag.dup && !flag.proper_pair'
More details are in the samtools.1 man page under "FILTER EXPRESSIONS". -
The knet networking code has been removed. It only supported the http and ftp protocols, and a better and safer alternative using libcurl has been available since release 1.3. If you need access to
ftp://
andhttp://
URLs, HTSlib should be built with libcurl support. (#1200) -
The old
htslib/knetfile.h
interfaces have been marked as deprecated. Any code still using them should be updated to use hFILE instead. (#1200) -
Added an introspection API for checking some of the capabilities provided by HTSlib. (#1170) Thanks also to John Marshall for contributions. (#1222)
hfile_list_schemes
: returns the number of schemes foundhfile_list_plugins
: returns the number of plugins foundhfile_has_plugin
: checks if a specific plugin is availablehts_features
: returns a bit mask with all available featureshts_test_feature
: test if a feature is availablehts_feature_string
: return a string summary of enabled features
-
Made performance improvements to
probaln_glocal
method, which speeds up mpileup BAQ calculations. (#1188)- Caching of reused loop variables and removal of loop invariants
- Code reordering to remove instruction latency.
- Other refactoring and tidyups.
-
Added a public method for constructing a BAM record from the component pieces. Thanks to Anders Kaplan. (#1159, #1164)
-
Added two public methods,
sam_parse_cigar
andbam_parse_cigar
, as part of a small CIGAR API (#1169, #1182). Thanks to Daniel Cameron for input. (#1147) -
HTSlib, and the included
htsfile
program, will now recognise the old RAZF compressed file format. Note that while the format is detected, HTSlib is unable to read it. It is recommended that RAZF files are uncompressed withgunzip
before using them with HTSlib. Thanks to John Marshall (#1244); and Matthew J. Oldach who reported problems with uncompressing some RAZF files (samtools/samtools#1387). -
The S3 plugin now has options to force the address style. It will recognise the addressing_style and host_bucket entries in the respective AWS
.credentials
and s3cmd.s3cfg
files. There is also a newHTS_S3_ADDRESS_STYLE
environment variable. Details are in the htslib-s3-plugin.7 man file (#1249).
Build changes
These are compiler, configuration and makefile based changes.
-
Added new Makefile targets for the applications that embed HTSlib and want to run its test suite or clean its generated artefacts. (#1230, #1238)
-
The CRAM codecs are now obtained via the htscodecs submodule, hence when cloning it is now best to use
git clone --recursive
. In an existing clone, you may usegit submodule update --init
to obtain the htscodecs submodule checkout. -
Updated CI test configuration to recurse HTSlib submodules. (#1359)
-
Added Cirrus-CI integration as a replacement for Travis, which was phased out. (#1175; #1212)
-
Updated the Windows image used by Appveyor to 'Visual Studio 2019'. (#1172; fixed #1166)
-
Fixed a buglet in
configure.ac
, exposed by the release 2.70 of autoconf. Thanks to John Marshall. (#1198) -
Fixed plugin linking on macOS, to prevent symbol conflict when linking with a static HTSlib. Thanks to John Marshall. (#1184)
-
Fixed a clang++9 error in
cram_io.h
. Thanks to Pjotr Prins. (#1190) -
Introduced
$(ALL_CPPFLAGS)
to allow for more flexibility in setting the compiler flags. Thanks to John Marshall. (#1187) -
Added 'fall through' comments to prevent warnings issued by Clang on intentional fall through case statements, when building with
-Wextra flag
. Thanks to John Marshall. (#1163) -
Non-configure builds now define
_XOPEN_SOURCE=600
to allow them to work when thegcc -std=c99
option is used. Thanks to John Marshall. (#1246)
Bug fixes
-
Fixed VCF
#CHROM
header parsing to only separate columns at tab characters. Thanks to Sam Morris for reporting the issue. (#1237; fixed samtools/bcftools#1408) -
Fixed a crash reported in
bcf_sr_sort_set
, which expectsREF
to be present. (#1204; fixed samtools/bcftools#1361) -
Fixed a bcf synced reader bug when filtering with a region list, and the first record for a chromosome had the same position as the last record for the previous chromosome. (#1254; fixed samtools/bcftools#1441)
-
Fixed a bug in the overlapping logic of
mpileup
, dealing with iterating over CIGAR segments. Thanks to @wulj2 for the analysis. (#1202; fixed #1196) -
Fixed a
tabix
bug that prevented setting the correct number of lines to be skipped in a region file. Thanks to Jim Robinson for reporting it. (#1189; fixed #1186) -
Made
bam_itr_next
an alias forsam_itr_next
, to prevent it from crashing when working withhtsFile
pointers. Thanks to Torbjörn Klatt for reporting it. (#1180; fixed #1179) -
Fixed once per outgoing multi-threaded block
bgzf_idx_flush
assertion, to accommodate situations when a single record could span multiple blocks. Thanks to @lacek. (#1168; fixed samtools/samtools#1328) -
Fixed assumption of
pthread_t
being a non-structure, as permitted by POSIX. Thanks also to John Marshall and Anders Kaplan. (#1167, #1153, #1153) -
Fixed the minimum offset of a BAI index bin, to account for unmapped reads. Thanks to John Marshall for spotting the issue. (#1158; fixed #1142)
-
Fixed the CRLF handling in
sam_parse_worker
method. Thanks to Anders Kaplan. (#1149; fixed #1148) -
Included
unistd.h
anderrno.h
directly in HTSlib files, as opposed to including them indirectly, via third party code. Thanks to Andrew Patterson (#1143) and John Marshall (#1145).