Skip to content

Commit 4c1acb8

Browse files
committed
Fix issue created in PR#1848
We shouldn't be attempting to create embedded reference sequences for CRAM containers with reads mapped to chr -1 (ie unmapped). We don't permit embed_ref in multi-ref mode and it's nonsensical for entirely unmapped data. The only real fix needed here is "c->ref_id < 0" just before calling cram_generate_reference(), but checking elsewhere can sidestep other potential issues and we have safety in checking in more than one place. Credit to OSS_Fuzz Fixes oss-fuzz issue 372547397
1 parent ca92061 commit 4c1acb8

File tree

2 files changed

+10
-2
lines changed

2 files changed

+10
-2
lines changed

cram/cram_encode.c

+8-2
Original file line numberDiff line numberDiff line change
@@ -1842,7 +1842,7 @@ int cram_encode_container(cram_fd *fd, cram_container *c) {
18421842
// Don't try embed ref if we repeatedly fail
18431843
pthread_mutex_lock(&fd->ref_lock);
18441844
int failed_embed = (fd->no_ref_counter >= 5); // maximum 5 tries
1845-
if (!failed_embed && c->embed_ref == -2) {
1845+
if (!failed_embed && c->embed_ref == -2 && c->ref_id >= 0) {
18461846
hts_log_warning("Retrying embed_ref=2 mode for #%d/5", fd->no_ref_counter);
18471847
fd->no_ref = c->no_ref = 0;
18481848
fd->embed_ref = c->embed_ref = 2;
@@ -1921,6 +1921,12 @@ int cram_encode_container(cram_fd *fd, cram_container *c) {
19211921
// Do not confuse with fd->ref_free which is a pointer to a
19221922
// reference string to free.
19231923
c->ref_free = 1;
1924+
} else {
1925+
// Double check for broken input. We shouldn't have
1926+
// embedded references enabled for unmapped data, but our
1927+
// data could be broken.
1928+
embed_ref = 0;
1929+
no_ref = c->no_ref = 1;
19241930
}
19251931
}
19261932
c->ref_seq_id = c->ref_id;
@@ -1967,7 +1973,7 @@ int cram_encode_container(cram_fd *fd, cram_container *c) {
19671973

19681974
// Embed consensus / MD-generated ref
19691975
if (embed_ref == 2) {
1970-
if (cram_generate_reference(c, s, r1) < 0) {
1976+
if (c->ref_id < 0 || cram_generate_reference(c, s, r1) < 0) {
19711977
// Should this be a permanent thing via fd->no_ref?
19721978
// Doing so means we cannot easily switch back again should
19731979
// things fix themselves later on. This is likely not a

cram/cram_io.c

+2
Original file line numberDiff line numberDiff line change
@@ -4954,6 +4954,8 @@ int cram_write_SAM_hdr(cram_fd *fd, sam_hdr_t *hdr) {
49544954
hts_log_warning("NOTE: the CRAM file will be bigger "
49554955
"than using an external reference");
49564956
pthread_mutex_lock(&fd->ref_lock);
4957+
// Best guess. It may be unmapped data with broken
4958+
// headers, in which case this will get ignored.
49574959
fd->embed_ref = 2;
49584960
pthread_mutex_unlock(&fd->ref_lock);
49594961
break;

0 commit comments

Comments
 (0)