-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotating Breakend Points (<BND>) #111
Comments
I haven't looked at the data yet, but just to clarify:
in BED format will not overlap:
because the VCF is 1-based and the BED is 0-based. |
Hi Brentp, Thanks for the response and I agree with your argument. Hence, for the same reason I converted my BED to 1-based whose coordinate is as shown But still it doesnot get annotated with all the BND events with same coordinates as in our cohort. It works when I edit and impute the ALT-ID as INS:[chr4:190113797[GA and adding END=261426 then I get all the BNDs with exactly same coordinate as present in my cohort. This small makeshift edit works for me but if possible this can be fixed some thing in vcfanno code? |
can you post a 1 line vcf (with header) and a 1 line bed that demonstrate the problem? |
Hi,
I am trying to annotate breakend points or BND structural variants called by illumina (Manta) but it seems it cannot annotate it because it doesnot recognize the ALT-id tags for BND types. The files are attached vcf_error.zip. The contents of the zipped file:
test.vcf: List of example query variants
gnomad_test.bed.gz: List of example variants with their coordinates and ID that can be used to annotate the query vcf file
vcfanno_bed.conf.toml: Configuration file
In the attached test.vcf file there are two variants :
Variant-1:
1 261425 MantaBND:58922:1:10:0:0:0:1 A [chr4:190113797[GA 292 PASS SVTYPE=BND;MATEID=MantaBND:58922:1:10:0:0:0:0;SVINSLEN=1;SVINSSEQ=G;BND_DEPTH=106;MATE_BND_DEPTH=42;AC=1;AN=2;CSQT=1|AP006222.1|ENST00000441866.2|transcript_variant GT:FT:GQ:PL:PR:SR 0/1:PASS:292:342,0,999:37,1:65,19
and Variant-2
1 261425 MantaBND:58922:1:10:0:0:0:1 A 292 PASS END=261426;SVTYPE=BND;MATEID=MantaBND:58922:1:10:0:0:0:0;SVINSLEN=1;SVINSSEQ=G;BND_DEPTH=106;MATE_BND_DEPTH=42;AC=1;AN=2;CSQT=1|AP006222.1|ENST00000441866.2|transcript_variant GT:FT:GQ:PL:PR:SR 0/1:PASS:292:342,0,999:37,1:65,19
These two variants represent the Breakend points or BND type events. The variant-1 is the true variant without any modification and variant-2 is same as variant-1 but edited with ALT-ID changed to (can also be
or DUP:TANDEM etc) and endpoint tag END=261426 was added to this line.I am trying to annotate with gnomad_test.bed.gz file (attached here) that has exactly same coordinates as this variant:
1 261425 261426 LP000Test
The configuration file is also attached: vcfanno_bed.conf.toml
I used the command as:
vcfanno -p 4 -ends -permissive-overlap vcfanno_bed.conf.toml test.vcf
The resulting annotation for Variant-1:
-- Not annotated with LP000Test
Whereas for variant-2:
-- gets annotated with LP000Test.
It seems the problem is that vcfanno cannot recognize or find the ALT-id and END point of the BND variant type.
Is there a way it can be fixed in vcfanno. This will help a lot when I am trying to compute internal overlap with our large cohort (n>800 samples). Currently, many of these BND types are getting missed out because of it and affects overall interpretation.
Thanks for helping it out.
vcfanno_error.zip
The text was updated successfully, but these errors were encountered: