Question about +/- in final_decomposition.tsv #19

865699871 · 2022-03-21T11:46:12Z

In the study of Altemose et al. (Complete genomic and epigenetic maps of human centromeres), CHM13 Cen1 contains 1.7Mb inversion inside active α HOR array (Fig 2a). We used Stringdecomposer in Cen1 active α HOR array. However, all items in final_decomposition.tsv are +. Can stringdecomposer mark + / - for sequence？

TanyaDvorkina · 2022-03-21T13:17:19Z

Hi,

Thank you for your interest in StringDecomposer!
In our tsv-files +/- at the end of each row refer to "reliability" of alignment (see more info about output in Quick start section). This characteristic is needed for monomer-to-read alignment only.

The strand is represented as ' at the end of the monomer name.
Consider two rows in final tsv-file:
ref mon 1 171 99
ref mon' 172 343 99

Second row shows that monomer mon is aligned with identity 99 in reverse strand.

We understand that such representation of strand is a bit misleading and we are going to add bed-file representation of StringDecomposer output in the nearest release.
For now you can use our internal script to convert StringDecomposer final tsv-file to bed-file convert2bed.py.

If this won't help, please don't hesitate to ask further questions!

Best,
Tanya

865699871 · 2022-03-21T13:32:39Z

Thank you for your response!

865699871 closed this as completed Mar 21, 2022

865699871 reopened this Mar 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about +/- in final_decomposition.tsv #19

Question about +/- in final_decomposition.tsv #19

865699871 commented Mar 21, 2022

TanyaDvorkina commented Mar 21, 2022 •

edited

Loading

865699871 commented Mar 21, 2022

Question about +/- in final_decomposition.tsv #19

Question about +/- in final_decomposition.tsv #19

Comments

865699871 commented Mar 21, 2022

TanyaDvorkina commented Mar 21, 2022 • edited Loading

865699871 commented Mar 21, 2022

TanyaDvorkina commented Mar 21, 2022 •

edited

Loading