captus paralog filter - references added in the wrong direction #12

EdBiffin · 2024-06-28T00:53:35Z

Dear Edgardo, Ive noticed that when adding reference sequences to alignments, prior to informed paralog filtering, in some cases these are added in the reverse direction to the extracted sequences in the alignment. Im using a custom reference file that comprises the sequences that were used for probe design, mostly sourced from 1KP and Phytozome - the references were generated by clustering using CD-Hit (longest sequence per cluster at specified identity). Ive attached an example alignment and also the references for that gene. I'm using v1.01. Any advice would be greatly appreciated.
AT1G03750.fna.txt
AT1G03750.references.txt

edgardomortiz · 2024-08-02T05:41:03Z

Dear Ed,

Sorry for the really late reply, I was in the chaos of moving countries. I see that in your reference all sequences are in different reading frames so Captus might be having troubles translating them consistently.

Captus translates the references using the six reading frames, then selects the reading frame that produces the fewest internal stop codons, if there is a tie between two reading frames it will prefer a positive reading frame. So maybe these are not CDS?

If you don't care about obtaining the aminoacid format from the alignment step, and you are sure all are in the same direction you could provide the reference to Captus as miscellaneous DNA (-d AT1G03750.references.fasta).

If the aminoacid output is necessary then I would suggest verifying that these are translatable (preferably in reading frame 1) or at least consistently for all

Let me know if this helps!

Edgardo

EdBiffin · 2024-08-05T23:12:40Z

Dear Edgardo, thanks for your reply – much appreciated. These are the sequences used for probe design, in some cases only partial exons, so that largely explains the issue. Hope your move went well and hoping that you continue to develop captus – were finding that it plugs a lot of gaps that are issues in other pipelines. Ed From: Edgardo M. Ortiz ***@***.***> Date: Friday, 2 August 2024 at 3:11 pm To: edgardomortiz/Captus ***@***.***> Cc: Ed Biffin ***@***.***>, Author ***@***.***> Subject: Re: [edgardomortiz/Captus] captus paralog filter - references added in the wrong direction (Issue #12) CAUTION: External email. Only click on links or open attachments from trusted senders.

…

________________________________ Dear Ed, Sorry for the really late reply, I was in the chaos of moving countries. I see that in your reference all sequences are in different reading frames so Captus might be having troubles translating them consistently. Captus translates the references using the six reading frames, then selects the reading frame that produces the fewest internal stop codons, if there is a tie between two reading frames it will prefer a positive reading frame. So maybe these are not CDS? If you don't care about obtaining the aminoacid format from the alignment step, and you are sure all are in the same direction you could provide the reference to Captus as miscellaneous DNA (-d AT1G03750.references.fasta). If the aminoacid output is necessary then I would suggest verifying that these are translatable (preferably in reading frame 1) or at least consistently for all Let me know if this helps! Edgardo — Reply to this email directly, view it on GitHub<#12 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AHX653GVC4S3RFV542NUVY3ZPMLYJAVCNFSM6AAAAABKA5DSJSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENRUGU4TSOJSHE>. You are receiving this because you authored the thread.Message ID: ***@***.***>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

captus paralog filter - references added in the wrong direction #12

captus paralog filter - references added in the wrong direction #12

EdBiffin commented Jun 28, 2024

edgardomortiz commented Aug 2, 2024

EdBiffin commented Aug 5, 2024 via email

captus paralog filter - references added in the wrong direction #12

captus paralog filter - references added in the wrong direction #12

Comments

EdBiffin commented Jun 28, 2024

edgardomortiz commented Aug 2, 2024

EdBiffin commented Aug 5, 2024 via email