Can a custom reference dataset be a probe set? #20

vincianem · 2025-01-15T11:13:53Z

Hej Hej,

I have just starting to work with capture data, and I wish to use Captus to process them.

I have two datasets: one with flowering plants (Teucrium) and one with sea animals (Octocorallia). For the flowering plants, it is simple as Captus comes bundled with Mega353, but I wonder how to process for the Octocorallia. Can a probe set be used as a target file if i properly format the sequence names?

I am new to Captus pipeline and this type of data, so please correct me if I've gotten something wrong.

Best wishes,
Vinciane

edgardomortiz · 2025-01-15T17:55:01Z

Hi @vincianem

You can provide any lineage set from the BUSCO database (https://busco-data.ezlab.org/v5/data/lineages/), just download the tar.gz file and provide its path to Captus for extraction step as -n

Now, if you have a custom probeset you must provide the sequences (full locus sequence, e.g. CDS) from where the probes (120bp segments) were derived.

I hope this helps, do not hesitate to ask me is something is not clear

Edgardo

vincianem · 2025-01-16T12:09:51Z

Hi @edgardomortiz

Thank you for your answer!

We used the octocoral v.2 probe set but the target file was not made available with the probe set. How one would proceed to create a robust target file from the probe set? For example, how did you proceed to create the SeedPlantsPTD?

I also have a question regarding the ploidy level. I have at least 2n and 4n in my dataset. How is variation in ploidy taken into account in Captus?

Best wishes,
Vinciane

edgardomortiz · 2025-01-16T15:37:50Z

Regarding the octocoral v2 probe set, I am not familiar with it but perhaps you can contact the authors or maybe the file is available as supplementary material with the paper where it was published? About ploidy, Captus can recover any number of divergent copies of a single locus, as long as they are different enough to be assembled as separate contigs.

In the case of the plastome proteins I downloaded all the plastome proteins available in GenBank and then clustered and manually curated the clusters.

Edgardo

vincianem · 2025-02-06T14:28:48Z

Thank you for your answers @edgardomortiz.

There was no target file provided with the paper of the octocoral v.2. We have contacted the authors but we've not heard back from them yet. Then, I used genome assemblies to produce my own target file using Phyluce.

I wonder, should loci for which I have multiple sequences be clustered in the final fasta file? I formatted the sequence names according to Captus manual.

That's great for the ploidy!

Vinciane

edgardomortiz · 2025-02-17T05:05:30Z

Hi @vincianem ,

Sorry for the delay, last weeks have been extremely busy. You can have multiple sequences per locus in your target file (if I understood your question correctly, if not please don't hesitate to ask again!)

By the way, Captus design can help designing probes and targets. If you want to try it I could help.

Edgardo

vincianem · 2025-02-17T11:55:03Z

Hi @edgardomortiz,

No worries, Captus worked fine with the first target file I designed with Phyluce. That's said, I have two more target files to create using existing set of probes, genomes and transcriptome assemblies. I am gladly giving a try to Captus design. Where do I find the information on how does Captus design work?

Best wishes, Vinciane

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can a custom reference dataset be a probe set? #20

Can a custom reference dataset be a probe set? #20

vincianem commented Jan 15, 2025 •

edited

Loading

edgardomortiz commented Jan 15, 2025

vincianem commented Jan 16, 2025 •

edited

Loading

edgardomortiz commented Jan 16, 2025

vincianem commented Feb 6, 2025 •

edited

Loading

edgardomortiz commented Feb 17, 2025

vincianem commented Feb 17, 2025 •

edited

Loading

Can a custom reference dataset be a probe set? #20

Can a custom reference dataset be a probe set? #20

Comments

vincianem commented Jan 15, 2025 • edited Loading

edgardomortiz commented Jan 15, 2025

vincianem commented Jan 16, 2025 • edited Loading

edgardomortiz commented Jan 16, 2025

vincianem commented Feb 6, 2025 • edited Loading

edgardomortiz commented Feb 17, 2025

vincianem commented Feb 17, 2025 • edited Loading

vincianem commented Jan 15, 2025 •

edited

Loading

vincianem commented Jan 16, 2025 •

edited

Loading

vincianem commented Feb 6, 2025 •

edited

Loading

vincianem commented Feb 17, 2025 •

edited

Loading