-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can a custom reference dataset be a probe set? #20
Comments
Hi @vincianem You can provide any lineage set from the BUSCO database (https://busco-data.ezlab.org/v5/data/lineages/), just download the tar.gz file and provide its path to Captus for extraction step as Now, if you have a custom probeset you must provide the sequences (full locus sequence, e.g. CDS) from where the probes (120bp segments) were derived. I hope this helps, do not hesitate to ask me is something is not clear Edgardo |
Thank you for your answer! We used the octocoral v.2 probe set but the target file was not made available with the probe set. How one would proceed to create a robust target file from the probe set? For example, how did you proceed to create the SeedPlantsPTD? I also have a question regarding the ploidy level. I have at least 2n and 4n in my dataset. How is variation in ploidy taken into account in Captus? Best wishes, |
Regarding the octocoral v2 probe set, I am not familiar with it but perhaps you can contact the authors or maybe the file is available as supplementary material with the paper where it was published? About ploidy, Captus can recover any number of divergent copies of a single locus, as long as they are different enough to be assembled as separate contigs. In the case of the plastome proteins I downloaded all the plastome proteins available in GenBank and then clustered and manually curated the clusters. Edgardo |
Thank you for your answers @edgardomortiz. There was no target file provided with the paper of the octocoral v.2. We have contacted the authors but we've not heard back from them yet. Then, I used genome assemblies to produce my own target file using Phyluce. I wonder, should loci for which I have multiple sequences be clustered in the final fasta file? I formatted the sequence names according to Captus manual. That's great for the ploidy! Vinciane |
Hi @vincianem , Sorry for the delay, last weeks have been extremely busy. You can have multiple sequences per locus in your target file (if I understood your question correctly, if not please don't hesitate to ask again!) By the way, Captus design can help designing probes and targets. If you want to try it I could help. Edgardo |
Hi @edgardomortiz, No worries, Captus worked fine with the first target file I designed with Phyluce. That's said, I have two more target files to create using existing set of probes, genomes and transcriptome assemblies. I am gladly giving a try to Captus design. Where do I find the information on how does Captus design work? Best wishes, Vinciane |
Hej Hej,
I have just starting to work with capture data, and I wish to use Captus to process them.
I have two datasets: one with flowering plants (Teucrium) and one with sea animals (Octocorallia). For the flowering plants, it is simple as Captus comes bundled with Mega353, but I wonder how to process for the Octocorallia. Can a probe set be used as a target file if i properly format the sequence names?
I am new to Captus pipeline and this type of data, so please correct me if I've gotten something wrong.
Best wishes,
Vinciane
The text was updated successfully, but these errors were encountered: