-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty results when Using taxator-tk for binning contigs against RefSeq bacterial genomes #68
Comments
Hi @FarzanehRah, this is hard to debug with sparse information, but let's give it a try! Observations
Questions Please answer those questions so that we can find the problem together:
The most reasonable explanation for me at the moment is, that the aligner does not generate any hits, thus the alignments file is empty, and the classification is as well. I haven't used the specific alignment parameters with last, but they might just be too stringent. Did you try with the blast workflow, which works similarly, but the index building is much faster and memory usage much lower? |
Hi Johannes, thank you so much, I really appreciate your prompt responses and valuable suggestions. Work Done and Challenges Encountered:
Responses to Your Questions:
For your second observation, I should mention that the first part is from the Thanks again for your time. |
As you can see, the aligner doesn't generate any hit to work with in taxator. That's the reason that the results are empty. Possibly, there is an issue with the parallel wrapper around lastal, or that the sensitivity is simply not high with the parameters used. I would take a few sequences of the query for a test run for use with both last and blast, to find the right alignment parameters. Start with the single-thread mode for testing. You can also just run plain lastal given your parameters against the constructed refpack to see, whether that also produces no alignments. I would also give NCBI blast a try again, to verify. To work with more recent aligners, you can always go to the binary folder in the taxator-tk installation and replace the binary versions with more recent ones. This always worked fine for last, but blast has shown to be less backward compatible and might need some tweaking. In any case, I update the binaries from time to time, so any feedback would be valuable for me to provide an updated taxator-tk runtime. Finally, you could also use alignment in protein space, which should be quite sensitive over larger phylogenetic distances. taxator-tk includes a sample blastp pipeline with builtin ORF detection etc. That mode requires a protein database, if I remember correctly. |
Hi, for_taxator_contigs.txt |
Hi,
I want to use Taxator for binning my contigs against RefSeq bacterial genomes. To do so, I created refpack using all bacterial genomes (25,124,056 sequences, 346 GB). since the indexing time was very long after several attempts, I used the following options to create the index:
-P64 -uRY16 -c -i 100
, then, I added the-m 100
option to the lastal command in binning-last.bash, but the results are empty:bash.err:
bash.out:
Do you have any suggestions or advice for this issue?
Thank you in advance
The text was updated successfully, but these errors were encountered: