-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No error during align-DNA failure #252
Comments
This issue should really be in the align-DNA repository; this isn't an issue with the metapipeline itself. |
True, should I copy it over and remove this one? |
You should be able to |
@alkaZeltser: can you try directly running Also:
|
I could.. but Paul told me to document the issue and move on :D
I'm using what the metapipeline is pointing to, which is
Here is the csv file generated by the metapipeline for my test sample:
|
I'm getting the following error when I test on F16 or F32 using v9.0.0 or the current branch:
When checking the nextflow html report, there are no "failed" tasks. However, there is 1 "aborted" task for @yashpatel6 : did something change in |
I believe this is related to #229. When it fails during Spark, it seems like Spark isn't able to return the corresponding error message back to the main process resulting in no error message in the log. |
Describe the issue
I am running some new samples through the metapipeline and trying to test various partitions (F72/F32/F16) to find the minimum requirement for my dataset. I found no issues when running on F72, however the other two partitions result in a failure during the align-DNA process. The pipeline stops and errors out, but no descriptive error message from BWA-MEM is returned, so trouble-shooting is difficult. The failure occurred about 5 hours into F32 alignment and 12 hours into F16 alignment. No completed BAMs were returned.
The test sample I'm using is from the recently registered
/hot/data/PRAD/PRAD0000068
It is a single germline WGS sample (not tumor-normal pair).
More info here: https://github.com/uclahs-cds/dataset-register-file/pull/116
From successfully completed F72 test runs, I know that the aligned BAM of this sample is 110G - quite large.
I suspect this is a resource issue, but would be nice to get a definitive error message from the aligner on why it stops.
Error messages in logs:
Command output:
andCommand error:
are empty lines./hot/user/nzeltser/project-disease-ProstateTumor-PRAD-000110-URGGermlineWGS/script/run-metapipeline.sh
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/output/EDRN-Zeltser-PRAD-LPUV/test/EZPRLPUV-test-F16.log
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/output/EDRN-Zeltser-PRAD-LPUV/test/EZPRLPUV-test-F32.log
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/input/EDRN-Zeltser-PRAD-LPUV/EDRN-Zeltser-PRAD-LPUV_meta-pipeline_F16.config
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/input/EDRN-Zeltser-PRAD-LPUV/EDRN-Zeltser-PRAD-LPUV_meta-pipeline_F32.config
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/output/EDRN-Zeltser-PRAD-LPUV/test/68/a854a5e349c2880341b4165070e5db
/hot/project/disease/ProstateTumor/PRAD-000110-URGGermlineWGS/meta-pipeline/output/EDRN-Zeltser-PRAD-LPUV/test/54/a283d82752cf67f95b7f3514c1443b
To Reproduce
Expected behavior
I don't actually expect this size of a sample to complete on an F16 node, maaaybe an F32, but I do expect an error message telling me why it failed.
The text was updated successfully, but these errors were encountered: