Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The possibility of starting from HiFi ccs reads #192

Open
OceanLyu opened this issue Nov 5, 2024 · 0 comments
Open

The possibility of starting from HiFi ccs reads #192

OceanLyu opened this issue Nov 5, 2024 · 0 comments

Comments

@OceanLyu
Copy link

OceanLyu commented Nov 5, 2024

Thanks for developing ragtag! It`s a really nice tool!

I have approxiamte 4,500,000 HiFi ccs reads with average length arround 15 kb and I want to assemble a genome using a related species as reference. I wonder if ragtag pipeline can begin directly with those ccs reads as contigs? like correcting the reads with reference sequences at first. Is there any suggestions for parameter settings?

Also, I`ve actually tried, but it seemed to stuck, which has already been more than 2 days, after building index and the minimap alignment seemed not using all reads input, as shown below:

Sun Nov  3 16:55:33 2024 --- VERSION: RagTag v2.1.0
Sun Nov  3 16:55:33 2024 --- CMD: ragtag.py correct -t 28 -u /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/../ref/GCF_016881025.1_HiC_Itri_2_genomic.fna /home/wsq_pkuhpc/lustre2/user/lhy/gsgenome/01.CCS/merged_hifi_css.fasta -o /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/pipe_from_ccs
Sun Nov  3 16:55:33 2024 --- INFO: Mapping the query genome to the reference genome
Sun Nov  3 16:55:33 2024 --- INFO: Running: minimap2 -x asm5 -t 28 /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ref/GCF_016881025.1_HiC_Itri_2_genomic.fna /home/wsq_pkuhpc/lustre2/user/lhy/gsgenome/01.CCS/merged_hifi_css.fasta > /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/pipe_from_ccs/ragtag.correct.asm.paf 2> /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/pipe_from_ccs/ragtag.correct.asm.paf.log
Sun Nov  3 17:09:14 2024 --- INFO: Finished running : minimap2 -x asm5 -t 28 /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ref/GCF_016881025.1_HiC_Itri_2_genomic.fna /home/wsq_pkuhpc/lustre2/user/lhy/gsgenome/01.CCS/merged_hifi_css.fasta > /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/pipe_from_ccs/ragtag.correct.asm.paf 2> /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ragtag_from_hifiasm/pipe_from_ccs/ragtag.correct.asm.paf.log
Sun Nov  3 17:09:14 2024 --- INFO: Reading whole genome alignments
Sun Nov  3 17:16:14 2024 --- INFO: Filtering and merging alignments
[fai_load] build FASTA index.
cat ragtag.correct.asm.paf.log
[M::mm_idx_gen::29.199*1.59] collected minimizers
[M::mm_idx_gen::30.907*2.29] sorted minimizers
[M::main::30.907*2.29] loaded/built the index for 7132 target sequence(s)
[M::mm_mapopt_update::32.944*2.21] mid_occ = 68
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 7132
[M::mm_idx_stat::34.810*2.15] distinct minimizers: 194702736 (93.45% are singletons); average occurrences: 1.191; average spacing: 10.688; total length: 2478965571
[M::worker_pipeline::42.582*5.19] mapped 33670 sequences
[M::worker_pipeline::48.354*7.73] mapped 33693 sequences
[M::worker_pipeline::53.611*9.66] mapped 33535 sequences
[M::worker_pipeline::58.514*11.17] mapped 33449 sequences
[M::worker_pipeline::63.280*12.42] mapped 33429 sequences
[M::worker_pipeline::68.571*13.58] mapped 33486 sequences
[M::worker_pipeline::73.416*14.50] mapped 33520 sequences
[M::worker_pipeline::78.347*15.32] mapped 33270 sequences
[M::worker_pipeline::83.302*16.06] mapped 33454 sequences
[M::worker_pipeline::88.234*16.71] mapped 33442 sequences
[M::worker_pipeline::93.416*17.30] mapped 33397 sequences
[M::worker_pipeline::98.395*17.83] mapped 33468 sequences
[M::worker_pipeline::103.501*18.31] mapped 33541 sequences
[M::worker_pipeline::108.463*18.73] mapped 33538 sequences
[M::worker_pipeline::113.743*19.13] mapped 33754 sequences
[M::worker_pipeline::118.879*19.50] mapped 33813 sequences
[M::worker_pipeline::123.632*19.81] mapped 34451 sequences
[M::worker_pipeline::128.714*20.11] mapped 34606 sequences
[M::worker_pipeline::134.153*20.40] mapped 33830 sequences
[M::worker_pipeline::139.270*20.66] mapped 33663 sequences
[M::worker_pipeline::144.806*20.92] mapped 33614 sequences
[M::worker_pipeline::149.634*21.13] mapped 33640 sequences
[M::worker_pipeline::154.517*21.34] mapped 33759 sequences
[M::worker_pipeline::159.723*21.53] mapped 32968 sequences
[M::worker_pipeline::164.650*21.71] mapped 28767 sequences
[M::worker_pipeline::169.953*21.88] mapped 28581 sequences
[M::worker_pipeline::175.178*22.05] mapped 28755 sequences
[M::worker_pipeline::180.306*22.20] mapped 28677 sequences
[M::worker_pipeline::185.740*22.35] mapped 28655 sequences
[M::worker_pipeline::190.780*22.48] mapped 28816 sequences
[M::worker_pipeline::195.958*22.62] mapped 28694 sequences
[M::worker_pipeline::200.621*22.73] mapped 28656 sequences
[M::worker_pipeline::205.888*22.85] mapped 28551 sequences
[M::worker_pipeline::210.921*22.97] mapped 28630 sequences
[M::worker_pipeline::216.056*23.07] mapped 28662 sequences
[M::worker_pipeline::221.110*23.18] mapped 28572 sequences
[M::worker_pipeline::226.093*23.27] mapped 28705 sequences
[M::worker_pipeline::231.082*23.37] mapped 28614 sequences
[M::worker_pipeline::236.044*23.46] mapped 28517 sequences
[M::worker_pipeline::241.195*23.54] mapped 28618 sequences
[M::worker_pipeline::246.158*23.63] mapped 28571 sequences
[M::worker_pipeline::251.323*23.71] mapped 28651 sequences
[M::worker_pipeline::256.565*23.78] mapped 28574 sequences
[M::worker_pipeline::261.490*23.85] mapped 28595 sequences
[M::worker_pipeline::266.719*23.93] mapped 28612 sequences
[M::worker_pipeline::271.615*23.99] mapped 28534 sequences
[M::worker_pipeline::276.596*24.06] mapped 28596 sequences
[M::worker_pipeline::281.468*24.12] mapped 28568 sequences
[M::worker_pipeline::286.590*24.18] mapped 28506 sequences
[M::worker_pipeline::291.486*24.24] mapped 28705 sequences
[M::worker_pipeline::296.737*24.30] mapped 28556 sequences
[M::worker_pipeline::301.750*24.35] mapped 28554 sequences
[M::worker_pipeline::306.969*24.40] mapped 28593 sequences
[M::worker_pipeline::311.944*24.46] mapped 28442 sequences
[M::worker_pipeline::317.117*24.51] mapped 28559 sequences
[M::worker_pipeline::322.174*24.56] mapped 28460 sequences
[M::worker_pipeline::327.369*24.60] mapped 28458 sequences
[M::worker_pipeline::332.449*24.65] mapped 28451 sequences
[M::worker_pipeline::337.482*24.69] mapped 28362 sequences
[M::worker_pipeline::342.467*24.73] mapped 28421 sequences
[M::worker_pipeline::347.595*24.78] mapped 28418 sequences
[M::worker_pipeline::352.562*24.82] mapped 28353 sequences
[M::worker_pipeline::357.924*24.86] mapped 28436 sequences
[M::worker_pipeline::363.196*24.90] mapped 28382 sequences
[M::worker_pipeline::368.039*24.93] mapped 28393 sequences
[M::worker_pipeline::373.143*24.97] mapped 28417 sequences
[M::worker_pipeline::378.299*25.00] mapped 28350 sequences
[M::worker_pipeline::383.375*25.03] mapped 28316 sequences
[M::worker_pipeline::388.662*25.07] mapped 28465 sequences
[M::worker_pipeline::393.913*25.10] mapped 28502 sequences
[M::worker_pipeline::399.091*25.13] mapped 28628 sequences
[M::worker_pipeline::403.969*25.16] mapped 28645 sequences
[M::worker_pipeline::409.048*25.19] mapped 28894 sequences
[M::worker_pipeline::414.272*25.22] mapped 28904 sequences
[M::worker_pipeline::419.351*25.25] mapped 29059 sequences
[M::worker_pipeline::424.450*25.27] mapped 28771 sequences
[M::worker_pipeline::429.490*25.30] mapped 28607 sequences
[M::worker_pipeline::434.512*25.33] mapped 28508 sequences
[M::worker_pipeline::439.872*25.35] mapped 28557 sequences
[M::worker_pipeline::445.126*25.38] mapped 28498 sequences
[M::worker_pipeline::450.113*25.40] mapped 28527 sequences
[M::worker_pipeline::455.105*25.43] mapped 28643 sequences
[M::worker_pipeline::460.128*25.45] mapped 28517 sequences
[M::worker_pipeline::465.112*25.47] mapped 28563 sequences
[M::worker_pipeline::470.274*25.50] mapped 28490 sequences
[M::worker_pipeline::475.353*25.52] mapped 28438 sequences
[M::worker_pipeline::480.818*25.54] mapped 28600 sequences
[M::worker_pipeline::485.792*25.55] mapped 28527 sequences
[M::worker_pipeline::491.032*25.57] mapped 28655 sequences
[M::worker_pipeline::495.999*25.59] mapped 28483 sequences
[M::worker_pipeline::501.014*25.61] mapped 28594 sequences
[M::worker_pipeline::506.221*25.63] mapped 28586 sequences
[M::worker_pipeline::511.279*25.65] mapped 28575 sequences
[M::worker_pipeline::516.387*25.67] mapped 28847 sequences
[M::worker_pipeline::521.596*25.68] mapped 29727 sequences
[M::worker_pipeline::526.898*25.70] mapped 29674 sequences
[M::worker_pipeline::531.847*25.72] mapped 29648 sequences
[M::worker_pipeline::536.976*25.73] mapped 29679 sequences
[M::worker_pipeline::542.105*25.75] mapped 29753 sequences
[M::worker_pipeline::547.288*25.77] mapped 29589 sequences
[M::worker_pipeline::552.040*25.79] mapped 29710 sequences
[M::worker_pipeline::557.318*25.80] mapped 29628 sequences
[M::worker_pipeline::562.295*25.82] mapped 29751 sequences
[M::worker_pipeline::567.413*25.83] mapped 29700 sequences
[M::worker_pipeline::572.441*25.85] mapped 29557 sequences
[M::worker_pipeline::577.703*25.86] mapped 29700 sequences
[M::worker_pipeline::582.735*25.88] mapped 29569 sequences
[M::worker_pipeline::587.718*25.90] mapped 29672 sequences
[M::worker_pipeline::593.071*25.91] mapped 29716 sequences
[M::worker_pipeline::598.297*25.92] mapped 29650 sequences
[M::worker_pipeline::603.672*25.93] mapped 29529 sequences
[M::worker_pipeline::608.674*25.94] mapped 29635 sequences
[M::worker_pipeline::613.772*25.96] mapped 29631 sequences
[M::worker_pipeline::618.995*25.97] mapped 29669 sequences
[M::worker_pipeline::624.039*25.98] mapped 29587 sequences
[M::worker_pipeline::629.068*25.99] mapped 29659 sequences
[M::worker_pipeline::634.670*26.00] mapped 29678 sequences
[M::worker_pipeline::639.341*26.02] mapped 29524 sequences
[M::worker_pipeline::644.364*26.03] mapped 29673 sequences
[M::worker_pipeline::649.658*26.04] mapped 29441 sequences
[M::worker_pipeline::654.492*26.05] mapped 29582 sequences
[M::worker_pipeline::659.826*26.06] mapped 29601 sequences
[M::worker_pipeline::664.714*26.07] mapped 29631 sequences
[M::worker_pipeline::669.955*26.08] mapped 29535 sequences
[M::worker_pipeline::675.257*26.10] mapped 29466 sequences
[M::worker_pipeline::680.039*26.11] mapped 29646 sequences
[M::worker_pipeline::685.096*26.12] mapped 29536 sequences
[M::worker_pipeline::690.383*26.13] mapped 29601 sequences
[M::worker_pipeline::695.650*26.14] mapped 29583 sequences
[M::worker_pipeline::700.892*26.14] mapped 29610 sequences
[M::worker_pipeline::706.057*26.15] mapped 29380 sequences
[M::worker_pipeline::711.015*26.16] mapped 29496 sequences
[M::worker_pipeline::716.026*26.18] mapped 29563 sequences
[M::worker_pipeline::721.534*26.18] mapped 29657 sequences
[M::worker_pipeline::726.695*26.19] mapped 29733 sequences
[M::worker_pipeline::731.991*26.20] mapped 29816 sequences
[M::worker_pipeline::737.421*26.21] mapped 29801 sequences
[M::worker_pipeline::742.518*26.21] mapped 29820 sequences
[M::worker_pipeline::747.650*26.22] mapped 29811 sequences
[M::worker_pipeline::752.642*26.23] mapped 29637 sequences
[M::worker_pipeline::757.913*26.24] mapped 29648 sequences
[M::worker_pipeline::763.001*26.25] mapped 29620 sequences
[M::worker_pipeline::767.937*26.26] mapped 29710 sequences
[M::worker_pipeline::772.878*26.27] mapped 29674 sequences
[M::worker_pipeline::778.160*26.28] mapped 29644 sequences
[M::worker_pipeline::783.695*26.28] mapped 29638 sequences
[M::worker_pipeline::788.732*26.29] mapped 29640 sequences
[M::worker_pipeline::794.036*26.30] mapped 29614 sequences
[M::worker_pipeline::799.076*26.31] mapped 29693 sequences
[M::worker_pipeline::804.052*26.31] mapped 29705 sequences
[M::worker_pipeline::809.066*26.32] mapped 29680 sequences
[M::worker_pipeline::814.180*26.33] mapped 29790 sequences
[M::worker_pipeline::819.261*26.34] mapped 29732 sequences
[M::worker_pipeline::820.537*26.34] mapped 8244 sequences
[M::main] Version: 2.26-r1175
[M::main] CMD: minimap2 -x asm5 -t 28 /home/wsq_pkuhpc/lustre2/user/lhy/results/genome_asm/homo_asm/ref/GCF_016881025.1_HiC_Itri_2_genomic.fna /home/wsq_pkuhpc/lustre2/user/lhy/gsgenome/01.CCS/merged_hifi_css.fasta
[M::main] Real time: 820.813 sec; CPU: 21612.236 sec; Peak RSS: 9.932 GB

Thanks in advance for your help!
Lyu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant