Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pangenome script benchmark #11

Open
ccbaumler opened this issue Feb 27, 2024 · 0 comments
Open

Pangenome script benchmark #11

ccbaumler opened this issue Feb 27, 2024 · 0 comments

Comments

@ccbaumler
Copy link
Contributor

The time -v output of the non-abund pangenome database curation suggesting the resources should be set to ~20GB and 2hrs per database.

        Command being timed: "./make-pangenome-sketches.py /home/baumlerc/2022-database-covers/dbs/gtdb-rs214-k21.zip -t /home/baumlerc/2022-database-covers/dbs/gtdb-rs214.lineages.sqldb -o test-no-abund.zip -k 21
 -r species --csv the-csv-file-you-desire_no-abund.csv"
        User time (seconds): 4469.89
        System time (seconds): 566.98
        Percent of CPU this job got: 83%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 1:39:58
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 17040560
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 96200
        Minor (reclaiming a frame) page faults: 88823161
        Voluntary context switches: 5672989
        Involuntary context switches: 60363
        Swaps: 0
        File system inputs: 51679104
        File system outputs: 5437344
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

When running the script with all the bells and whistles this time -v output suggests that the slurm resources should be set to >40GB and >2hrs.

        Command being timed: "./make-pangenome-sketches.py /home/baumlerc/2022-database-covers/dbs/gtdb-rs214-k21.zip -t /home/baumlerc/2022-database-covers/dbs/gtdb-rs214.lineages.sqldb -o test.zip -k 21 -r species -a --csv the-csv-file-you-desire.csv"
        User time (seconds): 5800.88
        System time (seconds): 717.47
        Percent of CPU this job got: 87%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 2:04:42
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 39818112
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 96751
        Minor (reclaiming a frame) page faults: 162501748
        Voluntary context switches: 5673639
        Involuntary context switches: 67568
        Swaps: 0
        File system inputs: 51784920
        File system outputs: 5634248
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant