Skip to content

Commit dcdd654

Browse files
authored
Real Storage asv perf POC (#2165)
#### Reference Issues/PRs <!--Example: Fixes #1234. See also #3456.--> Main idea of this set is to provide repeatable set of performance measurments against persistent storage initially ASW S3 and later any other storage that we support. Different types of storage provide different type of performance and characteristics. Thus having one and the same asv benchmark test to do both is far from optimal Persistent storages provide a way to setup once the data needed for the tests (especially for read tests/ those that do not modify anything) At the same time they should isolate that data from other other tests - that do writes by providing a temporary storage for their use The approach to build this kind of test setup is first to separate the logic for setting up the libraries and symbols data from the actual asv. In other words each test should use anothe class which is specialized for dealing with setting up data, making some checks and wiping it out if necessary. Having a setup class that is not part of asv test provides way to controlled setting up environment on demand and wiping it out, and all other stuff. This class can fascilitate easily that logic for all types of storages. In this model the asv test becomes simple to write. Most importantly the tests are now ASV _independent_, then can be quickly moved to other tool etc if needed It also gives opportunity to use parts of presetup logic for tests outside of asv itself. In other words the persistent part can be shared accross different types of tests. Adding later new storage type (azure etc) is very easy (provided that it is shared type of storage similar to aws): - extend logic in base class ... - inherit class of the primary implementation of logic AWS - that's it #### What does this implement or fix? #### Any other comments? #### Checklist <details> <summary> Checklist for code changes... </summary> - [ ] Have you updated the relevant docstrings, documentation and copyright notice? - [ ] Is this contribution tested against [all ArcticDB's features](../docs/mkdocs/docs/technical/contributing.md)? - [ ] Do all exceptions introduced raise appropriate [error messages](https://docs.arcticdb.io/error_messages/)? - [ ] Are API changes highlighted in the PR description? - [ ] Is the PR labelled as enhancement or bug so it appears in autogenerated release notes? </details> <!-- Thanks for contributing a Pull Request to ArcticDB! Please ensure you have taken a look at: - ArcticDB's Code of Conduct: https://github.com/man-group/ArcticDB/blob/master/CODE_OF_CONDUCT.md - ArcticDB's Contribution Licensing: https://github.com/man-group/ArcticDB/blob/master/docs/mkdocs/docs/technical/contributing.md#contribution-licensing --> --------- Co-authored-by: Georgi Rusev <Georgi Rusev>
1 parent 0b58a88 commit dcdd654

File tree

10 files changed

+2279
-17
lines changed

10 files changed

+2279
-17
lines changed

.github/actions/set_persistent_storage_env_vars/action.yml

+6-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ inputs:
77
aws_access_key: {required: true, type: string, description: The value for the AWS Access key}
88
aws_secret_key: {required: true, type: string, description: The value for the AWS Secret key}
99
strategy_branch: {default: 'ignore', type: string, description: a unique combination of the parameters for the given job strategy branch, e.g. linux_cp36}
10+
shared_storage_prefix: {default: 'none', type: string, description: a prefix string that will be used for persistent storage}
1011
runs:
1112
using: "composite"
1213
steps:
@@ -19,7 +20,11 @@ runs:
1920
echo "ARCTICDB_PERSISTENT_STORAGE_UNIQUE_ID=${{ github.ref_name }}_${{ github.run_id }}" >> $GITHUB_ENV
2021
echo "ARCTICDB_PERSISTENT_STORAGE_STRATEGY_BRANCH=${{ inputs.strategy_branch }}" >> $GITHUB_ENV
2122
# This is the top level path for all test, this is where to write data that should be shared between jobs (e.g. seed job)
22-
echo "ARCTICDB_PERSISTENT_STORAGE_SHARED_PATH_PREFIX=ci_tests/${{ github.ref_name }}_${{ github.run_id }}" >> $GITHUB_ENV
23+
if [ "${{ inputs.shared_storage_prefix }}" == "none" ]; then
24+
echo "ARCTICDB_PERSISTENT_STORAGE_SHARED_PATH_PREFIX=ci_tests/${{ github.ref_name }}_${{ github.run_id }}" >> $GITHUB_ENV
25+
else
26+
echo "ARCTICDB_PERSISTENT_STORAGE_SHARED_PATH_PREFIX=ci_tests/${{ inputs.shared_storage_prefix }}" >> $GITHUB_ENV
27+
fi
2328
# This is a path that should be used for specific job and its tests to avoid cross contamination and race conditions
2429
echo "ARCTICDB_PERSISTENT_STORAGE_UNIQUE_PATH_PREFIX=ci_tests/${{ github.ref_name }}_${{ github.run_id }}_${{ inputs.strategy_branch }}" >> $GITHUB_ENV
2530
# S3 Specific

.github/workflows/analysis_workflow.yml

+18-1
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,14 @@ on:
99
description: Tag of the ArcticDB development image to use for benchmark and code coverage flows
1010
type: string
1111
default: latest
12+
suite_to_run:
13+
description: Run LMDB suite or REAL storage
14+
type: choice
15+
options:
16+
- 'LMDB'
17+
- 'REAL'
18+
default: 'LMDB'
19+
1220

1321
schedule: # Schedule the job to run at 12 a.m. daily
1422
- cron: '0 0 * * *'
@@ -50,6 +58,7 @@ jobs:
5058
run_all_benchmarks: ${{ inputs.run_all_benchmarks || false }}
5159
run_on_pr_head: ${{ github.event_name == 'pull_request_target' }}
5260
dev_image_tag: ${{ inputs.dev_image_tag || 'latest' }}
61+
suite_to_run: ${{ inputs.suite_to_run || 'LMDB'}}
5362

5463
publish_benchmark_results_to_gh_pages:
5564
name: Publish benchmark results to gh-pages
@@ -106,6 +115,7 @@ jobs:
106115
107116

108117
run-asv-check-script:
118+
name: Executes asv tests checks
109119
timeout-minutes: 120
110120
runs-on: ubuntu-latest
111121
container: ghcr.io/man-group/arcticdb-dev:latest
@@ -141,6 +151,13 @@ jobs:
141151
cmake -P cpp/CMake/CpuCount.cmake | sed 's/^-- //' | tee -a $GITHUB_ENV
142152
env:
143153
CMAKE_BUILD_PARALLEL_LEVEL: ${{vars.CMAKE_BUILD_PARALLEL_LEVEL}}
154+
155+
- name: Set persistent storage variables
156+
uses: ./.github/actions/set_persistent_storage_env_vars
157+
with:
158+
bucket: "arcticdb-asv-real-storage"
159+
aws_access_key: "${{ secrets.AWS_S3_ACCESS_KEY }}"
160+
aws_secret_key: "${{ secrets.AWS_S3_SECRET_KEY }}"
144161

145162
- name: Install ASV
146163
shell: bash -el {0}
@@ -152,7 +169,7 @@ jobs:
152169
153170
- name: Build project for ASV
154171
run: |
155-
python -m pip install -ve .
172+
python -m pip install -ve .[Testing]
156173
157174
- name: Run ASV Tests Check script
158175
run: |

.github/workflows/benchmark_commits.yml

+9-1
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ on:
66
commit: {required: true, type: string, description: commit hash that will be benchmarked}
77
run_on_pr_head: {required: false, default: false, type: boolean, description: Specifies if the benchmark should run on PR head branch}
88
dev_image_tag: {required: false, default: 'latest', type: string, description: Tag of the ArcticDB development image}
9+
suite_to_run: {required: true, type: string, default: 'LMDB', description: Default benchmark on 'LMDB' storage (or 'REAL' storage)}
910
jobs:
1011
start_ec2_runner:
1112
uses: ./.github/workflows/ec2_runner_jobs.yml
@@ -62,6 +63,7 @@ jobs:
6263
aws_access_key: "${{ secrets.AWS_S3_ACCESS_KEY }}"
6364
aws_secret_key: "${{ secrets.AWS_S3_SECRET_KEY }}"
6465
strategy_branch: "${{ inputs.commit }}"
66+
shared_storage_prefix: "_github_runner_"
6567

6668
- name: Install ASV
6769
shell: bash -el {0}
@@ -76,7 +78,13 @@ jobs:
7678
shell: bash -el {0}
7779
run: |
7880
git config --global --add safe.directory .
79-
python -m asv run -v --show-stderr ${{ inputs.commit }}^!
81+
if [ "${{ github.event.inputs.suite_to_run }}" == "REAL" ]; then
82+
SUITE='^(real_).*'
83+
else
84+
SUITE='^(?!real_).*'
85+
fi
86+
echo "selection SUITE=$SUITE"
87+
python -m asv run -v --show-stderr --bench $SUITE ${{ inputs.commit }}^!
8088
8189
- name: Benchmark against master
8290
if: github.event_name == 'pull_request_target' && inputs.run_all_benchmarks == false

0 commit comments

Comments
 (0)