edgeless_benchmark
is a tool to help function developers and designers of
orchestration algorithms through the automated performance evaluation of a
population of workflows in controlled conditions.
The tool supports different arrival models and workflow types.
Arrival models (option --arrival-model
):
Arrival model | Description |
---|---|
poisson | Inter-arrival between consecutive workflows and lifetimes are exponentially distributed. |
incremental | One new workflow arrive every new inter-arrival time, with constant lifetime. |
incr-and-keep | Add workflows, with constant lifetimes, incrementally until the warm up period finishes, then keep until the end of the experiment. |
single | Add a single workflow that lasts for the entire experiment. |
trace | Read the arrival and end times of workflows from a file specified with --workload-trace , one workflow per line in the format arrival,end_time |
Workflow types (option --wf_type
):
Workflow type | Description | Application metrics | Template |
---|---|---|---|
none | No workflow is created. This option is meant only for testing/troubleshooting. | None | N |
single | A single function. | -- | N |
matrix-mul-chain | A chain of functions, each performing the multiplication of two matrices of 32-bit floating point random numbers at each invocation. | workflow,function | Y |
vector-mul-chain | A chain of functions, each performing the multiplication of an internal random matrix of 32-bit floating point numbers by the input vector received from the caller. | workflow,function | Y |
map-reduce | A workflow consisting of a random number of stages, where each stage is composed of a random number of processing blocks. Before going to the next stage, the output from all the processing blocks in the stage before must be received. | workflow | Y |
json-spec | The workflow specified in the given JSON file. The string @WF_ID in the file is substituted with a sequential identifier of the workflow. |
-- | N |
For all the workflow types with Y in the template column a template can be generated by specifying --wf-type "NAME;template"
.
For example, by running:
target/debug/edgeless_benchmark --wf-type "map-reduce;template" > map_reduce.json
A template will be generated in map_reduce.json
, which can then be loaded with:
target/debug/edgeless_benchmark --wf-type "map-reduce;map_reduce.json"
The duration of the experiment is configurable via a command-line option, like the seed used to generate pseudo-random numbers to enable repeatable experiments and the duration of a warm-up period.
The collection of performance metrics is done on Redis, an instance of which
must be reachable by edgeless_benchmark
at the URL specified with the
command-line argument --redis-url
(e.g., --redis-url redis://127.0.0.1:6379
).
If the command-line argument is not specified, then edgeless_benchmark
will
create the workload as specified without saving any performance metrics.
When used to collect performance metrics, one EDGELESS node with the
metrics-collection
resource provider is also needed and can be created by
following the instructions in the step by step example below.
The command edgeless_benchmark
and the ε-ORC both support the option to save
run-time events during the execution for the purpose of creating a dataset from
an execution of the benchmark.
It is also possible to specify additional_fields
For edgeleless_benchmark
this option is enabled by specifying a non-empty
value for option --dataset_path
, which defines the path of where to save
the dataset files.
The dataset files are encoded in a comma-separated values (CSV) format, with
the first row in each file containing the column names.
Each entry is pre-prended with additional fields, which can be specified with
the --additional_fields
, corresponding to the additional header
--additional_header
.
The output files are overwritten unless the --append
option is provided.
For the ε-ORC, a Redis proxy must be enabled, and an additional optional
section [proxy.dataset_settings]
must be added, whose fields have the same
meaning as the corresponding edgeless_benchmark
command-line options above
(see step-by-step example below).
The dataset files produced are the following:
Filename | Format | Produced by |
---|---|---|
health_status.csv | timestamp,node_id,node_health_status | ε-ORC |
capabilities.csv | timestamp,node_id,node_capabilities | ε-ORC |
mapping_to_instance_id.csv | timestamp,logical_id,node_id1,physical_id1,... | ε-ORC |
performance_samples.csv | metric,identifier,value,timestamp | ε-ORC |
application_metrics.csv | entity,identifier,value,timestamp | edgeless_benchmark |
Notes:
- The timestamp format is always A.B, where A is the Unix epoch in seconds and B the fractional part in nanoseconds.
- All the identifiers (node_id, logical_id, and physical_id) are UUID.
- The field entity in the application metrics can be
f
(function) orw
(workflow). - Check the difference between application metrics and performance samples in the orchestration documentation.
We assume that the repository has been downloaded and compiled in debug mode (see building instructions) and that a local instance of Redis is running (see online instructions).
First, build the vector_mul.wasm
bytecode:
target/debug/edgeless_cli function build functions/vector_mul/function.json
Then, create the configuration files:
target/debug/edgeless_inabox -t
Modify the [proxy]
section of orchestrator.toml
as follows:
[proxy]
proxy_type = "Redis"
redis_url = "redis://127.0.0.1:6379"
[proxy.dataset_settings]
dataset_path = "dataset/myexp-"
append = true
additional_fields = "a,b"
additional_header = "h_a,h_b"
And create the directory where the dataset files will be created:
mkdir dataset
In one shell start the EDGELESS in-a-box:
target/debug/edgeless_inabox
Modify the configuration of node.toml
so that it includes the following:
[resources.metrics_collector_provider]
collector_type = "Redis"
redis_url = "redis://localhost:6379"
provider = "metrics-collector-1"
Then create the JSON file specifying the characteristics of the vector-mul-chain workflow:
cat << EOF > vector_mul_chain.json
{
"min_chain_length": 5,
"max_chain_length": 5,
"min_input_size": 1000,
"max_input_size": 2000,
"function_wasm_path": "functions/vector_mul/vector_mul.wasm"
}
EOF
Create the directory that will host the dataset:
mkdir dataset
In another run the following benchmark, which lasts 30 seconds:
target/debug/edgeless_benchmark \
--redis-url redis://127.0.0.1:6379 \
-w "vector-mul-chain;vector_mul_chain.json" \
--dataset-path "dataset/myexp-" \
--additional-fields "a,b" \
--additional-header "h_a,h_b" \
--append
The dataset
directory now contains all the files in the table below,
starting with the prefix myexp-
.
An example of a post-processing script is included:
% DATASET=dataset/myexp-application_metrics.csv python documentation/examples-app-metrics.py
the average latency of wf6 was 33.23 ms
the average latency of wf4 was 69.67 ms
the average latency of wf0 was 19.58 ms
the average latency of wf1 was 50.98 ms
the average latency of wf2 was 54.84 ms
the average latency of wf5 was 72.22 ms