You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! The issue below refers to the genesys project.
I wanted to use the SimDIT framework for perfomance analysis for training custom DNN architectures. A json file is required, which describes the entire network in detail and currently, only ResNet50 (for ImageNet classification) is provided as a template (in SimDIT/DNN_Spec_Training/ResNet50_training_Spec_Locked.json for example).
I tried generating such a specification file for ResNet18 on the same dataset using the compile_benchmark.py script with different configuration/architecture files.
Number fusion layers: 32
Number fusion layers: 32
Traceback (most recent call last):
File "compile_benchmark.py", line 138, in <module>
compile_benchmark(fname,
File "compile_benchmark.py", line 99, in compile_benchmark
program.compile(verbose=verbose, finalize=True, stop_stage=stop_stage)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/compiler/program.py", line 1166, in compile
codelets = self.instantiate_all_codelets(node_sequence, verbose=verbose)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/compiler/program.py", line 878, in instantiate_all_codelets
cdlt = self.instantiate_codelet(n)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/compiler/program.py", line 371, in instantiate_codelet
cdlt_template = self.get_template_through_mapping(node)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/compiler/program.py", line 359, in get_template_through_mapping
raise RuntimeError(f"Unable to match node operation to codelet with the same name:\n"
RuntimeError: Unable to match node operation to codelet with the same name:
Node operation: reduce_sum
Input shape dimensions: [4]
Output shape dimensions: [1]
Codelet: reduce_sum
Codelet shapes: [2, 1]
I minorly tried converting existing configuration files to accept a training directive (added the line "TRAINING": true in genesys/examples/genesys/configs/benchmark_8x8.json for example) without success, and a different error trace this time:
Traceback (most recent call last):
File "compile_benchmark.py", line 138, in <module>
compile_benchmark(fname,
File "compile_benchmark.py", line 55, in compile_benchmark
program, _ = compile_full_model(model_name,
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/examples/genesys/genesys_network_sim.py", line 321, in compile_full_model
program = compile_genesys(model_name,
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/examples/genesys/genesys.py", line 308, in compile_genesys
graph = run_srdfg_passes(graph, def_cfg, batch_size=batch_size,
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/codelets-0.1.0-py3.8.egg/codelets/examples/genesys/genesys.py", line 227, in run_srdfg_passes
graph = fusion_pass(graph)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/polymath-0.1.0-py3.8.egg/polymath/srdfg/passes/__init__.py", line 229, in __call__
initialized_node = self.initialize_pass(gcpy, self.ctx)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/polymath-0.1.0-py3.8.egg/polymath/srdfg/passes/dnn_passes.py", line 286, in initialize_pass
self.fuse_layers(graph, fused_nodes, pf)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/polymath-0.1.0-py3.8.egg/polymath/srdfg/passes/dnn_passes.py", line 392, in fuse_layers
self.topological_insert(graph, node)
File "/home/balkon00/anaconda3/envs/verigood_ml/lib/python3.8/site-packages/polymath-0.1.0-py3.8.egg/polymath/srdfg/passes/dnn_passes.py", line 402, in topological_insert
assert all([i.name in graph.nodes for i in node.inputs])
AssertionError
I ensured that all required tools/packages were installed correctly.
How can I generate the DNN training specification file correctly?
The text was updated successfully, but these errors were encountered:
Hello! The issue below refers to the
genesys
project.I wanted to use the SimDIT framework for perfomance analysis for training custom DNN architectures. A json file is required, which describes the entire network in detail and currently, only ResNet50 (for ImageNet classification) is provided as a template (in
SimDIT/DNN_Spec_Training/ResNet50_training_Spec_Locked.json
for example).I tried generating such a specification file for ResNet18 on the same dataset using the
compile_benchmark.py
script with different configuration/architecture files.Trying the command below:
gave the error trace:
I minorly tried converting existing configuration files to accept a training directive (added the line
"TRAINING": true
ingenesys/examples/genesys/configs/benchmark_8x8.json
for example) without success, and a different error trace this time:I ensured that all required tools/packages were installed correctly.
How can I generate the DNN training specification file correctly?
The text was updated successfully, but these errors were encountered: