Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate pass config before instantiating the pass #1553

Merged
merged 1 commit into from
Jan 21, 2025
Merged

Conversation

shaahji
Copy link
Contributor

@shaahji shaahji commented Jan 15, 2025

Validate pass config before instantiating the pass

  • Use of search point should be limited to engine logic only. Rest of the Olive implementation should receive a validated configuration to use.
  • Validation is for complete configuration and not merely for a search point.
  • Fixed a few issues related to use of BasePassConfig vs. FullPassConfig.
  • Add local caching to OlivePackageConfig for loaded modules.
  • Renamed a few variables in engine logic to be explicit about use of pass config vs. pass run config.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

olive/engine/engine.py Fixed Show fixed Hide fixed
@shaahji shaahji force-pushed the shaahji/validation branch from 9623d68 to e260460 Compare January 15, 2025 09:56
* Use of search point should be limited to engine logic only. Rest of the
  Olive implementation should receive a validated configuration to use.
* Validation is for complete configuration and not merely for a search point.
* Fixed a few issues related to use of BasePassConfig vs. FullPassConfig.
* Add local caching to OlivePackageConfig for loaded modules.
* Renamed a few variables in engine logic to be explicit about use of pass
  config vs. pass run config.
@shaahji shaahji force-pushed the shaahji/validation branch from e260460 to fa448fd Compare January 15, 2025 23:05
@devang-ml devang-ml requested review from jambayk and Copilot January 15, 2025 23:30

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 16 out of 31 changed files in this pull request and generated no comments.

Files not reviewed (15)
  • olive/passes/pass_config.py: Evaluated as low risk
  • olive/engine/engine.py: Evaluated as low risk
  • olive/passes/onnx/session_params_tuning.py: Evaluated as low risk
  • olive/passes/olive_pass.py: Evaluated as low risk
  • olive/passes/onnx/model_builder.py: Evaluated as low risk
  • olive/passes/onnx/nvmo_quantization.py: Evaluated as low risk
  • olive/passes/onnx/optimum_conversion.py: Evaluated as low risk
  • olive/workflows/run/config.py: Evaluated as low risk
  • olive/passes/onnx/transformer_optimization.py: Evaluated as low risk
  • olive/passes/onnx/quantization.py: Evaluated as low risk
  • olive/cache.py: Evaluated as low risk
  • olive/systems/azureml/aml_system.py: Evaluated as low risk
  • olive/systems/olive_system.py: Evaluated as low risk
  • olive/systems/docker/docker_system.py: Evaluated as low risk
  • olive/systems/isolated_ort/isolated_ort_system.py: Evaluated as low risk
Comments suppressed due to low confidence (3)

olive/passes/pytorch/capture_split_info.py:67

  • The method name validate_config is appropriate and aligns with the new approach of validating the complete configuration.
def validate_config(

olive/package_config.py:53

  • The rsplit method should be called with arguments to split only at the last dot. It should be 'rsplit('.', 1)'.
_, module_name = pass_type.rsplit

olive/systems/local.py:27

  • The docstring should be updated to reflect the removal of the point parameter.
"""Run the pass on the model."""
@@ -270,6 +270,10 @@ def validate_pass_search(cls, v, values):
)
return v

@validator("pass_flows", pre=True)
def validate_pass_flows(cls, v, values):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this can also be achieved by using default_factory=list at line 143 similar to 135

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a temporary change and is removed in follow up PR #1562

@@ -196,66 +189,32 @@ def run_engine(package_config: OlivePackageConfig, run_config: RunConfig):
engine.target_config, skip_supported_eps_check=target_not_used, is_ep_required=is_ep_required
)

pass_list = []
acc_list = []
if auto_optimizer_enabled:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this case deprecated now? When auto optimizer is enabled, run_config.pass_config is empty and AutoOptimizer generates them.
Are we only going to do auto-opt through the cli and remove the built in one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if so, we would need to first fully remove all auto_optimizer related code and examples. otherwise, this case is broken

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we only going to do auto-opt through the cli and remove the built in one?

Temporarily, yes. A new implementation of auto-optimizer will be introduced soon. @devang-ml okay-ed it being broken for the time being.

logger.warning("Invalid search point, prune")
output_model_config = INVALID_CONFIG
pass_cls: Type[Pass] = self.olive_config.import_pass_module(pass_run_config["type"])
pass_config = pass_cls.config_at_search_point(pass_search_point, accelerator_spec, pass_run_config["config"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think config_at_search_point is not required anymore since it is equivalent to

pass_config = {**pass_run_config["config"], **pass_search_point}. 

pass_run_config["config"] already has all config params - fixed and search. pass_search_point has the same keys as the search params.
This avoids the redundant, potentially expensive calls inside the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a temporary change and is removed in follow up PR #1562

def validate_search_point(
self, search_point: Dict[str, Any], accelerator_spec: AcceleratorSpec, with_fixed_value: bool = False
@classmethod
def validate_config(
Copy link
Contributor

@jambayk jambayk Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can simplify the signature to only include config and accelerator spec. and make the assumption that config is always a full config (all pass params are present).
It is only called by engine with with_fixed_value=True currently and with the new changes, the engine always has the full config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already done as part of PR #1562. The with_fixed_value is deprecated.

if not super().validate_config(config, accelerator_spec, disable_search):
return False

config_cls, _ = cls.get_config_class(accelerator_spec, disable_search)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary and expensive? especially if we go with the assumption I suggested in olive_pass.py? we can always access config as a dictionary with all parameters present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was moving away from accessing the config as a dictionary and instead use the it as a class object. Code is more readable and browsable with intellisense. The expensive part will be optimized away in follow up PR where

  • remove dependency on accelerator_spec when building config class
  • config class will be a class member and will not be created over and over again
  • logic in default config that is dependent on accelerator spec will use search parameter

return output_model_config, None
return INVALID_CONFIG, None

p: Pass = pass_cls(accelerator_spec, pass_config, self.get_host_device())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about instantiating the pass every time in the long run. with these changes, the pass is somewhere between being stateless and stateful + the costs of instantiating the pass each time.

If the end goal is to make it fully stateless, then I don't mind it for the time being. There are somethings needed to make Pass stateless in the future:

  • host_device: this can be removed since it's a concept
  • path_params: only used by docker system but this concept is also outdated. Docker system can be updated to automatically infer paths like aml system
  • _user_module_loader: the pass run can create the user module loader itself
  • _config_class: generate_config can return the config object holding the params (both fixed and search). This way we always have access to the config class to serialize the pass config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, the pass shouldn't even be instantiated in engine at all., It should be done in System::run. Except for the case of local, all other system implementation turns the pass back into a config to write on the wire. So, instead of passing the Pass object to the system, it should use the config as the argument. That would still mean we instantiate the pass for each pass run but that would be limited to local run only.

@devang-ml can chime in on this.

@jambayk
Copy link
Contributor

jambayk commented Jan 21, 2025

As discussed offline, we will do the code improvements related to config class creation in follow up PRs.

@shaahji shaahji merged commit be1278c into main Jan 21, 2025
24 checks passed
@shaahji shaahji deleted the shaahji/validation branch January 21, 2025 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants