instrument NADE commands using metric spans #1844

sfc-gh-mchok · 2024-11-08T19:35:04Z

Pre-review checklist

I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
I've added or updated automated unit tests to verify correctness of my new code.
I've added or updated integration tests to verify correctness of my new code.
I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
I've confirmed that my changes are working by executing CLI's commands manually on Windows.
I've confirmed that my changes are up-to-date with the target branch.
I've described my changes in the release notes.
I've described my changes in the section below.

Changes description

Added metric spans to telemetry payload being sent
Instrumented native app commands with metric spans API
Added integ tests to verify the proper spans are being sent in telemetry for example commands (see tests_integration/nativeapp/test_metrics.py for examples of traces being sent)
Refactored test files to fit with convention, combining metrics counters + spans tests

An example trace of snow app run looks like the following (truncated for brevity):

{
        'message': {
            'command_execution_time': 30.307388332905248,
            'counters': {
                'features.global.pdf_templates': 1,
                'features.global.templates_processor': 1,
                'features.app.snowpark_processor': 0,
                'features.app.post_deploy_scripts': 1,
                'features.app.event_sharing': 0,
                'features.app.event_sharing_warning': 0,
                'features.app.event_sharing_error': 0
            },
            'spans': {
                'completed_spans': [{
                    'id': 'a45d527defc44719a0b0d8e8ab9d02b5',
                    'name': 'action.app.deploy',
                    'start_time': 0.6444029589183629,
                    'parent': None,
                    'parent_id': None,
                    'execution_time': 27.891165500041097,
                    'error': None,
                    'span_count_in_subtree': 10,
                    'span_depth': 1,
                    'trimmed': False
                }, {
                    'id': '8dd033f84f4f481b8542fff24a759ccf',
                    'name': 'action.app_pkg.deploy',
                    'start_time': 0.644967291969806,
                    'parent': 'action.app.deploy',
                    'parent_id': 'a45d527defc44719a0b0d8e8ab9d02b5',
                    'execution_time': 11.048065917100757,
                    'error': None,
                    'span_count_in_subtree': 7,
                    'span_depth': 2,
                    'trimmed': False
                }, {
                    'id': '088ae6c6ac0947e9ac8796fe83e60dd9',
                    'name': 'bundle',
                    'start_time': 0.6450087090488523,
                    'parent': 'action.app_pkg.deploy',
                    'parent_id': '8dd033f84f4f481b8542fff24a759ccf',
                    'execution_time': 0.0043289999011904,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': 'ae49cdc6fe944579a057fa4850d92c1d',
                    'name': 'artifact_processors',
                    'start_time': 0.6495900419540703,
                    'parent': 'action.app_pkg.deploy',
                    'parent_id': '8dd033f84f4f481b8542fff24a759ccf',
                    'execution_time': 0.00230620801448822,
                    'error': None,
                    'span_count_in_subtree': 2,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': 'b83c2257cedd41a38a89bc79b60f0214',
                    'name': 'templates_processor',
                    'start_time': 0.6496683340519667,
                    'parent': 'artifact_processors',
                    'parent_id': 'ae49cdc6fe944579a057fa4850d92c1d',
                    'execution_time': 0.0022297080140560865,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 4,
                    'trimmed': False
                }, {
                    'id': '399ba61144824d25861401573f9447c4',
                    'name': 'sync_deploy_root_with_stage',
                    'start_time': 2.6214112089946866,
                    'parent': 'action.app_pkg.deploy',
                    'parent_id': '8dd033f84f4f481b8542fff24a759ccf',
                    'execution_time': 5.49098225007765,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': '483b5fe50c1c4e6ea4e81598a0bf926d',
                    'name': 'post_deploy_hooks',
                    'start_time': 8.112657208926976,
                    'parent': 'action.app_pkg.deploy',
                    'parent_id': '8dd033f84f4f481b8542fff24a759ccf',
                    'execution_time': 3.1565516660921276,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': 'cd065a3c3148457d82d45f9b62d3ec62',
                    'name': 'validate_setup_script',
                    'start_time': 11.27043924992904,
                    'parent': 'action.app_pkg.deploy',
                    'parent_id': '8dd033f84f4f481b8542fff24a759ccf',
                    'execution_time': 0.422550291987136,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': '99829e4b459b48fb92e111c951f7f58c',
                    'name': 'update_app_object',
                    'start_time': 11.693080459022895,
                    'parent': 'action.app.deploy',
                    'parent_id': 'a45d527defc44719a0b0d8e8ab9d02b5',
                    'execution_time': 16.84247687482275,
                    'error': None,
                    'span_count_in_subtree': 2,
                    'span_depth': 2,
                    'trimmed': False
                }, {
                    'id': '7f79381f00f442bc8d499e7950ca9ac0',
                    'name': 'post_deploy_hooks',
                    'start_time': 22.447485291864723,
                    'parent': 'update_app_object',
                    'parent_id': '99829e4b459b48fb92e111c951f7f58c',
                    'execution_time': 6.088065165793523,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 3,
                    'trimmed': False
                }, {
                    'id': '71938e6500ac427db621280c229bdf4e',
                    'name': 'get_snowsight_url_for_app',
                    'start_time': 28.535641624825075,
                    'parent': None,
                    'parent_id': None,
                    'execution_time': 1.8380501249339432,
                    'error': None,
                    'span_count_in_subtree': 1,
                    'span_depth': 1,
                    'trimmed': False
                }],
                'num_spans_past_depth_limit': 0,
                'num_spans_past_total_limit': 0
            }
        }

src/snowflake/cli/_plugins/nativeapp/entities/application.py

tests_integration/nativeapp/test_feature_metrics.py

sfc-gh-mchok · 2024-11-11T22:21:17Z

tests_integration/nativeapp/test_command_metric_counters.py

file renamed from test_feature_metrics since it was going to be too large with the spans, and also converted the tests to use the factories

Generally the test directory mirrors the layout of the source directory so that tests are easy to locate. Breaking down the files like this because of some arbitrary size constraint makes things easier to manage, and makes it easier for us to review the changes because we can no longer see the diff. Please follow the convention instead.

i don't follow this comment, did you mean to say it makes it harder if we split the test file?

That's fair, I merged the new integ tests and also the old unit tests to test_metrics.py in their respective folders, and also adjusted the codeowners file accordingly, let me know if this works for you!

sfc-gh-mchok · 2024-11-11T22:25:56Z

tests_integration/nativeapp/test_command_metric_counters.py

+    [
+        (
+            "app deploy",
+            lambda: ProjectV11Factory(


This is the best I could come up with to have parametrization with factories, if anyone has a better approach I'd be interested

what do we need parametrized tests at all? I don't think I can read this anymore. I think overall we make way, way too frequent use of parametrized tests in our codebase, but that's a different story.

if the arguments to the factories are the same, you can parametrize just on the class and instantiate it in the test

@pytest.mark.parametrize("factory_class", [FooFactory, BarFactory]) def test_factory(factory_class): obj = factory_class(arg=123)

I ended up getting rid of all the parametrization and separating the tests out since the factories didn't have the same arguments, I think it's more clear and separated this way, let me know what you think

sfc-gh-bdufour · 2024-11-12T13:19:16Z

description + pre-review checklist aren't optional, please fill them in

sfc-gh-bdufour · 2024-11-12T13:23:50Z

tests_integration/nativeapp/test_command_metric_counters.py

Generally the test directory mirrors the layout of the source directory so that tests are easy to locate. Breaking down the files like this because of some arbitrary size constraint makes things easier to manage, and makes it easier for us to review the changes because we can no longer see the diff. Please follow the convention instead.

sfc-gh-bdufour · 2024-11-12T13:24:43Z

src/snowflake/cli/_plugins/nativeapp/codegen/compiler.py

@@ -88,7 +88,7 @@ def compile_artifacts(self):
        if not self._should_invoke_processors():
            return

-        with cc.phase("Invoking artifact processors"):
+        with cc.phase("Invoking artifact processors", span_name="compile_artifacts"):


there's no compilation happening. How about simply "artifact_processors"? We should probably also have individual processors as sub-spans.

sfc-gh-bdufour · 2024-11-12T13:25:43Z

src/snowflake/cli/api/cli_global_context.py

@@ -213,6 +214,26 @@ def get_cli_context() -> _CliGlobalContextAccess:
    return _CliGlobalContextAccess(get_cli_context_manager())


+def start_cli_metrics_span(span_name: str):


I don't think "start" makes sense here, since this both starts and stops the span at the appropriate time. Let's workshop the name of this + the start_span method it calls.

+1, I feel @span would suffice

Changed both the method name and the decorator name to just span, let me know if this works for you

sfc-gh-bdufour · 2024-11-12T13:28:01Z

src/snowflake/cli/_plugins/nativeapp/entities/application.py

@@ -420,6 +421,7 @@ def get_objects_owned_by_application(self) -> List[ApplicationOwnedObject]:
            ).fetchall()
            return [{"name": row[1], "type": row[2]} for row in results]

+    @start_cli_metrics_span("create_or_upgrade_app")


spans shouldn't be implementation-specific, but rather represent a logic step in the overall operation being performed.

Adjusted the names and locations of all the spans to better encapsulate this, let me know if they work for you!

sfc-gh-bdufour · 2024-11-12T13:28:54Z

src/snowflake/cli/_plugins/nativeapp/entities/application_package.py

@@ -915,6 +916,7 @@ def validate_setup_script(
            if validation_result["status"] == "FAIL":
                raise SetupScriptFailedValidation()

+    @start_cli_metrics_span("get_validation_result")


ditto, let's not name things according to the implementation. That can change over time.

sfc-gh-bdufour · 2024-11-12T13:30:59Z

src/snowflake/cli/api/console/abc.py

@@ -68,6 +68,7 @@ def phase(
        self,
        enter_message: str,
        exit_message: Optional[str] = None,
+        span_name: Optional[str] = None,


not sure if this is truly needed, since we can use multiple context managers on the same line. Thoughts?

I forgot about that feature, I adjusted these instances, thanks for the reminder

sfc-gh-bdufour · 2024-11-12T13:31:33Z

src/snowflake/cli/api/entities/utils.py

@@ -304,6 +308,7 @@ def validation_item_to_str(item: dict[str, str | int]):
    return s


+@start_cli_metrics_span("drop_generic_object")


sfc-gh-bdufour · 2024-11-12T13:32:43Z

tests_integration/nativeapp/test_command_metric_counters.py

+    [
+        (
+            "app deploy",
+            lambda: ProjectV11Factory(


what do we need parametrized tests at all? I don't think I can read this anymore. I think overall we make way, way too frequent use of parametrized tests in our codebase, but that's a different story.

… un-parametrize tests, merge test files to keep with covention

sfc-gh-mchok · 2024-11-12T20:08:38Z

description + pre-review checklist aren't optional, please fill them in

My mistake, I usually do it at the same time as when I make the PR but I started it as a draft this time and forgot to do it when I set it as ready for review, should be populated now!

sfc-gh-mchok · 2024-11-12T20:54:16Z

src/snowflake/cli/api/metrics.py

-        init=False, default_factory=lambda: time.monotonic()
-    )
+    # start time of the step from a performance counter in order to calculate execution time
+    _start_time: float = field(init=False, default_factory=time.perf_counter)


Ran into failing integ tests where the ordering of the spans was not deterministic, did some more research and found that time.perf_counter should just be a better version of time.monotonic with good granularity on Windows, so replaced those calls here

I don't get that part - Why is time determining the order? The most accurate way is first call goes first. Or are we dealing with multithreads here?

On Windows, time.monotonic() does not have as much precision, so multiple spans would end up having the same start time, so when we sort them by start time (to make it so that the first call goes first) the order ends up not being the same as the order as when we start the spans. This is fixed with time.perf_counter()

weird that the sorting function wasn't stable though. IMHO regardless of this "fix", we shouldn't rely on the uniqueness of timestamps and switch to a stable sort regardless.

sfc-gh-mchok · 2024-11-12T21:41:25Z

tests/api/test_metrics.py

-def mock_time_monotonic():
-    with mock.patch("time.monotonic", side_effect=count()) as mocked_time_monotonic:
-        yield mocked_time_monotonic
+def test_metrics_no_counters():


All counter tests moved from test_metrics_counters.py to consolidate the files and keep to the convention

sfc-gh-mchok · 2024-11-12T21:42:20Z

tests_integration/nativeapp/test_metrics.py

+
+@pytest.mark.integration
+@mock.patch("snowflake.connector.telemetry.TelemetryClient.try_add_log_to_batch")
+def test_feature_counters_v2_post_deploy_not_available_in_bundle(


counter tests moved from test_feature_metrics.py to consolidate them in one file and keep to the convention, and converted to use factories

src/snowflake/cli/_plugins/nativeapp/entities/application_package.py

src/snowflake/cli/_plugins/nativeapp/entities/application.py

…ent-nade-cmds

sfc-gh-bdufour · 2024-11-13T21:49:46Z

src/snowflake/cli/_plugins/nativeapp/artifacts.py

@@ -649,6 +650,7 @@ def resolve_without_follow(path: Path) -> Path:
    return Path(os.path.abspath(path))


+@span("build_initial_bundle")


why not just bundle ?

In my head bundle is the higher level container that does both this step (populating the deploy root) and then applying all the processors on it, but if we're fine just having the first step be called bundle then that's good with me

sfc-gh-bdufour · 2024-11-13T21:51:02Z

src/snowflake/cli/_plugins/nativeapp/entities/application.py

@@ -355,6 +355,7 @@ def post_deploy_hooks(self) -> list[PostDeployHook] | None:
        model = self._entity_model
        return model.meta and model.meta.post_deploy

+    @span("deploy_app")


Maybe we should consider some kind of standardized scheme for actions + entities, maybe something like action.app.deploy ?

src/snowflake/cli/_plugins/nativeapp/entities/application.py

src/snowflake/cli/_plugins/nativeapp/entities/application_package.py

sfc-gh-bdufour · 2024-11-13T21:56:38Z

src/snowflake/cli/_plugins/nativeapp/entities/application_package.py

@@ -544,6 +546,7 @@ def _bundle(self):
        compiler.compile_artifacts()
        return bundle_map

+    @span("deploy_app_package")


would rather do this on the action for consistency

sfc-gh-bdufour · 2024-11-13T21:57:34Z

src/snowflake/cli/api/console/console.py

@@ -68,7 +68,11 @@ def _format_message(self, message: str, output: Output) -> Text:
        return text

    @contextmanager
-    def phase(self, enter_message: str, exit_message: Optional[str] = None):
+    def phase(


you're no longer touching this, I'd revert the file entirely

sfc-gh-bdufour · 2024-11-13T22:00:53Z

src/snowflake/cli/api/entities/utils.py

@@ -76,6 +76,7 @@ def _get_stage_paths_to_sync(
    return stage_paths


+@span("sync_remote_and_local_files")


this is one of the rare cases where I'd use the actual function name, since it's exactly the logical concept we're timing.

draft PR for feedback on instrumentation

a2a833e

sfc-gh-mchok self-assigned this Nov 8, 2024

sfc-gh-mchok added 3 commits November 8, 2024 14:35

remove changes on loggers

46a3028

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

cf3a339

extract enum value for proper key to message dict

d26af10

sfc-gh-fcampbell reviewed Nov 11, 2024

View reviewed changes

src/snowflake/cli/_plugins/nativeapp/entities/application.py Outdated Show resolved Hide resolved

tests_integration/nativeapp/test_feature_metrics.py Outdated Show resolved Hide resolved

sfc-gh-mchok added 2 commits November 11, 2024 17:19

write custom decorator for spans, add tests

6b3d12e

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

ca990d4

sfc-gh-mchok commented Nov 11, 2024

View reviewed changes

sfc-gh-mchok marked this pull request as ready for review November 11, 2024 22:21

sfc-gh-mchok requested review from a team as code owners November 11, 2024 22:21

sfc-gh-mchok commented Nov 11, 2024

View reviewed changes

sfc-gh-bdufour reviewed Nov 12, 2024

View reviewed changes

rename spans as logical steps, remove extra param from console phase,…

4edd8b2

… un-parametrize tests, merge test files to keep with covention

sfc-gh-mchok added 2 commits November 12, 2024 15:27

adjust tests for new spans

f99a275

time.monotonic() -> time.perf_counter() for proper ordering on windows

db00d94

sfc-gh-mchok commented Nov 12, 2024

View reviewed changes

add span for getting snowsight url since it is a blind spot

84209f0

sfc-gh-mchok commented Nov 12, 2024

View reviewed changes

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

506bae2

sfc-gh-fcampbell reviewed Nov 13, 2024

View reviewed changes

src/snowflake/cli/_plugins/nativeapp/entities/application_package.py Outdated Show resolved Hide resolved

sfc-gh-fcampbell reviewed Nov 13, 2024

View reviewed changes

src/snowflake/cli/_plugins/nativeapp/entities/application.py Show resolved Hide resolved

sfc-gh-mchok added 4 commits November 13, 2024 11:59

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

7846c43

Merge remote-tracking branch 'origin' into mchok-metric-spans-instrum…

ec44e0c

…ent-nade-cmds

add event sharing counters from merge conflicts to new file

d5e2376

add span for create_or_upgrade_app to address blind spot

15955f7

sfc-gh-bdufour reviewed Nov 13, 2024

View reviewed changes

sfc-gh-mchok added 5 commits November 14, 2024 09:33

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

1b1d99a

Merge branch 'main' into mchok-metric-spans-instrument-nade-cmds

a32a21b

adjust span names for consistency

b77d135

add docstring for entity action span decorator

cef9f19

fix spans bundle test

8a45192

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

instrument NADE commands using metric spans #1844

instrument NADE commands using metric spans #1844

sfc-gh-mchok commented Nov 8, 2024 •

edited

Loading

sfc-gh-mchok Nov 11, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-fcampbell Nov 12, 2024 •

edited

Loading

sfc-gh-mchok Nov 12, 2024

sfc-gh-mchok Nov 11, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-fcampbell Nov 12, 2024

sfc-gh-mchok Nov 12, 2024

sfc-gh-bdufour commented Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-melnacouzi Nov 12, 2024

sfc-gh-mchok Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-mchok Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-mchok Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-bdufour Nov 12, 2024

sfc-gh-mchok commented Nov 12, 2024

sfc-gh-mchok Nov 12, 2024 •

edited

Loading

sfc-gh-melnacouzi Nov 14, 2024

sfc-gh-mchok Nov 14, 2024

sfc-gh-bdufour Nov 14, 2024 •

edited

Loading

sfc-gh-mchok Nov 12, 2024

sfc-gh-mchok Nov 12, 2024

sfc-gh-bdufour Nov 13, 2024

sfc-gh-mchok Nov 14, 2024

sfc-gh-bdufour Nov 13, 2024

sfc-gh-bdufour Nov 13, 2024

sfc-gh-bdufour Nov 13, 2024

sfc-gh-bdufour Nov 13, 2024

		@@ -213,6 +214,26 @@ def get_cli_context() -> _CliGlobalContextAccess:
		return _CliGlobalContextAccess(get_cli_context_manager())


		def start_cli_metrics_span(span_name: str):

		@@ -304,6 +308,7 @@ def validation_item_to_str(item: dict[str, str \| int]):
		return s


		@start_cli_metrics_span("drop_generic_object")

		@@ -649,6 +650,7 @@ def resolve_without_follow(path: Path) -> Path:
		return Path(os.path.abspath(path))


		@span("build_initial_bundle")

		@@ -76,6 +76,7 @@ def _get_stage_paths_to_sync(
		return stage_paths


		@span("sync_remote_and_local_files")

instrument NADE commands using metric spans #1844

Are you sure you want to change the base?

instrument NADE commands using metric spans #1844

Conversation

sfc-gh-mchok commented Nov 8, 2024 • edited Loading

Pre-review checklist

Changes description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-fcampbell Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-bdufour commented Nov 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-mchok commented Nov 12, 2024

sfc-gh-mchok Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-bdufour Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sfc-gh-mchok commented Nov 8, 2024 •

edited

Loading

sfc-gh-fcampbell Nov 12, 2024 •

edited

Loading

sfc-gh-mchok Nov 12, 2024 •

edited

Loading

sfc-gh-bdufour Nov 14, 2024 •

edited

Loading