Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add APP_INSIGHTS_ID to image build #1266

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

alexcastilio
Copy link
Contributor

@alexcastilio alexcastilio commented Jan 24, 2025

Description

  • Add APP_INSIGHTS_ID to image build to allow monitoring through heartbeat during test execution.
  • Add test to configuration packages (for both operator and daemonset)
  • Make heartbeat interval configurable
    • New keys in deploy/standard/manifests/controller/helm/retina/values.yaml

Related Issue

If this pull request is related to any issue, please mention it here. Additionally, make sure that the issue is assigned to you before submitting this pull request.

Checklist

  • I have read the contributing documentation.
  • I signed and signed-off the commits (git commit -S -s ...). See this documentation on signing commits.
  • I have correctly attributed the author(s) of the code.
  • I have tested the changes locally.
  • I have followed the project's style guidelines.
  • I have updated the documentation, if necessary.
  • I have added tests, if applicable.

Screenshots (if applicable) or Testing Completed

Manual test was also done to validate that the configuration was propagated from helm chart values.yaml file to both operator and agent. Value set was printed to log:
image
image

Additional Notes

Add any additional notes or context about the pull request here.


Please refer to the CONTRIBUTING.md file for more information on how to contribute to this project.

Signed-off-by: Alex Castilio dos Santos <[email protected]>
@alexcastilio alexcastilio force-pushed the alexcastilio/app-insights-in-image branch 3 times, most recently from fd1f6a8 to e175158 Compare January 24, 2025 14:39
@alexcastilio alexcastilio marked this pull request as ready for review January 24, 2025 14:40
@alexcastilio alexcastilio requested a review from a team as a code owner January 24, 2025 14:40
@alexcastilio alexcastilio force-pushed the alexcastilio/app-insights-in-image branch from e175158 to 585155c Compare January 24, 2025 14:52
Signed-off-by: Alex Castilio dos Santos <[email protected]>
@alexcastilio alexcastilio force-pushed the alexcastilio/app-insights-in-image branch from 585155c to b82c68c Compare January 24, 2025 15:18
@@ -35,5 +37,10 @@ func GetConfig(cfgFileName string) (*OperatorConfig, error) {
return nil, fmt.Errorf("error unmarshalling config: %w", err)
}

// If unset, default telemetry interval to 5 minutes.
if cfg.TelemetryInterval == 0 {
cfg.TelemetryInterval = 5 * time.Minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep to 15min, and since this is a configmap change we'll need to notify a few people

https://github.com/microsoft/retina/pull/1266/files#diff-754f5ed2c6297aa0f46f6627521cda46a66d540bcdc9713857a6a84f7fc7358aL52

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time to add CODEOWNERS?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if a zero value TelemetryInterval means no telemetry/the previous behavior. That keeps any changed behavior intentionally opt-in, backcompat, and obvious

Copy link
Contributor Author

@alexcastilio alexcastilio Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matmerr currently it is 15 min for the daemonset heartbeat. This config is for operator, which is 5 min. No change in default values:
https://github.com/microsoft/retina/pull/1266/files#diff-3707c2e5d521fb2df733befba56a5cf6044f7a0d08e8a2ffd1b8ce2a2351731dL59

@rbtr currently a different key in the yaml will enable telemetry:

Do you think enableTelemetry key should be removed and keep only the telemetryInterval? (disable telemetry if time is set to 0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huntergregory is there a specific reason operator heartbeat is so frequent at 5min?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if a zero value TelemetryInterval means no telemetry/the previous behavior. That keeps any changed behavior intentionally opt-in, backcompat, and obvious

@rbtr in this specific case I think reading telemetryInterval as 0 may require a user to jump to docs. I could see it being interpreted as disabled and don't send telemetry at all, or send telemetry in a continuous stream with a 0 second delay. I kinda prefer keeping the whole telemetry functionality behind the explicit enableTelemetry bool

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there's a separate enable flag that bypasses this entirely, on objection. let's just make sure that if this is unset it defaults to the previous behavior

@alexcastilio alexcastilio requested a review from rbtr January 29, 2025 17:19
@@ -9,3 +9,4 @@ metricsIntervalDuration: "10s"
# used to export telemetry to AppInsights
telemetryEnabled: true
dataAggregationLevel: "low"
telemetryInterval: "15m"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can be careful, but is it possible to add a test around the behavior if someone just uses 15 instead of 15m?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants