Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove experiment tracking #2202

Open
6 of 7 tasks
astrojuanlu opened this issue Nov 22, 2024 · 9 comments
Open
6 of 7 tasks

Remove experiment tracking #2202

astrojuanlu opened this issue Nov 22, 2024 · 9 comments
Assignees

Comments

@astrojuanlu
Copy link
Member

astrojuanlu commented Nov 22, 2024

After a discussion in which we considered keeping Experiment Tracking as-is, doing more development work (for example #2152), removal #1831, and making it optional #2079 , with a heavy heart we ended up picking complete removal. There isn't nearly enough user traction and we aren't confident we can make the feature good enough to make people switch over the alternatives.

I think we can proceed without a deprecation warning. This would be a breaking release.

As follow-ups, we'll also hold a retrospective to learn from this experience.

@tynandebold
Copy link
Member

Can I work on this, at least removing the frontend part?

cc @rashidakanchwala

@astrojuanlu
Copy link
Member Author

I have no problem with that @tynandebold!

@astrojuanlu
Copy link
Member Author

@rashidakanchwala will make the call

@Huongg
Copy link
Contributor

Huongg commented Nov 26, 2024

After discussing with @tynandebold , here are the three subtasks that need to be completed under this ticket:

  • 1. Remove Front-End Components: This includes the UI, e2e tests, and Apollo queries. Estimation: 1–3 points
  • 2. Remove Back-End Components: This includes GraphQL Strawberry, the SQL session store, and associated tests. Estimation: 3–5 points
  • 3. Update Documentation and References: Remove any references to ET, including Medium articles, YouTube content, and draft a release announcement to address the breaking changes. Estimation: 1–3 points

All should be done in a single feature branch so can be easily used in the future.

@ankatiyar
Copy link
Contributor

While reviewing kedro-org/kedro-starters#262, I was wondering if we need the spaceflights-pandas-viz and the spaceflights-pyspark-viz starters and the "kedro-viz" tool in the project creation flow at all?

The difference between spaceflights-pandas and spaceflights-pyspark-viz is a "reporting" pipeline AND the experiment tracking stuff in the settings.py. Kedro-Viz is in the requirements of all spaceflights starters by default. The "reporting" pipeline is not a part of the project unless you select --tools=viz --example=y. With experiment tracking gone, we could probably get rid of the extra tool and starters? 🤔

@ravi-kumar-pilla
Copy link
Contributor

The difference between spaceflights-pandas and spaceflights-pyspark-viz is a "reporting" pipeline AND the experiment tracking stuff in the settings.py. Kedro-Viz is in the requirements of all spaceflights starters by default. The "reporting" pipeline is not a part of the project unless you select --tools=viz --example=y. With experiment tracking gone, we could probably get rid of the extra tool and starters? 🤔

That is a good call. I think without experiment tracking, we can actually remove both the starters. But it would affect Kedro-Viz visibility too

@ravi-kumar-pilla
Copy link
Contributor

Hi Team,

Since we are removing tracking datasets, we also need to work on -

  1. Move kedro-catalog JSON schema to kedro-datasets kedro#4258 (The jsonschema is already present in kedro-datasets, it needs to be removed from kedro - kedro/static/jsonschema)
  2. Since the json schema is part of kedro-datasets, the schema files need to be renamed reflecting kedro-dataset releases instead of kedro-catalog (i.e., kedro-datasets-x.x.json instead of kedro-catalog-x.xx.json). I will create a new ticket if the team feels this is needed (as Nok was mentioning kedro-vscode will not rely on these json files in future, so there is also a question on should we remove them completely ? )

Both the above issues are not blockers for the parent ticket but it would be great to get clarity on these json schema files.

Thank you

@astrojuanlu
Copy link
Member Author

We are working on a communications plan and a retrospective.

@Huongg Huongg moved this from In Progress to In Review in Kedro-Viz Jan 13, 2025
@rashidakanchwala
Copy link
Contributor

Migration from Kedro-Viz Native Experiment Tracking to Kedro-Mlflow

Kedro-Viz Dataset Type MLflow Dataset Type Configuration Details
tracking.MetricsDataset MlflowMetricDataset No additional configuration needed.
tracking.JSONDataset MlflowArtifactDataset Wrap within MlflowArtifactDataset and configure as json.JSONDataset.
plotly.plotlyDataset MlflowArtifactDataset Wrap within MlflowArtifactDataset.
plotly.JSONDataset MlflowArtifactDataset Wrap within MlflowArtifactDataset.
matplotlib.MatplotlibWriter MlflowArtifactDataset Wrap within MlflowArtifactDataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Review
Development

Successfully merging a pull request may close this issue.

6 participants