Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add memray example #1768

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/integrations/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ Flytekit functionality. For comparison, these plugins can be thought of like
- Run analytical queries using DuckDB.
* - {doc}`Great Expectations </auto_examples/greatexpectations_plugin/index>`
- Validate data with `great_expectations`.
* - {doc}`Memray </auto_examples/memray_plugin/index>`
- `memray`: Memory profiling with memray.
* - {doc}`MLFlow </auto_examples/mlflow_plugin/index>`
- `mlflow`: the open standard for model tracking.
* - {doc}`Modin </auto_examples/modin_plugin/index>`
Expand Down
27 changes: 27 additions & 0 deletions examples/memray_plugin/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
FROM python:3.11-slim-bookworm
LABEL org.opencontainers.image.source=https://github.com/flyteorg/flytesnacks

WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root

WORKDIR /root

ENV VENV /opt/venv
# Virtual environment
RUN python3 -m venv ${VENV}
ENV PATH="${VENV}/bin:$PATH"

# Install Python dependencies
COPY requirements.in /root
RUN pip install -r /root/requirements.in

# Copy the actual code
COPY . /root

# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
20 changes: 20 additions & 0 deletions examples/memray_plugin/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
(memray_plugin)=

# Memray Profiling

```{eval-rst}
.. tags:: Integration, Profiling, Observability
```

Memray tracks and reports memory allocations, both in python code and in compiled extension modules.
This Memray Profiling plugin enables memory tracking on the Flyte task level and renders a memgraph profiling graph on Flyte Deck.

First, install the Memray plugin:

```bash
pip install flytekitplugins-memray
```

```{auto-examples-toc}
memray_example
```
Empty file.
66 changes: 66 additions & 0 deletions examples/memray_plugin/memray_plugin/memray_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# %% [markdown]
# (memray_example)=
#
# # Memray Profiling Example
# Memray tracks and reports memory allocations, both in python code and in compiled extension modules.
# This Memray Profiling plugin enables memory tracking on the Flyte task level and renders a memgraph profiling graph on Flyte Deck.
# %%
from flytekit import workflow, task, ImageSpec
from flytekitplugins.memray import memray_profiling
import time

# %% [markdown]
# First, we use `ImageSpec` to construct a container that contains the dependencies for the
# tasks, we want to profile:
# %%
image = ImageSpec(
name="memray_demo",
packages=["flytekitplugins_memray"],
registry="<your_cr_registry>",
)


# %% [markdown]
# Next, we define a dummy function that generates data in memory without releasing:
# %%
def generate_data(n: int):
leak_list = []
for _ in range(n): # Arbitrary large number for demonstration
large_data = " " * 10**6 # 1 MB string
leak_list.append(large_data) # Keeps appending without releasing
time.sleep(0.1) # Slow down the loop to observe memory changes


# %% [markdown]
# Example of profiling the memory usage of `generate_data()` via the memray `table` html reporter
# %%
@task(container_image=image, enable_deck=True)
@memray_profiling(memray_html_reporter="table")
def memory_usage(n: int) -> str:
generate_data(n=n)

return "Well"


# %% [markdown]
# Example of profiling the memory leackage of `generate_data()` via the memray `flamegraph` html reporter
# %%


@task(container_image=image, enable_deck=True)
@memray_profiling(trace_python_allocators=True, memray_reporter_args=["--leaks"])
def memory_leakage(n: int) -> str:
generate_data(n=n)

return "Well"


# %% [markdown]
# Put everything together in a workflow.
# %%


@workflow
def wf(n: int = 500):
memory_usage(n=n)
memory_leakage(n=n)
1 change: 1 addition & 0 deletions examples/memray_plugin/requirements.in
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
flytekitplugins-memray
Loading