-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add time series tutorial #1738
Open
cosmicBboy
wants to merge
2
commits into
master
Choose a base branch
from
time-series-tutorial
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
add time series tutorial #1738
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
FROM python:3.8-slim-buster | ||
LABEL org.opencontainers.image.source https://github.com/flyteorg/flytesnacks | ||
|
||
WORKDIR /root | ||
ENV VENV /opt/venv | ||
ENV LANG C.UTF-8 | ||
ENV LC_ALL C.UTF-8 | ||
ENV PYTHONPATH /root | ||
|
||
# This is necessary for opencv to work | ||
RUN apt-get update && apt-get install -y libsm6 libxext6 libxrender-dev ffmpeg build-essential curl | ||
|
||
WORKDIR /root | ||
|
||
ENV VENV /opt/venv | ||
# Virtual environment | ||
RUN python3 -m venv ${VENV} | ||
ENV PATH="${VENV}/bin:$PATH" | ||
|
||
# Install Python dependencies | ||
COPY requirements.in /root | ||
RUN pip install -r /root/requirements.in | ||
RUN pip freeze | ||
|
||
# Copy the actual code | ||
COPY . /root | ||
|
||
# This tag is supplied by the build script and will be used to determine the version | ||
# when registering tasks, workflows, and launch plans | ||
ARG tag | ||
ENV FLYTE_INTERNAL_IMAGE $tag |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
(time_series_modeling)= | ||
|
||
# Time Series Modeling | ||
|
||
```{eval-rst} | ||
.. tags:: Advanced, MachineLearning | ||
``` | ||
|
||
Time series data is fundamentally different from Independent and Identically | ||
Distributed (IID) data, which is commonly used in many machine learning tasks. | ||
Here are a few key differences: | ||
|
||
1. **Temporal Dependency**: In time series data, observations are ordered | ||
chronologically and exhibit temporal dependencies. Each data point is related | ||
to its past and future values. This sequential nature is crucial for | ||
forecasting and trend analysis. In contrast, IID data assumes that each | ||
observation is independent of others. | ||
2. **Non-stationarity**: Time series often display trends, seasonality, or cyclic | ||
patterns that evolve over time. This non-stationarity means that statistical | ||
properties like mean and variance can change, making analysis more complex. IID | ||
data, by definition, maintains constant statistical properties. | ||
3. **Autocorrelation**: Time series data frequently shows autocorrelation, where | ||
an observation is correlated with its own past values. This feature is essential | ||
for many time series models but is not the case for IID data. | ||
4. **Importance of Order**: The sequence of observations in time series data is | ||
critical and cannot be shuffled without losing information. In IID data, the | ||
order of observations is assumed to be irrelevant. | ||
5. **Inference is Focused on Forecasting**: Time series analysis often aims to | ||
predict future values based on historical patterns, whereas many machine | ||
learning tasks with IID data focus on classification or regression without | ||
a temporal component. | ||
6. **Specific Modeling Techniques**: Time series data requires specialized | ||
modeling techniques like ARIMA, Prophet, or RNNs that can capture temporal | ||
dynamics. These models are not typically used with IID data. | ||
|
||
Understanding these differences is crucial for selecting appropriate analysis | ||
methods and interpreting results in time series modeling tasks. | ||
|
||
Below are examples demonstrating how to use Flyte to train time series models. | ||
|
||
## Examples | ||
|
||
```{auto-examples-toc} | ||
neural_prophet | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
flytekit>=1.7.0 | ||
wheel | ||
matplotlib | ||
flytekitplugins-deck-standard |
Empty file.
116 changes: 116 additions & 0 deletions
116
examples/time_series_modeling/time_series_modeling/neural_prophet.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
# %% [markdown] | ||
# # Train a Neural Prophet Model | ||
# | ||
# This script demonstrates how to train a model for time series forecasting | ||
# using the [neural prophet](https://neuralprophet.com/) library. | ||
|
||
# %% [markdown] | ||
# ## Imports and Setup | ||
# | ||
# First, we import necessary libraries to run the training workflow. | ||
|
||
import pandas as pd | ||
from flytekit import Deck, ImageSpec, current_context, task, workflow | ||
from flytekit.types.file import FlyteFile | ||
|
||
# %% [markdown] | ||
# ## Define an ImageSpec | ||
# | ||
# For reproducibility, we create an `ImageSpec` object with required packages | ||
# for our tasks. | ||
|
||
image = ImageSpec( | ||
name="neuralprophet", | ||
packages=[ | ||
"neuralprophet", | ||
"matplotlib", | ||
"ipython", | ||
"pandas", | ||
"pyarrow", | ||
], | ||
# This registry is for a local flyte demo cluster. Replace this with your | ||
# own registry, e.g. `docker.io/<username>/<imagename>` | ||
registry="localhost:30000", | ||
) | ||
|
||
# %% [markdown] | ||
# ## Data Loading Task | ||
# | ||
# This task loads the time series data from the specified URL. In this case, | ||
# we use a hard-coded URL for a sample dataset that ships with the neural prophet. | ||
|
||
URL = "https://github.com/ourownstory/neuralprophet-data/raw/main/kaggle-energy/datasets/tutorial01.csv" | ||
|
||
|
||
@task(container_image=image) | ||
def load_data() -> pd.DataFrame: | ||
return pd.read_csv(URL) | ||
|
||
|
||
# %% [markdown] | ||
# ## Model Training Task | ||
# | ||
# This task trains the Neural Prophet model on the loaded data. | ||
# We train the model in the hourly frequency for ten epochs. | ||
|
||
|
||
@task(container_image=image) | ||
def train_model(df: pd.DataFrame) -> FlyteFile: | ||
from neuralprophet import NeuralProphet, save | ||
|
||
working_dir = current_context().working_directory | ||
model = NeuralProphet() | ||
model.fit(df, freq="H", epochs=10) | ||
model_fp = f"{working_dir}/model.np" | ||
save(model, model_fp) | ||
return FlyteFile(model_fp) | ||
|
||
|
||
# %% [markdown] | ||
# ## Forecasting Task | ||
# | ||
# This task loads the trained model, makes predictions, and visualizes the | ||
# results using a Flyte Deck. | ||
|
||
|
||
@task( | ||
container_image=image, | ||
enable_deck=True, | ||
) | ||
def make_forecast(df: pd.DataFrame, model_file: FlyteFile) -> pd.DataFrame: | ||
from neuralprophet import load | ||
|
||
model_file.download() | ||
model = load(model_file.path) | ||
|
||
# Create a new dataframe reaching 365 into the future | ||
# for our forecast, n_historic_predictions also shows historic data | ||
df_future = model.make_future_dataframe( | ||
df, | ||
n_historic_predictions=True, | ||
periods=365, | ||
) | ||
|
||
# Predict the future | ||
forecast = model.predict(df_future) | ||
|
||
# Plot on a Flyte Deck | ||
fig = model.plot(forecast) | ||
Deck("Forecast", fig.to_html()) | ||
|
||
return forecast | ||
|
||
|
||
# %% [markdown] | ||
# ## Main Workflow | ||
# | ||
# Finally, this workflow orchestrates the entire process: loading data, | ||
# training the model, and making forecasts. | ||
|
||
|
||
@workflow | ||
def main() -> pd.DataFrame: | ||
df = load_data() | ||
model_file = train_model(df) | ||
forecast = make_forecast(df, model_file) | ||
return forecast |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.