Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

continuous syncing service #83

Open
mccanne opened this issue May 29, 2022 · 1 comment
Open

continuous syncing service #83

mccanne opened this issue May 29, 2022 · 1 comment
Assignees

Comments

@mccanne
Copy link
Collaborator

mccanne commented May 29, 2022

Syncing from kakfa and doing ETL should run as a continuous service so we don't need to poll and recompute progress state on each run.

Step 1 is to get from-kafka to provide a continuous service where it listens on each configured topic and syncs data as it arrives. There should be two parameters to drive commits: a data limit and a timeout. When data arrives but does not exceed the data limit, a timeout triggers processing.

Step 2 is to automate ETL based on from-kafka commits. Here the service is running continuously and whenever data arrives that could be consumed by an ETL, the logic is run automatically. This way, we don't need to run ETLs on a polling loop as they are run only when they have new data to process.

@philrz
Copy link
Contributor

philrz commented Oct 5, 2022

Step 1 is complete, and Step 2 remains to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants