Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generic topic to pool sync #26

Open
mccanne opened this issue Oct 6, 2021 · 0 comments
Open

generic topic to pool sync #26

mccanne opened this issue Oct 6, 2021 · 0 comments

Comments

@mccanne
Copy link
Collaborator

mccanne commented Oct 6, 2021

Zinger should have options to run more generically by syncing a kafka topic to a pool without the strict enforcement of the sequential offset field in the kafka meta record. Here we would just sync the kafka key/value or just the value without creating a meta record. Also you should be able to run with auto-commit of consumer commit offsets so multiple processes could sync in parallel to the same data lake (where strict ordering does not matter).

The auto-commit approach creates a window where data could be dropped from the lake and committed. We should think through how we might use explicit commits that would be performed after the pool commit with some way of recovering from a crash if the pool is committed to but the topic commit offset is not updated. Maybe we need a kafka meta-field after all to make this work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant