Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] Allow setting ClientOptions for all datafusion.object_store contexts #1083

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nathschmidt
Copy link

@nathschmidt nathschmidt commented Mar 23, 2025

Which issue does this PR close?

Closes #1082

Rationale for this change

This change adds a PyClientOptions wrapper around object_store.ClientOptions so that it can be instantiated and then used from Python.

What changes are included in this PR?

Introduces a new PyClientOptions struct and necessary wiring to use it.

Are there any user-facing changes?

New optional user-facing arguments for setting ClientOptions.


Given this is my first change - and I don't usually work in Rust - looking to get some confirmation on approach and acceptability of this PR.

Outstanding TODO's:

  • Get some confirmation on overall approach
  • Add necessary wiring for all contexts
  • Add some simple tests

#[pyclass(name = "ClientOptions", module = "datafusion.store", subclass)]
#[derive(Debug, Clone)]
pub struct PyClientOptions {
pub inner: ClientOptions,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main question I have is does this need to be Arc<ClientOptions> rather than just ClientOptions?

@nathschmidt nathschmidt changed the title Allow setting ClientOptions for all datafusion.object_store contexts [Draft] Allow setting ClientOptions for all datafusion.object_store contexts Mar 23, 2025
@kylebarron
Copy link
Contributor

This is somewhat related to #899.

Up for discussion with maintainers but I would personally argue for adopting my project https://docs.rs/pyo3-object_store, which exposes all possible client options automatically, and would give less of a maintenance burden here. You can see the exported client API, which matches these python docs .

@nathschmidt
Copy link
Author

@kylebarron - this seems ideal to me. I'm happy to have a go at it - potentially as a separate API to ease any compatibility woes (i.e. deprecate datafusion.store, replace with datafusion.obstore or something - though tbh given datafusion-python is mostly major releases it should be an acceptable API change so long as there are updated examples).

While we're still on version 0.0.11 of object_store I can try and put together a demo branch with some cargo hackery. Unsure if there's a timeline on getting an official datafusion release depending on 0.0.12 out yet that'd obviously be a blocker to actually merging.

@kylebarron
Copy link
Contributor

kylebarron commented Mar 24, 2025

I think the current exports are datafusion.object_store? So we could deprecate that and use datafusion.store instead?

I've been thinking perhaps it makes sense to cut an 0.3 release of pyo3-object_store that uses object_store 0.11, so that projects (including my own) can start integrating pyo3-object_store.

It's mostly the python-facing obstore API that actually needed 0.12

@kylebarron
Copy link
Contributor

I published new versions of pyo3-object_store for pyo3 0.23 and object_store 0.11: https://github.com/developmentseed/obstore/tree/main/pyo3-object_store#version-compatibility

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow setting ClientOptions for datafusion.object_store contexts
2 participants