Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for SparkConnect #539

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open

feat: add support for SparkConnect #539

wants to merge 35 commits into from

Conversation

razvan
Copy link
Member

@razvan razvan commented Mar 21, 2025

Description

Part of: #284

See issue above for implemented features.

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

Preview Give feedback

Reviewer

Preview Give feedback

Acceptance

Preview Give feedback

@razvan razvan moved this to Development: In Progress in Stackable Engineering Apr 7, 2025
@razvan razvan moved this from Development: In Progress to Development: Waiting for Review in Stackable Engineering Apr 8, 2025
@razvan razvan requested a review from a team April 8, 2025 08:20
@razvan razvan marked this pull request as ready for review April 8, 2025 08:20
@maltesander maltesander self-requested a review April 8, 2025 08:27
@maltesander maltesander moved this from Development: Waiting for Review to Development: In Review in Stackable Engineering Apr 8, 2025
@razvan
Copy link
Member Author

razvan commented Apr 8, 2025

pub const CONNECT_GRPC_PORT: i32 = 15002;
pub const CONNECT_UI_PORT: i32 = 4040;

pub const DUMMY_SPARK_CONNECT_GROUP_NAME: &str = "default";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is defined in common.rs as well.

Comment on lines 113 to 104
scs: &v1alpha1::SparkConnectServer,
config: &v1alpha1::ExecutorConfig,
pi: &ResolvedProductImage,
config_map: &ConfigMap,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a fan of the abbreviations. I can live with scs but pi is probably not how you want to refer an image :) ?

Comment on lines 404 to 415
ServicePort {
name: Some(String::from("grpc")),
port: CONNECT_GRPC_PORT,
..ServicePort::default()
},
ServicePort {
name: Some(String::from("http")),
port: CONNECT_UI_PORT,
..ServicePort::default()
},
]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc and http are used as well in build_deployment as well, constant?

Comment on lines +67 to +77
# TODO: cannot use the PySpark job from the ConfigMap because it breaks
# with a "iceberg.SparkWrite$WriterFactory" ClassNotfoundException.
# Use the app bundled within spark-connect-client instead
# "/app/simple-connect-app.py",
#
command:
[
"/usr/bin/python",
"/stackable/spark-connect-examples/python/simple-connect-app.py",
"sc://spark-connect-server",
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What needs to be done to fix this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Development: In Review
Development

Successfully merging this pull request may close these issues.

2 participants