Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Java Containers Crash on MAC M4 with OS 15.2 or higher #2066

Open
vmtocloud opened this issue Feb 21, 2025 · 6 comments
Open

Multiple Java Containers Crash on MAC M4 with OS 15.2 or higher #2066

vmtocloud opened this issue Feb 21, 2025 · 6 comments
Labels
bug Something isn't working

Comments

@vmtocloud
Copy link

vmtocloud commented Feb 21, 2025

Bug Report

Which version of the demo you are using?
Latest

Symptom

A clear and concise description of what the bug is.

When deploying the Open Telemetry Demo app with the open-telemetry-demo Helm Chart in Kubernetes running on an Apple Mac M4 with OS 15.2 or higher all pods should deploy and come up as running.

What do you expect to see?

All pods in a running state

Please describe the actual behavior experienced.

The following pods: adservice, accountingservice, Checkoutservice, frauddetectionservice, kafaka and opensearch all crashloop with a Java Runtime error in the logs.

Could you provide the minimum required steps to resolve the issue you're seeing?

The root cause seems to be related to how the Java Virtual Machine (JVM) interacts with the M4's architecture within the Docker environment. A workaround involves disabling the Scalable Vector Extension (SVE) in the JVM by using the -XX:UseSVE=0 option. This can be applied when running the Java application or when configuring Docker images.

when building a Docker image, the JAVA_TOOL_OPTIONS environment variable can be set within the Dockerfile:
Code

FROM <base_image>
ENV JAVA_TOOL_OPTIONS="-XX:UseSVE=0"
# ... rest of the Dockerfile

Users are unable to use the Helm chart today since the images from ghcr.io/open-telemetry/ are not yet updated. Any help appreciated! If this is the wrong place apologies, let me know and I will post in correct Github.

We will close this issue if:

  • The steps you provided are complex.
  • If we can not reproduce the behavior you're reporting.

Additional Context

I have not tested the above solution as I do not have a Mac M4. I have folks on my team that are happy to test this if there are new images pushed.

@vmtocloud vmtocloud added the bug Something isn't working label Feb 21, 2025
@vmtocloud vmtocloud changed the title Multiple Java Containers Crash on MAC M4 with OS 15,2 or higher Multiple Java Containers Crash on MAC M4 with OS 15.2 or higher Feb 21, 2025
@julianocosta89
Copy link
Member

Hello @vmtocloud 👋🏽

Thanks for reporting this!
Actually @CharlieTLe has already merged a PR to fix this last month, but it seems this is not being considered when building the images on our workflows.

Also, I was checking the bug issue and this seems to be resolved 🤔

I wonder if we should just change the base image to a newer specific version of temurin.

@vmtocloud
Copy link
Author

vmtocloud commented Feb 21, 2025

Hi @julianocosta89

Thank you for the quick response. The bug issue was resolved but it's the same fix I described above but it has to be part of the image build:
Workaround: -XX:UseSVE=0

@puckpuck
Copy link
Contributor

Right now we have a fix for when you build the images locally. We don't include this option in our standard build workflow for released images, which is what K8s will use.

Does that option have any downside on non Arm-based MacOS systems?

@hegerchr
Copy link

Hi 👋,

I run into the same issue on my Mac M4 trying to deploy the demo in a local Minikube cluster last week.

I was trying to fix it by adding the workaround -XX:UseSVE=0 at different places (dockerfile, docker-compose.yml, opentelemetry-demo.yaml) and building the images locally using the Minikube's Docker environment. I used the Minikube's Docker environment to make the images available in Minikube.

However, I have not managed yet to make it work. I'm not a routined user of Kubernetes nor Docker, it seems to be complex to build and run local images. I was hoping to find advice in the OpenTelemetry demo docs on how to do it without success.

Could we also add instructions to the docs on how to build and run the local images?

@hegerchr
Copy link

I have found this in the file docker-compose.yaml

# Workaround on OSX for https://bugs.openjdk.org/browse/JDK-8345296
-_JAVA_OPTIONS

for service AdService (line 74), Kafka (line 668) and OpenSearch (line 810). I don't know if it is missing the value -XX:UseSVE=0 by design. I added the value to each

# Workaround on OSX for https://bugs.openjdk.org/browse/JDK-8345296
- _JAVA_OPTIONS=-XX:UseSVE=0

and used the docker compose command from the docs

docker compose up --force-recreate --remove-orphans --detach

to start the demo. The AdService, Kafka and OpenSearch do not crash anymore.

@vmtocloud
Copy link
Author

Right now we have a fix for when you build the images locally. We don't include this option in our standard build workflow for released images, which is what K8s will use.

Does that option have any downside on non Arm-based MacOS systems?

@puckpuck I was wondering the same thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants