-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v313] Cache eviction issue with multiple requests and unequal page/requested size #18674
Comments
Do you mind add more env info or the error message? |
Thanks for the quick response. Here's some more environment information. Alluxio is running with one master and one worker node on the same system, with the following site configuration, being attached to HDFS locally: alluxio-site.properties alluxio.master.mount.table.root.ufs=hdfs://127.0.0.1:9000/
alluxio.dora.client.ufs.root=hdfs://127.0.0.1:9000/
alluxio.worker.tieredstore.levels=0 alluxio-env.sh JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 The error is coming from an environment where TPC-DS is being ran with an Apache Hadoop storage medium for data being queried by TPC-DS. Using Apache Spark for executing queries and Alluxio for caching data queried, the following is outputted when a query fails between Spark/Alluxio: Spark error
Alluxio error (left some additional debug logs around it for context if needed)
|
@jja725 Hello, do you need any more information for this issue? Could it be possible to fix this through |
how about set a bigger cache size in alluxio-site.properties? |
try to allocate more cache. modify |
Increasing it does help a little in terms of how many queries are able to complete, but it doesn't completely solve the issue. Once I get to around the size of the dataset, then all the queries are able to complete. |
Alluxio Version
v313
Describe the bug
Two problems: Multiple write requests can try to evict the same page in cache and only one can succeed, the other is the evict page size is not checked against the requested size so the cache eviction can not free enough space for write request.
A TPC benchmark to do with big data and hadoop, using alluxio as a caching layer.
Multiple queries coming at the same time during the throughput query stage of a benchmark will end up failing due to cache eviction issues described earlier.
Please let me know if additional information is required to get this fixed. Thanks!
The text was updated successfully, but these errors were encountered: