Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H2OResponseError: SHAP Summary Error #16462

Open
panyan716 opened this issue Dec 17, 2024 · 4 comments
Open

H2OResponseError: SHAP Summary Error #16462

panyan716 opened this issue Dec 17, 2024 · 4 comments
Assignees
Labels

Comments

@panyan716
Copy link

H2OResponseError Traceback (most recent call last)
Cell In[12], line 1
----> 1 aml.explain(test)

File D:\ZhuanYe\python\Lib\site-packages\h2o\explanation_explain.py:3285, in explain(models, frame, columns, top_n_features, include_explanations, exclude_explanations, plot_overrides, figsize, render, qualitative_colormap, sequential_colormap, background_frame)
3283 result["shap_summary"]["plots"] = H2OExplanation()
3284 for shap_model in shap_models:
-> 3285 result["shap_summary"]["plots"][shap_model.model_id] = display(shap_summary_plot(
3286 shap_model,
3287 **_custom_args(
3288 plot_overrides.get("shap_summary_plot"),
3289 frame=frame,
3290 figsize=figsize,
3291 background_frame=background_frame
3292 )))
3294 # PDP
3295 if "pdp" in explanations:

File D:\ZhuanYe\python\Lib\site-packages\h2o\explanation_explain.py:704, in shap_summary_plot(model, frame, columns, top_n_features, samples, colorize_factors, alpha, colormap, figsize, jitter, save_plot_path, background_frame)
702 with no_progress_block():
703 contributions = NumpyFrame(model.predict_contributions(frame, output_format="compact", background_frame=background_frame))
--> 704 frame = NumpyFrame(frame)
705 contribution_names = contributions.columns
707 feature_importance = sorted(
708 {k: np.abs(v).mean() for k, v in contributions.items() if "BiasTerm" != k}.items(),
709 key=lambda kv: kv[1])

File D:\ZhuanYe\python\Lib\site-packages\h2o\explanation_explain.py:236, in NumpyFrame.init(self, h2o_frame)
234 _is_numeric = h2o_frame.isnumeric()
235 self._columns = h2o_frame.columns
--> 236 self._factors = {col: h2o_frame[col].asfactor().levels()[0] for col in
237 np.array(h2o_frame.columns)[_is_factor]}
239 df = h2o_frame.as_data_frame(False)
240 self._data = np.empty((h2o_frame.nrow, h2o_frame.ncol))

File D:\ZhuanYe\python\Lib\site-packages\h2o\frame.py:1438, in H2OFrame.levels(self)
1423 def levels(self):
1424 """
1425 Get the factor levels.
1426
(...)
1436 >>> h2oframe.levels()
1437 """
-> 1438 lol = H2OFrame._expr(expr=ExprNode("levels", self)).as_data_frame(False)
1439 lol.pop(0) # Remove column headers
1440 lol = list(zip(*lol))

File D:\ZhuanYe\python\Lib\site-packages\h2o\frame.py:1989, in H2OFrame.as_data_frame(self, use_pandas, header, use_multi_thread)
1986 return pandas.read_csv(StringIO(self.get_frame_data()), low_memory=False, skip_blank_lines=False)
1988 from h2o.utils.csv.readers import reader
-> 1989 frame = [row for row in reader(StringIO(self.get_frame_data()))]
1990 if not header:
1991 frame.pop(0)

File D:\ZhuanYe\python\Lib\site-packages\h2o\frame.py:2049, in H2OFrame.get_frame_data(self)
2033 def get_frame_data(self):
2034 """
2035 Get frame data as a string in csv format.
2036
(...)
2045 >>> iris.get_frame_data()
2046 """
2047 return h2o.api(
2048 "GET /3/DownloadDataset",
-> 2049 data={"frame_id": self.frame_id, "hex_string": False, "escape_quotes": True}
2050 )

File D:\ZhuanYe\python\Lib\site-packages\h2o\frame.py:414, in H2OFrame.frame_id(self)
402 @Property
403 def frame_id(self):
404 """
405 Internal id of the frame (str).
406
(...)
412 >>> print(iris.frame_id)
413 """
--> 414 return self._frame()._ex._cache._id

File D:\ZhuanYe\python\Lib\site-packages\h2o\frame.py:584, in H2OFrame._frame(self, rows, rows_offset, cols, cols_offset, fill_cache)
583 def _frame(self, rows=10, rows_offset=0, cols=-1, cols_offset=0, fill_cache=False):
--> 584 self._ex._eager_frame()
585 if fill_cache:
586 self._ex._cache.fill(rows=rows, rows_offset=rows_offset, cols=cols, cols_offset=cols_offset)

File D:\ZhuanYe\python\Lib\site-packages\h2o\expr.py:90, in ExprNode._eager_frame(self)
88 if not self._cache.is_empty(): return
89 if self._cache._id is not None: return # Data already computed under ID, but not cached locally
---> 90 self._eval_driver('frame')

File D:\ZhuanYe\python\Lib\site-packages\h2o\expr.py:114, in ExprNode._eval_driver(self, top)
107 """
108 :param top: if this is a top expression (providing a final result),
109 then specifies the expected result type (accepted values = ['frame', 'scalar']),
110 or None if no object creation is expected.
111 :return: self expr
112 """
113 exec_str = self._get_ast_str(top)
--> 114 res = ExprNode.rapids(exec_str)
115 if 'scalar' in res:
116 if isinstance(res['scalar'], list):

File D:\ZhuanYe\python\Lib\site-packages\h2o\expr.py:258, in ExprNode.rapids(expr)
249 @staticmethod
250 def rapids(expr):
251 """
252 Execute a Rapids expression.
253
(...)
256 :returns: The JSON response (as a python dictionary) of the Rapids execution
257 """
--> 258 return h2o.api("POST /99/Rapids", data={"ast": expr, "session_id": h2o.connection().session_id})

File D:\ZhuanYe\python\Lib\site-packages\h2o\h2o.py:123, in api(endpoint, data, json, filename, save_to)
121 # type checks are performed in H2OConnection class
122 _check_connection()
--> 123 return h2oconn.request(endpoint, data=data, json=json, filename=filename, save_to=save_to)

File D:\ZhuanYe\python\Lib\site-packages\h2o\backend\connection.py:499, in H2OConnection.request(self, endpoint, data, json, filename, save_to)
497 save_to = save_to(resp)
498 self._log_end_transaction(start_time, resp)
--> 499 return self._process_response(resp, save_to)
501 except (requests.exceptions.ConnectionError, requests.exceptions.HTTPError) as e:
502 if self._local_server and not self._local_server.is_running():

File D:\ZhuanYe\python\Lib\site-packages\h2o\backend\connection.py:853, in H2OConnection._process_response(response, save_to)
851 if status_code in {400, 404, 412} and isinstance(data, H2OErrorV3):
852 data.show_stacktrace = False
--> 853 raise H2OResponseError(data)
855 # Server errors (notably 500 = "Server Error")
856 # Note that it is possible to receive valid H2OErrorV3 object in this case, however it merely means the server
857 # did not provide the correct status code.
858 raise H2OServerError("HTTP %d %s:\n%s" % (status_code, response.reason, data))

H2OResponseError: Server error java.lang.IllegalArgumentException:
Error: Incorrect number of arguments; 'cols_py' expects 2 but was passed 3
Request: POST /99/Rapids
data: {'ast': "(tmp= py_459_sid_a6a9 (levels (tmp= py_458_sid_a6a9 (as.factor (cols_py py_457_sid_a6a9 np.str_('AT-DILI'))))))", 'session_id': '_sid_a6a9'}
QQ拼音截图2024121709510311
1
2
3
4

@panyan716 panyan716 added the bug label Dec 17, 2024
@tomasfryda tomasfryda self-assigned this Jan 7, 2025
@tomasfryda
Copy link
Contributor

Thank you for reporting this issue. Unfortunately, I won't be able to reproduce it without your code but I think I know what's wrong - we use lazy evaluation and this error looks like you modified the frame that you're calling on the explain method and that modification was not evaluated yet. If I'm correct, simply calling print(test) (assuming your frame is called test) before running the explain method should fix it for you.

@panyan716 Could you confirm that?

@panyan716
Copy link
Author

Thank you for reporting this issue. Unfortunately, I won't be able to reproduce it without your code but I think I know what's wrong - we use lazy evaluation and this error looks like you modified the frame that you're calling on the explain method and that modification was not evaluated yet. If I'm correct, simply calling print(test) (assuming your frame is called test) before running the explain method should fix it for you.

@panyan716 Could you confirm that?

Thank you for your answer. However, this bug is still not resolved. It is worth noting that when we run “explain”, Learning Curve Plot, Variable Importance, Variable Importance Heatmap, and Model Correlation work fine, only SHAP Summary reports an error.
QQ拼音截图20250111104820

@panyan716
Copy link
Author

Thank you for reporting this issue. Unfortunately, I won't be able to reproduce it without your code but I think I know what's wrong - we use lazy evaluation and this error looks like you modified the frame that you're calling on the explain method and that modification was not evaluated yet. If I'm correct, simply calling print(test) (assuming your frame is called test) before running the explain method should fix it for you.

@panyan716 Could you confirm that?
We are using jupyter notebook to run H2O AutoML. Could you please help me to find out why the bug occurs.
DILI-Copy1.pdf
test0813.csv
Train0813.csv

@tomasfryda
Copy link
Contributor

@panyan716 Thank you for the more information so I could reproduce the bug. It's caused by h2o-3 not supporting numpy>=2.0.0.

A workaround for now would be to install older version of numpy.

$ pip install --force-reinstall 'numpy<2'

We already have a ticket for supporting numpy 2 but I don't know when we'll get to it. If you want to follow the issue more closely it's #16295 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants