Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validator error #7

Open
ABarcaru opened this issue Aug 16, 2024 · 7 comments
Open

validator error #7

ABarcaru opened this issue Aug 16, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@ABarcaru
Copy link

Hi, I am trying to use the python application of this script.

nmi = NormalizedMI(normalize_method='joint', k=5 , verbose=False, n_dims = 1)
nmi.fit_transform(X)

and I am hitting the same error:

"BeartypeCallHintReturnViolation: Function normi._estimators.kraskov_estimator() return (np.float64(0.0), np.float64(2.084952429900486), np.float64(1.0187914941452727), np.float64...)) violates type hint tuple[typing.Annotated[typing.Union[float, numpy.floating], Is[lambda arr: bool(np.all(arr > 0))]], typing.Union[float, numpy.floating], typing.Union[float, numpy.floating], typing.Union[float, numpy.floating]], as tuple index 0 item <class "numpy.float64"> np.float64(0.0) violates validator Is[lambda arr: bool(np.all(arr > 0))]:
False == Is[lambda arr: bool(np.all(arr > 0))]."

My X is a numpy array 56x20, dtype float. It works in two other cases of the same size and same instantiation but fails in this one.
What could be the problem ?
test_array.csv

@braniii
Copy link
Member

braniii commented Sep 4, 2024

@ABarcaru thank you so much for taking the time to open this issue. This issue might be linked to #6, but looking at the provided test_array I am not sure if it is related. I will try to address #6 in the next weeks and hope that it will solve your problem as well. Sorry, for that.

@PerryL-y
Copy link

PerryL-y commented Nov 4, 2024

In the kraskov_estimator function, mutual information mi is finally rectified as np.max([digamma_N + digamma_k - digamma_nx - digamma_ny, 0]), which means it may return 0. So, maybe it should be type-annotated as non-negative float instead of positive float?

Indeed, when I test the normalizedMI class with independent random sequences, it correctly estimates 0 mutual information between random sequences, but throw error due to the beartype type checking

@jessLryan
Copy link

jessLryan commented Nov 11, 2024

@braniii I have been getting a similar error when calling fit or fit_transform. I am not able to share the data I am working with, but have included the error details. There are 2 features, with the same dimensions, containing floats. Thank you!

BeartypeCallHintReturnViolation: Function normi._estimators.kraskov_estimator() return (0.0, -33.75037660839038, -21.184763217160228, -13.703348491372974) violates type hint tuple[typing.Annotated[typing.Union[float, numpy.floating], beartype.vale.Is[lambda arr: bool(np.all(arr > 0))]], typing.Union[float, numpy.floating], typing.Union[float, numpy.floating], typing.Union[float, numpy.floating]], as tuple index 0 item <class "numpy.float64"> 0.0 violates validator beartype.vale.Is[lambda arr: bool(np.all(arr > 0))]:
False == beartype.vale.Is[lambda arr: bool(np.all(arr > 0))].

@jessLryan
Copy link

@braniii from an initial look (and quite a basic understanding of what is happening here), it seems that the MI value returned by the kraskov_estimator function is required to be a PositiveFloat, which in turn is required to be of type IsStrictlyPositive, so an MI value of 0.0 results in an error - should this requirement be positive, and not strictly positive?

@braniii
Copy link
Member

braniii commented Nov 12, 2024

@ABarcaru @jessLryan sorry for the inconvenience. I guess that the limitation is the assumption of local constant density in the KSG estimator (which this method is based on). I will try to have a look into it.

@jessLryan
Copy link

presumably, if MI is zero, then NMI also ought to be zero?

@agheeraert
Copy link

agheeraert commented Mar 26, 2025

@braniii and others, a quick and dirty fix that worked for me is to replace the return call from the kraskov_estimator function in src/normi/_estimators.py from

430     return (
431         np.max([digamma_N + digamma_k - digamma_nx - digamma_ny, 0),  # mi
432         digamma_N - digamma_k + (dx + dy) * mean_log_eps,  # hxy
433         digamma_N - digamma_nx + dx * mean_log_eps,  # hx
434         digamma_N - digamma_ny + dy * mean_log_eps,  # hy
435     )

to

430     return (
431         np.max([digamma_N + digamma_k - digamma_nx - digamma_ny, 1e-10),  # mi
432         digamma_N - digamma_k + (dx + dy) * mean_log_eps,  # hxy
433         digamma_N - digamma_nx + dx * mean_log_eps,  # hx
434         digamma_N - digamma_ny + dy * mean_log_eps,  # hy
435     )

I used to work a lot (not anymore) with the ksg estimator for MI estimation and it was often that we needed those dirty fixes on real-world data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants