Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in _minimize_local_discrete #616

Open
veni-vidi-vici-dormivi opened this issue Feb 6, 2025 · 0 comments · May be fixed by #617 or #493
Open

Bug in _minimize_local_discrete #616

veni-vidi-vici-dormivi opened this issue Feb 6, 2025 · 0 comments · May be fixed by #617 or #493
Labels
bug Something isn't working

Comments

@veni-vidi-vici-dormivi
Copy link
Collaborator

veni-vidi-vici-dormivi commented Feb 6, 2025

I mentioned this already in #413 but then forgot about it. Consider the following case in _minimize_local_discrete:
The very first localized covariance matrix is singular and thus returns a negative loglikelyhood of inf.
Then current_min is still inf. res is also inf, a warning gets raised but we continue. res < current_min in this case becomes float("inf") < float("inf") and yields False. Thus we land at sel = i - 1 and sel = -1. We return sequence[-1]`, in this case the last localization radius, without ever checking its likelihood or if the resulting localized covariance matrix is not singular. In my opinion, this is a bug and we should fix it before releasing MESMER v1.0.

It can be fixed by actually skipping results that are inf (rather complicated) or raising an Error if the first element already results is inf if we think that once the localization radii return singular matrices the rest will also always return singular matrices. This is inuitively true when the radii are ordered (which we do not explicitly require) and higher radii lead to less localization and thus to "too little regularized" matrices. At the moment, we always stop when res == inf and return the element from the iteration before (NOTE: I think the comment in the code is misleading).

current_min = float("inf")
# ensure it's a list because we cannot get an item from an iterable
sequence = list(sequence)
element = None
for i, element in enumerate(sequence):
res = func(element, **kwargs)
if np.isneginf(res):
raise ValueError("`fun` returned `-inf`")
# skip element if inf is returned - not sure about this?
elif np.isinf(res):
warnings.warn("`fun` returned `inf`", OptimizeWarning)
if res < current_min:
current_min = res
else:
# need to return element from the previous iteration
sel = i - 1
if sel == 0:
warnings.warn("First element is local minimum.", OptimizeWarning)
return sequence[sel]
warnings.warn("No local minimum found, returning the last element", OptimizeWarning)
return element

Also check the following test cases:

@pytest.mark.parametrize(
    "values, expected",
    [((5, 4, 3, 2, 3, 0), 3), ((1, 0, 1, 2), 1), ((float("inf"), 1, 2, 3), 1),
     ((3, 2, 1, float("inf"), 0), 4), ((float("inf"), float("inf"), float("inf")), 2),
     ((3, float("inf"), 2, 1, float("inf"), 2), 3)],
)
def test_minimize_local_discrete(values, expected):

    data_dict = {key: value for key, value in enumerate(values)}

    def func(i):
        return data_dict[i]

    result = mesmer.core.utils._minimize_local_discrete(func, data_dict.keys())

    assert result == expected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant