Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: give a useful error message when .query is used on a dataframe with duplicate column names #60863

Open
1 of 3 tasks
Mo-Gul opened this issue Feb 6, 2025 · 0 comments
Open
1 of 3 tasks
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@Mo-Gul
Copy link

Mo-Gul commented Feb 6, 2025

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

The title says it all. Optimally .query would give the same error message as the alternative given in the MWE below.

Feature Description

see MWE below

Alternative Solutions

see MWE below

Additional Context

# %%
import pandas as pd

df = pd.DataFrame(
    {
        "A": range(1, 6),
        "B": range(10, 0, -2),
        "C": range(10, 5, -1),
    },
)

# %%
# rename columns with duplicate names
df.columns = ["A", "B", "A"]

# %%
# gives useful error message:
# ValueError: cannot reindex on an axis with duplicate labels
df[(df.A <= 4) & (df.B <= 8)]

# %%
# does not give useful error message:
# TypeError: dtype 'A    int64
# A    int64
# dtype: object' not understood
df.query("A <= 4 and B <= 8")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

1 participant