Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: DataFrame.lookup #35224

Merged
merged 36 commits into from
Sep 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5bbddce
DEPR - 18262 - deprecate lookup
erfannariman Jul 11, 2020
02b50ac
DEPR - 18262 - changes black
erfannariman Jul 11, 2020
ca8bf05
Add test for deprecation lookup
erfannariman Jul 11, 2020
dea45b9
Added deprecation to whatsnew
erfannariman Jul 11, 2020
5f116ad
Add test for deprecation lookup
erfannariman Jul 11, 2020
0c04e90
FIX - 18262 - add warnings to other tests
erfannariman Jul 11, 2020
758468d
Merge remote-tracking branch 'upstream/master' into 18262-depr-lookup
erfannariman Jul 11, 2020
7ac3b32
DOC - 18262 - added example to lookup values
erfannariman Jul 11, 2020
2ab80cf
DOC - 18262 - point to example in depr
erfannariman Jul 11, 2020
94a6c0f
FIX - 18262 - deprecation warning before summary
erfannariman Jul 11, 2020
a339131
FIX - 18262 - whitespaces after comma
erfannariman Jul 11, 2020
269e4cc
FIX - 18262 - removed double line break
erfannariman Jul 11, 2020
0c40d69
Fix variable in ipython
erfannariman Jul 11, 2020
1ca23bc
18262 - removed linebreak
erfannariman Jul 12, 2020
9681a3d
18262 - Fix merge conflict
erfannariman Jul 15, 2020
6342ad2
18262 - replaced depr message
erfannariman Jul 15, 2020
3dfe19d
18262 - line break too long line
erfannariman Jul 15, 2020
187d47b
18262 - set header size
erfannariman Jul 15, 2020
db63df7
[FIX] - 18262 - Merge conflict
erfannariman Jul 28, 2020
227fad5
[FIX] - 18262 - removed extra dash header
erfannariman Jul 28, 2020
ce775ce
[FIX] - 18262 - reference to section in docs
erfannariman Jul 28, 2020
2ee7d09
[FIX] - 18262 - grammar / typos in docstring
erfannariman Jul 28, 2020
4c78311
Merge branch 'master' into 18262-depr-lookup
erfannariman Jul 28, 2020
f447185
Merge branch 'master' into 18262-depr-lookup
erfannariman Sep 13, 2020
dc2d367
moved depr version to 1.2
erfannariman Sep 13, 2020
293bd7a
test with linking to user guide
erfannariman Sep 13, 2020
cbca163
Remove line break
erfannariman Sep 13, 2020
90fa6a9
Merge branch 'master' into 18262-depr-lookup
erfannariman Sep 13, 2020
3eefd8e
Merge branch 'master' into 18262-depr-lookup
erfannariman Sep 14, 2020
4c3c163
Revert whatsnew v1.1.0
erfannariman Sep 14, 2020
b5a34e3
Added depr message in whatsnew v1.2.0
erfannariman Sep 14, 2020
ba4fb8a
replace query with loc
erfannariman Sep 14, 2020
6b91db6
add melt and loc to depr msg
erfannariman Sep 14, 2020
ff7724f
add dot
erfannariman Sep 14, 2020
104e3cb
added colon hyperlink
erfannariman Sep 15, 2020
25e78dd
updates
erfannariman Sep 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 16 additions & 6 deletions doc/source/user_guide/indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1480,17 +1480,27 @@ default value.
s.get('a') # equivalent to s['a']
s.get('x', default=-1)

The :meth:`~pandas.DataFrame.lookup` method
-------------------------------------------
.. _indexing.lookup:

Looking up values by index/column labels
----------------------------------------

Sometimes you want to extract a set of values given a sequence of row labels
and column labels, and the ``lookup`` method allows for this and returns a
NumPy array. For instance:
and column labels, this can be achieved by ``DataFrame.melt`` combined by filtering the corresponding
rows with ``DataFrame.loc``. For instance:

.. ipython:: python

dflookup = pd.DataFrame(np.random.rand(20, 4), columns = ['A', 'B', 'C', 'D'])
dflookup.lookup(list(range(0, 10, 2)), ['B', 'C', 'A', 'B', 'D'])
df = pd.DataFrame({'col': ["A", "A", "B", "B"],
'A': [80, 23, np.nan, 22],
'B': [80, 55, 76, 67]})
df
melt = df.melt('col')
melt = melt.loc[melt['col'] == melt['variable'], 'value']
melt.reset_index(drop=True)

Formerly this could be achieved with the dedicated ``DataFrame.lookup`` method
which was deprecated in version 1.2.0.

.. _indexing.class:

Expand Down
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.2.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ Deprecations
- Deprecated parameter ``inplace`` in :meth:`MultiIndex.set_codes` and :meth:`MultiIndex.set_levels` (:issue:`35626`)
- Deprecated parameter ``dtype`` in :~meth:`Index.copy` on method all index classes. Use the :meth:`Index.astype` method instead for changing dtype(:issue:`35853`)
- Date parser functions :func:`~pandas.io.date_converters.parse_date_time`, :func:`~pandas.io.date_converters.parse_date_fields`, :func:`~pandas.io.date_converters.parse_all_fields` and :func:`~pandas.io.date_converters.generic_parser` from ``pandas.io.date_converters`` are deprecated and will be removed in a future version; use :func:`to_datetime` instead (:issue:`35741`)
- :meth:`DataFrame.lookup` is deprecated and will be removed in a future version, use :meth:`DataFrame.melt` and :meth:`DataFrame.loc` instead (:issue:`18682`)

.. ---------------------------------------------------------------------------

Expand Down
15 changes: 14 additions & 1 deletion pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -3843,10 +3843,15 @@ def _series(self):
def lookup(self, row_labels, col_labels) -> np.ndarray:
"""
Label-based "fancy indexing" function for DataFrame.

Given equal-length arrays of row and column labels, return an
array of the values corresponding to each (row, col) pair.

.. deprecated:: 1.2.0
DataFrame.lookup is deprecated,
use DataFrame.melt and DataFrame.loc instead.
For an example see :meth:`~pandas.DataFrame.lookup`
in the user guide.

Parameters
----------
row_labels : sequence
Expand All @@ -3859,6 +3864,14 @@ def lookup(self, row_labels, col_labels) -> np.ndarray:
numpy.ndarray
The found values.
"""
msg = (
"The 'lookup' method is deprecated and will be"
"removed in a future version."
"You can use DataFrame.melt and DataFrame.loc"
"as a substitute."
)
warnings.warn(msg, FutureWarning, stacklevel=2)

n = len(row_labels)
if n != len(col_labels):
raise ValueError("Row labels must have same size as column labels")
Expand Down
36 changes: 27 additions & 9 deletions pandas/tests/frame/indexing/test_indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1340,7 +1340,8 @@ def test_lookup_float(self, float_frame):
df = float_frame
rows = list(df.index) * len(df.columns)
cols = list(df.columns) * len(df.index)
result = df.lookup(rows, cols)
with tm.assert_produces_warning(FutureWarning):
result = df.lookup(rows, cols)

expected = np.array([df.loc[r, c] for r, c in zip(rows, cols)])
tm.assert_numpy_array_equal(result, expected)
Expand All @@ -1349,7 +1350,8 @@ def test_lookup_mixed(self, float_string_frame):
df = float_string_frame
rows = list(df.index) * len(df.columns)
cols = list(df.columns) * len(df.index)
result = df.lookup(rows, cols)
with tm.assert_produces_warning(FutureWarning):
result = df.lookup(rows, cols)

expected = np.array(
[df.loc[r, c] for r, c in zip(rows, cols)], dtype=np.object_
Expand All @@ -1365,7 +1367,8 @@ def test_lookup_bool(self):
"mask_c": [False, True, False, True],
}
)
df["mask"] = df.lookup(df.index, "mask_" + df["label"])
with tm.assert_produces_warning(FutureWarning):
df["mask"] = df.lookup(df.index, "mask_" + df["label"])

exp_mask = np.array(
[df.loc[r, c] for r, c in zip(df.index, "mask_" + df["label"])]
Expand All @@ -1376,13 +1379,16 @@ def test_lookup_bool(self):

def test_lookup_raises(self, float_frame):
with pytest.raises(KeyError, match="'One or more row labels was not found'"):
float_frame.lookup(["xyz"], ["A"])
with tm.assert_produces_warning(FutureWarning):
float_frame.lookup(["xyz"], ["A"])

with pytest.raises(KeyError, match="'One or more column labels was not found'"):
float_frame.lookup([float_frame.index[0]], ["xyz"])
with tm.assert_produces_warning(FutureWarning):
float_frame.lookup([float_frame.index[0]], ["xyz"])

with pytest.raises(ValueError, match="same size"):
float_frame.lookup(["a", "b", "c"], ["a"])
with tm.assert_produces_warning(FutureWarning):
float_frame.lookup(["a", "b", "c"], ["a"])

def test_lookup_requires_unique_axes(self):
# GH#33041 raise with a helpful error message
Expand All @@ -1393,14 +1399,17 @@ def test_lookup_requires_unique_axes(self):

# homogeneous-dtype case
with pytest.raises(ValueError, match="requires unique index and columns"):
df.lookup(rows, cols)
with tm.assert_produces_warning(FutureWarning):
df.lookup(rows, cols)
with pytest.raises(ValueError, match="requires unique index and columns"):
df.T.lookup(cols, rows)
with tm.assert_produces_warning(FutureWarning):
df.T.lookup(cols, rows)

# heterogeneous dtype
df["B"] = 0
with pytest.raises(ValueError, match="requires unique index and columns"):
df.lookup(rows, cols)
with tm.assert_produces_warning(FutureWarning):
df.lookup(rows, cols)

def test_set_value(self, float_frame):
for idx in float_frame.index:
Expand Down Expand Up @@ -2232,3 +2241,12 @@ def test_object_casting_indexing_wraps_datetimelike():
assert blk.dtype == "m8[ns]" # we got the right block
val = blk.iget((0, 0))
assert isinstance(val, pd.Timedelta)


def test_lookup_deprecated():
# GH18262
df = pd.DataFrame(
{"col": ["A", "A", "B", "B"], "A": [80, 23, np.nan, 22], "B": [80, 55, 76, 67]}
)
with tm.assert_produces_warning(FutureWarning):
df.lookup(df.index, df["col"])