-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: pd.DataFrame.from_dict() should support loading columns of varying lengths #61282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi! I’d like to work on this. df = pd.DataFrame.from_dict(
{"col1": [1, 2, 3], "col2": [4, 5]},
autopad=True,
fill_value=np.nan
) Let me know if this approach sounds good! |
take |
@ShauryaDusht I wonder if it makes sense to have parity between the two (orient='index' and orient='columns') and avoid introducing the new |
@nikhilweee |
@ShauryaDusht Are you suggesting we add a new option to the Either way I still think it makes sense to just update the behaviour of |
@nikhilweee I was referring to your approach — adding a new option to the orient argument itself. That felt like a cleaner and more sensible, rather than introducing a separate |
@ShauryaDusht Sorry if I was unclear but I am not suggesting that we add any arguments at all. My suggestion is to merely update the behavior of |
@nikhilweee Got it — looks like the second So should I wait for the maintainers' review before starting(it is still in triage), or would it be okay to begin working on it now? |
@ShauryaDusht Yes, that's exactly what I meant (I fixed the typo). I think it's a good idea to wait for what the maintainers have to say. |
Feature Type
Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas
Problem Description
Creating a dataframe from a dictionary with columns of varying lengths is not supported.
As of pandas 2.2.3, the following snippet results in
ValueError: All arrays must be of the same length
Feature Description
Pandas should automatically pad columns as necessary to make sure they are the same length. Especially because that's the behavior when the
orient
argument is set toindex
. The following works perfectly fine.Alternative Solutions
Since pandas already supports rows of varying lengths when the
orient
argument is set toindex
, to load a dictionary where not all columns are the same length, an alternative solution would be to setorient
toindex
and transpose the resulting dataframe.Additional Context
Since there is a discrepancy in the way pandas handles loading dictionaries based on the value of the
orient
argument, it would be great to have parity between the two.The text was updated successfully, but these errors were encountered: