-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Row dropped with pandas read_csv on linux #1120
Comments
Thanks - that's weird... I can even reproduce it under Windows. |
Thanks for the report - this is certainly a bug in pyfakefs that has never been noticed before. It seems to be related to I/O buffering. Just for reference: it looks like the last line is not written in the fake fs if it crosses the end of the buffer. The default buffer size is 8192, and the size of your data is a bit over that. There is a difference between Windows and Unix due to the different line endings, so the file size under Windows is larger by the number of line endings in the file, which may be the reason why sometimes the same data work under Windows, but not under Linux (and vice verse). |
@K-Meech - Should be fixed in main now, please check if it works for you. |
Thanks @mrbean-bremen ! Just checked, and everything is working on main now 😄 |
Ok - let me know if you need a patch release now, otherwise I will first check a couple of other issues. |
As an aside: the bug was a simple mistake that has been there for about 5 years now without being noticed - and it shows again how important it is to think about edge cases while writing tests... |
No rush for a release from our side, so feel free to check other issues first. Thanks again for the quick fix! |
Describe the bug
When using pyfakefs with pandas, sometimes a single row is dropped on write / read. This only occurs on linux systems (tested with ubuntu laptop), with no issue on Windows. Totally understand if this issue is out of scope - as there are known issues with pandas listed in the docs!
How To Reproduce
Run the following on a linux system via pytest:
Once read, the dataframe will drop one row (from 46 to 45). Changing pretty much anything about this dataframe e.g. names of columns, number of rows etc... will lead to this test passing.
Your environment
I'm running on WSL, but a colleague had the same issue on their ubuntu system:
The text was updated successfully, but these errors were encountered: