Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid deep copying of the attrs when making arrow #509

Merged
merged 1 commit into from
Nov 25, 2024

Conversation

berland
Copy link
Collaborator

@berland berland commented Nov 25, 2024

Pandas 2.2 has changed behaviour in treating the experimental 'attrs', and will copy the entire dict (for all columns) each time dframe[colname] is called.

Instead of calling dframe[colname] for every column, we prepare the values upfront instead.

Pandas 2.2 has changed behaviour in treating the experimental 'attrs', and will
copy the entire dict (for all columns) each time dframe[colname] is called.

Instead of calling dframe[colname] for every column, we prepare the values
upfront instead.
@berland berland linked an issue Nov 25, 2024 that may be closed by this pull request
@berland berland added the bug Something isn't working label Nov 25, 2024
@berland berland self-assigned this Nov 25, 2024
@berland
Copy link
Collaborator Author

berland commented Nov 25, 2024

As a bonus, the code is now also 25% faster also on Pandas 2.1.4

@berland berland merged commit 5d1d58d into master Nov 25, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Solve catastrophic memory issue with Pandas 2.2
2 participants