PERF: Improve efficiency of `BlockValuesRefs` #59598

Tolker-KU · 2024-08-24T21:21:31Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Improve efficiency of BlockValuesRefs by allowing for more efficient C code-gen

rhshadrach · 2024-08-25T13:18:52Z

@Tolker-KU - can you run and post ASV results

Tolker-KU · 2024-08-27T16:57:02Z

@Tolker-KU - can you run and post ASV results

Thanks for taking a look a this.

I've tried hard to get meaningful runs of the asv benchmarks but I think my computer is so old and slow that noise is dominating the results. I'm getting weird results where some benchmarks are much faster and some are much slower

To show the gains from this I've run below micro-benchmark

>>> N = 10_000
>>> arr = np.random.normal(size=(50, N))
>>> df = pd.DataFrame(arr)
>>> %timeit -n10000 list(df.iterrows())
2.43 ms ± 268 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)  <- before
2.1 ms  ± 117 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)  <- after

WillAyd · 2024-08-27T20:02:07Z

pandas/_libs/internals.pyx

@@ -890,12 +890,12 @@ cdef class BlockValuesRefs:

    def __cinit__(self, blk: Block | None = None) -> None:
        if blk is not None:
-            self.referenced_blocks = [weakref.ref(blk)]
+            self.referenced_blocks = [PyWeakref_NewRef(blk, None)]


Does this actually change anything? I am under the impression that Cython would be smart enough to handle this the same in both cases.

Instead of looking at the benchmarks (which can be difficult to run and flaky), you can also inspect the Cython annotations before/after the change. Are they actually different?

Thanks. Glad you brought that up. I should've been more explicit in the description, as the motivation for this PR is that the code-gen changes quite dramatically. An example from one of the methods below. Note the code generated for the method does not fit on one screen before the change

Before

After

Thanks for posting. That's...quite the difference

mroeschke · 2024-08-28T17:34:37Z

Thanks @Tolker-KU

Improve efficency of _libs.internal

72440a5

Tolker-KU requested a review from WillAyd as a code owner August 24, 2024 21:21

Tolker-KU marked this pull request as draft August 24, 2024 21:34

Re-add python version of ._rebuild_blknos_and_blklocs

8b78e70

Tolker-KU marked this pull request as ready for review August 24, 2024 21:40

rhshadrach added Performance Memory or execution speed performance Copy / view semantics Enhancement labels Aug 25, 2024

Tolker-KU changed the title ~~PERF: Improve efficiency of BlockValuesRefs~~ PERF: Improve efficiency of BlockValuesRefs Aug 25, 2024

WillAyd reviewed Aug 27, 2024

View reviewed changes

WillAyd approved these changes Aug 27, 2024

View reviewed changes

mroeschke approved these changes Aug 28, 2024

View reviewed changes

mroeschke added this to the 3.0 milestone Aug 28, 2024

mroeschke merged commit 91541c1 into pandas-dev:main Aug 28, 2024
56 of 59 checks passed

Tolker-KU deleted the perf-internals branch August 28, 2024 17:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: Improve efficiency of `BlockValuesRefs` #59598

PERF: Improve efficiency of `BlockValuesRefs` #59598

Tolker-KU commented Aug 24, 2024 •

edited

Loading

rhshadrach commented Aug 25, 2024

Tolker-KU commented Aug 27, 2024

WillAyd Aug 27, 2024

Tolker-KU Aug 27, 2024

WillAyd Aug 27, 2024

mroeschke commented Aug 28, 2024

PERF: Improve efficiency of BlockValuesRefs #59598

PERF: Improve efficiency of BlockValuesRefs #59598

Conversation

Tolker-KU commented Aug 24, 2024 • edited Loading

rhshadrach commented Aug 25, 2024

Tolker-KU commented Aug 27, 2024

WillAyd Aug 27, 2024

Choose a reason for hiding this comment

Tolker-KU Aug 27, 2024

Choose a reason for hiding this comment

Before

After

WillAyd Aug 27, 2024

Choose a reason for hiding this comment

mroeschke commented Aug 28, 2024

PERF: Improve efficiency of `BlockValuesRefs` #59598

PERF: Improve efficiency of `BlockValuesRefs` #59598

Tolker-KU commented Aug 24, 2024 •

edited

Loading