Update Python example #444

saulshanabrook · 2024-10-17T20:36:37Z

In this PR we change the benchmarks to only run long-running examples instead of all of them. This is to reduce variance caused by indeterminacy in the memory allocator, which is overly prominent in short benchmarks (oxc-project/backlog#89).

It also updates the Python example to more accurately reflect the current workload, including optimizing, and turning into a final Python string, all with CSE (so the size of the file is smaller than before). It should output a final string with something like this:

def __fn(X, y):
    assert X.dtype == np.dtype(np.float64)
    assert X.shape == (150, 4,)
    assert np.all(np.isfinite(X))
    assert y.dtype == np.dtype(np.int64)
    assert y.shape == (150,)
    assert set(np.unique(y)) == set((0, 1, 2,))
    _0 = y == np.array(0)
    _1 = np.sum(_0)
    _2 = y == np.array(1)
    _3 = np.sum(_2)
    _4 = y == np.array(2)
    _5 = np.sum(_4)
    _6 = np.array((_1, _3, _5,)).astype(np.dtype(np.float64))
    _7 = _6 / np.array(float(150))
    _8 = np.zeros((3, 4,), dtype=np.dtype(np.float64))
    _9 = np.sum(X[_0], axis=0)
    _10 = _9 / np.array(X[_0].shape[0])
    _8[0, :,] = _10
    _11 = np.sum(X[_2], axis=0)
    _12 = _11 / np.array(X[_2].shape[0])
    _8[1, :,] = _12
    _13 = np.sum(X[_4], axis=0)
    _14 = _13 / np.array(X[_4].shape[0])
    _8[2, :,] = _14
    _15 = _7 @ _8
    _16 = X - _15
    _17 = np.sqrt(np.array(float(1 / 147)))
    _18 = X[_0] - _8[0, :,]
    _19 = X[_2] - _8[1, :,]
    _20 = X[_4] - _8[2, :,]
    _21 = np.concatenate((_18, _19, _20,), axis=0)
    _22 = np.sum(_21, axis=0)
    _23 = _22 / np.array(_21.shape[0])
    _24 = np.expand_dims(_23, 0)
    _25 = _21 - _24
    _26 = np.square(_25)
    _27 = np.sum(_26, axis=0)
    _28 = _27 / np.array(_26.shape[0])
    _29 = np.sqrt(_28)
    _30 = _29 == np.array(0)
    _29[_30] = np.array(float(1))
    _31 = _21 / _29
    _32 = _17 * _31
    _33 = np.linalg.svd(_32, full_matrices=False)
    _34 = _33[1] > np.array(0.0001)
    _35 = _34.astype(np.dtype(np.int32))
    _36 = np.sum(_35)
    _37 = _33[2][:_36, :,] / _29
    _38 = _37.T / _33[1][:_36]
    _39 = np.array(150) * _7
    _40 = _39 * np.array(float(1 / 2))
    _41 = np.sqrt(_40)
    _42 = _8 - _15
    _43 = _41 * _42.T
    _44 = _43.T @ _38
    _45 = np.linalg.svd(_44, full_matrices=False)
    _46 = np.array(0.0001) * _45[1][0]
    _47 = _45[1] > _46
    _48 = _47.astype(np.dtype(np.int32))
    _49 = np.sum(_48)
    _50 = _38 @ _45[2].T[:, :_49,]
    _51 = _16 @ _50
    return _51[:, :2,]

codspeed-hq · 2024-10-17T20:49:35Z

CodSpeed Performance Report

Merging #444 will not alter performance

_{Comparing saulshanabrook:update-python-example (0082005) with main (b0db068)}

Summary

✅ 6 untouched benchmarks

🆕 2 new benchmarks

Benchmarks breakdown

	Benchmark	`main`	`saulshanabrook:update-python-example`	Change
🆕	`python_array_optimize`	N/A	6.6 s	N/A
🆕	`stresstest_large_expr`	N/A	2.9 s	N/A

oxc-project/backlog#89

Alex-Fischman · 2024-10-18T20:24:14Z

This seems wrong to me. We want to be able to catch regressions in our whole codebase, not just in 6 slow tests, right?

saulshanabrook · 2024-10-18T21:27:48Z

I think the issue is that the shorter tests have too much variability to be deterministic enough to benchmark. Alternatively we could try a deterministic memory allocator for our benchmarks. There's a thread I link to above which outlines the issue.

My sense is that if a change doesn't effect our larger benchmarks it's probably fine to pull in... Like as long as they are representative enough of all the code.

Alex-Fischman · 2024-10-18T21:42:40Z

If the large tests are representative then this is a good change; I just didn't think that they were.

Alex-Fischman · 2024-10-19T00:43:30Z

Looking at the codspeed regressions on a different PR (#448) I'm seeing that a lot of the variance comes from the parser.

yihozhang · 2024-10-20T02:25:13Z

Aren't most of the regressions from short-running programs uninteresting because many are one-off costs like parsing, compilation, and bookkeeping? We care more about performance-critical components like query/apply/rebuild, which only become prominent in longer-running programs.

yihozhang · 2024-10-21T17:45:41Z

I think the old python example is a good stress test for the type checker. Maybe we can still keep it but under a different name?

Alex-Fischman · 2024-10-21T21:48:12Z

I think that we should increase the regression threshold back to 10%: look at this report. Even on a benchmark that takes >10ms, the rebuild method still gets significantly slower. (Or we should fix rebuild to not be randomly slow? Unclear how possible this is)

saulshanabrook · 2024-10-22T01:02:28Z

Another option is I could "ignore" all benchmarks that aren't long running: https://docs.codspeed.io/features/ignoring-benchmarks/

That way they still get recorded and can be viewed manually but won't influence the reports.

The only downside here is that if we add additional short examples, they will be have to manually added to the ignore list in the UI.

Alex-Fischman · 2024-10-22T03:39:01Z

That last option seems strictly better than the solution in this PR to me: opt-out vs. opt-in. Hopefully it would motivate us to create better benchmarks in the future.

saulshanabrook · 2024-10-22T15:20:09Z

That last option seems strictly better than the solution in this PR to me: opt-out vs. opt-in. Hopefully it would motivate us to create better benchmarks in the future.

OK sounds good! I will update this PR to just add the Python benchmark (and keep the old one) and then just ignore the smaller benchmarks for now :)

Alex-Fischman

I have quite a few nits but no need for re-review

tests/python_array_optimize_big.egg

README.md

saulshanabrook · 2024-10-24T15:37:46Z

OK I think this is ready to merge

Update python example and benchmark it

46dae2f

saulshanabrook requested a review from a team as a code owner October 17, 2024 20:36

saulshanabrook requested review from mwillsey and removed request for a team October 17, 2024 20:36

Add extracts

710d7a2

saulshanabrook requested review from Alex-Fischman and removed request for mwillsey October 17, 2024 20:47

saulshanabrook changed the title ~~Update Python example~~ Only benchmark long running examples Oct 17, 2024

Only run long running benchmarks

d684867

oxc-project/backlog#89

This was referenced Oct 17, 2024

Fix sources of nondeterminism in egglog #439

Open

Update hashbrown #445

Merged

saulshanabrook added 3 commits October 22, 2024 12:20

add back big python benchmark

e25b2f3

switch back to running all benchmarks

bc58369

Add documentation about ignoring benchmarks

9ab7c13

saulshanabrook changed the title ~~Only benchmark long running examples~~ Update Python example Oct 22, 2024

Updated to new python version

43c55d0

saulshanabrook mentioned this pull request Oct 24, 2024

Be able to generate egg from python example egraphs-good/egglog-python#226

Merged

Alex-Fischman approved these changes Oct 24, 2024

View reviewed changes

tests/python_array_optimize_big.egg Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

rename big python file

c537b79

saulshanabrook added 3 commits October 24, 2024 10:59

make it clear its performance thats non deterministic

32e55e1

Clarify ignoring benchmarks is manual

eb4a836

Move benchmarks above docs

0082005

saulshanabrook merged commit ae69e65 into egraphs-good:main Oct 24, 2024
5 checks passed

saulshanabrook deleted the update-python-example branch October 24, 2024 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Python example #444

Update Python example #444

saulshanabrook commented Oct 17, 2024 •

edited

Loading

codspeed-hq bot commented Oct 17, 2024 •

edited

Loading

Alex-Fischman commented Oct 18, 2024

saulshanabrook commented Oct 18, 2024

Alex-Fischman commented Oct 18, 2024

Alex-Fischman commented Oct 19, 2024

yihozhang commented Oct 20, 2024

yihozhang commented Oct 21, 2024

Alex-Fischman commented Oct 21, 2024 •

edited

Loading

saulshanabrook commented Oct 22, 2024

Alex-Fischman commented Oct 22, 2024 •

edited

Loading

saulshanabrook commented Oct 22, 2024

Alex-Fischman left a comment

saulshanabrook commented Oct 24, 2024

Update Python example #444

Update Python example #444

Conversation

saulshanabrook commented Oct 17, 2024 • edited Loading

codspeed-hq bot commented Oct 17, 2024 • edited Loading

CodSpeed Performance Report

Merging #444 will not alter performance

Summary

Benchmarks breakdown

Alex-Fischman commented Oct 18, 2024

saulshanabrook commented Oct 18, 2024

Alex-Fischman commented Oct 18, 2024

Alex-Fischman commented Oct 19, 2024

yihozhang commented Oct 20, 2024

yihozhang commented Oct 21, 2024

Alex-Fischman commented Oct 21, 2024 • edited Loading

saulshanabrook commented Oct 22, 2024

Alex-Fischman commented Oct 22, 2024 • edited Loading

saulshanabrook commented Oct 22, 2024

Alex-Fischman left a comment

Choose a reason for hiding this comment

saulshanabrook commented Oct 24, 2024

saulshanabrook commented Oct 17, 2024 •

edited

Loading

codspeed-hq bot commented Oct 17, 2024 •

edited

Loading

Alex-Fischman commented Oct 21, 2024 •

edited

Loading

Alex-Fischman commented Oct 22, 2024 •

edited

Loading