Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize building and casting of EnsoMultiValue #11924

Draft
wants to merge 17 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
package org.enso.interpreter.bench.benchmarks.semantic;

import java.util.concurrent.TimeUnit;
import java.util.function.Function;
import org.enso.compiler.benchmarks.Utils;
import org.graalvm.polyglot.Value;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.BenchmarkParams;
import org.openjdk.jmh.infra.Blackhole;

/**
* These benchmarks compare performance of {@link EnsoMultiValue}. They create a vector in a certain
* configuration representing numbers and then they perform {@code sum} operation on it.
*/
@BenchmarkMode(Mode.AverageTime)
@Fork(1)
@Warmup(iterations = 3)
@Measurement(iterations = 5)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
public class MultiValueBenchmarks {
private Value arrayOfNumbers;
private Value sum;
private Value self;
private final long length = 100000;

@Setup
public void initializeBenchmark(BenchmarkParams params) throws Exception {
var ctx = Utils.createDefaultContextBuilder().build();
var code =
"""
from Standard.Base import Vector, Float, Number, Integer

type Complex
private Number re:Float im:Float

Complex.from (that:Number) = Complex.Number that 0

sum arr =
go acc i = if i >= arr.length then acc else
v = arr.at i : Float
sum = acc + v
@Tail_Call go sum i+1
go 0 0


make_vector type n =
Copy link
Member Author

@JaroslavTulach JaroslavTulach Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dedicated benchmark for EnsoMultiValue instances based on the idea of ArrayProxy one. Currently the more complicated benchmarks refuse to compile and the compiler bails out. The initial results are:

sbt:enso> runtime-benchmarks/benchOnly MultiValueBenchmarks
Benchmark                                                 Mode  Cnt    Score    Error  Units
MultiValueBenchmarks.sumOverComplexAndFloat5              avgt    5  214.330 ±  4.179  ms/op
MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3  avgt    5  219.803 ± 11.872  ms/op
MultiValueBenchmarks.sumOverFloat1                        avgt    5    0.079 ±  0.006  ms/op
MultiValueBenchmarks.sumOverFloatAndComplex6              avgt    5  219.525 ±  6.393  ms/op
MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4  avgt    5  203.788 ±  9.843  ms/op
MultiValueBenchmarks.sumOverInteger0                      avgt    5    0.074 ±  0.001  ms/op

After 630ec62 the results are better:

MultiValueBenchmarks.sumOverComplexAndFloat5              avgt    5  30.109 ± 0.661  ms/op
MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3  avgt    5  26.988 ± 0.446  ms/op
MultiValueBenchmarks.sumOverFloat1                        avgt    5   0.078 ± 0.003  ms/op
MultiValueBenchmarks.sumOverFloatAndComplex6              avgt    5  27.821 ± 0.856  ms/op
MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4  avgt    5  27.961 ± 0.263  ms/op
MultiValueBenchmarks.sumOverInteger0                      avgt    5   0.078 ± 0.002  ms/op

and there are no bailouts. Time to really speed things up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector.new n i->
r = 3 + 5*i
case type of
0 -> r:Integer
1 -> r:Float
2 -> r:Complex
3 ->
c = r:Complex&Float
c:Float
4 ->
c = r:Float&Complex
c:Float
5 -> r:Complex&Float
6 -> r:Float&Complex
""";
var benchmarkName = SrcUtil.findName(params);
var src = SrcUtil.source(benchmarkName, code);
var module = ctx.eval(src);

this.self = module.invokeMember("get_associated_type");
Function<String, Value> getMethod = (name) -> module.invokeMember("get_method", self, name);

String test_builder;
int type = Integer.parseInt(benchmarkName.substring(benchmarkName.length() - 1));
this.arrayOfNumbers = getMethod.apply("make_vector").execute(self, type, length);
this.sum = getMethod.apply("sum");
}

@Benchmark
Copy link
Member Author

@JaroslavTulach JaroslavTulach Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now the base benchmark. The results after 4dacf53 are:

# the base one
MultiValueBenchmarks.sumOverComplexBaseBenchmark0         avgt    5  0.139 ± 0.013  ms/op

# these two are supposed to be faster
MultiValueBenchmarks.sumOverInteger1                      avgt    5  0.065 ± 0.003  ms/op
MultiValueBenchmarks.sumOverFloat2                        avgt    5  0.073 ± 0.002  ms/op

# these should catch up with sumOverComplexBaseBenchmark0 one day
MultiValueBenchmarks.sumOverComplexAndFloat5              avgt    5  8.580 ± 0.326  ms/op
MultiValueBenchmarks.sumOverComplexFloatRecastedToFloat3  avgt    5  9.118 ± 0.483  ms/op
MultiValueBenchmarks.sumOverFloatAndComplex6              avgt    5  8.110 ± 0.160  ms/op
MultiValueBenchmarks.sumOverFloatComplexRecastedToFloat4  avgt    5  9.393 ± 0.648  ms/op

still 60 times slower than it should be.

public void sumOverInteger0(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverFloat1(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverComplexCast2(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverComplexFloatRecastedToFloat3(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverFloatComplexRecastedToFloat4(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverComplexAndFloat5(Blackhole matter) {
performBenchmark(matter);
}

@Benchmark
public void sumOverFloatAndComplex6(Blackhole matter) {
performBenchmark(matter);
}

private void performBenchmark(Blackhole matter) throws AssertionError {
var resultValue = sum.execute(self, arrayOfNumbers);
if (!resultValue.fitsInLong()) {
throw new AssertionError("Shall be a long: " + resultValue);
}
long result = resultValue.asLong();
long expectedResult = length * 3L + (5L * (length * (length - 1L) / 2L));
boolean isResultCorrect = result == expectedResult;
if (!isResultCorrect) {
throw new AssertionError("Expecting " + expectedResult + " but was " + result);
}
matter.consume(result);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,12 @@ abstract Object executeCheckOrConversion(
@ExplodeLoop
final boolean isAllTypes() {
Node p = this;
CompilerAsserts.compilationConstant(p);
CompilerAsserts.partialEvaluationConstant(p);
for (; ; ) {
if (p instanceof TypeCheckValueNode vn) {
CompilerAsserts.compilationConstant(vn);
CompilerAsserts.partialEvaluationConstant(vn);
var allTypes = vn.isAllTypes();
CompilerAsserts.compilationConstant(allTypes);
CompilerAsserts.partialEvaluationConstant(allTypes);
return allTypes;
}
p = p.getParent();
Expand Down
Loading
Loading