Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[.NET] Improved parsing time #336

Merged
merged 9 commits into from
Dec 18, 2024
Merged

[.NET] Improved parsing time #336

merged 9 commits into from
Dec 18, 2024

Conversation

obligaron
Copy link
Contributor

🤔 What's changed?

I have optimised the gherkin parser for better performance. This was done by avoiding string culture sensitive comparisons, (string) allocations and other optimisations (see commits for details).

For the very_long.feature this adds up to 3x to 7x improved parsing speed (depending on the .NET runtime used).

Also added a BenchmarkDotNet project to measure the performance improvements (for now and in the future).

Setup:

BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4602/23H2/2023Update/SunValley3)
AMD Ryzen 9 5900X, 1 CPU, 24 logical and 12 physical cores
.NET SDK 9.0.101
  [Host]     : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-ZJBIYF : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-TTIYKJ : .NET Framework 4.8.1 (4.8.9282.0), X64 RyuJIT VectorSize=256

Results before:

Method Runtime FeatureFile Mean Error StdDev Gen0 Gen1 Gen2 Allocated
Parser .NET 8.0 tags.feature 527.66 μs 9.133 μs 8.543 μs 35.1563 35.1563 35.1563 555.57 KB
ParserReuse .NET 8.0 tags.feature 97.01 μs 0.873 μs 0.817 μs 6.3477 0.2441 - 104.91 KB
Parser .NET Framework 4.8.1 tags.feature 958.57 μs 4.861 μs 4.547 μs 106.4453 35.1563 35.1563 593.81 KB
ParserReuse .NET Framework 4.8.1 tags.feature 72.33 μs 0.771 μs 0.721 μs 19.6533 0.8545 - 121.48 KB
Parser .NET 8.0 very_long.feature 4,792.91 μs 42.469 μs 37.648 μs 281.2500 140.6250 31.2500 4720.05 KB
ParserReuse .NET 8.0 very_long.feature 4,256.20 μs 23.611 μs 22.086 μs 257.8125 164.0625 - 4257.06 KB
Parser .NET Framework 4.8.1 very_long.feature 4,011.65 μs 24.134 μs 21.394 μs 835.9375 289.0625 39.0625 5290.22 KB
ParserReuse .NET Framework 4.8.1 very_long.feature 2,838.23 μs 11.272 μs 9.992 μs 777.3438 277.3438 - 4782.37 KB

Results after:

Method Runtime FeatureFile Mean Error StdDev Gen0 Gen1 Gen2 Allocated
Parser .NET 8.0 tags.feature 410.61 μs 8.183 μs 10.349 μs 35.1563 35.1563 35.1563 490.98 KB
ParserReuse .NET 8.0 tags.feature 15.34 μs 0.182 μs 0.162 μs 2.5177 0.0916 - 41.18 KB
Parser .NET Framework 4.8.1 tags.feature 876.67 μs 15.740 μs 14.724 μs 71.2891 35.1563 35.1563 512.97 KB
ParserReuse .NET Framework 4.8.1 tags.feature 23.80 μs 0.108 μs 0.084 μs 7.3853 0.2747 - 45.45 KB
Parser .NET 8.0 very_long.feature 989.34 μs 7.098 μs 6.292 μs 140.6250 70.3125 35.1563 2012.58 KB
ParserReuse .NET 8.0 very_long.feature 565.29 μs 5.020 μs 4.695 μs 94.7266 47.8516 - 1558.77 KB
Parser .NET Framework 4.8.1 very_long.feature 1,901.60 μs 11.795 μs 10.456 μs 320.3125 142.5781 35.1563 2071.23 KB
ParserReuse .NET Framework 4.8.1 very_long.feature 905.75 μs 2.283 μs 1.906 μs 259.7656 100.5859 - 1596.55 KB

⚡️ What's your motivation?

Investigated SourceCodeGenerator for .NET and gherkin parser speed was one part where much of the time was spent. Also gherkin parser speed is relevant for user interactions (IDE editing of future files) and general compilation speed in .NET.

I hope this helps to make gherkin even better. 🙂

🏷️ What kind of change is this?

  • 🏦 Refactoring/debt/DX (improvement to code design, tooling, etc. without changing behaviour)
  • 💥 Breaking change (incompatible changes to the API)

♻️ Anything particular you want feedback on?

I changed some parts of the parsing to string ordinal comparison. string ordinal comparison was already used in a lot of places. So I don't think this is a hard breaking change. But I wanted to highlight it anyway (here and in the changelog).
Another thing is that I removed a Mono workaround that was probably no longer needed (regarding `string.StartsWith') that showed up during profiling.

📋 Checklist:

  • I agree to respect and uphold the Cucumber Community Code of Conduct
  • I've changed the behaviour of the code
    • I have added/updated tests to cover my changes.
  • Users should know about my change
    • I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

Copy link
Member

@gasparnagy gasparnagy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general.
Just one question: is the optimization you had both in the parser and the ast builder that flattens the error handling wrapper method to try-catch everywhere counts a lot in perf? I understand that crating and invoking a delegate is more costly, but copy-pasting the same error handling code everywhere is pretty error-prone. That's the only change that somehow hurts me more. Would the numbers be very different without that one?

@obligaron
Copy link
Contributor Author

No problem. 🙂
Here are the results without the delegate optimization (second commit reverted):

Method Runtime FeatureFile Mean Error StdDev Gen0 Gen1 Gen2 Allocated
Parser .NET 8.0 tags.feature 425.46 μs 8.187 μs 10.645 μs 35.1563 35.1563 35.1563 531.46 KB
ParserReuse .NET 8.0 tags.feature 23.71 μs 0.143 μs 0.127 μs 4.9744 0.1831 - 81.66 KB
Parser .NET Framework 4.8.1 tags.feature 839.09 μs 1.712 μs 1.602 μs 76.1719 40.0391 35.1563 553.49 KB
ParserReuse .NET Framework 4.8.1 tags.feature 30.64 μs 0.350 μs 0.292 μs 13.9771 0.4883 - 86.04 KB
Parser .NET 8.0 very_long.feature 1,245.31 μs 16.312 μs 13.621 μs 210.9375 175.7813 35.1563 3428.37 KB
ParserReuse .NET 8.0 very_long.feature 831.35 μs 16.616 μs 17.063 μs 181.6406 90.8203 - 2973.31 KB
Parser .NET Framework 4.8.1 very_long.feature 2,217.68 μs 43.897 μs 41.061 μs 550.7813 250.0000 35.1563 3493.88 KB
ParserReuse .NET Framework 4.8.1 very_long.feature 1,223.84 μs 8.885 μs 6.937 μs 490.2344 187.5000 - 3015.27 KB

Without it, very_long.feature with reuse decreases by 47% for .NET 8 and 35% for .NET Framework 4.8. Allocations almost double.

Regarding error handling:
Most methods are generated, the changes in the gherking-csharp.razor file adds only 22 lines.

@gasparnagy
Copy link
Member

@obligaron Thx for the measure. Yes, this is significant enough to keep.

@gasparnagy gasparnagy merged commit be363bc into cucumber:main Dec 18, 2024
2 checks passed
@obligaron obligaron deleted the perf branch December 18, 2024 19:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants