Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make internal & private classes sealed where possible. Make private methods static where possible #689

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Henr1k80
Copy link
Contributor

sealed/static part of #687, except for DictionaryAdapter

By doing this, lookups in Virtual Method Table before calling are avoided.

@stakx
Copy link
Member

stakx commented Oct 16, 2024

I'll try to take a more detailed look at this eventually, but I am currently a little short on time, so please expect some waiting time.

Like I mentioned over in #687, sealed/static changes that affect the public API will have to wait until we're approaching the next major version bump.

Even with changes to private members that don't affect the public API surface, I am not sure whether they are worth spending much time on at this time:

  • The execution time speed-ups will be minuscule (and almost certainly undetectable), they are easily outweighed by all the heavy reflection stuff going on in this library.
  • Making things sealed and/or static may well have to be partially reverted again during upcoming refactorings (which are esp. likely in DynamicProxy).
  • I'm not a huge fan of making things static just to avoid an instance method call (which consists really just in an additional this parameter; note that non-virtual instance methods already don't require virtual dispatch). This project's coding style guidelines says we should then also group them together so the commit diffs would end up much bigger.

But yeah I'll review as soon as I find the time.

@Henr1k80
Copy link
Contributor Author

No stress @stakx :-)

It could be great to have a benchmark in BenchmarkDotNet to measure the impact, to see if it is worthwhile.
I just don't know which methods should be benchmarked. If you could point at a unit test that is a good candidate to benchmark, I could try that when I have time.

sealed AFAIK does not help on .NET Framework, as that CLR does not have the optimization and does the virtual method lookup anyway...

And on newer .NET, the CLR might have become so good at devirtualization & PGO that the difference is not measurable.
It might have made a difference in 3.1 & 5, maybe a little in 6 but that is end of support within a month...

@stakx
Copy link
Member

stakx commented Oct 17, 2024

It could be great to have a benchmark in BenchmarkDotNet to measure the impact, to see if it is worthwhile.

I've done similar refactorings in the past for Moq, which is built on top of DynamicProxy, and I occasionally ran benchmarks using both BenchmarkDotNet and PerfView. When testing isolated methods that you've refactored/optimized against their unoptimized versions, BenchmarkDotNet might tell you it's become X% faster. However as soon as you benchmark code that includes anything System.Reflection.Emit, you no longer see any actual improvement because much, much more time is spent there than in your own user code... any micro-optimizations like sealed/static are really like proverbial drops in the ocean.

All that is to say, when doing these refactorings, we should do them with the goal of improving code correctness or legibility, not for any performance gains (because, I'm sorry to say, those almost certainly won't manifest in a practically meaningful quantity).

@stakx
Copy link
Member

stakx commented Oct 17, 2024

the CLR might have become so good at devirtualization & PGO

From what I've read, the CLR is indeed very good at de-virtualization, even with interface dispatch.

There are places inside DynamicProxy where private fields are interface-typed but they hold instances of a known class type, say:

private IDictionary<...> foo = new Dictionary<...>();

... and it seems a bit silly to force the CLR to go through interface dispatch or optimize it away when we could've just used the concrete static type in the first place:

private Dictionary<...> foo = new();

In those cases I think it can make sense to refactor but mostly because there's no real gain in abstracting/hiding Dictionary<...> behind an almost identical interface... it just makes the code more complicated to look at.

If such a refactoring brings perf gains along with it, that's a nice bonus, but by themselves they would probably be too minuscule to actually matter, so they shouldn't be the driving factor for the refactoring.

All of the above is simply my opinion based on past experiences with a different but related library... I'll happily change my mind in case I'm wrong about anything.

@Henr1k80
Copy link
Contributor Author

Yeah, it could be nice to do some just of the "correctness" refactorings, but it is hard to verify that there are no regressions, due to black magic runtime stuff...

At least on my hardware, where different BenchmarkDotNet runs (on other projects) produces different results for the same code...

To measure these "correctness" refactorings, you would need a non-throttling machine, with minimal background tasks.
My laptop with antivirus, windows telemetry etc. is not up to the task...

Slightly off topic:
It could probably be a business idea to provide machines, running fixed lowest boost clock, that can boot a minimal clean distro, compile & run a program.
No virtualization, no noisy cloud neighbors, hopefully constant Hz and minimal distractions.
But you probably need to ensure a deposit, if users purposely break the machine or in other ways "modifies" it 😅

Disable Turbo/Boost, for a constant clock without throttling.
Clock speed alone is not relevant for measuring factors of "basic" performance improvements.

A good use for older computers?
Optimally using the generated heat for something, at least room heating.
If they get garbled, send them for recycling, hopefully secure the deposit and send it to a non-profit.

Instruction sets, cache sizes & RAM speeds + sizes might even be more representative of the computers out there?
At least compared to developer & cloud computers.

@Henr1k80
Copy link
Contributor Author

BTW, by runtime black magic I mean that the generated IL should be more optimized, but it is hard to know the resulting machine code looks like after PGO & other optimizations, and how "correctness" optimizations affect the black magic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants