-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Yul Optimizer Step to Evaluate Yul constant
Functions
#15358
base: develop
Are you sure you want to change the base?
Conversation
Thanks a lot for the PR! The analysis and reasoning as well as the execution are very impressive! The most important concern we have here is that the interpreter the new optimizer step relies on so far is solely used for testing and fuzzing and its correctness is not considered critical. Using it for optimzations like in the PR will make it critical, so merging this would first require reviewing the interpreter for the necessary robustness on the defined subset of Yul. As you already note in the PR description, so far optimization like this is supposed to happen via inlining and expression simplification, but as you correctly lay out this doesn't work in all cases. The checked exponentiation function could actually be refactored, s.t. it doesn't involve any Would you be interested in joining one of our design calls to discuss how to proceed here (see https://docs.soliditylang.org/en/v0.8.26/contributing.html#team-calls)? If that doesn't work for you, we can also continue the discussion asynchroneously. We're currently in the process of preparing a release that will likely happen next week, but at the latest after that, we'll try to properly evaluate what would need to be done to bring the interpreter into a state in which it can be used in this manner (respectively whether that's feasible). |
Hi ekpyron. Thank you so much for taking the time and for your patience in reviewing this PR, especially considering it’s not a small one. Regarding the interpreter, I would also like to help improve it, whether that means addressing change requests to the interpreter itself, or adding more tests to ensure its correctness. I understand that including the interpreter can enable the development of more powerful optimizations, but with great power comes great responsibility. I would love to see this PR in action, but it must be absolutely correct. Therefore, I want to give it as much support as I can. Thanks for the team call invitation. Unfortunately, the timing is not suitable for me, so I would prefer to communicate asynchronously instead. I will try to keep you updated as much as possible. Please let me know if there is anything else I should do for this PR, such as addressing change requests, adding documentation, or adding more tests. Again, thank you so much for your time. I’m really grateful that this PR is being considered! |
e56f0ea
to
8eada94
Compare
8eada94
to
dfa525f
Compare
Hey @quangloc99 ! Thanks for the PR! |
Hi @matheusaaguiar! Thanks again for taking the time to review this PR! I will proceed to write a new interpreter. My original intention was that if there was already an interpreter implementation with existing tests, then using the existing one would be a nice and safe approach. But now I understand that the original interpreter was only meant to be a quick tool, so it makes sense to create a proper one. With this in mind, I want to lay out what I plan to do with the interpreter:
I will try to keep the new interpreter restricted but with room for future extensions. It would be great if, in the future, more analysis could be done to include memory access as well. |
Oh, one more thing I just thought of. To test the new interpreter, should I add a new test suite like test/libyul/YulInterpreterTest.h? |
Sounds like a good plan. The existing one is a good basis, you can definitely use it as reference.
Yes, I think you will have to implement a specific test suit. You can use Feel free to ping me if you have any questions or comments :) |
Hey @quangloc99 You can check the gas benchmarks by downloading the "summarized" artifacts generated by our CI job, e.g.: PR-artifact and develop-artifact, run |
Gas cost benchmarks
|
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
||
euler | |||
gnosis | |||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
-0% |
0% |
zeppelin |
ir-optimize-evm+yul
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | +0% |
||
ens | -0.04% ✅ |
0% |
0% |
euler | +0.04% ❌ |
||
gnosis | |||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | +0.01% ❌ |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | +0.02% ❌ |
+0.06% ❌ |
-0.06% ✅ |
ir-optimize-evm-only
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
0% |
0% |
euler | |||
gnosis | |||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
0% |
0% |
zeppelin | 0% |
legacy-no-optimize
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
||
euler | 0% |
||
gnosis | 0% |
||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | 0% |
-0% |
-0.23% ✅ |
legacy-optimize-evm+yul
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
+0% |
0% |
euler | 0% |
||
gnosis | 0% |
||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
+0% |
0% |
zeppelin | 0% |
+0% |
-0.02% ✅ |
legacy-optimize-evm-only
project | bytecode_size | deployment_gas | method_gas |
---|---|---|---|
brink | 0% |
||
colony | 0% |
||
elementfi | 0% |
||
ens | 0% |
0% |
0% |
euler | 0% |
||
gnosis | 0% |
||
gp2 | 0% |
||
pool-together | 0% |
||
uniswap | 0% |
||
yield_liquidator | 0% |
0% |
0% |
zeppelin | 0% |
+0% |
-0.07% ✅ |
!V
= version mismatch
!B
= no value in the "before" version
!A
= no value in the "after" version
!T
= one or both values were not numeric and could not be compared
-0
= very small negative value rounded to zero
+0
= very small positive value rounded to zero
I'm guessing that up till now, every project already has their constants with inlined value instead of a complex expression. I'm sure that the contract writers aware about the fact that the current optimizer could not handle all the cases to inline the constant effectively. That's why I'm motivated to create this PR! I believe the compiler should handle the preprocessing better to let the contract writer write better, more readable code. |
This is a very good suggestion! Let me try that. |
Sorry! I meant after the Function Specializer. |
Hello. As @matheusaaguiar said, I did some benchmarking here to see how helpful the new step is in practice. The general concept of the new step does make sense, so it's actually surprising how little effect it seems to have. Even if real-life projects optimize their constants manually, this should work not just for constants - I'd expect it to find lots of optimization opportunities in the helpers generated internally by the compiler. We should also see some changes in gas expectations in our semantic test cases since we definitely do not hand-optimize constants in those. Something seems off and I think it's important to investigate and understand what's happening before we jump into making this production-ready. I can at least confirm that it does have a positive effect on code with constants so it's not that it's broken. But it does seem less effective than it could be. My initial guess was that running it before FunctionSpecializer prevents it from running on most calls with constant arguments, and it probably does, but I don't think this is enough, because I tried that and saw no improvement in benchmarks. I think it may be that it also needs to run before any inlining happens since in the current design that irreversibly removes some optimization opportunities (we have no optimizer step that can reverse inlining). In general I think that making full use of it requires some experimentation with the sequence to see where it can have some actual impact. Also, it's worth considering whether making the step simpler and relying more on composability with other steps wouldn't yield better results. For example, reducing its scope from a whole function to a single statement. Statements with constant values are much more common so this would let the step simplify more and not be affected by things being inlined. The step would not be able to eliminate calls, but that's the job that the FullInliner can do for you. It would also not simplify calculations spread over multiple statements, but if you run steps like ExpressionSplitter and LiteralRematerialiser before, you'll get anything that's side-effect free combined into a single statement so it won't make a difference. If you look at our existing steps that's the approach we took with them and the reason why they have lots of seemingly odd limitations. BenchmarksHere's some detail on what I checked. It wasn't very comprehensive and @matheusaaguiar has already posted the results from external tests, which are much more meaningful, but I have the results so I may just as well post them here. I also checked the difference in gas expectations but since there was no difference, there's nothing to post :) Bytecode sizeFrom
It shows that there is some effect, but it's rather small and negative. It's not very conclusive though, since your PR is not based on 0.8.27, so the difference could be from something that was merged in the meantime. Compilation timeFrom
At least the PR does not seem to affect compilation time in a significant way, which was one of our concerns, since we're very focused on speeding up compilation via IR right now. It could still be only due to the step not having many optimization opportunities so we should recheck later when we make it more effective. Semantic test with a gas usageThis is how I verified that the PR actually does have an effect. It's is a semantic test that you can drop into a contract C {}
contract D {
// Do something that eats a lot of gas to get the numbers over the minimum of 100k
C c = new C();
uint256 public constant A0 = 8;
uint256 public constant A1 = A0 * 2;
uint256 public constant A2 = A1 * 2;
uint256 public constant A3 = A2 * 2;
uint256 public constant A4 = A3 * 2;
uint256 public constant A5 = A4 * 2;
uint256 public constant A6 = A5 * 2;
uint256 public constant A7 = A6 * 2;
uint256 public constant A8 = A7 * 2;
uint256 public constant A9 = A8 * 2;
uint256 public constant A10 = A9 * 2;
function f(uint256 x) public pure returns (uint256) {
return x / A10;
}
}
// ----
// constructor() ->
-// gas irOptimized: 119400
-// gas irOptimized code: 123600
+// gas irOptimized: 117495
+// gas irOptimized code: 99800
// gas legacy: 144563
// gas legacy code: 438800
// gas legacyOptimized: 121483
// f(uint256): 1 -> 0 Something where most of the cost is coming from constants rather than something unrelated might be a better test though. |
Also, some notes on the things that were discussed above:
I'd like to clarify this a bit. This what we currently think is the best approach, but the actual next step was supposed to be to take a closer look at the current interpreter and make sure this is really a viable approach. We don't want to get into a task that will be too big and distract too much from the main point, which is the new step. If it turns out to be like that, we may still consider a different approach. And anyway, like I said above, IMO the focus should be first on making sure that the step is actually effective and also agreeing that the overall approach is good. I.e. get the big decisions out of the way before committing to spending the effort needed to get the details right.
Just a note that if you do try to do that, it's best to do it in a separate refactor PR. We generally prefer smaller PRs. Anything that can be extracted from a bigger PR and still makes sense on its own, should be a separate PR to reduce the scope of reviews (which is the biggest bottleneck for us). Also,
I think it would be best to reuse the current suite and simply make it execute tests using both interpreters. This way we get test coverage for both at the same time and can be sure both give the same answer. This is the approach we took in semantic tests to cover both code generation pipelines and it works very well there. |
Hello @cameel! Thank you for taking the time to review and benchmark everything! I’m grateful for your advice on how this PR should proceed. Based on your feedback, I’d like to share my thoughts and some of the decisions I made when creating this PR. My motivation came from how constants are handled in Solidity. I was surprised when many of my constants weren’t inlined and were still being computed within the compiled program. This led me to investigate further, eventually resulting in this PR. My guess as to why the benchmark results were weak remains the same. I believe that since the optimizer wasn’t able to effectively calculate, inline, and transform the code, contract developers were forced to manually compute their constants. Because of this, I think the benchmark was biased: there wasn’t a strong use of complex constant expressions, so there wasn’t much for this optimizer step to act on. Personally, I don’t see this PR as a typical optimizer PR but more as an improvement to the development experience. With this PR, I expect developers will be able to write code that better expresses their intentions without having to focus so much on how the code gets optimized. The effect of this PR is more like a syntax change than an optimization step. It gives developers a new tool to work with, as I believe it changes the underlying semantics of constants: instead of inlining the expression, it inlines the value. This change also leads to the question: if I want to introduce this new tool, where should I contribute? Initially, I thought I might contribute to I understand that the interpreter is very powerful and can do much more than what’s demonstrated in this PR. However, I didn’t see it being used. With this PR, I also wanted to make the push for the interpreter usage, not just for constants or simple statements, but anywhere it’s applicable. This PR is just a proof of concept, showcasing one potential use for the interpreter. I believe there are many more ways to leverage it, but for now, I wanted to focus on this simple optimization. I’ve been mindful of the review workload for the team, which is another reason I initially reused the interpreter from the tests. Not only did this reduce my effort, but I also thought it would make the review process easier. I apologize for the size of this PR. I tried to keep it as simple as I could. |
I’d like to share my thoughts on the advice as well. I like the idea of moving this optimizer step to evaluate statements instead of constant functions. I understand that this will give the optimizer step a broader optimization impact. However, even though these two approaches to evaluation are similar, I think there are some differences that need to be addressed:
As I mentioned before, I want to advocate for the use of the interpreter. If the interpreter gets accepted and used in production, I believe there will be more ways to utilize it than what is demonstrated in this PR. Regarding the composability point, I will experiment as suggested. However, I believe keeping this optimizer step focused on constant functions is fine as it is. The ultimate composability is how the interpreter is used across different optimizer steps rather than this one alone. I also believe that in the future, even better optimization steps will be introduced and this one may be deprecated, which is fine as the compiler evolves. |
What that aside, I want your advice on how to proceed with this PR. I have not made any additional changes rather than restoring the original interpreter in test. The interpreter I added is still inside But for now, should I focus on expreimentation and benchmarking, or should I proceed with the suggested implementation changes? Also for benchmarking, I want run and see the gas usage report locally. I tried to run |
We discussed it again today and we think that reuse of a subset of current interpreter should be good enough after all, but it will still require some work to make sure it's reliable. If you can make it solid, we'd be happy to merge it and then you could use it as a base for your new step. I created an issue for that #15435 and added some details about implementation, which hopefully answer your questions related to that part. If you have more, let's continue discussion on that topic there.
Fair enough, I now see where you're coming from. I just think we can achieve both with not that much extra effort. In many ways it would even be simpler. We can have a step that can make But for now we decided we're ok with such a limited step targeting Just a note that we'd rather mark it as experimental and keep it out of the default sequence at first. We're a bit wary of potential issues in the interpreter and we treat optimizer bugs very seriously, so we really want to see people use carefully first and report back. Then we'd consider including it in the sequence.
Doing it at Yul level is definitely the better idea for a contribution. It makes the scope actually manageable. We have #3157 for the Solidity-level equivalent, but that's a much more complex task and is actually blocked on some design decisions. It's a very important point on our roadmap, but it'll take us some time to get there. |
As to your more specific questions about my suggestion to make the step simpler:
In that scenario you would first evaluate statements inside the function, which would be done once. If the function is simple enough to be eliminated by your step then mine would reduce it to something like
Looking at how simple the step is currently, perhaps it would be a bit more complex. But it would be more powerful and still not very complex overall, because it would rely on other steps to do the complex stuff. For example it would not need to use Dataflow Analyzer. You could run Structural Simplifier first, which already uses Dataflow Analyzer and would deal with any constant control flow. You would also not have to check if variables are pure - Rematerialiser and Literal Rematerialiser would convert it into a form where the final is directly in the expression - you'd only check if it's a literal and skip the expression if not. Also keep in mind that it would not necessarily happen all at once, some steps or patterns get repeated multiple times, because each pass may reveal new opportunities. For example you might run your step first to evaluate conditionals, then Structural Simplifier to unroll the BTW, I initially said it should work at a statement level, but it should rather be expression level. I was thinking about this being necessary to handle control flow, but then realized that Structural Simplifier will do it.
I think you're still not seeing fully what I mean. Yes, the step on its own would only deal with expressions/statements. But it would be combined with other steps and together they would be able do do what yours does and even more. Basically, your current step is closer in complexity to a fixed sequence of our current simple steps. Ours are more flexible, because we can combine them in many different ways, like LEGOs.
Well, this one is a pain to run locally. Here's how we run it in CI. But it should not normally require any RPC key so something must have gone wrong. Maybe try to run it for a single project first, say, OpenZeppelin. Check if this works: test/external_tests.py test \
--solc-binary-type native \
--solc-binary-path /path/to/solc \
--run zeppelin This will take a while to run and drop gas measurements in But we usually don't run it locally because that's a royal pain to do for all ~10 projects. We go to the artifacts of We also have |
Hi cameel! I really appreciate your feedback and for welcoming this PR! Regarding the open issue with the side-effect-free interpreter, I think the spec is clear. I’ll create a separate PR for that. Once it's ready, I’ll integrate the new interpreter into this PR and then we can wrap it up. On the discussion side, I understand the LEGO analogy, and I know the current step seems very limited for now. It’s still hard to mix it with other steps to really push it to its potential. I would love to contribute to this in the future, but I agree that we should treat this PR as a base step and work on making it more powerful later. I’m also curious about how to make this an experimental feature. How would that work? Would we add a flag to enable a new optimization sequence? And how would we gather feedback on it? I apologize if I’m asking too many questions, but I’m genuinely interested. Once again, thank you for your feedback and for your patience with all my questions! |
There isn't that much to it. We don't have any specific mechanism for that (aside from the
We'll probably say something about it in release notes and also advertise it a bit using the usual channels, like the forum or social media. Just making it clear that it's experimental and people should not rely on it in production. |
Issue with Constants
Constants in Solidity are generally assumed to be hardcoded, with their values replaced wherever they are used. However, this isn't entirely accurate. According to the Solidity documentation:
While the documentation correctly states that the expression is copied, it is often challenging to optimize constants in many situations. Consider the following examples:
In this example, the constant
C
is the result of a power operation, which is a checked operation. Because the checked power is implemented with theleave
instruction, it is not inlined in thetest
function. As a result, the division cannot be optimized into a bitshift operation, which would save gas.In this case, all constants are marked as public and used multiple times. Since the multiplication is also checked, the expressions used to compute these constants are too complex to be inlined and optimized.
This subtle detail affects how code is written and optimized. Contract writers often need to compute constants beforehand and hardcode the value into the contract, often with a comment explaining the expression. This approach is suboptimal, as a small change to one constant requires multiple manual updates.
Simply put, the experience of using constants in Solidity does not align with typical user expectations. Due to the lack of preprocessing and further optimization capabilities, the code is either unoptimized or painful to write.
Solution: New Optimization Step
To address this issue, I introduced a new optimization step called
constantFunctionEvaluator
. This optimization component scans all functions without arguments and attempts to execute them. The execution is performed using the existing interpreter from thetest
.After execution, if the interpreter encounters no built-in function calls that can introduce side effects (such as
mstore
,mload
,call
,create
,address
, etc.), the component concludes that the function is constant. It then replaces the function's body with assignments to return variables corresponding to the evaluated values.Implementation
The existing interpreter was moved from the
tests
directory to the source directorylibyul
. To restrict which built-ins can be called, the interpreter and expression evaluator were overridden to include additional checks. This interpreter is then used in the new optimization component.Impact
function
does not need to be markedpure
to be evaluated. Thanks to the interpreter, the actual code flow is evaluated rather than relying on static analysis. This means some functions can even call side-effecting built-ins, yet still be simplified by this component if those built-ins are not encountered.I hope this PR is helpful and can contribute to further optimizations. With the interpreter, optimizations such as evaluating expressions with only constant parameters can also be performed. I hope this PR serves as a proof of concept for that.