Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Yul Optimizer Step to Evaluate Yul constant Functions #15358

Open
wants to merge 55 commits into
base: develop
Choose a base branch
from

Conversation

quangloc99
Copy link
Contributor

Issue with Constants

Constants in Solidity are generally assumed to be hardcoded, with their values replaced wherever they are used. However, this isn't entirely accurate. According to the Solidity documentation:

For a constant variable, the expression assigned to it is copied to all the places where it is accessed and also re-evaluated each time. This allows for local optimizations.

While the documentation correctly states that the expression is copied, it is often challenging to optimize constants in many situations. Consider the following examples:

  • Constant that Stores a Power:
pragma solidity ^0.8.24;
contract Test {
    uint256 private constant A = 9;
    uint256 private constant B = 4;
    uint256 private constant C = B ** A;

    function test(uint256 x) public pure returns (uint256) {
        return x / C;
    }
}

In this example, the constant C is the result of a power operation, which is a checked operation. Because the checked power is implemented with the leave instruction, it is not inlined in the test function. As a result, the division cannot be optimized into a bitshift operation, which would save gas.

  • Multiple Constants Dependent on Each Other:
pragma solidity ^0.8.26;

contract Test {
    uint256 public constant A0 = 8;
    uint256 public constant A1 = A0 * 2;
    uint256 public constant A2 = A1 * 2;
    uint256 public constant A3 = A2 * 2;
    uint256 public constant A4 = A3 * 2;
    uint256 public constant A5 = A4 * 2;
    uint256 public constant A6 = A5 * 2;
    uint256 public constant A7 = A6 * 2;
    uint256 public constant A8 = A7 * 2;
    uint256 public constant A9 = A8 * 2;
    uint256 public constant A10 = A9 * 2;

    function test(uint256 x) public pure returns (uint256) {
        return x / A10;
    }
}

In this case, all constants are marked as public and used multiple times. Since the multiplication is also checked, the expressions used to compute these constants are too complex to be inlined and optimized.

This subtle detail affects how code is written and optimized. Contract writers often need to compute constants beforehand and hardcode the value into the contract, often with a comment explaining the expression. This approach is suboptimal, as a small change to one constant requires multiple manual updates.

Simply put, the experience of using constants in Solidity does not align with typical user expectations. Due to the lack of preprocessing and further optimization capabilities, the code is either unoptimized or painful to write.

Solution: New Optimization Step

To address this issue, I introduced a new optimization step called constantFunctionEvaluator. This optimization component scans all functions without arguments and attempts to execute them. The execution is performed using the existing interpreter from the test.

After execution, if the interpreter encounters no built-in function calls that can introduce side effects (such as mstore, mload, call, create, address, etc.), the component concludes that the function is constant. It then replaces the function's body with assignments to return variables corresponding to the evaluated values.

Implementation

The existing interpreter was moved from the tests directory to the source directory libyul. To restrict which built-ins can be called, the interpreter and expression evaluator were overridden to include additional checks. This interpreter is then used in the new optimization component.

Impact

  • This optimization step resolves the mentioned issues with constants. It reduces each constant's expression to a single number, allowing the optimizer to further inline and simplify the expressions.
  • In the Yul world, a function does not need to be marked pure to be evaluated. Thanks to the interpreter, the actual code flow is evaluated rather than relying on static analysis. This means some functions can even call side-effecting built-ins, yet still be simplified by this component if those built-ins are not encountered.
  • Although this optimization step targets constants, contract writers can also benefit from it in (pure) functions. As long as the function does not accept any arguments and its execution flow does not introduce any side effects, it can be optimized.

I hope this PR is helpful and can contribute to further optimizations. With the interpreter, optimizations such as evaluating expressions with only constant parameters can also be performed. I hope this PR serves as a proof of concept for that.

@ekpyron
Copy link
Member

ekpyron commented Aug 28, 2024

Thanks a lot for the PR! The analysis and reasoning as well as the execution are very impressive!

The most important concern we have here is that the interpreter the new optimizer step relies on so far is solely used for testing and fuzzing and its correctness is not considered critical. Using it for optimzations like in the PR will make it critical, so merging this would first require reviewing the interpreter for the necessary robustness on the defined subset of Yul.

As you already note in the PR description, so far optimization like this is supposed to happen via inlining and expression simplification, but as you correctly lay out this doesn't work in all cases. The checked exponentiation function could actually be refactored, s.t. it doesn't involve any leave anymore, we could also investigate that as a quicker path to some improvement in this, however, we'd need to check what effects in cost such a change would have in other real-world cases. But, as you also point out correctly, this method also relies on inlining heuristics actually choosing to inline in the cases at hand; the current inlining heuristics aren't particularly good, but improving on them is tricky, especially since we also need to watch out for amount of local variables in functions after inlining to prevent running out of stack space too quickly. It could be possible to do some analysis to make an educated predication about whether inlining and simplification will in fact resolve into a constant, but that would already be similar to interpretation - and interpreting may also have good performance metrics. So given all that, this new step is definitely worthwhile to consider!

Would you be interested in joining one of our design calls to discuss how to proceed here (see https://docs.soliditylang.org/en/v0.8.26/contributing.html#team-calls)? If that doesn't work for you, we can also continue the discussion asynchroneously.

We're currently in the process of preparing a release that will likely happen next week, but at the latest after that, we'll try to properly evaluate what would need to be done to bring the interpreter into a state in which it can be used in this manner (respectively whether that's feasible).

@quangloc99
Copy link
Contributor Author

Hi ekpyron. Thank you so much for taking the time and for your patience in reviewing this PR, especially considering it’s not a small one.

Regarding the interpreter, I would also like to help improve it, whether that means addressing change requests to the interpreter itself, or adding more tests to ensure its correctness. I understand that including the interpreter can enable the development of more powerful optimizations, but with great power comes great responsibility. I would love to see this PR in action, but it must be absolutely correct. Therefore, I want to give it as much support as I can.

Thanks for the team call invitation. Unfortunately, the timing is not suitable for me, so I would prefer to communicate asynchronously instead. I will try to keep you updated as much as possible.

Please let me know if there is anything else I should do for this PR, such as addressing change requests, adding documentation, or adding more tests.

Again, thank you so much for your time. I’m really grateful that this PR is being considered!

@matheusaaguiar
Copy link
Collaborator

Hey @quangloc99 ! Thanks for the PR!
Solidity team have discussed the next steps for this PR and as per @ekpyron recommendation, it would be best to write a new interpreter for the task rather than reusing the one from test. It should just handle control flow and plain "pure" opcodes (e.g. arithmetic or other which work purely from other values on stack). It should just abort on anything else like memory access, reverts, etc. It should also not handle memory/storage/calldata/returndata.

@quangloc99
Copy link
Contributor Author

Hi @matheusaaguiar! Thanks again for taking the time to review this PR!

I will proceed to write a new interpreter. My original intention was that if there was already an interpreter implementation with existing tests, then using the existing one would be a nice and safe approach. But now I understand that the original interpreter was only meant to be a quick tool, so it makes sense to create a proper one.

With this in mind, I want to lay out what I plan to do with the interpreter:

  • First of all, I will still use the existing one as a reference. I think the current one already handles control flow very well and includes all the necessary calculations for operations.
  • Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like std::expected. Maybe boost's outcome or std::variant could also work. Other than that, I’ll aim to clean up other parts.

I will try to keep the new interpreter restricted but with room for future extensions. It would be great if, in the future, more analysis could be done to include memory access as well.

@quangloc99
Copy link
Contributor Author

Oh, one more thing I just thought of. To test the new interpreter, should I add a new test suite like test/libyul/YulInterpreterTest.h?

@matheusaaguiar
Copy link
Collaborator

matheusaaguiar commented Sep 11, 2024

Hi @matheusaaguiar! Thanks again for taking the time to review this PR!

I will proceed to write a new interpreter. My original intention was that if there was already an interpreter implementation with existing tests, then using the existing one would be a nice and safe approach. But now I understand that the original interpreter was only meant to be a quick tool, so it makes sense to create a proper one.

With this in mind, I want to lay out what I plan to do with the interpreter:

* First of all, I will still use the existing one as a reference. I think the current one already handles control flow very well and includes all the necessary calculations for operations.

* Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like `std::expected`. Maybe `boost`'s `outcome` or `std::variant` could also work. Other than that, I’ll aim to clean up other parts.

I will try to keep the new interpreter restricted but with room for future extensions. It would be great if, in the future, more analysis could be done to include memory access as well.

Sounds like a good plan. The existing one is a good basis, you can definitely use it as reference.
I think you could use std::variant. There are some places where we already use it to return errors instead of letting them bubble up.

Oh, one more thing I just thought of. To test the new interpreter, should I add a new test suite like test/libyul/YulInterpreterTest.h?

Yes, I think you will have to implement a specific test suit. You can use YulInterpreterTest.h as reference. Unless it could be used with the existing functionalities, but I guess it is better to have a separate one.

Feel free to ping me if you have any questions or comments :)

@matheusaaguiar
Copy link
Collaborator

matheusaaguiar commented Sep 11, 2024

Hey @quangloc99
I guess was too eager to answer and forgot to check the compilation runtime and gas costs. It seems like this PR is not affecting them significantly neither for good or bad.
@cameel already ran some tests and will probably post the results later, but first we should investigate why the proposed optimized step is not having an impact here.
Maybe it needs to be inserted into the optimizer sequence before after the Function Specializer (F)?

You can check the gas benchmarks by downloading the "summarized" artifacts generated by our CI job, e.g.: PR-artifact and develop-artifact, run ./scripts/externalTests/benchmark_diff.py --style humanized --output-format markdown table develop-artifact PR-artifact. That should give you a table comparing the gas costs using some real world projects.

@matheusaaguiar
Copy link
Collaborator

matheusaaguiar commented Sep 11, 2024

Gas cost benchmarks

ir-no-optimize

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi 0%
ens 0%
euler
gnosis
gp2 0%
pool-together 0%
uniswap 0%
yield_liquidator 0% -0% 0%
zeppelin

ir-optimize-evm+yul

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi +0%
ens -0.04% ✅ 0% 0%
euler +0.04% ❌
gnosis
gp2 0%
pool-together 0%
uniswap +0.01% ❌
yield_liquidator 0% +0% 0%
zeppelin +0.02% ❌ +0.06% ❌ -0.06% ✅

ir-optimize-evm-only

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi 0%
ens 0% 0% 0%
euler
gnosis
gp2 0%
pool-together 0%
uniswap 0%
yield_liquidator 0% 0% 0%
zeppelin 0%

legacy-no-optimize

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi 0%
ens 0%
euler 0%
gnosis 0%
gp2 0%
pool-together 0%
uniswap 0%
yield_liquidator 0% +0% 0%
zeppelin 0% -0% -0.23% ✅

legacy-optimize-evm+yul

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi 0%
ens 0% +0% 0%
euler 0%
gnosis 0%
gp2 0%
pool-together 0%
uniswap 0%
yield_liquidator 0% +0% 0%
zeppelin 0% +0% -0.02% ✅

legacy-optimize-evm-only

project bytecode_size deployment_gas method_gas
brink 0%
colony 0%
elementfi 0%
ens 0% 0% 0%
euler 0%
gnosis 0%
gp2 0%
pool-together 0%
uniswap 0%
yield_liquidator 0% 0% 0%
zeppelin 0% +0% -0.07% ✅

!V = version mismatch
!B = no value in the "before" version
!A = no value in the "after" version
!T = one or both values were not numeric and could not be compared
-0 = very small negative value rounded to zero
+0 = very small positive value rounded to zero

@quangloc99
Copy link
Contributor Author

Hey @quangloc99 I guess was too eager to answer and forgot to check the compilation runtime and gas costs. It seems like this PR is not affecting them significantly neither for good or bad. @cameel already ran some tests and will probably post the results later, but first we should investigate why the proposed optimized step is not having an impact here. Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

You can check the gas benchmarks by downloading the "summarized" artifacts generated by our CI job, e.g.: PR-artifact and develop-artifact, run ./scripts/externalTests/benchmark_diff.py --style humanized --output-format markdown table develop-artifact PR-artifact. That should give you a table comparing the gas costs using some real world projects.

I'm guessing that up till now, every project already has their constants with inlined value instead of a complex expression. I'm sure that the contract writers aware about the fact that the current optimizer could not handle all the cases to inline the constant effectively.

That's why I'm motivated to create this PR! I believe the compiler should handle the preprocessing better to let the contract writer write better, more readable code.

@quangloc99
Copy link
Contributor Author

Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

This is a very good suggestion! Let me try that.

@matheusaaguiar
Copy link
Collaborator

Maybe it needs to be inserted into the optimizer sequence before the Function Specializer (F)?

This is a very good suggestion! Let me try that.

Sorry! I meant after the Function Specializer.

@cameel
Copy link
Member

cameel commented Sep 11, 2024

Hello. As @matheusaaguiar said, I did some benchmarking here to see how helpful the new step is in practice. The general concept of the new step does make sense, so it's actually surprising how little effect it seems to have. Even if real-life projects optimize their constants manually, this should work not just for constants - I'd expect it to find lots of optimization opportunities in the helpers generated internally by the compiler. We should also see some changes in gas expectations in our semantic test cases since we definitely do not hand-optimize constants in those.

Something seems off and I think it's important to investigate and understand what's happening before we jump into making this production-ready. I can at least confirm that it does have a positive effect on code with constants so it's not that it's broken. But it does seem less effective than it could be. My initial guess was that running it before FunctionSpecializer prevents it from running on most calls with constant arguments, and it probably does, but I don't think this is enough, because I tried that and saw no improvement in benchmarks. I think it may be that it also needs to run before any inlining happens since in the current design that irreversibly removes some optimization opportunities (we have no optimizer step that can reverse inlining).

In general I think that making full use of it requires some experimentation with the sequence to see where it can have some actual impact. Also, it's worth considering whether making the step simpler and relying more on composability with other steps wouldn't yield better results. For example, reducing its scope from a whole function to a single statement. Statements with constant values are much more common so this would let the step simplify more and not be affected by things being inlined. The step would not be able to eliminate calls, but that's the job that the FullInliner can do for you. It would also not simplify calculations spread over multiple statements, but if you run steps like ExpressionSplitter and LiteralRematerialiser before, you'll get anything that's side-effect free combined into a single statement so it won't make a difference. If you look at our existing steps that's the approach we took with them and the reason why they have lots of seemingly odd limitations.

Benchmarks

Here's some detail on what I checked. It wasn't very comprehensive and @matheusaaguiar has already posted the results from external tests, which are much more meaningful, but I have the results so I may just as well post them here.

I also checked the difference in gas expectations but since there was no difference, there's nothing to post :)

Bytecode size

From test/benchmarks/local.sh:

File Pipeline 0.8.27 this PR this PR + (k after F)
verifier.sol legacy 4899 bytes 4963 bytes 4975 bytes
verifier.sol via-ir 4324 bytes 4388 bytes 4400 bytes
OptimizorClub.sol legacy
OptimizorClub.sol via-ir 22082 bytes 22274 bytes 22310 bytes
chains.sol legacy 5869 bytes 5901 bytes 5907 bytes
chains.sol via-ir 21401 bytes 21433 bytes 21439 bytes

It shows that there is some effect, but it's rather small and negative. It's not very conclusive though, since your PR is not based on 0.8.27, so the difference could be from something that was merged in the meantime.

Compilation time

From test/benchmarks/external.sh:

Project 0.8.27 this PR
openzeppelin 37 s 40 s
uniswap-v4 147 s 147 s
eigenlayer 674 s 669 s

At least the PR does not seem to affect compilation time in a significant way, which was one of our concerns, since we're very focused on speeding up compilation via IR right now. It could still be only due to the step not having many optimization opportunities so we should recheck later when we make it more effective.

Semantic test with a gas usage

This is how I verified that the PR actually does have an effect. It's is a semantic test that you can drop into a .sol file somewhere under test/libsolidity/semanticTests/. Or actually a diff of such a test between current develop and your PR.

 contract C {}

 contract D {
     // Do something that eats a lot of gas to get the numbers over the minimum of 100k
     C c = new C();

     uint256 public constant A0 = 8;
     uint256 public constant A1 = A0 * 2;
     uint256 public constant A2 = A1 * 2;
     uint256 public constant A3 = A2 * 2;
     uint256 public constant A4 = A3 * 2;
     uint256 public constant A5 = A4 * 2;
     uint256 public constant A6 = A5 * 2;
     uint256 public constant A7 = A6 * 2;
     uint256 public constant A8 = A7 * 2;
     uint256 public constant A9 = A8 * 2;
     uint256 public constant A10 = A9 * 2;

     function f(uint256 x) public pure returns (uint256) {
         return x / A10;
     }
 }
 // ----
 // constructor() ->
-// gas irOptimized: 119400
-// gas irOptimized code: 123600
+// gas irOptimized: 117495
+// gas irOptimized code: 99800
 // gas legacy: 144563
 // gas legacy code: 438800
 // gas legacyOptimized: 121483
 // f(uint256): 1 -> 0

Something where most of the cost is coming from constants rather than something unrelated might be a better test though.

@cameel
Copy link
Member

cameel commented Sep 11, 2024

Also, some notes on the things that were discussed above:

Solidity team have discussed the next steps for this PR and as per @ekpyron recommendation, it would be best to write a new interpreter for the task rather than reusing the one from test. It should just handle control flow and plain "pure" opcodes (e.g. arithmetic or other which work purely from other values on stack). It should just abort on anything else like memory access, reverts, etc. It should also not handle memory/storage/calldata/returndata.

I'd like to clarify this a bit. This what we currently think is the best approach, but the actual next step was supposed to be to take a closer look at the current interpreter and make sure this is really a viable approach. We don't want to get into a task that will be too big and distract too much from the main point, which is the new step. If it turns out to be like that, we may still consider a different approach.

And anyway, like I said above, IMO the focus should be first on making sure that the step is actually effective and also agreeing that the overall approach is good. I.e. get the big decisions out of the way before committing to spending the effort needed to get the details right.

  • Secondly, I’ll try to improve the interpreter a bit. One thing that comes to mind is the way exceptions are used. In this case, exceptions are just used to halt the interpreting process early. This isn't ideal since we might want to recover from some exceptions. I’m considering using something like std::expected. Maybe boost's outcome or std::variant could also work. Other than that, I’ll aim to clean up other parts.

Just a note that if you do try to do that, it's best to do it in a separate refactor PR. We generally prefer smaller PRs. Anything that can be extracted from a bigger PR and still makes sense on its own, should be a separate PR to reduce the scope of reviews (which is the biggest bottleneck for us).

Also, std::expected, might be nice, but I see it's C++23. We're still on C++17 (with a plan to migrate to C++20 in the near future). We do have a custom Result class you could use for this though. A variant is fine too.

Yes, I think you will have to implement a specific test suit. You can use YulInterpreterTest.h as reference. Unless it could be used with the existing functionalities, but I guess it is better to have a separate one.

I think it would be best to reuse the current suite and simply make it execute tests using both interpreters. This way we get test coverage for both at the same time and can be sure both give the same answer. This is the approach we took in semantic tests to cover both code generation pipelines and it works very well there.

@quangloc99
Copy link
Contributor Author

Hello @cameel! Thank you for taking the time to review and benchmark everything!

I’m grateful for your advice on how this PR should proceed. Based on your feedback, I’d like to share my thoughts and some of the decisions I made when creating this PR.

My motivation came from how constants are handled in Solidity. I was surprised when many of my constants weren’t inlined and were still being computed within the compiled program. This led me to investigate further, eventually resulting in this PR.

My guess as to why the benchmark results were weak remains the same. I believe that since the optimizer wasn’t able to effectively calculate, inline, and transform the code, contract developers were forced to manually compute their constants. Because of this, I think the benchmark was biased: there wasn’t a strong use of complex constant expressions, so there wasn’t much for this optimizer step to act on.

Personally, I don’t see this PR as a typical optimizer PR but more as an improvement to the development experience. With this PR, I expect developers will be able to write code that better expresses their intentions without having to focus so much on how the code gets optimized. The effect of this PR is more like a syntax change than an optimization step. It gives developers a new tool to work with, as I believe it changes the underlying semantics of constants: instead of inlining the expression, it inlines the value.

This change also leads to the question: if I want to introduce this new tool, where should I contribute? Initially, I thought I might contribute to libsolidity, as it provides the direct tools and interfaces developers use. However, contributing to libsolidity wasn’t straightforward (due to the complexity of Solidity expressions and the lack of an evaluator), and it also overlapped with the Yul optimizer. That’s why I decided to contribute to the Yul optimizer instead. In my opinion, this contribution fits well with the rest of the compiler components, with no conflicts, and this optimizer step does exactly what it says.

I understand that the interpreter is very powerful and can do much more than what’s demonstrated in this PR. However, I didn’t see it being used. With this PR, I also wanted to make the push for the interpreter usage, not just for constants or simple statements, but anywhere it’s applicable. This PR is just a proof of concept, showcasing one potential use for the interpreter. I believe there are many more ways to leverage it, but for now, I wanted to focus on this simple optimization.

I’ve been mindful of the review workload for the team, which is another reason I initially reused the interpreter from the tests. Not only did this reduce my effort, but I also thought it would make the review process easier. I apologize for the size of this PR. I tried to keep it as simple as I could.

@quangloc99
Copy link
Contributor Author

quangloc99 commented Sep 12, 2024

I’d like to share my thoughts on the advice as well.

I like the idea of moving this optimizer step to evaluate statements instead of constant functions. I understand that this will give the optimizer step a broader optimization impact. However, even though these two approaches to evaluation are similar, I think there are some differences that need to be addressed:

  • Evaluating statements is not always the same as evaluating constant functions. Suppose I have three variables that call the same constant function. Evaluating the statement would lead to the constant function being evaluated multiple times, which would also impact the compiler's runtime. So, if we proceed with statement evaluation, I think we still need to keep constant function evaluation.
  • It seems to me that statement evaluation requires more effort. To achieve this, I would either need to inspect the entire statement to see if all the parameters are pure before evaluating, or I’d need to rely on the DataflowAnalyzer. Both approaches seem more complicated than what I proposed in this PR.
  • As I mentioned, constant function evaluation gives developers a tool they can use proactively, similar to other programming languages. Statement evaluation doesn’t offer developers as much flexibility.

As I mentioned before, I want to advocate for the use of the interpreter. If the interpreter gets accepted and used in production, I believe there will be more ways to utilize it than what is demonstrated in this PR.

Regarding the composability point, I will experiment as suggested. However, I believe keeping this optimizer step focused on constant functions is fine as it is. The ultimate composability is how the interpreter is used across different optimizer steps rather than this one alone. I also believe that in the future, even better optimization steps will be introduced and this one may be deprecated, which is fine as the compiler evolves.

@quangloc99
Copy link
Contributor Author

What that aside, I want your advice on how to proceed with this PR. I have not made any additional changes rather than restoring the original interpreter in test. The interpreter I added is still inside libyul, and I will make change to it.

But for now, should I focus on expreimentation and benchmarking, or should I proceed with the suggested implementation changes?

Also for benchmarking, I want run and see the gas usage report locally. I tried to run ./test/external_tests.py but seems like it requires some environment variables, such as the RPC key. Can you help guiding me through running it please?

@cameel
Copy link
Member

cameel commented Sep 16, 2024

What that aside, I want your advice on how to proceed with this PR. I have not made any additional changes rather than restoring the original interpreter in test. The interpreter I added is still inside libyul, and I will make change to it.

We discussed it again today and we think that reuse of a subset of current interpreter should be good enough after all, but it will still require some work to make sure it's reliable. If you can make it solid, we'd be happy to merge it and then you could use it as a base for your new step.

I created an issue for that #15435 and added some details about implementation, which hopefully answer your questions related to that part. If you have more, let's continue discussion on that topic there.

Personally, I don’t see this PR as a typical optimizer PR but more as an improvement to the development experience.

Fair enough, I now see where you're coming from. I just think we can achieve both with not that much extra effort. In many ways it would even be simpler. We can have a step that can make constants usable and also make the optimizer more powerful.

But for now we decided we're ok with such a limited step targeting constants as an initial step. We'd generally prefer it to become something more generic, usable in more optimization scenarios and composable with other steps, it would be a waste not to do it having gone through the trouble of preparing the interpreter for production use, but we can think about it when we're done with this PR. Having this something like this is already helpful. It's a welcome contribution.

Just a note that we'd rather mark it as experimental and keep it out of the default sequence at first. We're a bit wary of potential issues in the interpreter and we treat optimizer bugs very seriously, so we really want to see people use carefully first and report back. Then we'd consider including it in the sequence.

Initially, I thought I might contribute to libsolidity, as it provides the direct tools and interfaces developers use.

Doing it at Yul level is definitely the better idea for a contribution. It makes the scope actually manageable. We have #3157 for the Solidity-level equivalent, but that's a much more complex task and is actually blocked on some design decisions. It's a very important point on our roadmap, but it'll take us some time to get there.

@cameel
Copy link
Member

cameel commented Sep 16, 2024

As to your more specific questions about my suggestion to make the step simpler:

Evaluating statements is not always the same as evaluating constant functions. Suppose I have three variables that call the same constant function. Evaluating the statement would lead to the constant function being evaluated multiple times, which would also impact the compiler's runtime. So, if we proceed with statement evaluation, I think we still need to keep constant function evaluation.

In that scenario you would first evaluate statements inside the function, which would be done once. If the function is simple enough to be eliminated by your step then mine would reduce it to something like return <number>. Then the function would be simple enough to be inlined at the point of use of each constant by the inliner.

It seems to me that statement evaluation requires more effort. To achieve this, I would either need to inspect the entire statement to see if all the parameters are pure before evaluating, or I’d need to rely on the DataflowAnalyzer. Both approaches seem more complicated than what I proposed in this PR.

Looking at how simple the step is currently, perhaps it would be a bit more complex. But it would be more powerful and still not very complex overall, because it would rely on other steps to do the complex stuff. For example it would not need to use Dataflow Analyzer. You could run Structural Simplifier first, which already uses Dataflow Analyzer and would deal with any constant control flow. You would also not have to check if variables are pure - Rematerialiser and Literal Rematerialiser would convert it into a form where the final is directly in the expression - you'd only check if it's a literal and skip the expression if not.

Also keep in mind that it would not necessarily happen all at once, some steps or patterns get repeated multiple times, because each pass may reveal new opportunities. For example you might run your step first to evaluate conditionals, then Structural Simplifier to unroll the ifs and loops and then Expression Joiner to join what you can into a single expression and finally your step again to evaluate things that now potentially got consolidated into longer constant expressions.

BTW, I initially said it should work at a statement level, but it should rather be expression level. I was thinking about this being necessary to handle control flow, but then realized that Structural Simplifier will do it.

As I mentioned, constant function evaluation gives developers a tool they can use proactively, similar to other programming languages. Statement evaluation doesn’t offer developers as much flexibility.

I think you're still not seeing fully what I mean. Yes, the step on its own would only deal with expressions/statements. But it would be combined with other steps and together they would be able do do what yours does and even more. Basically, your current step is closer in complexity to a fixed sequence of our current simple steps. Ours are more flexible, because we can combine them in many different ways, like LEGOs.

Also for benchmarking, I want run and see the gas usage report locally. I tried to run ./test/external_tests.py but seems like it requires some environment variables, such as the RPC key. Can you help guiding me through running it please?

Well, this one is a pain to run locally. Here's how we run it in CI. But it should not normally require any RPC key so something must have gone wrong.

Maybe try to run it for a single project first, say, OpenZeppelin. Check if this works:

test/external_tests.py test \
    --solc-binary-type native \
    --solc-binary-path /path/to/solc \
    --run zeppelin

This will take a while to run and drop gas measurements in reports/externalTests/. Then you'll have to run scripts/externalTests/summarize_benchmarks.py and scripts/externalTests/benchmark_diff.py on if you want a nice table like the one posted by @matheusaaguiar above.

But we usually don't run it locally because that's a royal pain to do for all ~10 projects. We go to the artifacts of c_ext_benchmarks job on CircleCI and just grab the file from there and run scripts/externalTests/benchmark_diff.py on that.

We also have scripts/externalTests/download_benchmarks.py that automates downloading them, but that one unfortunately does require an API key. It used to work without it but now CircleCI requires tokens.

@quangloc99
Copy link
Contributor Author

Hi cameel!

I really appreciate your feedback and for welcoming this PR!

Regarding the open issue with the side-effect-free interpreter, I think the spec is clear. I’ll create a separate PR for that. Once it's ready, I’ll integrate the new interpreter into this PR and then we can wrap it up.

On the discussion side, I understand the LEGO analogy, and I know the current step seems very limited for now. It’s still hard to mix it with other steps to really push it to its potential. I would love to contribute to this in the future, but I agree that we should treat this PR as a base step and work on making it more powerful later.

I’m also curious about how to make this an experimental feature. How would that work? Would we add a flag to enable a new optimization sequence? And how would we gather feedback on it? I apologize if I’m asking too many questions, but I’m genuinely interested.

Once again, thank you for your feedback and for your patience with all my questions!

@cameel
Copy link
Member

cameel commented Sep 18, 2024

I’m also curious about how to make this an experimental feature.

There isn't that much to it. We don't have any specific mechanism for that (aside from the experimental pragma, but that's really only viable for language features, not compiler features). It's more conceptual really - such features are never enabled by default, usually undocumented, we do not provide changelog for them and they're not even guaranteed to actually work in a release. So it makes working on them less stringent until we decide to make them official. In case of your step the most important aspect would be not including it in the default sequence.

And how would we gather feedback on it?

We'll probably say something about it in release notes and also advertise it a bit using the usual channels, like the forum or social media. Just making it clear that it's experimental and people should not rely on it in production.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants