-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use prism parser via translator #388
Conversation
The problem seems to be that
|
@exterm Can you use prism v0.20.0 itself? It has |
Oooh! |
b4cf971
to
1fc0e51
Compare
I'm running into lots of errors that look like I'm not requiring the right thing.
When I add a
|
Ah! It needs a |
Ahh sorry it's actually the opposite, you should only require |
1f8b4a4
to
a690e33
Compare
Done. I had to change a test because the error message was different. Ideally it would be the same as from |
Error messages are going to be different because they're coming straight from prism. They're meant to be more informative than CRuby, and as such they're likely to be different from the parser gem. The message themselves we're not going to make any guarantees about, as they are likely to change as we continue to improve error recovery. |
OK! I can live with that. |
ab47897
to
0499520
Compare
0499520
to
d36f90a
Compare
Green tests! Thank you a lot for your help (and all the work on Prism of course), Kevin. Will try this out tomorrow on a substantial Rails app. |
@gmcgibbon FYI it seems to "just work" according to the test suite. Real world testing before merge seems wise. |
Seems to break on numbered parameters in blocks:
The code in question is something like I currently believe that this is a bug in prism, respectively the translator |
I'll wait for the next prism release with the bugfix before I make this "ready for review". I tested with the fix on the largest Rails app I currently have access to (~1m lines of Ruby code) and it works just fine. Speedup seems to be about 30% for a full |
@exterm I'll release a patch release on Monday first thing. It'll be v0.20.1. Thank you very much for your work on this! I'm still very interested in replacing the parser with prism as a whole (that should be much better than using the translation layer) but I recognize that's quite a bit more involved. Is that something you're still interested in working on? I'd be happy to help support in any way I can. |
@kddnewton What's the advantage to packwerk from skipping the translation layer? |
It should be quite a bit faster (i.e., an order of magnitude), because it won't have to go through the whole translation. The other very tangential, loosely-defined benefit is that it uses the same AST as Ruby, so anyone contributing to packwerk will be gaining knowledge of the AST that all of the Ruby implementations use. This means new contributors won't have to learn a new AST, and in a world where prism is the parser everywhere (the world I'm trying to create) this means a lower barrier to entry for new contributors. |
The last thing is that we've worked hard to provide a really good Ruby API. Instead of dealing with a single class with a type field, all of our nodes have their own classes. We have utility functions on all of the classes that should make it quite a bit easier to use. Also, if you need anything, you have a line directly to the main contributor and I'll add whatever functions you may need. |
I whipped up a quick benchmark and it looks like omitting the translation could indeed provide a huge boost. |
@exterm v0.21.0 is out now with these bug fixes |
@kddnewton thank you, will plug it in when I find some time. Probably later today.
I was hoping to make a full parser replacement paid work, but I'm having trouble making a business case for it. Gusto now has a Rust reimplementation of packwerk that is a lot faster, and probably faster than we can make packwerk. I'm still interested in this for fun, but I'll have to find time for it. |
Testing this. I am atm stuck on prism 0.17 due to our version of rbi. Had to read through the discussion to see the version dependency. |
mmmh, interesting. That might make a release incorporating this a little harder. |
I just opened a PR to add support for Ruby 2.7, so this won’t be an issue
once I release that.
…On Mon, Feb 5, 2024 at 7:20 PM Philip Müller ***@***.***> wrote:
mmmh, interesting. That might make a release incorporating this a little
harder.
—
Reply to this email directly, view it on GitHub
<#388 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABG3P3QLU7F64UQFR2KNRJ3YSFZNZAVCNFSM6AAAAABCVUGFHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRYGU2TCOBXHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I was going to update prism and mark this PR ready for review, but in that case I'll wait for Ruby 2.7 support in prism. |
@exterm version 0.22.0 supports Ruby 2.7, so you should be able to use that. |
555c044
to
206c5ba
Compare
288f1fc
to
b4566b2
Compare
b4566b2
to
59acfdf
Compare
This is ready to go |
packwerk.gemspec
Outdated
@@ -50,6 +50,7 @@ Gem::Specification.new do |spec| | |||
# For Ruby parsing | |||
spec.add_dependency("ast") | |||
spec.add_dependency("parser") | |||
spec.add_dependency("prism", ">= 0.22.0") # 0.21.0 fixes numbered block params, 0.22.0 adds Ruby 2.7 compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use parser still or can we drop it as a dependency now that we're on prism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundler doesn't have peer dependencies, but if it did parser would be one. When the translation layer gets loaded it tries to require parser. We don't have it as a direct dependency because we don't want to impose that on all of prism's consumers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah besides, in this PR packwerk still uses Parser::AST
and a few other things.
A "clean" migration to Prism without using the translation layer, and using prism's AST format instead, would let us get rid of parser
AFAICT. But it's significantly more work.
I've looked into it a bit, and among other things NodeHelpers
has to be completely rewritten. I'm thinking we may want to completely remove it though since prism's AST format is a lot more user friendly and may not need this kind of leaky wrapper around it.
(@kddnewton the only thing that's been a little annoying is that the types don't seem to represent shared interfaces - e.g. between symbol and string nodes, or between class and module nodes. What's a good place to discuss these things?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can open a discussion on github.com/ruby/prism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bundler doesn't have peer dependencies
It does, but not directly. You can use gem
entries like requires to detect if a gem exists in an application's bundle. At least, this is what I was inferring to do.
in this PR
packwerk
still uses Parser::AST and a few other things.
Yes, I see now we're using a translation layer, and we're halfway between using prism and parser. Peer dependencies aren't appropriate here. I don't see any actual benchmarks, but I too believe this should be faster. I'll put together a quick benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before:
📦 Finished in 16.88 seconds
bin/packwerk check 0.30s user 0.04s system 1% cpu 33.053 total
After:
📦 Finished in 17.98 seconds
bin/packwerk check 0.29s user 0.04s system 0% cpu 34.413 total
Looks like we're at the same-ish difference to me now over multiple runs. How many times faster do we expect Prism to be using the translation layer? 5x seems like a stretch to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It heavily depends on the file unfortunately. The biggest issue is that the parser gem uses character offsets, whereas the prism gem uses byte offsets. If the file has any multibyte characters, we have to run everything through an offset filter that takes time to build and run. It's much faster if the file has no multibyte characters.
That being said, this is only going to impact the parsing speed. So if parsing isn't a huge bottleneck in packwerk itself, it's not going to be a massive win. Of course the bigger win would be translating the entire thing to using prism directly, but that's far more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@exterm do you have an application that benefits from this change? If you do, please provide benchmarks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kddnewton parsing is about 2/3 of packwerk runtime on large apps. So I'd expect a big speedup and I did get a big speedup on the app I was working on (~1 million Ruby LOC). I'll do another benchmark after updating to prism 0.23.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually @gmcgibbon and @exterm can you both try v0.24.0 (sorry). Aaron and I worked on a performance improvement for creating the AST which should yield some significant speed improvements.
@gmcgibbon can we get this merged? Shopify will save money in CI 😉 |
Up to date measurements from a reasonably sized Rails app, using spring. Sadly I can only currently access it through a Github Codespace, so it's a highly virtualized environment. I did not get a lot of variation on repeated runs though. Packwerk 3.1.0cached: ~2.6s Packwerk 3.1.0 with
|
For a single 3k line file, the difference is about 0.8s vs 1.1s - it's not dramatic, but it's faster. Moving off the translator layer would be better but is more work. It could be done as a follow-up to this PR. |
Right, I think I'm fixating on the wrong type of run. I believe there's an improvement on uncached runs on our app too. Uncached before:
Uncached after:
That's considerably faster, and CI would use uncached, so that's good enough for me. |
packwerk.gemspec
Outdated
@@ -50,6 +50,7 @@ Gem::Specification.new do |spec| | |||
# For Ruby parsing | |||
spec.add_dependency("ast") | |||
spec.add_dependency("parser") | |||
spec.add_dependency("prism", ">= 0.23.0") # 0.24.0 fixes a performance issue with the parser translator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to 0.24.0, or we can be more conservative and use 0.22.0 as the minimum with 2.7 support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to but somehow I only changed the comment 🤦🏼
f3baf86
to
ff6e849
Compare
Thank you @exterm, I'll run a release for this now. |
@gmcgibbon do you want to add a note about the expected speedup (~30% for uncached runs on large apps) to the release notes? Or maybe just say "this makes it faster" or something... Right now it's not really clear from the release notes why the change was made. I don't have a strong opinion, just making a suggestion. |
25% faster on the main Rails app I work in 👍 |
What are you trying to accomplish?
Proof of concept for discussion #387. Packwerk spends about 2/3 of the time it runs on a large code base in parsing Ruby. Prism is ~5x faster than the current parser (
whitequark/parser
). So this should be a significant speed boost for packwerk.It looks like this could be a simple, almost trivial change.
What approach did you choose and why?
We could use Prism directly, but that would require significant changes to Packwerk. Luckily @kddnewton has included a translator with prism that provides compatibility with
whitequark/parser
, so that it should be a drop-in replacement. We won't get the same performance improvements, but we get some, and almost for free.What should reviewers focus on?
Type of Change
Additional Release Notes
Include any notes here to include in the release description. For example, if you selected "breaking change" above, leave notes on how users can transition to this version.
If no additional notes are necessary, delete this section or leave it unchanged.
Checklist