Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename trait LanguageScoper to Query #153

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bheylin
Copy link
Contributor

@bheylin bheylin commented Oct 27, 2024

The use of LanguageScoper is a type seems to be a misnomer.

Query is implemented on a CompiledQuery and a LanguageScoper is used to query a given set of text to produce a list of Scope's.

With this rename, a lang::Query produces Scope's and 'language scopingin a process that acceptsQuery's as input and outputs Scope`'s.

`LanguageScoper` is a misnomer. `Query` is implemented on a
`CompiledQuery` and a `LanguageScoper` is used to query
a given set of text to produce a list of `Scope`'s.
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 86.77%. Comparing base (b53bf21) to head (558c81f).

Files with missing lines Patch % Lines
src/main.rs 92.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #153   +/-   ##
=======================================
  Coverage   86.77%   86.77%           
=======================================
  Files          32       32           
  Lines        1883     1883           
=======================================
  Hits         1634     1634           
  Misses        249      249           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alexpovel
Copy link
Owner

Hmm, not sure I follow. When you say

Query is implemented on a CompiledQuery

what is Query in this context?

a LanguageScoper is used to query a given set of text to produce a list of Scope's.

Yes! But isn't it just as valid to say: a LanguageScoper is used to scope a given set of text to produce a list of Scope's. It just seems like a synonym.

What we name it in the code I'm not attached to, but the user-facing side (docs, CLI) should make sense. Currently, the concepts are simple: there's scopers and actions. Regex can scope, and so can a literal string when we pass --literal-string. Programming languages are simply also scopers, in the same vein. "Query" would be a new concept and term, and arguably it's an abstraction leak, with tree-sitter shining through. Users shouldn't need to know anything about that.

Currently, scopers are just things which take input text, and will discard parts (or none) of it as "out of scope". That abstraction mechanism works across regex, literal strings, and syntax "queries" (and maybe more in the future; and it also works for internal stuff like DosFix).

The trait is

pub trait LanguageScoper: Scoper + Find + Send + Sync {

which I intend as "LanguageScoper are Scoper, but have some extra special sauce" (inheritance sorta).

What do you reckon?

@bheylin
Copy link
Contributor Author

bheylin commented Oct 27, 2024

My comments here are from the point-of-view of a newcomer to srgn as a tool and a codebase, and as a newcomer to
writing tree-sitter queries.

Hmm, not sure I follow. When you say

Query is implemented on a CompiledQuery

what is Query in this context?

I mean to say here that it makes sense to me that the trait that is implemented on CompiledQuery would be called Query instead of LanguageScoper given that's it's used to define a set of polymorphic query objects.

It's a simpler statement, almost obvious to a reader, that a rust::CompiledQuery is transformed into a Vec<dyn Query>.

a LanguageScoper is used to query a given set of text to produce a list of Scope's.

Yes! But isn't it just as valid to say: a LanguageScoper is used to scope a given set of text to produce a list of Scope's. It just seems like a synonym.

Yea it's a synonym. But I'll argue below against needlessly using a synonym.
And to be clear, I'm not going to die on this hill 😆
I just think that the code base would be clearer for me if query is used more consistently.

What we name it in the code I'm not attached to, but the user-facing side (docs, CLI) should make sense. Currently, the concepts are simple: there's scopers and actions. Regex can scope, and so can a literal string when we pass --literal-string. Programming languages are simply also scopers, in the same vein. "Query" would be a new concept and term, and arguably it's an abstraction leak, with tree-sitter shining through. Users shouldn't need to know anything about that.

I look at srgn as having a strong association with tree-sitter and it's concepts. So I don't see tree-sitter concepts that are forwarded through the interface of srgn as a problem, or something extra to think about. Prepared queries allow a user to avoid tree-sitter to some degree, but the user will soon arrive at a point where they want to write a more complex query. And then they will have to learn at least basic tree-sitter query syntax.

At this stage, trying to hide tree-sitter concepts by mapping them to new srgn concepts is a higher cognitive load on the user. That, I argue, is the abstraction leak.

That's all based on my reading that tree-sitter is fundamental to srgn. Maybe it isn't and you want to keep it abstracted?

The other point I want to highlight are existing concepts and common meanings in English.
Take regex for example, the input is a match pattern or pattern or expression , with pattern being an existing word with a common meaning in English. Saying: "a pattern produces matches" won't surprise many English speakers. You could also say: "a query produces matches" and most people will still get what you mean. Indeed everyone is well used to "querying search engines" and such.

But if you say: "language scopers produce scopes", you have some explaining to do. Even to a technically literate audience. Is the srgn scope the same as a tree-sitter scope? So I don't see what the change in terminology is adding over the plain use of pattern/query and matches.

The concept of action is clear, only the input to an action needs to be explained I think.

A nomenclature of queries, matches, actions and the concept that matches are filtered using queries gets you quite far I think without having to introduce a word-phrase like language-scoper.

On the subject of interface concepts being mapped clearly onto the codebase. I am a fan of doing that unless there's a good reason not to. In this case I think it's pretty straight forward to map the UI concepts onto the codebase, where the code will of course be more nuanced than the concepts the users expresses themselves with. But why use different names for the same concept?

Anyways this is all from my admittedly quite shallow interactions with srgn so far.
So take it all with a grain of salt 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants