Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial v2 refactor #131

Draft
wants to merge 4 commits into
base: ts_refactor
Choose a base branch
from

Conversation

xogeny
Copy link
Contributor

@xogeny xogeny commented Jan 22, 2018

Lots of changes here. First and foremost, this organizes the code into es6 modules.
I tried to hit the big functional bigs: tokenizer, parser, evaluator, functions and separate
them out. The parser was particularly involved so I actually further decomposed that
so that files were at least a reasonable size. I don't have particularly strong feelings
about the taxonomy withing the parser package, just trying to get things to a
manageable state.

There are several other big changes worth noting here in the tooling. First, I migrated
the code to TypeScript. I think TypeScript is a particularly good fit for AST related
work because of the tagged union support. But it also really helps with refactoring
so I think it will be pretty useful moving forward. It all compiles out to es6 and es5
and can be bundled with rollup. So it doesn't impact deployment at all, just
development.

I moved the tests to jest. I find Jest is a little easier to configure out of the box
and things like coverage are well supported.

I used yarn to create a yarn.lock file. We could also use npm v5, but since
jsonata currently supports Node v4, and npm v5 only ships with Node v7+.
Bottom line is that yarn (or npm5) give much better reproducibility on installs.

Even with all these changes, the test suite still passes 100% of tests with 100% coverage!

Lots of changes here.  First and foremost, this organizes the code into es6 modules.
I tried to hit the big functional bigs: tokenizer, parser, evaluator, functions and separate
them out.  The parser was particularly involved so I actually further decomposed that
so that files were at least a reasonable size.  I don't have particularly strong feelings
about the taxonomy withing the parser package, just trying to get things to a
manageable state.

There are several other big changes worth noting here in the tooling.  First, I migrated
the code to TypeScript.  I think TypeScript is a particularly good fit for AST related
work because of the tagged union support.  But it also really helps with refactoring
so I think it will be pretty useful moving forward.  It all compiles out to es6 and es5
and can be bundled with rollup.  So it doesn't impact *deployment* at all, just
development.

I moved the tests to `jest`.  I find Jest is a little easier to configure out of the box
and things like coverage are well supported.

I used `yarn` to create a `yarn.lock` file.  We could also use `npm` `v5`, but since
`jsonata` currently supports Node `v4`, and `npm` `v5` only ships with Node v7+.
Bottom line is that `yarn` (or `npm5`) give much better reproducibility on installs.

Even with all these changes, the test suite still passes 100% of tests with 100% coverage!
@coveralls
Copy link

coveralls commented Jan 22, 2018

Coverage Status

Coverage remained the same at 100.0% when pulling f5420d9 on xogeny:initial-v2-refactor into a753f42 on jsonata-js:ts_refactor.

@andrew-coleman
Copy link
Member

Looks like jest doesn't support node v4

@mattbaileyuk
Copy link
Member

They dropped support at the end of last year in version 22; we could lock to a backlevel (21) but not sure how long that would be for. Might need to decide on the future of node v4 support in JSONata as it drops completely from any further fixes (even security) in April. Might depend on node-red, though.

@xogeny
Copy link
Contributor Author

xogeny commented Jan 23, 2018

OK, it turns out that downgrading to jest v21 seems to work. I'll push out a new version of the branch that should hopefully address this.

This will allow it to work with Node 4, in theory.
@xogeny
Copy link
Contributor Author

xogeny commented Jan 23, 2018

...and it seems that downgrading wasn't sufficient.

One big priority was removing optional fields from AST nodes and tagging AST nodes
with `global` properties (like `keepArray`, see below).

I haven't changed the AST definitions at all.  The `type` and `value` fields should have
the same values they always had.  However, in the types I'm using tagged unions for all
the AST nodes.  Generally, the `type` field is used to narrow the type.  But in some cases,
I further refine type by `value` (which you can really think of as a subtype for the AST
nodes).  This basically allows me to narrow the type further in the code and do more
static checking during compilation to make sure that the proper fields are being read
and/or written.

One "breaking change" is with the parser recovery tests.  These aren't really breaking
in the sense that they would impact most users.  They have to do with including some
additional lexical information.  This information was, generally speaking, already there.
But there were cases where it wasn't included for some AST nodes.  So the change is
really just including it consistently.

Another noticeable change is with error handling.  Previously, the syntax errors were
annotated on the root node of the AST.  I've changed that so they don't impact the AST.
I've also changed the error semantics very slightly.  Now, if there are no errors, the
`errors` field in an expression will return an empty array (previously, it returned
`undefined`).

One completely internal change was the elimination of the `keepArray` attribute
on AST nodes.  This was added whenever the `[]` was found in an expression.  I've
changed the handling so that instead of having the parser walk through the AST and
find the head of the path that contains the `[]` operator and then set the `keepArray`
attribute on it, I'm simply adding a special "decorator" AST node that is optimized
away eventually.  This keeps the parser focused just on building the initial AST and
leaves the semantics of the `SingletonArrayDecorator` (the node that gets inserted
as a result of an `[]` in a path) to be handled completely during `ast_optimize`.  It is
important to note that this special AST node is always optimized away.

In `ast_optimize` I generate a new AST node as a result of optimization in each `case`
of the `switch` statement.  This is mainly to keep the typing clean (so we know precisely
what we are working with at any given time).
@DaniGuardiola
Copy link

What is the status of this? Need any help?

@mtiller
Copy link
Contributor

mtiller commented Apr 17, 2019

I actually got the code in reasonable shape. I believe the entire original test suite passes in my version along with probably a few additional tests I added. You can find it on the v2-refactor branch in my fork.

I do not plan to continue that work. Ultimately, the jsonata project operates under various (reasonable) constraints that just made it difficult for me to take the code where I wanted it to go. But hopefully somebody can incorporate some of the ideas I pursued there. A bit part of my refactoring was:

  • Typing all AST node types to allow type checking during interpretation
  • Using multiple files rather than just one
  • Using TypeScript
  • Changing the "working value" model for evaluation so that it used a "boxed" value rather than trying to attach various attributes internal attributes to the value itself. There may very well be performance degradation in what I did, FYI.

The biggest issues I found were several cases of what I would call "inconsistent semantics" where the evaluation of a node was context sensitive. Another issue I had was related to how to handle sync and async cases without having to rewrite massive amounts of code. I kind of managed this I think, not sure. I was shooting for some abstractions where the fundamental evaluation would always be the same, but the abstractions would take care of issues like sync vs. async. This is slightly related to another goal I had which was to be able to apply jsonata expressions to databases. The issues with sync vs. async also come into play there.

I also found the way that lambdas were being evaluated hard to follow and abstract.

If anybody has any questions, I'd be happy to answer them.

@mattbaileyuk mattbaileyuk marked this pull request as draft May 4, 2022 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants