Remove duplicate Symbols caused by a Label #33

mpsijm · 2019-11-04T16:05:53Z

The Problem: In NormGrammarReader.processSymbol, a cache is maintained for symbols that are already processed. However, this cache did not take into account that Symbols with and without a Label should be treated as equal (e.g. tail:Stm and Stm, see org.spoofax.jsglr2.integration/src/main/resources/grammars/layout-sensitive.sdf3). These labels are only used to reference the symbols from the layout constraints, but do not actually contribute to the meaning of a production.

Why is this a problem? For SLR parse table generation (see #27), all grammar symbols are assumed to be unique. However, because Symbols with or without a Label were not treated as equal, the "graph" used to calculate the First- and Follow-sets became disconnected. This in turn caused some symbols to have empty First-/Follow-sets. The productions belonging to these symbols would not get any reduce actions in the parse table, because if the Follow-set is empty, no lookahead is allowed.

Current solution: When processing symbols in the NormGrammarReader, any Label constructor is removed from the symbol term.

Discussion: I have tried looking around where these labels are used (I am assuming layout constraints, but I couldn't find how they are processed in NormGrammarReader).
Depending on their usage, it might be a good idea to apply this filter in Stratego already, e.g. in the strategy module-to-normal-form. In that case, this PR can be closed without merging.
@udesou any feedback is appreciated, as always 🙂

@udesou

**The Problem:** In `NormGrammarReader.processSymbol`, a cache is maintained for symbols that are already processed. However, this cache did not take into account that Symbols with and without a Label should be treated as equal (e.g. `tail:Stm` and `Stm`, see org.spoofax.jsglr2.integration/src/main/resources/grammars/layout-sensitive.sdf3). These labels are only used to reference the symbols from the layout constraints, but do not actually contribute to the meaning of a production. **Why is this a problem?** For SLR parse table generation (see metaborg#27), all grammar symbols are assumed to be unique. However, because Symbols with or without a Label were not treated as equal, the "graph" used to calculate the First- and Follow-sets became disconnected. This in turn caused some symbols to have empty First-/Follow-sets. The productions belonging to these symbols would not get any reduce actions in the parse table, because if the Follow-set is empty, no lookahead is allowed. **Current solution:** When processing symbols in the `NormGrammarReader`, any Label constructor is removed from the symbol term. **Discussion:** I have tried looking around where these labels are used (I am assuming layout constraints, but I couldn't find how they are processed in NormGrammarReader). Depending on their usage, it might be a good idea to apply this filter in Stratego already, e.g. in the strategy `module-to-normal-form`. @udesou any feedback is appreciated, as always 🙂

mpsijm added 2 commits November 4, 2019 16:59

Add TODO about moving removeLabelFromSymbolTerm to Stratego

ab39635

jasperdenkers merged commit 0c47f4b into metaborg:develop/jsglr2 Nov 12, 2019

mpsijm deleted the remove-duplicate-symbols-with-label branch November 12, 2019 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove duplicate Symbols caused by a Label #33

Remove duplicate Symbols caused by a Label #33

mpsijm commented Nov 4, 2019

Remove duplicate Symbols caused by a Label #33

Remove duplicate Symbols caused by a Label #33

Conversation

mpsijm commented Nov 4, 2019