Skip to content

Commit

Permalink
Fix dstr unparsing
Browse files Browse the repository at this point in the history
* This is an entirely new approach.
* Instead to find the "correct" dstr segments we simply try all and unparse the first one
  that round trips.
* This so far guarantees we always get good concrete syntax, but it can be time intensive as
  the combinatoric space of possible dynamic string sequence is quadratic with the dstr children size.
* For this reason we try above (currently) dstr children to unparse as heredoc first.
* Passes the entire corpus and fixes bugs.

[fix #249]
  • Loading branch information
mbj committed Sep 20, 2024
1 parent 2163503 commit 4e16935
Show file tree
Hide file tree
Showing 58 changed files with 1,299 additions and 857 deletions.
7 changes: 7 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# v0.7.0 2024-09-16

[#366](https://github.com/mbj/unparser/pull/366)

* Fix all known dstring issues.
* Interface changes.

# v0.6.15 2024-06-10

[#373](https://github.com/mbj/unparser/pull/373)
Expand Down
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@

source 'https://rubygems.org'

gem 'mutant', path: '../mutant'

gemspec
28 changes: 16 additions & 12 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,7 +1,17 @@
PATH
remote: ../mutant
specs:
mutant (0.12.4)
diff-lcs (~> 1.3)
parser (~> 3.3.0)
regexp_parser (~> 2.9.0)
sorbet-runtime (~> 0.5.0)
unparser (~> 0.6.14)

PATH
remote: .
specs:
unparser (0.6.15)
unparser (0.7.0)
diff-lcs (~> 1.3)
parser (>= 3.3.0)

Expand All @@ -12,14 +22,8 @@ GEM
diff-lcs (1.5.1)
json (2.7.2)
language_server-protocol (3.17.0.3)
mutant (0.12.3)
diff-lcs (~> 1.3)
parser (~> 3.3.0)
regexp_parser (~> 2.9.0)
sorbet-runtime (~> 0.5.0)
unparser (~> 0.6.14)
mutant-rspec (0.12.3)
mutant (= 0.12.3)
mutant-rspec (0.12.4)
mutant (= 0.12.4)
rspec-core (>= 3.8.0, < 4.0.0)
parallel (1.25.1)
parser (3.3.2.0)
Expand Down Expand Up @@ -62,15 +66,15 @@ GEM
rubocop-packaging (0.5.2)
rubocop (>= 1.33, < 2.0)
ruby-progressbar (1.13.0)
sorbet-runtime (0.5.11422)
sorbet-runtime (0.5.11572)
strscan (3.1.0)
unicode-display_width (2.5.0)

PLATFORMS
ruby
x86_64-linux

DEPENDENCIES
mutant (~> 0.12.2)
mutant!
mutant-rspec (~> 0.12.2)
rspec (~> 3.9)
rspec-core (~> 3.9)
Expand Down
96 changes: 92 additions & 4 deletions bin/corpus
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
#!/usr/bin/env ruby
# frozen_string_literal: true

require 'etc'
require 'mutant'
require 'optparse'
require 'pathname'
require 'unparser'

Thread.abort_on_exception = true

module Unparser
module Corpus
ROOT = Pathname.new(__dir__).parent
Expand All @@ -17,16 +21,85 @@ module Unparser
#
# @return [Boolean]
def verify
puts("Verifiying: #{name}")
checkout
command = %W[unparser #{repo_path}]
exclude.each do |name|
command.push('--ignore', repo_path.join(name).to_s)

paths = Pathname.glob(Pathname.new(repo_path).join('**/*.rb'))

driver = Mutant::Parallel.async(
config: Mutant::Parallel::Config.new(
block: method(:verify_path),
jobs: Etc.nprocessors,
on_process_start: ->(*) {},
process_name: 'unparser-corpus-test',
sink: Sink.new,
source: Mutant::Parallel::Source::Array.new(jobs: paths),
thread_name: 'unparser-corpus-test',
timeout: nil
),
world: Mutant::WORLD
)

loop do
status = driver.wait_timeout(1)

puts("Processed: #{status.payload.total}")

status.payload.errors.each do |report|
puts report
fail
end

break if status.done?
end
Kernel.system(*command)

true
end

private

class Sink
include Mutant::Parallel::Sink

attr_reader :errors, :total

def initialize
@errors = []
@total = 0
end

def stop?
!@errors.empty?
end

def status
self
end

def response(response)
if response.error
Mutant::WORLD.stderr.puts(response.log)
fail response.error
end

@total += 1

if response.result
@errors << response.result
end
end
end

def verify_path(path)
validation = Validation.from_path(path)

if original_syntax_error?(validation) || generated_encoding_error?(validation) || validation.success?
return
end

validation.report
end

def checkout
TMP.mkdir unless TMP.directory?

Expand All @@ -50,6 +123,21 @@ module Unparser
TMP.join(name)
end

private

# This happens if the original source contained a non UTF charset meta comment.
# These are not exposed to the AST in a way unparser could know about to generate a non UTF-8
# target and emit that meta comment itself.
# For the purpose of corpus testing these cases are ignored.
def generated_encoding_error?(validation)
exception = validation.generated_node.from_left { return false }
exception.instance_of?(Parser::SyntaxError) && exception.message.eql?('literal contains escape sequences incompatible with UTF-8')
end

def original_syntax_error?(validation)
validation.original_node.from_left { return false }.instance_of?(Parser::SyntaxError)
end

def system(arguments)
return if Kernel.system(*arguments)

Expand Down
9 changes: 5 additions & 4 deletions bin/parser-round-trip-test
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ class Test
:rubies
)

EXPECT_FAILURE = {}.freeze
STATIC_LOCAL_VARIABLES = %i[foo bar baz].freeze
EXPECT_FAILURE = {}.freeze

def legacy_attributes
default_builder_attributes.reject do |attribute_name, value|
Expand Down Expand Up @@ -77,9 +78,9 @@ class Test

# rubocop:disable Metrics/AbcSize
def validation
identification = name.to_s
identification = name.to_s

generated_source = Unparser.unparse_either(node)
generated_source = Unparser.unparse_either(node, static_local_variables: STATIC_LOCAL_VARIABLES)
.fmap { |string| string.dup.force_encoding(parser_source.encoding).freeze }

generated_node = generated_source.bind do |source|
Expand All @@ -99,7 +100,7 @@ class Test

def parser
Unparser.parser.tap do |parser|
%w[foo bar baz].each(&parser.static_env.method(:declare))
STATIC_LOCAL_VARIABLES.each(&parser.static_env.method(:declare))
end
end

Expand Down
Loading

0 comments on commit 4e16935

Please sign in to comment.