Releases: earwig/mwparserfromhell
Releases · earwig/mwparserfromhell
version 0.5
- Added
Wikicode.contains()
to determine whether a Node or Wikicode object is
contained within another Wikicode object. - Added
Wikicode.get_ancestors()
andWikicode.get_parent()
to find all
ancestors and the direct parent of a Node, respectively. - Fixed a long-standing performance issue with deeply nested, invalid syntax
(issue #42). The parser should be much faster on certain complex pages. The
"max cycle" restriction has also been removed, so some situations where
templates at the end of a page were being skipped are now resolved. - Made
Template.remove(keep_field=True)
behave more reasonably when the
parameter is already empty. - Added the
keep_template_params
argument toWikicode.strip_code()
. If True,
then template parameters will be preserved in the output. - Wikicode objects can now be pickled properly (fixed infinite recursion error
on incompletely-constructed StringMixIn subclasses). - Fixed
Wikicode.matches()
's behavior on iterables besides lists and tuples. - Fixed len() sometimes raising ValueError on empty node lists.
- Fixed a rare parsing bug involving self-closing tags inside the attributes of
unpaired tags. - Fixed release script after changes to PyPI.
version 0.4.4
- Added support for Python 3.6.
- Fixed parsing bugs involving:
- wikitables nested in templates;
- wikitable error recovery when unable to recurse;
- templates nested in template parameters before other parameters.
- Fixed parsing file-like objects.
- Made builds deterministic.
- Documented caveats.
version 0.4.3
- Added Windows binaries for Python 3.5.
- Fixed edge cases involving wikilinks inside of external links and vice versa.
- Fixed a C tokenizer crash when a keyboard interrupt happens while parsing.
version 0.4.2
- Fixed setup script not including header files in releases.
- Fixed Windows binary uploads.
version 0.4.1
- The process for building Windows binaries has been fixed, and these should be distributed along with new releases. Windows users can now take advantage of C speedups without having a compiler of their own.
- Added support for Python 3.5.
<
and>
are now disallowed in wikilink titles and template names. This includes when denoting tags, but not comments.- Fixed the behavior of
preserve_spacing
inTemplate.add()
andkeep_field
inTemplate.remove()
on parameters with hidden keys. - Removed
_ListProxy.detach()
.SmartList
s now use weak references and their children are garbage-collected properly. - Fixed parser bugs involving:
- templates with completely blank names;
- templates with newlines and comments.
- Heavy refactoring and fixes to the C tokenizer, including:
- corrected a design flaw in text handling, allowing for substantial speed
improvements when parsing long strings of plain text; - implemented new Python 3.3 PEP 393 Unicode APIs.
- corrected a design flaw in text handling, allowing for substantial speed
- Fixed various bugs in
SmartList
, including one that was causing memory issues on 64-bit builds of Python 2 on Windows. - Fixed some bugs in the release scripts.
version 0.4
- The parser now falls back on pure Python mode if C extensions cannot be
built. This fixes an issue that prevented some Windows users from installing
the parser. - Added support for parsing wikicode tables (patches by David Winegar).
- Added a script to test for memory leaks in
scripts/memtest.py
. - Added a script to do releases in
scripts/release.sh
. skip_style_tags
can now be passed tomwparserfromhell.parse()
(previously,
onlyParser().parse()
allowed it).- The
recursive
argument toWikicode
's filter methods now accepts a third
option,RECURSE_OTHERS
, which recurses over all children except instances of
forcetype
(for example,code.filter_templates(code.RECURSE_OTHERS)
returns all un-nested templates). - The parser now understands HTML tag attributes quoted with single quotes.
When setting a tag attribute's value, quotes will be added if necessary. As
part of this,Attribute
'squoted
attribute has been changed toquotes
,
and is now either a string or None. - Calling
Template.remove()
with aParameter
object that is not part of the
template now raisesValueError
instead of doing nothing. - Parameters with non-integer keys can no longer be created with
showkey=False
, nor have the value of this attribute be set to False later. _ListProxy.destroy()
has been changed to_ListProxy.detach()
, and now works
in a more useful way.- If something goes wrong while parsing,
ParserError
will now be raised.
Previously, the parser would produce an unclearBadRoute
exception or allow
an incorrect node tree to be build. - Fixed parser bugs involving:
- nested tags;
- comments in template names;
- tags inside of
<nowiki>
tags.
- Added tests to ensure that parsed trees convert back to wikicode without
unintentional modifications. - Added support for a
NOWEB
environment variable, which disables a unit test
that makes a web call. - Test coverage has been improved, and some minor related bugs have been fixed.
- Updated and fixed some documentation.
version 0.3.3
- Added support for Python 2.6 and 3.4.
Template.has()
is now passedignore_empty=False
by default instead of True.
This fixes a bug when adding parameters to templates with empty fields, and
is a breaking change if you rely on the default behavior.- The
matches
argument ofWikicode
's filter methods now accepts a function
(taking one argument, aNode
, and returning a bool) in addition to a regex. - Re-added
flat
argument toWikicode.get_sections()
, fixed the order in which
it returns sections, and made it faster. Wikicode.matches()
now accepts a tuple or list of strings/Wikicode
objects
instead of just a single string orWikicode
.- Given the frequency of issues with the (admittedly insufficient) tag parser,
there's a temporaryskip_style_tags
argument toparse()
that ignores '' and
''' until these issues are corrected. - Fixed a parser bug involving nested wikilinks and external links.
- C code cleanup and speed improvements.
version 0.3.2
- Added support for Python 3.2 (along with current support for 3.3 and 2.7).
- Renamed
Template.remove()
's first argument fromname
toparam
, which now acceptsParameter
objects in addition to parameter name strings.
version 0.3.1
- Fixed a parser bug involving URLs nested inside other markup.
- Fixed some typos.
version 0.3
- Added complete support for HTML Tags, including forms like
<ref>foo</ref>
,<ref name="bar"/>
, and wiki-markup tags like bold ('''
), italics (''
), and lists (*
,#
,;
and:
). - Added support for ExternalLinks (
http://example.com/
and[http://example.com/ Example]
). Wikicode.filter()
methods are now passedrecursive=True
by default instead ofFalse
. This is a breaking change if you rely on anyfilter()
methods being non-recursive by default.- Added a
matches()
method toWikicode
for page/template name comparisons. - The
obj
param ofWikicode.insert_before()
,insert_after()
,replace()
, andremove()
now acceptsWikicode
objects and strings representing parts of wikitext, instead of just nodes. These methods also make all possible substitutions instead of just one. - Renamed
Template.has_param()
toTemplate.has()
for consistency withTemplate
's other methods;has_param()
is now an alias. - The C tokenizer extension now works on Python 3 in addition to Python 2.7.
- Various bugfixes, internal changes, and cleanup.