📜 Split out the '==' behavior in ParserElement, now implemented as the ParserElement.matches() method. Using '==' for string test purposes will be removed in a future release.
✅ Expanded capabilities of runTests(). Will now accept embedded comments (default is Python style, leading '#' character, but customizable). Comments will be emitted along with the tests and test output. Useful during test development, to create a test string consisting only of test case description comments separated by blank lines, and then fill in the test cases. Will also highlight ParseFatalExceptions with "(FATAL)".
➕ Added a 'pyparsing_common' class containing common/helpful little expressions such as integer, float, identifier, etc. I used this class as a sort of embedded namespace, to contain these helpers without further adding to pyparsing's namespace bloat.
📜 Minor enhancement to traceParseAction decorator, to retain the parse action's name for the trace output.
➕ Added optional 'fatal' keyword arg to addCondition, to indicate that a condition failure should halt parsing immediately.
- _trim_arity fix in 2.1.2 was very version-dependent on Py 3.5.0. Now works for Python 2.x, 3.3, 3.4, 3.5.0, and 3.5.1 (and hopefully beyond).
Fixed bug in _trim_arity when pyparsing code is included in a PyInstaller, reported by maluwa.
🛠 Fixed catastrophic regex backtracking in implementation of the quoted string expressions (dblQuotedString, sglQuotedString, and quotedString). Reported on the pyparsing wiki by webpentest, good catch! (Also tuned up some other expressions susceptible to the same backtracking problem, such as cStyleComment, cppStyleComment, etc.)
➕ Added support for assigning to ParseResults using slices.
🛠 Fixed bug in ParseResults.toDict(), in which dict values were always converted to dicts, even if they were just unkeyed lists of tokens. Reported on SO by Gerald Thibault, thanks Gerald!
🛠 Fixed bug in SkipTo when using failOn, reported by robyschek, thanks!
🛠 Fixed bug in Each introduced in 2.1.0, reported by AND patch and unit test submitted by robyschek, well done!
✂ Removed use of functools.partial in replaceWith, as this creates an ambiguous signature for the generated parse action, which fails in PyPy. Reported by Evan Hubinger, thanks Evan!
➕ Added default behavior to QuotedString to convert embedded '\t', '\n', etc. characters to their whitespace counterparts. Found during Q&A exchange on SO with Maxim.
Modified the internal _trim_arity method to distinguish between TypeError's raised while trying to determine parse action arity and those raised within the parse action itself. This will clear up those confusing "() takes exactly 1 argument (0 given)" error messages when there is an actual TypeError in the body of the parse action. Thanks to all who have raised this issue in the past, and most recently to Michael Cohen, who sent in a proposed patch, and got me to finally tackle this problem.
➕ Added compatibility for pickle protocols 2-4 when pickling ParseResults. In Python 2.x, protocol 0 was the default, and protocol 2 did not work. In Python 3.x, protocol 3 is the default, so explicitly naming protocol 0 or 1 was required to pickle ParseResults. With this release, all protocols 0-4 are supported. Thanks for reporting this on StackOverflow, Arne Wolframm, and for providing a nice simple test case!
➕ Added optional 'stopOn' argument to ZeroOrMore and OneOrMore, to simplify breaking on stop tokens that would match the repetition expression.
It is a common problem to fail to look ahead when matching repetitive tokens if the sentinel at the end also matches the repetition expression, as when parsing "BEGIN aaa bbb ccc END" with:
"BEGIN" + OneOrMore(Word(alphas)) + "END"
Since "END" matches the repetition expression "Word(alphas)", it will never get parsed as the terminating sentinel. Up until now, this has to be resolved by the user inserting their own negative lookahead:
"BEGIN" + OneOrMore(~Literal("END") + Word(alphas)) + "END"
Using stopOn, they can more easily write:
"BEGIN" + OneOrMore(Word(alphas), stopOn="END") + "END"
The stopOn argument can be a literal string or a pyparsing expression. Inspired by a question by Lamakaha on StackOverflow (and many previous questions with the same negative-lookahead resolution).
➕ Added expression names for many internal and builtin expressions, to reduce name and error message overhead during parsing.
♻️ Converted helper lambdas to functions to refactor and add docstring support.
🛠 Fixed ParseResults.asDict() to correctly convert nested ParseResults values to dicts.
🛠 Cleaned up some examples, fixed typo in fourFn.py identified by aristotle2600 on reddit.
✂ Removed keepOriginalText helper method, which was deprecated ages ago. Superceded by originalTextFor.
🗄 Same for the Upcase class, which was long ago deprecated and replaced with the upcaseTokens method.
Simplified string representation of Forward class, to avoid memory and performance errors while building ParseException messages. Thanks, Will McGugan, Andrea Censi, and Martijn Vermaat for the bug reports and test code.
Cleaned up additional issues from enhancing the error messages for Or and MatchFirst, handling Unicode values in expressions. Fixes Unicode encoding issues in Python 2, thanks to Evan Hubinger for the bug report.
🛠 Fixed implementation of dir() for ParseResults - was leaving out all the defined methods and just adding the custom results names.
🛠 Fixed bug in ignore() that was introduced in pyparsing 1.5.3, that would not accept a string literal as the ignore expression.
➕ Added new example parseTabularData.py to illustrate parsing of data formatted in columns, with detection of empty cells.
⚡️ Updated a number of examples to more current Python and pyparsing forms.
🛠 Fixed a bug in Each when multiple Optional elements are present. Thanks for reporting this, whereswalden on SO.
🛠 Fixed another bug in Each, when Optional elements have results names or parse actions, reported by Max Rothman - thank you, Max!
➕ Added optional parseAll argument to runTests, whether tests should require the entire input string to be parsed or not (similar to parseAll argument to parseString). Plus a little neaten-up of the output on Python 2 (no stray ()'s).
👻 Modified exception messages from MatchFirst and Or expressions. These were formerly misleading as they would only give the first or longest exception mismatch error message. Now the error message includes all the alternatives that were possible matches. Originally proposed by a pyparsing user, but I've lost the email thread - finally figured out a fairly clean way to do this.
🛠 Fixed a bug in Or, when a parse action on an alternative raises an exception, other potentially matching alternatives were not always tried. Reported by TheVeryOmni on the pyparsing wiki, thanks!
🛠 Fixed a bug to dump() introduced in 2.0.4, where list values were shown in duplicate.
- 🖨 (&$(@#&$(@!!!! Some "print" statements snuck into pyparsing v2.0.4, breaking Python 3 compatibility! Fixed. Reported by jenshn, thanks!
➕ Added ParserElement.addCondition, to simplify adding parse actions that act primarily as filters. If the given condition evaluates False, pyparsing will raise a ParseException. The condition should be a method with the same method signature as a parse action, but should return a boolean. Suggested by Victor Porton, nice idea Victor, thanks!
Slight mod to srange to accept unicode literals for the input string, such as "[а-яА-Я]" instead of "[\u0430-\u044f\u0410-\u042f]". Thanks to Alexandr Suchkov for the patch!
✨ Enhanced implementation of replaceWith.
🛠 Fixed enhanced ParseResults.dump() method when the results consists only of an unnamed array of sub-structure results. Reported by Robin Siebler, thanks for your patience and persistence, Robin!
🛠 Fixed bug in fourFn.py example code, where pi and e were defined using CaselessLiteral instead of CaselessKeyword. This was not a problem until adding a new function 'exp', and the leading 'e' of 'exp' was accidentally parsed as the mathematical constant 'e'. Nice catch, Tom Grydeland - thanks!
Adopt new-fangled Python features, like decorators and ternary expressions, per suggestions from Williamzjc - thanks William! (Oh yeah, I'm not supporting Python 2.3 with this code any more...) Plus, some additional code fixes/cleanup - thanks again!
➕ Added ParserElement.runTests, a little test bench for quickly running an expression against a list of sample input strings. Basically, I got tired of writing the same test code over and over, and finally added it as a test point method on ParserElement.
➕ Added withClass helper method, a simplified version of withAttribute for the common but annoying case when defining a filter on a div's class - made difficult because 'class' is a Python reserved word.
🛠 Fixed escaping behavior in QuotedString. Formerly, only quotation marks (or characters designated as quotation marks in the QuotedString constructor) would be escaped. Now all escaped characters will be escaped, and the escaping backslashes will be removed.
🛠 Fixed regression in ParseResults.pop() - pop() was pretty much broken after I added improvements in 2.0.2. Reported by Iain Shelvington, thanks Iain!
🛠 Fixed bug in And class when initializing using a generator.
✨ Enhanced ParseResults.dump() method to list out nested ParseResults that are unnamed arrays of sub-structures.
🛠 Fixed UnboundLocalError under Python 3.4 in oneOf method, reported on Sourceforge by aldanor, thanks!
Fixed bug in ParseResults init method, when returning non-ParseResults types from parse actions that implement eq. Raised during discussion on the pyparsing wiki with cyrfer.