pyparsing v2.1.0 Release Notes

Release Date: 2016-02-01 // about 8 years ago
    • Modified the internal _trim_arity method to distinguish between TypeError's raised while trying to determine parse action arity and those raised within the parse action itself. This will clear up those confusing "() takes exactly 1 argument (0 given)" error messages when there is an actual TypeError in the body of the parse action. Thanks to all who have raised this issue in the past, and most recently to Michael Cohen, who sent in a proposed patch, and got me to finally tackle this problem.

    • ➕ Added compatibility for pickle protocols 2-4 when pickling ParseResults. In Python 2.x, protocol 0 was the default, and protocol 2 did not work. In Python 3.x, protocol 3 is the default, so explicitly naming protocol 0 or 1 was required to pickle ParseResults. With this release, all protocols 0-4 are supported. Thanks for reporting this on StackOverflow, Arne Wolframm, and for providing a nice simple test case!

    • ➕ Added optional 'stopOn' argument to ZeroOrMore and OneOrMore, to simplify breaking on stop tokens that would match the repetition expression.

    It is a common problem to fail to look ahead when matching repetitive tokens if the sentinel at the end also matches the repetition expression, as when parsing "BEGIN aaa bbb ccc END" with:

    "BEGIN" + OneOrMore(Word(alphas)) + "END"
    

    Since "END" matches the repetition expression "Word(alphas)", it will never get parsed as the terminating sentinel. Up until now, this has to be resolved by the user inserting their own negative lookahead:

    "BEGIN" + OneOrMore(~Literal("END") + Word(alphas)) + "END"
    

    Using stopOn, they can more easily write:

    "BEGIN" + OneOrMore(Word(alphas), stopOn="END") + "END"
    

    The stopOn argument can be a literal string or a pyparsing expression. Inspired by a question by Lamakaha on StackOverflow (and many previous questions with the same negative-lookahead resolution).

    • ➕ Added expression names for many internal and builtin expressions, to reduce name and error message overhead during parsing.

    • 🔨 Converted helper lambdas to functions to refactor and add docstring support.

    • 🛠 Fixed ParseResults.asDict() to correctly convert nested ParseResults values to dicts.

    • 🛠 Cleaned up some examples, fixed typo in fourFn.py identified by aristotle2600 on reddit.

    • ✂ Removed keepOriginalText helper method, which was deprecated ages ago. Superceded by originalTextFor.

    • 🗄 Same for the Upcase class, which was long ago deprecated and replaced with the upcaseTokens method.