pyparsing v1.5.0 Release Notes

Release Date: 2008-06-01 // almost 16 years ago
  • ๐Ÿ“œ This version of pyparsing includes work on two long-standing ๐Ÿ™‹ FAQ's: support for forcing parsing of the complete input string (without having to explicitly append StringEnd() to the grammar), and a method to improve the mechanism of detecting where syntax errors occur in an input string with various optional and ๐Ÿš€ alternative paths. This release also includes a helper method to simplify definition of indentation-based grammars. With โšก๏ธ these changes (and the past few minor updates), I thought it was ๐Ÿ“œ finally time to bump the minor rev number on pyparsing - so 1.5.0 is now available! Read on...

    • ๐Ÿ“œ AT LAST!!! You can now call parseString and have it raise an exception if the expression does not parse the entire input string. This has been an FAQ for a LONG time.

    The parseString method now includes an optional parseAll argument (default=False). If parseAll is set to True, then the given parse expression must parse the entire input string. (This is equivalent to adding StringEnd() to the end of the expression.) The default value is False to retain backward compatibility.

    Inspired by MANY requests over the years, most recently by ecir-hana on the pyparsing wiki!

    • โž• Added new operator '-' for composing grammar sequences. '-' behaves just like '+' in creating And expressions, but '-' is used to mark grammar structures that should stop parsing immediately and report a syntax error, rather than just backtracking to the last successful parse and trying another alternative. For instance, running the following code:

      port_definition = Keyword("port") + '=' + Word(nums) entity_definition = Keyword("entity") + "{" + Optional(port_definition) + "}"

      entity_definition.parseString("entity { port 100 }")

    pyparsing fails to detect the missing '=' in the port definition. But, since this expression is optional, pyparsing then proceeds to try to match the closing '}' of the entity_definition. Not finding it, pyparsing reports that there was no '}' after the '{' character. Instead, we would like pyparsing to parse the 'port' keyword, and if not followed by an equals sign and an integer, to signal this as a syntax error.

    This can now be done simply by changing the port_definition to:

    port_definition = Keyword("port") - '=' + Word(nums)
    

    Now after successfully parsing 'port', pyparsing must also find an equals sign and an integer, or it will raise a fatal syntax exception.

    By judicious insertion of '-' operators, a pyparsing developer can have their grammar report much more informative syntax error messages.

    Patches and suggestions proposed by several contributors on the pyparsing mailing list and wiki - special thanks to Eike Welk and Thomas/Poldy on the pyparsing wiki!

    • โž• Added indentedBlock helper method, to encapsulate the parse actions and indentation stack management needed to keep track of indentation levels. Use indentedBlock to define grammars for indentation-based grouping grammars, like Python's.

    indentedBlock takes up to 3 parameters: - blockStatementExpr - expression defining syntax of statement that is repeated within the indented block - indentStack - list created by caller to manage indentation stack (multiple indentedBlock expressions within a single grammar should share a common indentStack) - indent - boolean indicating whether block must be indented beyond the current level; set to False for block of left-most statements (default=True)

    A valid block must contain at least one indented statement.

    • ๐Ÿ›  Fixed bug in nestedExpr in which ignored expressions needed to be set off with whitespace. Reported by Stefaan Himpe, nice catch!

    • Expanded multiplication of an expression by a tuple, to accept tuple values of None: . expr*(n,None) or expr*(n,) is equivalent to expr*n + ZeroOrMore(expr) (read as "at least n instances of expr") . expr*(None,n) is equivalent to expr*(0,n) (read as "0 to n instances of expr") . expr*(None,None) is equivalent to ZeroOrMore(expr) . expr*(1,None) is equivalent to OneOrMore(expr)

    Note that expr*(None,n) does not raise an exception if more than n exprs exist in the input stream; that is, expr*(None,n) does not enforce a maximum number of expr occurrences. If this behavior is desired, then write expr*(None,n) + ~expr

    • โž• Added None as a possible operator for operatorPrecedence. None signifies "no operator", as in multiplying m times x in "y=mx+b".

    • ๐Ÿ›  Fixed bug in Each, reported by Michael Ramirez, in which the order of terms in the Each affected the parsing of the results. Problem was due to premature grouping of the expressions in the overall Each during grammar construction, before the complete Each was defined. Thanks, Michael!

    • 0๏ธโƒฃ Also fixed bug in Each in which Optional's with default values were not getting the defaults added to the results of the overall Each expression.

    • ๐Ÿ›  Fixed a bug in Optional in which results names were not assigned if a default value was supplied.

    • ๐Ÿ‘ป Cleaned up Py3K compatibility statements, including exception construction statements, and better equivalence between ustr and basestring, and __nonzero_ and bool.