pyparsing v3.0.0.a1 Release Notes

Release Date: 2020-04-01 // about 4 years ago
    • โœ‚ Removed Py2.x support and other deprecated features. Pyparsing now requires Python 3.5 or later. If you are using an earlier version of Python, you must use a Pyparsing 2.4.x version

    Deprecated features removed: . ParseResults.asXML() - if used for debugging, switch to using ParseResults.dump(); if used for data transfer, use ParseResults.asDict() to convert to a nested Python dict, which can then be converted to XML or JSON or other transfer format

    . operatorPrecedence synonym for infixNotation - convert to calling infixNotation

    . commaSeparatedList - convert to using pyparsing_common.comma_separated_list

    . upcaseTokens and downcaseTokens - convert to using pyparsing_common.upcaseTokens and downcaseTokens

    . compat.collect_all_And_tokens will not be settable to False to revert to pre-2.3.1 results name behavior - review use of names for MatchFirst and Or expressions containing And expressions, as they will return the complete list of parsed tokens, not just the first one. Use __diag__.warn_multiple_tokens_in_named_alternation to help identify those expressions in your parsers that will have changed as a result.

    • โœ‚ Removed support for running python setup.py test. The setuptools maintainers consider the test command deprecated (see https://github.com/pypa/setuptools/issues/1684). To run the Pyparsing test, use the command tox.

    • API CHANGE: The staticmethod ParseException.explain has been moved to ParseBaseException.explain_exception, and a new explain instance method added to ParseBaseException. This will make calls to explain much more natural:

      try: expr.parseString("...") except ParseException as pe: print(pe.explain())

    • POTENTIAL API CHANGE: ZeroOrMore expressions that have results names will now include empty lists for their name if no matches are found. Previously, no named result would be present. Code that tested for the presence of any expressions using "if name in results:" will now always return True. This code will need to change to "if name in results and results[name]:" or just "if results[name]:". Also, any parser unit tests that check the asDict() contents will now see additional entries for parsers having named ZeroOrMore expressions, whose values will be [].

    • POTENTIAL API CHANGE: Fixed a bug in which calls to ParserElement.setDefaultWhitespaceChars did not change whitespace definitions on any pyparsing built-in expressions defined at import time (such as quotedString, or those defined in pyparsing_common). This would lead to confusion when built-in expressions would not use updated default whitespace characters. Now a call to ParserElement.setDefaultWhitespaceChars will also go and update all pyparsing built-ins to use the new default whitespace characters. (Note that this will only modify expressions defined within the pyparsing module.) Prompted by work on a StackOverflow question posted by jtiai.

    • Expanded diag and compat to actual classes instead of just namespaces, to add some helpful behavior:

      • enable() and .disable() methods to give extra help when setting or clearing flags (detects invalid flag names, detects when trying to set a compat flag that is no longer settable). Use these methods now to set or clear flags, instead of directly setting to True or False.

        import pyparsing as pp pp.diag.enable("warn_multiple_tokens_in_named_alternation")

      • diag.enable_all_warnings() is another helper that sets all "warn*" diagnostics to True.

        pp.diag.enable_all_warnings()

      • added new warning, "warn_on_match_first_with_lshift_operator" to warn when using '<<' with a '|' MatchFirst operator, which will create an unintended expression due to precedence of operations.

      Example: This statement will erroneously define the fwd expression as just expr_a, even though expr_a | expr_b was intended, since '<<' operator has precedence over '|':

      fwd << expr_a | expr_b
      

      To correct this, use the '<<=' operator (preferred) or parentheses to override operator precedence:

      fwd <<= expr_a | expr_b
               or
      fwd << (expr_a | expr_b)
      
    • ๐Ÿ‘€ Cleaned up default tracebacks when getting a ParseException when calling parseString. Exception traces should now stop at the call in parseString, and not include the internal traceback frames. (If the full traceback is desired, then set ParserElement.verbose_traceback to True.)

    • ๐Ÿ›  Fixed FutureWarnings that sometimes are raised when '[' passed as a character to Word.

    • ๐Ÿ†• New namespace, assert methods and classes added to support writing unit tests.

      • assertParseResultsEquals
      • assertParseAndCheckList
      • assertParseAndCheckDict
      • assertRunTestResults
      • assertRaisesParseException
      • reset_pyparsing_context context manager, to restore pyparsing config settings
    • โœจ Enhanced error messages and error locations when parsing fails on the Keyword or CaselessKeyword classes due to the presence of a preceding or trailing keyword character. Surfaced while working with metaperl on issue #201.

    • โœจ Enhanced the Regex class to be compatible with re's compiled with the re-equivalent regex module. Individual expressions can be built with regex compiled expressions using:

      import pyparsing as pp import regex

      would use regex for this expression

      integer_parser = pp.Regex(regex.compile(r'\d+'))

    Inspired by PR submitted by bjrnfrdnnd on GitHub, very nice!

    • ๐Ÿ›  Fixed handling of ParseSyntaxExceptions raised as part of Each expressions, when sub-expressions contain '-' backtrack suppression. As part of resolution to a question posted by John Greene on StackOverflow.

    • Potentially huge performance enhancement when parsing Word expressions built from pyparsing_unicode character sets. Word now internally converts ranges of consecutive characters to regex character ranges (converting "0123456789" to "0-9" for instance), resulting in as much as 50X improvement in performance! Work inspired by a question posted by Midnighter on StackOverflow.

    • ๐Ÿ‘Œ Improvements in select_parser.py, to include new SQL syntax from SQLite. PR submitted by Robert Coup, nice work!

    • ๐Ÿ›  Fixed bug in PrecededBy which caused infinite recursion, issue #127 submitted by EdwardJB.

    • ๐Ÿ›  Fixed bug in CloseMatch where end location was incorrectly computed; and updated partial_gene_match.py example.

    • ๐Ÿ›  Fixed bug in indentedBlock with a parser using two different types of nested indented blocks with different indent values, but sharing the same indent stack, submitted by renzbagaporo.

    • ๐Ÿ›  Fixed bug in Each when using Regex, when Regex expression would get parsed twice; issue #183 submitted by scauligi, thanks!

    • ๐Ÿ“œ BigQueryViewParser.py added to examples directory, PR submitted by Michael Smedberg, nice work!

    • ๐Ÿ“œ booleansearchparser.py added to examples directory, PR submitted by xecgr. Builds on searchparser.py, adding support for '*' wildcards and non-Western alphabets.

    • ๐Ÿ›  Fixed bug in delta_time.py example, when using a quantity of seconds/minutes/hours/days > 999.

    • ๐Ÿ›  Fixed bug in regex definitions for real and sci_real expressions in pyparsing_common. Issue #194, reported by Michael Wayne Goodman, thanks!

    • ๐Ÿ›  Fixed FutureWarning raised beginning in Python 3.7 for Regex expressions containing '[' within a regex set.

    • โœ… Minor reformatting of output from runTests to make embedded comments more visible.

    • And finally, many thanks to those who helped in the restructuring of the pyparsing code base as part of this release. Pyparsing now has more standard package structure, more standard unit tests, and more standard code formatting (using black). Special thanks to jdufresne, klahnakoski, mattcarmody, and ckeygusuz, to name just a few.