Stanza v0.2.0 Release Notes

Release Date: 2019-05-16 // almost 5 years ago
  • ๐Ÿ›  This release features major improvements on memory efficiency and speed of the neural network pipeline in stanfordnlp and various bugfixes. These features include:

    ๐ŸŽ The downloadable pretrained neural network models are now substantially smaller in size (due to the use of smaller pretrained vocabularies) with comparable performance. Notably, the default English model is now ~9x smaller in size, German ~11x, French ~6x and Chinese ~4x. As a result, memory efficiency of the neural pipelines for most languages are substantially improved.

    Substantial speedup of the neural lemmatizer via reduced neural sequence-to-sequence operations.

    The neural network pipeline can now take in a Python list of strings representing pre-tokenized text. (https://github.com/stanfordnlp/stanfordnlp/issues/58)

    ๐Ÿ”ง A requirements checking framework is now added in the neural pipeline, ensuring the proper processors are specified for a given pipeline configuration. The pipeline will now raise an exception when a requirement is not satisfied. (https://github.com/stanfordnlp/stanfordnlp/issues/42)

    ๐Ÿ›  Bugfix related to alignment between tokens and words post the multi-word expansion processor. (https://github.com/stanfordnlp/stanfordnlp/issues/71)

    ๐Ÿ“š More options are added for customizing the Stanford CoreNLP server at start time, including specifying properties for the default pipeline, and setting all server options such as username/password. For more details on different options, please checkout the client documentation page.

    0๏ธโƒฃ CoreNLPClient instance can now be created with CoreNLP default language properties as:

    client = CoreNLPClient(properties='chinese')
    
    • Alternatively, a properties file can now be used during the creation of a CoreNLPClient:

      client = CoreNLPClient(properties='/path/to/corenlp.props')

    • 0๏ธโƒฃ All specified CoreNLP annotators are now preloaded by default when a CoreNLPClient instance is created. (https://github.com/stanfordnlp/stanfordnlp/issues/56)