PyTorch-NLP v0.3.0 Release Notes

Release Date: 2018-05-06 // almost 6 years ago
  • 🚀 Release 0.3.0

    Major Features And Improvements

    • ⬆️ Upgraded to PyTorch 0.4.0
    • ➕ Added Byte-Pair Encoding (BPE) pre-trained subword embeddings in 275 languages
    • 🔨 Refactored download scripts to torchnlp.downloads
    • Enable Spacy encoder to run in multiple languages.
    • ➕ Added a boolean aligned option to FastText supporting MUSE (Multilingual Unsupervised and Supervised Embeddings)

    🐛 Bug Fixes and Other Changes

    • Create non-existent cache dirs for torchnlp.word_to_vector.
    • ➕ Add set operation to torchnlp.datasets.Dataset with support for slices, columns and rows
    • Updated biggest_batches_first in torchnlp.samplers to be more efficient at approximating memory then Pickle
    • Enabled torch.utils.pad_tensor and torch.utils. pad_batch to support N dimensional tensors
    • ⚡️ Updated to sacremoses to fix NLTK moses dependancy for torch.text_encoders
    • Added __getitem()__ for _PretrainedWordVectors. For example:

      from torchnlp.word_to_vector import FastText vectors = FastText() tokenized_sentence = ['this', 'is', 'a', 'sentence'] vectors[tokenized_sentence]

    • Added __contains__ for _PretrainedWordVectors. For example:

      from torchnlp.word_to_vector import FastText vectors = FastText()

      'the' in vectors True 'theqwe' in vectors False