PyTorch-NLP v0.3.0 Release Notes
Release Date: 2018-05-06 // almost 6 years ago-
🚀 Release 0.3.0
Major Features And Improvements
- ⬆️ Upgraded to PyTorch 0.4.0
- ➕ Added Byte-Pair Encoding (BPE) pre-trained subword embeddings in 275 languages
- 🔨 Refactored download scripts to
torchnlp.downloads
- Enable Spacy encoder to run in multiple languages.
- ➕ Added a boolean aligned option to FastText supporting MUSE (Multilingual Unsupervised and Supervised Embeddings)
🐛 Bug Fixes and Other Changes
- Create non-existent cache dirs for
torchnlp.word_to_vector
. - ➕ Add
set
operation totorchnlp.datasets.Dataset
with support for slices, columns and rows - Updated
biggest_batches_first
intorchnlp.samplers
to be more efficient at approximating memory then Pickle - Enabled
torch.utils.pad_tensor
andtorch.utils. pad_batch
to support N dimensional tensors - ⚡️ Updated to sacremoses to fix NLTK moses dependancy for
torch.text_encoders
Added
__getitem()__
for_PretrainedWordVectors
. For example:from torchnlp.word_to_vector import FastText vectors = FastText() tokenized_sentence = ['this', 'is', 'a', 'sentence'] vectors[tokenized_sentence]
Added
__contains__
for_PretrainedWordVectors
. For example:from torchnlp.word_to_vector import FastText vectors = FastText()
'the' in vectors True 'theqwe' in vectors False