pretrained_word_embeddings

It is about how to load pretrained word embeddings in pytorch, e.g., ELMo\BERT\XLNET.

Setup

python 3.6.x
pytorch 1.3.1
pip install gpustat [if gpu is used]
ELMo in allennlp: pip install allennlp
BERT/XLNET in transformers: pip install transformers

Use

python elmo_bert_xlnet_layer.py

Alignment from BERT\XLNET tokens to original words

Usually, we want to get word embeddings from BERT\XLNET models, while one word may be split into multiple tokens after BERT\XLNET tokenization. In this case, we would like to get word embeddings by using the alignment from BERT\XLNET tokens to original words.

For example, the sentence

"i dont care wether it provides free wifi or not"

can be tokenized as

['i', 'dont', 'care', 'wet', '##her', 'it', 'provides', 'free', 'wi', '##fi', 'or', 'not']

.

We provide three types of alignment:

'ori': we simply use the output embeddings of BERT\XLNET to represent each input sentence, while ignoring the output embeddings of special tokens like '[CLS]' and '[SEP]'.
'first': using the embedding of the first token of each word as the word embedding.
'avg': averaging the embeddings of all the tokens of each word as the word embedding.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
figs		figs
old_version		old_version
.gitignore		.gitignore
README.md		README.md
elmo_bert_xlnet_layer.py		elmo_bert_xlnet_layer.py
utils_bert_xlnet.py		utils_bert_xlnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pretrained_word_embeddings

Setup

Use

Alignment from BERT\XLNET tokens to original words

1. alignment is 'ori'

2. alignment is 'first'

3. alignment is 'avg'

About

Releases

Packages

Languages

sz128/pretrained_word_embeddings

Folders and files

Latest commit

History

Repository files navigation

pretrained_word_embeddings

Setup

Use

Alignment from BERT\XLNET tokens to original words

1. alignment is 'ori'

2. alignment is 'first'

3. alignment is 'avg'

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages