Please check the latest news (change log) and keep this package updated.
- Added the DOI link for the online published JPSP article: https://doi.org/10.1037/pspa0000396.
- Fixed bugs: Now only
BERT_download()
connects to the Internet, while all the other functions run in an offline way. - Improved installation guidance for Python packages.
- Added
BERT_info()
. - Added
add.tokens
andadd.method
parameters forBERT_vocab()
andFMAT_run()
: An experimental functionality to add new tokens (e.g., out-of-vocabulary words, compound words, or even phrases) as [MASK] options. Validation is still needed for this novel practice (one of my ongoing projects), so currently please only use at your own risk, waiting until the publication of my validation work. - All functions except
BERT_download()
now import local model files only, without automatically downloading models. Users must first useBERT_download()
to download models. - Deprecating
FMAT_load()
: Better to useFMAT_run()
directly.
- Added
BERT_vocab()
andICC_models()
. - Improved
summary.fmat()
,FMAT_query()
, andFMAT_run()
(significantly faster because now it can simultaneously estimate all [MASK] options for each unique query sentence, with running time only depending on the number of unique queries but not on the number of [MASK] options). - If you use the
reticulate
package version ≥ 1.36.1, thenFMAT
should be updated to ≥ 2024.4. Otherwise, out-of-vocabulary [MASK] words may not be identified and marked. NowFMAT_run()
directly uses model vocabulary and token ID to match [MASK] words. To check if a [MASK] word is in the model vocabulary, please useBERT_vocab()
.
- The FMAT methodology paper has been accepted (March 14, 2024) for publication in the Journal of Personality and Social Psychology: Attitudes and Social Cognition (DOI: 10.1037/pspa0000396)!
- Added
BERT_download()
(downloading models to local cache folder "%USERPROFILE%/.cache/huggingface") to differentiate fromFMAT_load()
(loading saved models from local cache). But indeedFMAT_load()
can also download models silently if they have not been downloaded. - Added
gpu
parameter (see Guidance for GPU Acceleration) inFMAT_run()
to allow for specifying an NVIDIA GPU device on which the fill-mask pipeline will be allocated. GPU roughly performs 3x faster than CPU for the fill-mask pipeline. By default,FMAT_run()
would automatically detect and use any available GPU with an installed CUDA-supported Pythontorch
package (if not, it would use CPU). - Added running speed information (queries/min) for
FMAT_run()
. - Added device information for
BERT_download()
,FMAT_load()
, andFMAT_run()
. - Deprecated
parallel
inFMAT_run()
:FMAT_run(model.names, data, gpu=TRUE)
is the fastest. - A progress bar is displayed by default for
progress
inFMAT_run()
.
- CRAN package publication.
- Fixed bugs and improved functions.
- Provided more examples.
- Now use "YYYY.M" as package version number.
- Initial public release on GitHub.
- Designed basic functions.