Skip to content

OpenNMT-py v3.2.0

Compare
Choose a tag to compare
@vince62s vince62s released this 07 Jun 20:15
· 120 commits to master since this release
c858395

Lots new stuff in this release:

  • Skip init during model build (way faster building)
  • Enable quantization of LoRA layers
  • Enable 4bit quantization from bitsandbytes (NF4 / FP4)
  • Enable "some" bnb.optim Optimizers for benchmarking purpose
  • Refactor model state_dict loading to enable pseudo lazy loading with move on GPU as it loads
  • Enable Gradient checkpointing for FFN, MHA, LoRA modules
  • Make FFN bias optional (same as QKV): llama, mpt, redpajama, openllama converters changed accordingly.
    Convertv2_v3 set add_qkvbias=True, add_ffnbias=True.
    load_checkpoint: if w1_bias detected in checkpoint then add_ffnbias=True
  • Add Multi Query attention
  • Add Parallel Residual attention
  • Add Falcon 7B converter