From cc8b52a13bb4153172a429be9d4ed1f034777e3f Mon Sep 17 00:00:00 2001 From: Namrata Shivagunde <51484711+NamrataRShivagunde@users.noreply.github.com> Date: Wed, 8 May 2024 10:19:49 -0400 Subject: [PATCH] Update index.html --- docs/2024/pept_relora_n_galore/index.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/2024/pept_relora_n_galore/index.html b/docs/2024/pept_relora_n_galore/index.html index d846244..80cc396 100644 --- a/docs/2024/pept_relora_n_galore/index.html +++ b/docs/2024/pept_relora_n_galore/index.html @@ -154,7 +154,7 @@
ReLoRA uses LoRA (Hu et al., 2022) decomposition technique where the pre-trained model weights are frozen and trainable rank decomposition matrices (WA, WB) are injected into each layer of the LLM. However in LoRA, the rank of the matrix is restricted by the rank r (given below), and the new trainable parameters (WA and WB) are merged back to the original matrices only after the end of the training. This suggests the potential to use PEPT techniques to pre-train LLMS.
+ReLoRA uses LoRA (Hu et al., 2022) decomposition technique where the pre-trained model weights are frozen and trainable rank decomposition matrices (WA, WB) are injected into each attention and MLP layer of the LLM. However in LoRA, the rank of the matrix is restricted by the rank r (given below), and the new trainable parameters (WA and WB) are merged back to the original matrices only after the end of the training.