Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

修复:将剩余tk命名修改为mindpet #19

Merged
merged 1 commit into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 0 additions & 3 deletions .github/workflows/ci.yml
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,3 @@ jobs:
- name: Test with unit test (UT) pytest
run: |
pytest test/unit_test
- name: Test with system test (ST) pytest
run: |
pytest test/developer_test
Empty file modified .gitignore
100644 → 100755
Empty file.
6 changes: 1 addition & 5 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,10 @@ repos:
# list of supported hooks: https://pre-commit.com/hooks.html
- id: check-yaml
- id: debug-statements
- id: end-of-file-fixer
- id: mixed-line-ending
args: ["--fix=lf"]
- id: trailing-whitespace

- repo: https://github.com/pylint-dev/pylint
rev: v2.14.5
hooks:
- id: pylint
args: [ "-rn", "-sn", "--rcfile=pylintrc", "--fail-on=I" ]
exclude: tests(/\w*)*/functional/|tests/input|tests(/\w*)*data/|doc/
exclude: tests(/\w*)*/functional/|tests/input|tests(/\w*)*data/|doc/|test|pylintrc
Empty file modified LICENSE
100644 → 100755
Empty file.
16 changes: 8 additions & 8 deletions README.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,12 @@ pip uninstall mindpet

| 微调算法 | 算法论文 | 使用说明 |
|----------------| ----------------------------------------------------------- |-----------------------------------------------------------------|
| LoRA | LoRA: Low-Rank Adaptation of Large Language Models | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第一章 |
| PrefixTuning | Prefix-Tuning: Optimizing Continuous Prompts for Generation | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第二章 |
| Adapter | Parameter-Efficient Transfer Learning for NLP | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第三章 |
| LowRankAdapter | Compacter: Efficient low-rank hypercom plex adapter layers | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第四章 |
| BitFit | BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第五章 |
| R_Drop | R-Drop: Regularized Dropout for Neural Networks | [TK_DeltaAlgorithm_README](doc/TK_DeltaAlgorithm_README.md) 第六章 |
| LoRA | LoRA: Low-Rank Adaptation of Large Language Models | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第一章 |
| PrefixTuning | Prefix-Tuning: Optimizing Continuous Prompts for Generation | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第二章 |
| Adapter | Parameter-Efficient Transfer Learning for NLP | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第三章 |
| LowRankAdapter | Compacter: Efficient low-rank hypercom plex adapter layers | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第四章 |
| BitFit | BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第五章 |
| R_Drop | R-Drop: Regularized Dropout for Neural Networks | [MindPet_DeltaAlgorithm_README](doc/MindPet_DeltaAlgorithm_README.md) 第六章 |



Expand All @@ -60,12 +60,12 @@ pip uninstall mindpet

MindPet支持用户根据 微调算法 或 模块名 冻结网络中部分模块,提供调用接口和配置文件两种实现方式。

使用说明参考[TK_GraphOperation_README](doc/TK_GraphOperation_README.md) 第一章。
使用说明参考[MindPet_GraphOperation_README](doc/MindPet_GraphOperation_README.md) 第一章。



### 4.2 保存可训练参数功能API

MindPet支持用户单独保存训练中可更新的参数为ckpt文件,从而节省存储所用的物理资源。

使用说明参考[TK_GraphOperation_README](doc/TK_GraphOperation_README.md) 第二章。
使用说明参考[MindPet_GraphOperation_README](doc/MindPet_GraphOperation_README.md) 第二章。
90 changes: 45 additions & 45 deletions doc/TK_DeltaAlgorithm_README.md → doc/MindPet_DeltaAlgorithm_README.md
100644 → 100755

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions doc/TK_GraphOperation_README.md → doc/MindPet_GraphOperation_README.md
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ freeze_modules(model, include, exclude)
**样例:**

```python
from tk.graph.freeze_utils import freeze_modules
from mindpet.graph.freeze_utils import freeze_modules

# 初始化网络结构
model = Network()
Expand Down Expand Up @@ -86,7 +86,7 @@ freeze_delta(model, mode, include, exclude)
**样例:**

```python
from tk.graph.freeze_utils import freeze_delta
from mindpet.graph.freeze_utils import freeze_delta

# 初始化网络结构
model = Network()
Expand Down Expand Up @@ -141,7 +141,7 @@ freeze_from_config(model, config_path)
**样例:**

```python
from tk.graph.freeze_utils import freeze_from_config
from mindpet.graph.freeze_utils import freeze_from_config

# 初始化网络结构
model = Network()
Expand Down Expand Up @@ -187,7 +187,7 @@ TrainableParamsCheckPoint(directory, prefix, config)
- **在模型微调时**,从大模型微调工具包中引入`TrainableParamsCheckPoint`类,用法与MindSpore的`ModelCheckpoint`一致,实例化此`callback`后,加入训练时的`callback list`即可,例如:

```python
from tk.graph import TrainableParamsCheckPoint
from mindpet.graph import TrainableParamsCheckPoint
from mindspore import CheckpointConfig

ckpt_config = CheckpointConfig()
Expand Down
Empty file modified doc/image/architecture_of_adapter_module.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified doc/image/architecture_of_low_rank_adapter_module.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified doc/image/lora.PNG
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file modified doc/image/prefix.png
100644 → 100755
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 3 additions & 2 deletions mindpet/__init__.py
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Copyright © Huawei Technologies Co., Ltd. 2022-2023. All rights reserved.
"""Mindpet sdk APIs."""

import mindpet.tk_sdk as tk_sdk
from mindpet import mindpet_sdk

__all__ = ["tk_sdk"]
__all__ = ["mindpet_sdk"]
Empty file modified mindpet/delta/__init__.py
100644 → 100755
Empty file.
40 changes: 21 additions & 19 deletions mindpet/delta/adapter.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -53,17 +53,17 @@ def __init__(
self.non_linearity_name = non_linearity

adapter_dict = OrderedDict()
adapter_dict["tk_delta_adapter_down_sampler"] = _Linear(hidden_size,
adapter_dict["mindpet_delta_adapter_down_sampler"] = _Linear(hidden_size,
bottleneck_size,
compute_dtype=compute_dtype,
param_init_type=param_init_type)
adapter_dict["tk_delta_adapter_non_linear"] = get_activation(non_linearity)
adapter_dict["tk_delta_adapter_up_sampler"] = _Linear(bottleneck_size,
adapter_dict["mindpet_delta_adapter_non_linear"] = get_activation(non_linearity)
adapter_dict["mindpet_delta_adapter_up_sampler"] = _Linear(bottleneck_size,
hidden_size,
compute_dtype=compute_dtype,
param_init_type=param_init_type)

self.tk_delta_adapter_block = nn.SequentialCell(adapter_dict)
self.mindpet_delta_adapter_block = nn.SequentialCell(adapter_dict)
self.residual_add = P.Add()
self.cast = P.Cast()
self.shape = P.Shape()
Expand All @@ -79,7 +79,7 @@ def construct(self, input_tensor):
input_tensor = self.reshape(input_tensor, (-1, input_tensor_shape[-1]))

# calculate adapter_out
adapter_out = self.tk_delta_adapter_block(input_tensor)
adapter_out = self.mindpet_delta_adapter_block(input_tensor)

# residual connection, add input and adapter_out
output = self.residual_add(input_tensor, adapter_out)
Expand All @@ -99,27 +99,29 @@ def shard(self,
strategy_residual_add=None):
"""Shard Method"""
try:
self.tk_delta_adapter_block.tk_delta_adapter_down_sampler.shard(
self.mindpet_delta_adapter_block.mindpet_delta_adapter_down_sampler.shard(
strategy_matmul=strategy_matmul_down_sampler, strategy_bias=strategy_bias_down_sampler)

if self.non_linearity_name.lower() == "leakyrelu":
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.select_op.shard(
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.select_op.shard(
(strategy_non_linearity[0], strategy_non_linearity[0]))
elif self.non_linearity_name.lower() == "logsigmoid":
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.mul.shard((strategy_non_linearity[0], ()))
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.exp.shard(strategy_non_linearity)
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.add.shard((strategy_non_linearity[0], ()))
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.rec.shard(strategy_non_linearity)
self.tk_delta_adapter_block.tk_delta_adapter_non_linear.log.shard(strategy_non_linearity)
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.mul.shard((
strategy_non_linearity[0], ()))
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.exp.shard(strategy_non_linearity)
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.add.shard((
strategy_non_linearity[0], ()))
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.rec.shard(strategy_non_linearity)
self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear.log.shard(strategy_non_linearity)
elif self.non_linearity_name.lower() == "logsoftmax":
raise ValueError("The 'LogSoftmax' function is not supported in semi auto parallel "
"or auto parallel mode.")
else:
getattr(self.tk_delta_adapter_block.tk_delta_adapter_non_linear,
getattr(self.mindpet_delta_adapter_block.mindpet_delta_adapter_non_linear,
self.non_linearity_name).shard(strategy_non_linearity)

self.tk_delta_adapter_block.tk_delta_adapter_up_sampler.shard(strategy_matmul=strategy_matmul_up_sampler,
strategy_bias=strategy_bias_up_sampler)
self.mindpet_delta_adapter_block.mindpet_delta_adapter_up_sampler.shard(
strategy_matmul=strategy_matmul_up_sampler,strategy_bias=strategy_bias_up_sampler)

self.residual_add.shard(strategy_residual_add)

Expand All @@ -142,7 +144,7 @@ class AdapterDense(nn.Dense):
当使用str时,值引用自类initializer;更多细节请参考Initializer的值。
当使用Tensor时,数据类型与输入Tensor相同。
默认值:"normal"。
bias_init (Union[Tensor, str, Initializer, numbers.Number]):
bias_init (Union[Tensor, str, Initializer, numbers.Number]):
线性层偏置参数的初始化方法。
它的类型可以是Tensor,str,Initializer或numbers.Number。
当使用str时,值引用自类initializer;更多细节请参考Initializer的值。
Expand Down Expand Up @@ -194,7 +196,7 @@ def __init__(self,
has_bias=has_bias,
activation=activation)

self.tk_delta_adapter = AdapterLayer(hidden_size=out_channels,
self.mindpet_delta_adapter = AdapterLayer(hidden_size=out_channels,
bottleneck_size=bottleneck_size,
non_linearity=non_linearity,
param_init_type=param_init_type,
Expand Down Expand Up @@ -226,7 +228,7 @@ def construct(self, input_tensor):
input_tensor = self.activation(input_tensor)

# calculate adapter_out
input_tensor = self.tk_delta_adapter(input_tensor)
input_tensor = self.mindpet_delta_adapter(input_tensor)

# recover the previous outshape and dtype
out_shape = x_shape[:-1] + (-1,)
Expand Down Expand Up @@ -267,7 +269,7 @@ def shard(self,
getattr(self.activation, self.act_name).shard(strategy_activation_org)

# set adapter strategy
self.tk_delta_adapter.shard(strategy_matmul_down_sampler=strategy_matmul_down_sampler,
self.mindpet_delta_adapter.shard(strategy_matmul_down_sampler=strategy_matmul_down_sampler,
strategy_bias_down_sampler=strategy_bias_down_sampler,
strategy_non_linearity=strategy_non_linearity,
strategy_matmul_up_sampler=strategy_matmul_up_sampler,
Expand Down
Empty file modified mindpet/delta/delta_constants.py
100644 → 100755
Empty file.
12 changes: 6 additions & 6 deletions mindpet/delta/lora.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,11 @@ def __init__(
self.lora_rank = lora_rank
self.lora_alpha = lora_alpha
self.lora_dropout = get_dropout(lora_dropout)
self.tk_delta_lora_a = Parameter(
self.mindpet_delta_lora_a = Parameter(
initializer(lora_a_init, [lora_rank, in_channels], param_init_type),
name='tk_delta_lora_A')
self.tk_delta_lora_b = Parameter(initializer(lora_b_init, [out_channels, lora_rank], param_init_type),
name='tk_delta_lora_B')
name='mindpet_delta_lora_A')
self.mindpet_delta_lora_b = Parameter(initializer(lora_b_init, [out_channels, lora_rank], param_init_type),
name='mindpet_delta_lora_B')
self.scaling = self.lora_alpha / self.lora_rank

# Calculation utils
Expand All @@ -80,8 +80,8 @@ def construct(self, input_tensor):
ori_dtype = F.dtype(input_tensor)
input_tensor = self.cast(input_tensor, self.dtype)
weight = self.cast(self.weight, self.dtype)
lora_a = self.cast(self.tk_delta_lora_a, self.dtype)
lora_b = self.cast(self.tk_delta_lora_b, self.dtype)
lora_a = self.cast(self.mindpet_delta_lora_a, self.dtype)
lora_b = self.cast(self.mindpet_delta_lora_b, self.dtype)
scaling = self.cast(self.scaling, self.dtype)

# Shape operations
Expand Down
Loading