Skip to content

Commit

Permalink
added nb license
Browse files Browse the repository at this point in the history
  • Loading branch information
ishandhanani committed Jul 31, 2024
1 parent 3a69141 commit 14c2d46
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 1 deletion.
Binary file added assets/brev-hf-law-qa-dataset.zip
Binary file not shown.
4 changes: 3 additions & 1 deletion llama31_law.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,9 @@
"source": [
"### Step 1: Prepare the dataset\n",
"\n",
"The dataset we used is a subset of the [Law-StackExchange dataset](https://huggingface.co/datasets/ymoslem/Law-StackExchange). We've already filtered and processed this dataset and it can be used to train the model for various different tasks - question title generation (summarization), law domain question answering, and question tag generation (multi-label classification). To run your own data cleaning and prepreocessing, please refer to the [data generation notebook](https://github.com/NVIDIA/NeMo-Curator/tree/main/tutorials/peft-curation-with-sdg). That tutorial also allows you to generate synthetic data and increase the size of the dataset."
"The dataset we used is a subset of the [Law-StackExchange dataset](https://huggingface.co/datasets/ymoslem/Law-StackExchange). We've already filtered and processed this dataset and it can be used to train the model for various different tasks - question title generation (summarization), law domain question answering, and question tag generation (multi-label classification). To run your own data cleaning and prepreocessing, please refer to the [data generation notebook](https://github.com/NVIDIA/NeMo-Curator/tree/main/tutorials/peft-curation-with-sdg). That tutorial also allows you to generate synthetic data and increase the size of the dataset.\n",
"\n",
"This dataset is licensed under the [CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/) license. You can use it for any purpose, including commercial use, without attribution. However, if you use the dataset in a publication, please cite the original authors and the [Law-StackExchange dataset](https://huggingface.co/datasets/ymoslem/Law-StackExchange) repository."
]
},
{
Expand Down

0 comments on commit 14c2d46

Please sign in to comment.