Skip to content

Commit

Permalink
Merge pull request github#35593 from github/repo-sync
Browse files Browse the repository at this point in the history
Repo sync
  • Loading branch information
docs-bot authored Dec 9, 2024
2 parents a22e97a + 4070c99 commit 49dcf57
Show file tree
Hide file tree
Showing 11 changed files with 331 additions and 62 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ shortTitle: Remove sensitive data

## About removing sensitive data from a repository

When altering your repository's history using tools like `git filter-repo` or the BFG Repo-Cleaner, it's crucial to understand the implications, especially regarding open pull requests and sensitive data.
When altering your repository's history using tools like `git filter-repo`, it's crucial to understand the implications, especially regarding open pull requests and sensitive data.

The `git filter-repo` tool and the BFG Repo-Cleaner rewrite your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. We recommend merging or closing all open pull requests before removing files from your repository.
The `git filter-repo` tool rewrites your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits. Changed commit SHAs may affect open pull requests in your repository. We recommend merging or closing all open pull requests before removing files from your repository.

You can remove the file from the latest commit with `git rm`. For information on removing a file that was added with the latest commit, see "[AUTOTITLE](/repositories/working-with-files/managing-large-files/about-large-files-on-github#removing-files-from-a-repositorys-history)."

Expand All @@ -48,37 +48,7 @@ If the commit that introduced the sensitive data exists in any forks, it will co

Consider these limitations and challenges in your decision to rewrite your repository's history.

## Purging a file from your repository's history

You can purge a file from your repository's history using either the `git filter-repo` tool or the BFG Repo-Cleaner open source tool.

> [!NOTE] If sensitive data is located in a file that's identified as a binary file, you'll need to remove the file from the history, as you can't modify it to remove or replace the data.
### Using the BFG

The [BFG Repo-Cleaner](https://rtyley.github.io/bfg-repo-cleaner/) is a tool that's built and maintained by the open source community. It provides a faster, simpler alternative to `git filter-repo` for removing unwanted data.

For example, to remove your file with sensitive data and leave your latest commit untouched, run:

```shell
bfg --delete-files YOUR-FILE-WITH-SENSITIVE-DATA
```

To replace all text listed in `passwords.txt` wherever it can be found in your repository's history, run:

```shell
bfg --replace-text passwords.txt
```

After the sensitive data is removed, you must force push your changes to {% data variables.product.product_name %}. Force pushing rewrites the repository history, which removes sensitive data from the commit history. If you force push, it may overwrite commits that other people have based their work on.

```shell
git push --force
```

See the [BFG Repo-Cleaner](https://rtyley.github.io/bfg-repo-cleaner/)'s documentation for full usage and download instructions.

### Using git filter-repo
## Purging a file from your repository's history using git-filter-repo

> [!WARNING] If you run `git filter-repo` after stashing changes, you won't be able to retrieve your changes with other stash commands. Before running `git filter-repo`, we recommend unstashing any changes you've made. To unstash the last set of changes you've stashed, run `git stash show -p | git apply -R`. For more information, see [Git Tools - Stashing and Cleaning](https://git-scm.com/book/en/v2/Git-Tools-Stashing-and-Cleaning).
Expand Down Expand Up @@ -178,7 +148,7 @@ To illustrate how `git filter-repo` works, we'll show you how to remove your fil

## Fully removing the data from {% data variables.product.prodname_dotcom %}

After using either the BFG tool or `git filter-repo` to remove the sensitive data and pushing your changes to {% data variables.product.product_name %}, you must take a few more steps to fully remove the data from {% data variables.product.product_name %}.
After using `git filter-repo` to remove the sensitive data and pushing your changes to {% data variables.product.product_name %}, you must take a few more steps to fully remove the data from {% data variables.product.product_name %}.

{% ifversion ghec %}
1. If the repository was migrated using the {% data variables.product.prodname_importer_proper_name %}, there may be some non-standard Git references that follow the pattern `refs/github-services`, that neither the BFG tool or `git filter-repo` can remove. In this case, remove those references running the following commands in your local copy of the repository:
Expand All @@ -205,22 +175,6 @@ After using either the BFG tool or `git filter-repo` to remove the sensitive dat

1. Tell your collaborators to [rebase](https://git-scm.com/book/en/v2/Git-Branching-Rebasing), _not_ merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging.

1. If you used `git filter-repo`, you can skip this step.

If you used the BFG tool, after rewriting, you can clean up references in your local repository to the old history to be dereferenced and garbage collected with the following commands (using Git 1.8.5 or newer):

```shell
$ git reflog expire --expire=now --all
$ git gc --prune=now
> Counting objects: 2437, done.
> Delta compression using up to 4 threads.
> Compressing objects: 100% (1378/1378), done.
> Writing objects: 100% (2437/2437), done.
> Total 2437 (delta 1461), reused 1802 (delta 1048)
```

> [!NOTE] You can also achieve this by pushing your filtered history to a new or empty repository and then making a fresh clone from {% data variables.product.product_name %}.

{% ifversion ghes %}

## Identifying reachable commits
Expand All @@ -245,7 +199,7 @@ If references are found in any forks, the results will look similar, but will st
ghe-nwo NWO
```

The same procedure using the BFG tool or `git filter-repo` can be used to remove the sensitive data from the repository's forks. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
The same procedure using `git filter-repo` can be used to remove the sensitive data from the repository's forks. Alternatively, the forks can be deleted altogether, and if needed, the repository can be re-forked once the cleanup of the root repository is complete.
Once you have removed the commit's references, re-run the commands to double-check.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ To ensure that all code is properly reviewed prior to being merged into the defa

## Mitigate data leaks

If a user pushes sensitive data, ask them to remove it by using the `git filter-repo` tool or the BFG Repo-Cleaner open source tool. For more information, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)." Also, it is possible to revert almost anything in Git. For more information, see [{% data variables.product.prodname_blog %}](https://github.blog/2015-06-08-how-to-undo-almost-anything-with-git/).
If a user pushes sensitive data, ask them to remove it by using the `git filter-repo` tool. For more information, see "[AUTOTITLE](/authentication/keeping-your-account-and-data-secure/removing-sensitive-data-from-a-repository)." Also, it is possible to revert almost anything in Git. For more information, see [{% data variables.product.prodname_blog %}](https://github.blog/2015-06-08-how-to-undo-almost-anything-with-git/).

At the organization level, if you're unable to coordinate with the user who pushed the sensitive data to remove it, we recommend you contact {% data variables.contact.contact_support %} with the concerning commit SHA.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Below is a typical workflow that explains how {% data variables.product.prodname

* **Remediation:** You then need to take appropriate actions to remediate the exposure. This might include:
* Rotating the affected credential to ensure it is no longer usable.
* Removing the secret from the repository's history (using tools like BFG Repo-Cleaner or {% data variables.product.prodname_dotcom %}'s built-in features).
* Removing the secret from the repository's history (using tools like `git-filter-repo` or {% data variables.product.prodname_dotcom %}'s built-in features).

* **Monitoring:** It's good practice to regularly audit and monitor your repositories to ensure no other secrets are exposed.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -138,12 +138,8 @@ After your pattern is created, {% data variables.product.prodname_secret_scannin
Before defining a custom pattern, you must ensure that you enable secret scanning for your enterprise account. For more information, see "[Enabling {% data variables.product.prodname_GH_advanced_security %} for your enterprise]({% ifversion fpt or ghec %}/enterprise-server@latest/{% endif %}/admin/advanced-security/enabling-github-advanced-security-for-your-enterprise)."

> [!NOTE]
{% ifversion custom-pattern-dry-run-ga %}
> * At the enterprise level, only the creator of a custom pattern can edit the pattern, and use it in a dry run.
> * {% data reusables.secret-scanning.dry-runs-enterprise-permissions %}
{% else %}
> As there is no dry-run functionality, we recommend that you test your custom patterns in a repository before defining them for your entire enterprise. That way, you can avoid creating excess false-positive {% data variables.secret-scanning.alerts %}.
{% endif %}
{% data reusables.enterprise-accounts.access-enterprise %}
{% data reusables.enterprise-accounts.policies-tab %}{% ifversion security-feature-enablement-policies %}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,10 @@ By default, {% data variables.product.prodname_copilot_chat_short %} uses the `G
* `o1-preview`: This model is focused on advanced reasoning and solving complex problems, in particular in math and science. It responds more slowly than the `gpt-4o` model. Each member of your enterprise can make 10 requests to this model per day.
* `o1-mini`: This is the faster version of the `o1-preview` model, balancing the use of complex reasoning with the need for faster responses. It is best suited for code generation and small context operations. Each member of your enterprise can make 50 requests to this model per day.

### {% data variables.product.prodname_copilot_short %} Metrics API access

Enable this policy to allow users to use the {% data variables.product.prodname_copilot_short %} Metrics API. See "[AUTOTITLE](/rest/copilot/copilot-metrics)."

## Configuring policies for {% data variables.product.prodname_copilot %}

{% data reusables.enterprise-accounts.access-enterprise %}
Expand Down
Loading

0 comments on commit 49dcf57

Please sign in to comment.