Skip to content

Commit

Permalink
semgrep in org initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
ahpaleus committed Jan 12, 2024
1 parent 8883d31 commit 28cbe64
Showing 1 changed file with 111 additions and 23 deletions.
134 changes: 111 additions & 23 deletions content/docs/static-analysis/semgrep/30-org.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,29 +2,117 @@
title: "In your organization"
slug: in-your-organization
summary: "This section discusses the process of introducing Semgrep to your organization."
draft: true
weight: 30
---

# Introducing Semgrep to your organization

Semgrep is a powerful static analysis tool designed to identify bugs and specific
code patterns across multiple languages. This section outlines a flexible plan
for introducing Semgrep to your organization effectively:

1. **Assess Semgrep's usability for your organization**: Determine which languages and technologies within your
organization are supported by Semgrep.
2. **Conduct a pilot test**: Run Semgrep on a small project to evaluate its effectiveness and identify issues such
as false positives.
3. **Train your team**: Teach developers and other relevant team members how to use Semgrep, write custom rules,
and engage with the Semgrep community.
4. **Create an internal repository of custom rules**: Develop and maintain a repository with custom Semgrep rules
tailored to your organization's specific needs.
5. **Incorporate Semgrep into your CI/CD pipeline**: Gradually implement Semgrep into your CI/CD pipeline by starting
with a pilot test, followed by scheduled full scans
and, finally, diff-aware scanning on event triggers.
6. **Assign a dedicated researcher**: Allocate a team member to explore experimental features, monitor external
repositories, and assess the value of a paid subscription.

By following this plan, you can successfully integrate Semgrep into your organization, enhancing your code's security,
readability, and overall quality.
# How to introduce Semgrep to your organization

Semgrep is designed to be flexible to fit your organization’s specific needs. To get the best results, it’s important to
understand how to run Semgrep, which rules to use, and how to integrate it into the CI/CD pipeline. If you are unsure
how to get started, here is our seven-step plan to determine how to best integrate Semgrep into your SDLC, based on
what we’ve learned over the years.

## The 7-step Semgrep plan

1. Review the list of supported languages to understand whether [Semgrep can help you](https://semgrep.dev/docs/supported-languages/#language-maturity).

2. **Explore**: Try Semgrep on a small project to evaluate its effectiveness. For example, navigate into the root
directory of a project and run:

``` shell
semgrep --config auto
```

There are a few important notes to consider when running this command:

- The `--config auto` option submits metrics to Semgrep, which may not be desirable.
- Invoking Semgrep in this way will present an overview of identified issues, including the number and severity.
In general, you can use this CLI flag to gain a broad view of the technologies covered by Semgrep.
- Semgrep identifies programming languages by file extensions rather than analyzing their contents.
Some paths are excluded from scanning by default using the default `.semgrepignore` file. Additionally, Semgrep
excludes untracked files listed in a `.gitignore` file.

3. **Dive deep**: Instead of using the auto option, use the [Semgrep Registry](https://semgrep.dev/explore) to select
rulesets based on key security patterns, and your tech stack and needs.
- Try:

```shell
semgrep --config p/default
semgrep --config p/owasp-top-ten
semgrep --config p/cwe-top-25
```

or choose a ruleset based on your technology:

```shell
semgrep --config p/javascript
```

- Focus on rules with high confidence and medium- or high-impact metadata first. If there are too many results,
limit results to error severity only using the --severity ERROR flag.
- Resolve identified issues and include reproduction instructions in your bug reports.

4. **Fine-tune**: Obtain your ideal rulesets chain by reviewing the effectiveness of currently used rulesets.
- Check out non-security rulesets, too, such as best practices rules. This will enhance code readability and may
prevent the introduction of vulnerabilities in the future. Also, consider covering other aspects of your project:
- Shell scripts, configuration files, generic files, Dockerfiles
- Third-party dependencies (Semgrep Supply Chain, a paid feature, can help you detect if you are using the
vulnerable package in an exploitable way)
- To ignore the incorrect code pattern by Semgrep, use a comment in your code on the first line of a preceding line
of the pattern match, e.g., `// nosemgrep: go.lang.security.audit.xss`. Also, explain why you decided to disable
a rule or provide a risk-acceptance reason.
- Create a customized `.semgrepignore` file to reduce noise by excluding specific files or folders from the Semgrep
scan. Semgrep ignores files listed in `.gitignore` by default. To maintain this, after creating a `.semgrepignore`
file, add `.gitignore` to your `.semgrepignore` with the pattern `:include .gitignore`.

5. Create an internal repository to aggregate custom Semgrep rules specific to your organization.
A README file should include a short tutorial on using Semgrep, applying custom rules from your repository,
and an inventory table of custom rules. Also, a contribution checklist will allow your team to maintain the quality
level of the rules (see the

Check failure on line 71 in content/docs/static-analysis/semgrep/30-org.md

View workflow job for this annotation

GitHub Actions / markdown-linter

Trailing spaces [Expected: 0 or 2; Actual: 1]

content/docs/static-analysis/semgrep/30-org.md:71:28 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1]
[Trail of Bits Semgrep rule development checklist](https://github.com/trailofbits/semgrep-rules/blob/main/CONTRIBUTING.md#development-practices)).
Ensure that adding a new Semgrep rule to your internal Semgrep repository includes a peer review process
to reduce false positives/negatives.

6. **Evangelize**: Train developers and other relevant teams on effectively using Semgrep.
- Present pilot test results and advice on improving the organization's code quality and security.
Show potential Semgrep limitations (single-file analysis only).
- Include the official [Learn Semgrep](https://semgrep.dev/learn) resource and present the
[Semgrep Playground](https://semgrep.dev/playground/new) with “simple mode” for easy rule creation.
Provide an overview of how to write custom rules and emphasize that writing custom Semgrep rules is easy. Mention
that the custom rules can be extended with the auto-fix feature using the `fix: key`. Encourage using metadata
(i.e., CWE, confidence, likelihood, impact) in custom rules to support the vulnerability management process.
To help a developer answer the question, “Should I create a Semgrep rule for this problem?” you can use these
follow-up questions:
- Can we detect a specific security vulnerability?
- Can we enforce best practices/conventions or maintain code consistency?
- Can we optimize the code by detecting code patterns that affect performance?
- Can we validate a specific business requirement or constraint?
- Can we identify deprecated/unused code?
- Can we spot any misconfiguration in a configuration file?
- Is this a recurring question as you review your code?
- How is code documentation handled, and what are the requirements for documentation?
- Create places for the team to discuss Semgrep, write custom rules, troubleshoot (e.g., a Slack channel),
and jot down ideas for Semgrep rules (e.g., on a Trello board). Also, consider writing custom rules for bugs found
during your organization’s security audits/bug bounty program. A good idea is to aggregate quick notes to help your
team use Semgrep (see the [Appendix in the original blog post](https://blog.trailofbits.com/2024/01/12/how-to-introduce-semgrep-to-your-organization/#:~:text=Appendix%3A%20Things%20I%20wish%20I%E2%80%99d%20known%20before%20I%20started%20using%20Semgrep)).
- Pay attention to the Semgrep Community Slack, where the Semgrep community helps with problems or writing custom
rules.
- Encourage the team to report existing limitations/bugs while using Semgrep to the Semgrep team by filling out
GitHub issues (see this [example issue](https://github.com/returntocorp/semgrep/issues/4587) submitted by
Trail of Bits).

7. Implement Semgrep in the CI/CD pipeline by getting acquainted with the Semgrep documentation related to your CI
vendor. Incorporating Semgrep incrementally is important to avoid overwhelming developers with too many results. So,
try out a pilot test first on a repository. Then, implement the full Semgrep scan on a schedule on the main branch in
the CI/CD pipeline. Finally, include a diff-aware scanning approach when an event triggers (e.g., a pull/merge request).
A diff-aware approach scans only changes in files on a trigger, maintaining efficiency. This approach should examine a
fine-tuned set of rules that provide high confidence and true positive results. Once the Semgrep implementation is
mature, configure Semgrep in the CI/CD pipeline to block the PR pipeline with unresolved Semgrep findings.

## What’s next? Maximizing the value of Semgrep in your organization

As you introduce Semgrep to your organization, remember that it undergoes frequent updates. To make the most of its
benefits, assign one person in your organization to be responsible for analyzing new features (e.g., Semgrep Pro, which
extends codebase scanning with inter-file coding paradigms instead of Semgrep’s single-file approach), informing
the team about external repositories of Semgrep rules, and determining the value of the paid subscription (e.g., access
to premium rules).

0 comments on commit 28cbe64

Please sign in to comment.