Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(deps): update dependency semgrep to v1.85.0 #79

Merged
merged 2 commits into from
Aug 25, 2024

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Aug 24, 2024

Mend Renovate

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
semgrep 1.46.0 -> 1.85.0 age adoption passing confidence

Release Notes

returntocorp/semgrep (semgrep)

v1.85.0

Compare Source

Added
  • Semgrep now recognizes files ending with the extention .tfvars as terraform files (saf-1481)
Changed
  • The use of --debug will not generate anymore profiling information.
    Use --time instead. (debug)
  • Updated link to the Supply Chain findings page on Semgrep AppSec Platform to filter to the specific repository and ref the findings are detected on. (secw-2395)
Fixed
  • Fixed an error with julia list comprehentions where the pattern:

    [$A for $B in $C]
    

    would match

    [x for y in z]

    However we would only get one binding [$A/x]

    Behavior after fix: we get three bindings [$A/x,$B/y,$C/z] (saf-1480)

v1.84.1

Compare Source

No significant changes.

v1.84.0

Compare Source

Changed
  • We switch from magenta to yellow when highlighting matches
    with the medium or warning severity. We now use magenta for
    cricical severity to be consistent with other tools such
    as npm. (color)
Fixed
  • Workaround deadlock when interfile is run with j>1 and tracing is enabled. (saf-1157)
  • Fixed file count to report the accurate number of files scanned by generic & regex
    so that no double counting occurs. (saf-507)

v1.83.0

Compare Source

Added
  • Dockerfile: Allow Semgrep Ellipsis (...) in patterns for HEALTHCHECK commands. (saf-1441)
Fixed
  • The use of --debug should generate now far less log entries.
    Moreover, when the number of ignored files, or rules, or
    other entities exceed a big number, we instead replace them
    with a in the output to keep the output of semgrep
    small. (debuglogs)
  • Fixed a bug introduced in 1.81.0 which caused files ignored for the Code
    product but not the Secrets product to fail to be scanned for secrets.
    Files that were not ignored for either product were not affected. (saf-1459)

v1.82.0

Compare Source

Added
  • Added testsuite/ as a filepath to the default value for .semgrepignore. (gh-1876)
Changed
  • Update the library definitions for Java for the latest version of the JDK. (java-library-definitions)
Fixed
  • Fixed metavariable comparison in step mode.

    Used to be that the rule:

        steps:
            - languages: [python]
              patterns:
                - pattern: x = f($VAR);
            - languages: [generic]
              patterns:
                - pattern-either:
                   - patterns:
                    - pattern: HI $VAR

    Wouldn't match, as one is an identifier, and the other an expression that has a
    string literal. The fix was chainging the equality used. (saf-1061)

v1.81.0

Compare Source

Changed
  • The --debug option will now display logging information from the semgrep-core
    binary directly, without waiting that the semgrep-core program finish. (incremental_debug)
Fixed
  • C++: Scanning a project with header files (.h) now no longer causes a
    spurious warnings that the file is being skipped, or not analyzed. (code-6899)

  • Semgrep will now be more strict (as it should be) when unifying identifiers.

    Patterns like the one below may not longer work, particularly in Semgrep Pro:

    patterns:
      - pattern-inside: |
          class A:
            ...
            def $F(...):
              ...
            ...
          ...
      - pattern-inside: |
          class B:
            ...
            def $F(...):
              ...
            ...
          ...
    

    Even if two classes A and B may both have a method named foo, these methods
    are not the same, and their ids are not unifiable via $F. The right way of doing
    this in Semgrep is the following:

    patterns:
      - pattern-inside: |
          class A:
            ...
            def $F1(...):
              ...
            ...
          ...
      - pattern-inside: |
          class B:
            ...
            def $F2(...):
              ...
            ...
          ...
      - metavariable-comparison:
          comparison: str($F1) == str($F2)
    

    We use a different metavariable to match each method, then we check whether they
    have the same name (i.e., same string). (code-7336)

  • In the app, you can configure Secrets ignores separately from Code/SSC ignores. However, the
    files that were ignored by Code/SSC and not Secrets were still being scanned during the
    preprocessing stage for interfile analysis. This caused significantly longer scan times than
    expected for some users, since those ignored files can ignore library code. This PR fixes that
    behavior and makes Code/SSC ignores apply as expected. (saf-1087)

  • Fixed typo that prevented users from using "--junit-xml-output" flag and added a tests that invokes the flag. (saf-1437)

v1.80.0

Compare Source

Added
  • OSemgrep now can take --exclude-minified-files to skip minified files. Additionally --no-exclude-minified-files will disable this option. It is off by default. (cdx-460)

  • Users are now required to login before using semgrep scan --pro.

    Previously, semgrep will tell the users to log in, but the scan will still continue.

    With this change, semgrep will tell the users to log in and stop the scan. (saf-1137)

Fixed
  • The language server no longer scans large or minified files (cdx-460)

  • Pro: Improved module resolution for Python. Imports like from a.b import c where
    c is a module will now be resolved by Semgrep. And, if a module cannot be found
    in the search path, Semgrep will try to heuristically resolve the module by matching
    the module specifier against the files that are being scanned. (code-7069)

  • A scan can occasionally freeze when using tracing with multiprocesses.

    This change disables tracing when scanning each target file unless the scan runs in a single process. (saf-1143)

  • Improved error handling for rules with invalid patterns. Now, scans will still complete and findings from other rules will be reported. (saf-789)

  • The "package-lock.json" parser incorrectly assumed that all paths in the "packages" component of "package-lock.json" started with "node_modules/".

    In reality, a dependency can be installed anywhere, so the parser was made more flexible to recognize alternative locations ("node_modules", "lib", etc). (sc-1576)

v1.79.0

Compare Source

Added
  • Preliminary support for the Move on Aptos language
    (see https://aptos.dev/move/move-on-aptos for more info on this language).
    Thanks a lot to Zhiping Liao (ArArgon) and Andrea Cappa for their contributions! (move_on_aptos)
  • The language server now reports number of autofixes and ignores triggered throught IDE integrations when metrics are enabled (pdx-autofix-ignore)
  • Added support for comparing Golang Pseudo-versions. After replacing calls to the
    packaging module with some custom logic, Pseudo-versions can now be compared against
    strict core versions and other pseudo versions accurately. (sc-1601)
  • We now perform a git gc as a side-effect of historical scans. (scrt-630)
Fixed
  • tainting: Fixed bug in --pro-intrafile that caused Semgrep to confuse a parameter
    with a top-level function with no arguments that happened to have the same name:

    def foo
      taint
    end
    
    def bar(foo)
      sink(foo) # no more FP here
    end (code-6923)
    
  • Fixed fatal errors on files containing nosemgrep annotation without
    any rule ID after. (nosemgrep_exn)

  • Matching explanations: Focus nodes now appear after filter nodes, which is
    the correct order of execution of pattern nodes. Filter nodes are now
    unreversed. (saf-1127)

  • Autofix: Previews in the textual CLI output will now join differing lines
    with a space, rather than joining with no whitespace whatsoever. (saf-1135)

  • Secrets: resolved some rare instances where historical scans would skip blobs
    depending on the structure of the local copy of the repository (i.e., blobs
    were only skipped if the specific copy of the git store had a certain
    structure). (scrt-630)

v1.78.0

Compare Source

Added
  • Matching of fully qualified type names in the metavariable-type operator has
    been improved. For example:

    from a.b import C
    
    x = C()
    

    The type of x will match both a.b.C and C.

      - pattern: $X = $Y()
      - metavariable-type:
          metavariable: $X
          types:
            - a.b.C  # or C
    ``` (code-7269)
    
Fixed
  • Symbolic propagation now works on decorator functions, for example:

    x = foo
    @​x() # this is now matched by pattern `@foo()`
    def test():
      pass (code-6634)
    
  • Fixed an issue where Python functions with annotations ending in endpoint,
    route, get, patch, post, put, delete, before_request or
    after_request (i.e., ones we associate with Flask) were incorrectly analyzed
    with the Code product in addition to the Secrets product when present in a file
    being ignored for Code analysis but included for Secrets. (scrt-609)

v1.77.0

Compare Source

Added
  • Semgrep will now report the id of the organization associated with logged in users when reporting metrics in the language server (cdx-508)

  • Pro: taint-mode: Improved index-sensitive taint tracking for tuple/list (un)packing.

    Example 1:

     def foo():
         return ("ok", taint)
    
     def test():
          x, y = foo()
          sink(x)  # nothing, no FP
          sink(y)  # finding
    

    Example 2:

     def foo(t):
          (x, y) = t
          sink(x)  # nothing, no FP
          sink(y)  # finding
    
     def test():
          foo(("ok", taint)) (code-6935)
    
  • Adds traces to help debug the performance of tainting. To send the traces added in the PR, pass
    --trace and also set the environment variable SEMGREP_TRACE_LEVEL=trace. To send them to a
    local endpoint instead of our default endpoint, use --trace-endpoint. (saf-1100)

Fixed
  • Fixed a bug in the generation of the control-flow graph for try statements that
    could e.g. cause taint to report false positives:

    def test():
        data = taint
        try:
    

Semgrep assumes that clean could raise an exception, but

even if it does, the tainted data will never reach the sink !

          data = clean(data)
      except Exception:
          raise Exception()

data must be clean here

      sink(data) # no more FP (flow-78)
  • The language server (and semgrep --experimental) should not report anymore errors from
    the metrics.semgrep.dev server such as "cannot read property 'map' of undefined". (metrics_error)
  • Fixed a bug in the gemfile.lock parser which causes Semgrep to miss direct
    dependencies whose package name does not end in a version constraint. (sc-1568)

v1.76.0

Compare Source

Added
  • Added type inference support for basic operators in the Pro engine, including
    +, -, *, /, >, >=, <=, <, ==, !=, and not. For numeric
    computation operators such as + and -, if the left-hand side and right-hand
    side types are equal, the return type is assumed to be the same. Additionally,
    comparison operators like > and ==, as well as the negation operator not,
    are assumed to return a boolean type. (code-6940)

  • Added guidance for resolving token issues for install-semgrep-pro in non-interactive environments. (gh-1668)

  • Adds support for a new flag, --subdir <path>, for semgrep ci, which allows users to pass a
    subdirectory to scan instead of the entire directory. The path should be a relative path, and
    the directory where semgrep ci is run should be the root of the repository being scanned.
    Unless SEMGREP_REPO_DISPLAY_NAME is explicitly set, passing the subdirectory
    will cause the results to go to a project specific to that subdirectory.

    The intended use case for semgrep ci --subdir path/to/dir is to help users with very large
    repos scan the repo in parts. (saf-1056)

Fixed
  • The min-version/max-version rule filtering is now done in pysemgrep too,
    avoiding previous crash when using new fields (or new enums) in a rule.

  • Language Server will now send error messages properly, and error handling is greatly improved (cdx-502)

  • Pro: Calling a safe method on a tainted object should no longer propagate taint.

    Example:

    class A {
        String foo(String str) {
            return "ok";
        }
    }
    
    class Test {
        public static void test() {
            A a;
            String s;
            a = taint();
            // Despite `a` is tainted, `a.foo()` is entirely safe !!!
            s = a.foo("bar");
            sink(s); // No more FP here
        }
    } (code-6935)
    
  • Fixing errors in matching identifiers from wildcard imports. For example, this
    update addresses the issue where the following top-level assignment:
    from pony.orm import *
    db = Database()
    is not matched with the following pattern:
    $DB = pony.orm.Database(...)
    ``` (code-7045)

  • [Pro Interfile JS/TS] Improve taint propagation through callbacks passed to $X.map functions and similar. Previously, such callbacks needed to have a return value for taint to be properly tracked. After this fix, they do not. (js-taint)

  • Rust: Constructors will now properly match to only other constructors with
    the same names, in patterns. (saf-1099)

v1.75.0

Compare Source

Added
  • Pro: Semgrep can now track taint through tuple/list (un)packing intra-procedurally
    (i.e., within a single function). For example:

    t = ["ok", "taint"]
    x, y = t
    sink(x) # OK, no finding
    sink(y) # tainted, finding
    ``` (code-6935)
  • Optional type matching is supported in the Pro engine for Python. For example,
    in Python, Optional[str], str | None, and Union[str, None] represent the
    same type but in different type expressions. The optional type match support
    enables matching between these expressions, allowing any optional type
    expression to match any other optional type expression when used with
    metavariable-type filtering. It's important to note that syntactic pattern
    matching still distinguishes between these types. (code-6939)

  • Add support for pnpm v9 (pnpm)

  • Added a new rule option decorators_order_matters, which allows users to make decorators/ non-keyword attributes matching stricter. The default matching for attributes is order-agnostic, but if this rule option is set to true, non-keyword attributes (e.g. decorators in Python) will be matched in order, while keyword attributes (e.g. static, inline, etc) are not affected.

    An example usage will be a rule to detect any decorator that is outside of the route() decorator in Flask, since any decorator outside of the route() decorator takes no effect.

v1.74.0

Compare Source

Fixed
  • One part of interfile tainting was missing a constant propagation phase, which causes semgrep to miss some true positives in some cases during interfile analysis.

    This fix adds the missing constant propagation. (saf-1032)

  • Semgrep now matches YAML tags (e.g. !number in !number 42) correctly rather
    than ignoring them. (saf-1046)

  • Upgraded Semgrep's Dockerfile parser. This brings in various
    fixes from
    tree-sitter-dockerfile

    including minimal support for heredoc templates, support for variables in keys
    of LABEL instructions, support for multiple parameters for ADD and COPY
    instructions, tolerance for blanks after the backslash of a line continuation.
    As a result of supporting variables in LABEL keys, the multiple key/value
    pairs found in LABEL instructions are now treated as if they each had they own
    LABEL instruction. It allows a pattern LABEL a=b to match LABEL a=b c=d
    without the need for an ellipsis (LABEL a=b ...). Another consequence is
    that the pattern LABEL a=b c=d can no longer match LABEL c=d a=b but it
    will match a LABEL a=b instruction immediately followed by a separate
    LABEL c=d. (upgrade-dockerfile-parser)

v1.73.0

Compare Source

Added
  • Added new AWS validator syntax for Secrets (scrt-278)
Fixed
  • Fix couldn't find metavar $MT in the match results error, which may occur
    when we capture FQN with the metavariable and use metavariable-type filter on
    it. (code-7042)
  • Fixes the crash (during scan) caused by improper handling of unicode characters present in the source code. (gh-8421)
  • [Pro Engine Only] Tainted values are now tracked through instantiation of React functional components via JSX. (jsx-taint)

v1.72.0

Compare Source

Fixed
  • Dockerfile support: Avoid a silent parsing error that was possibly accompanied
    with a segfault when parsing Dockerfiles that lack a trailing newline
    character. (gh-10084)

  • Fixed bug that was preventing the use of metavariable-pattern with
    the aliengrep engine of the generic mode. (gh-10222)

  • Added support for function declarations on object literals in the dataflow analysis.

    For example, previously taint rules would not have matched the
    following javascript code but now would.

    let tainted = source()
    let o = {
        someFuncDecl(x) {
            sink(tainted)
        }
    }
    ``` (saf-1001)
    
  • Osemgrep only:

    When rules have metavariable-type, they don't show up in the SARIF output. This change fixes that.

    Also right now dataflow traces are always shown in SARIF even when --dataflow-traces is not passed. This change also fixes that. (saf-1020)

  • Fixed bug in rule parsing preventing patternless SCA rules from being validated. (saf-1030)

v1.71.0

Compare Source

Added
  • Pro: const-prop: Previously inter-procedural const-prop could only infer whether
    a function returned an arbitrary string constant. Now it will be able to infer
    whether a function returns a concrete constant value, e.g.:

    def bar():
      return "bar"
    
    def test():
      x = bar()
      foo(x) # now also matches pattern `foo("bar")`, previously only `foo("...")`
    ``` (flow-61)
  • Python: const-prop: Semgrep will now recognize "..." * N expression as arbitrary
    constant string literals (thus matching the pattern "..."). (flow-75)

Changed
  • The --beta-testing-secrets-enabled option, deprecated for several months, is now removed. Use --secrets as its replacement. (gh-9987)
Fixed
  • When using semgrep --test --json, we now report in the
    config_missing_fixtests field in the JSON output not just rule files
    containing a fix: without a corresponding ".fixed" test file; we now also
    report rule files using a fix-regex: but without a corresponding a
    .fixed test file, and the fix: or fix-regex: can be in
    any rule in the file (not just the first rule). (fixtest)

  • Fixes matching for go struct field tags metadata.

    For example given the program:

    type Rectangle struct {
        Top    int `json:"top"`
        Left   int `json:"left"`
        Width  int `json:"width"`
        Height int `json:"height"`
    }
    

    The pattern,

    type Rectangle struct {
        ...
        $NAME $TYPE $TAGS
        ...
    }
    

    will now match each field and the $TAGS metavariable will be
    bound when used in susequent patterns. (saf-949)

  • Matching: Patterns of statements ending in ellipsis metavariables, such as
    x = 1
    $...STMTS
    will now properly extend the match range to accommodate whatever is captured by
    the ellipsis metavariable ($...STMTS). (saf-961)

  • The SARIF output format should have the tag "security" when the "cwe"
    section is present in the rule. Moreover, duplicate tags should be
    de-duped.

    Osemgrep wasn't doing this before, but with this fix, now it does. (saf-991)

  • Fixed bug in mix.lock parser where it was possible to fail on a python None error. Added handler for arbitrary exceptions during lockfile parsing. (sc-1466)

  • Moved --historical-secrets to the "Pro Engine" option group, instead of
    "Output formats", where it was previously (in error). (scrt-570)

v1.70.0

Compare Source

Added
  • Added guidance for resolving API token issues in CI environments. (gh-10133)

  • The osemgrep show command supports 2 new options: dump-ast dump-pattern.
    See osemgrep show --help for more information. (osemgrep_show)

  • Added additional output flags which allow you to write output to multiple files in multiple formats.

    For example, the comand semgrep ci --text --json-output=result.json --sarif-output=result.sarif.json
    Displays text output on stdout, writes the output that would be generated by passing the --json flag
    to result.json, and writes the output that would be generated by passing the --sarif to result.sarif.json. (saf-341)

  • Added an experimental feature for users to use osemgrep to format
    SARIF output.

    When both the flags --sarif and --use-osemgrep-sarif are specified,
    semgrep will use the ocaml implementation to format SARIF.

    This flag is experimental and can be removed any time. Users must not
    rely on it being available. (saf-978)

Changed
  • The main regex engine is now PCRE2 (was PCRE). While the syntax is mostly
    compatible, there are some minor instances where updates to rules may be
    needed, since PCRE2 is slightly more strict in some cases. For example, while
    we previously accepted [\w-.], such a pattern would now need to be written
    [\w.-] or [\w\-.] since PCRE2 rejects the first as having an invalid range. (scrt-467)
Fixed
  • Semgrep LS now waits longer for users to login (gh-10109)

  • When semgrep ci finishes scanning and uploads findings, it tells the
    app to mark the scan as completed.

    For large findings, this may take a while and marking the scan as
    completed may timeout. When a scan is not marked as completed, the app
    may show that the repo is still processing, and confuses the user.

    This change increases the timeout (previously 20 minutes) to 30
    minutes. (saf-980)

  • Fix semgrep ci --oss-only when secrets product is enabled. (scrt-223)

v1.69.0

Compare Source

Added
  • Tracing: remove support for SEMGREP_OTEL_ENDPOINT and replace with
    --trace-endpoint <url>.
    This change is for an internal feature for debugging performance. (saf-885)
Changed
  • Passing --debug to Semgrep should now print less logs. We do
    not want --debug's output to be enormous, as it tends not to be useful and yet
    cause some problems. Note that --debug is mainly intended for Semgrep developers,
    please ask for help if needed. (gh-10044)
Fixed
  • In generic mode (default, spacegrep engine), matching a pattern that
    ends with an ellipsis now favors the longest match rather than the shortest
    match when multiple matches are possible. For example, for a given target
    program a a b, the pattern a ... b will match a b as before but
    the pattern a ... will now match the longer a a b rather than a b. (gh-10039)
  • Fixed the inter-file diff scan issue where the removal of pre-existing findings
    didn't work properly when adding a new file or renaming an existing file. (saf-897)

v1.68.0

Compare Source

Added
  • Scan un-changed lockfiles in diff-aware scans (gh-9899)
  • Languages: Added the QL language (used by CodeQL) to Semgrep (saf-947)
  • SwiftPM parser will now report package url and reference. (sc-1218)
  • Add support for Elixir (Mix) SCA parsing for pro engine users. (sc-1303)
Fixed
  • Output for sarif format includes dataflow traces. (gh-10004)
  • The environment variable LOG_LEVEL (as well as PYTEST_LOG_LEVEL) is
    no longer consulted by Semgrep to determine the log level. Only
    SEMGREP_LOG_LEVEL is consulted. PYTEST_SEMGREP_LOG_LEVEL is also
    consulted in the current implementation but should not be used outside of
    Semgrep's Pytest tests. This is to avoid accidentally affecting Semgrep
    when inheriting the LOG_LEVEL destined to another application. (gh-10044)
  • Fixed swiftpm parser to no longer limit the amount of found packages in manifest file. (sc-1364)
  • Fixed incorrect ecosystem being used for Elixir. Hex should be used instead of Mix. (sc-elixir)
  • Fixed the match_based_ids of lockfile-only findings to differentiate between findings in cases where one rule produces multiple findings in one lockfile (sca-mid)
  • Secrets historical scans: fixed a bug where historical scans could run on differential scans. (scrt-545)

v1.67.0

Compare Source

Added
  • --historical-secrets flag for running Semgrep Secrets regex rules on git
    history (requires Semgrep Secrets). This flag is not yet implemented for
    --experimental. (scrt-531)
Changed
  • Files with the .phtml extension are now treated as PHP files. (gh-10009)

  • [IMPORTANT] Logged in users running semgrep ci will now run the pro engine by default! All semgrep ci scans will run with our proprietary languages (Apex and Elixir), as well as cross-function taint within a single file, and other single file pro optimizations we have developed. This is equivalent to semgrep ci --pro-intrafile. Users will likely see improved results if they are running semgrep ci and did not already have additional configuration to enable pro analysis.

    The current default engine does not include cross-file analysis. To scan with cross-file analysis, turn on the app toggle or pass in the flag --pro. We recommend this unless you have very large repos (talk to our support to get help enabling cross-file analysis on monorepos!)

    To revert back to our OSS analysis, pass the flag --oss-only (or use --pro-languages to continue to receive our proprietary languages).

    Reminder: because we release first to our canary image, this change will only immediately affect you if you are using semgrep/semgrep:canary. If you are using semgrep/semgrep:latest, it will affect you when we bump canary to latest. (saf-845)

Fixed
  • Fixed a parsing error in Kotlin when there's a newline between the class name and the primary constructor.

    This could not parse before

    class C
    constructor(arg:Int){}
    

    because of the newline between the class name and the constructor.

    Now it's fixed. (saf-899)

v1.66.2

Compare Source

Added
  • osemgrep now respects HTTP_PROXY and HTTPS_PROXY when making network requests (cdx-253)
Changed
  • [IMPORTANT] The public rollout of inter-file differential scanning has been
    temporarily reverted for further polishing of the feature. We will reintroduce
    it in a later version. (saf-268)
Fixed
  • Autofix on variable definitions should now handle the semicolon
    in Java, C++, and C#. (saf-928)

v1.66.1

Compare Source

Fixed
  • Autofix on variable definitions should now handle the semicolon
    in Rust, Cairo, Solidity, Dart. (autofix_vardef)
  • [IMPORTANT] we restored bash, jq, and curl in our semgrep docker image as some
    users were relying on it. We might remove them in the futur but in the
    mean time we restored the packages and if we remove them we will announce
    it more loudly. We also created a new page giving more information
    about our policy for our docker images:
    https://semgrep.dev/docs/semgrep-ci/packages-in-semgrep-docker/ (docker_bash)
  • Fixed autofix application on lines containing multi-byte characters. (multibyte)

v1.66.0

Compare Source

Added
  • Added information about interfile pre-processing to --max-memory help. (gh-9932)
  • We've implemented basic support for the yield keyword in Python. The Pro
    engine now detects taint findings from taint sources returned by the yield
    keyword. (saf-281)
Changed
  • osemgrep --remote will no longer clone into a tmp folder, but instead the CWD (cdx-remote)

  • [IMPORTANT] Inter-file differential scanning is now enabled for all Pro users.

    Inter-file differential scanning is now enabled for all Pro users. While it may
    take longer than intra-file differential scanning, which is the current default
    for pro users, it offers deeper analysis of dataflow paths compared to
    intra-file differential scanning. Additionally, it is significantly faster
    than non-differential inter-file scanning, with scan times reduced to
    approximately 1/10 of the non-differential inter-file scan. Users who
    enable the pro engine and engage in differential PR scans on GitHub or
    GitLab may experience the impact of this update. If needed, users can
    revert to the previous intra-file differential scan behavior by configuring
    the --no-interfile-diff-scan command-line option. (saf-268)

Fixed
  • The official semgrep docker image does not contain anymore the
    bash, jq, and curl utilities, to reduce its attack surface. (saf-861)

v1.65.0

Compare Source

Changed
  • Removed the extract-mode rules experimental feature. (extract_mode)

v1.64.0

Compare Source

Changed
  • Removed the AST caching experimental feature (--experimental --ast-caching
    in osemgrep and -parsing_cache_dir in semgrep-core). (ast_caching)
  • Removed the Registry caching experimental feature (--experimental --registry-caching)
    in osemgrep. (registry_caching)
Fixed
  • Clean any credentials from project URL before using it, to prevent leakage. (saf-876)
  • ci: Updated logic for informational message printed when no rules are sent to
    correctly display when secrets is enabled (in additional to
    when code is). (scrt-455)

v1.63.0

Compare Source

Added
  • Dataflow: Added support for nested record patterns such as { body: { param } }
    in the LHS of an assignment. Now given { body: { param } } = tainted Semgrep
    will correctly mark param as tainted. (flow-68)
  • Matching: metavariable-regex can now match on metavariables of interpolated
    strings which use variables that have known values. (saf-865)
  • Add support for parsing Swift Package Manager manifest and lockfiles (sc-1217)
Fixed
  • fix: taint signatures do not capture changes to parameters' fields (flow-70)
  • Scan summary links printed after semgrep ci scans now reflect a custom SEMGREP_APP_URL, if one is set. (saf-353)

v1.62.0

Compare Source

Added
  • Pro: Adds support for python constructors to taint analysis.

    If interfile naming resolves that a python constructor is called taint
    will now track these objects with less heuristics. Without interfile
    analysis these changes have no effect on the behavior of tainting.
    The overall result is that in the following program the oss analysis
    would match both calls to sink while the interfile analysis would only
    match the second call to sink.

    class A:
        untainted = "not"
        tainted = "not"
        def __init__(self, x):
        	self.tainted = x
    
    a = A("tainted")
    

v1.61.1

Compare Source

Added
  • Added performance metrics using OpenTelemetry for better visualization.
    Users wishing to understand the performance of their Semgrep scans or
    to help optimize Semgrep can configure the backend collector created in
    libs/tracing/unix/Tracing.ml.

    This is experimental and both the implementation and flags are likely to
    change. (ea-320)

  • Created a new environment variable SEMGREP_REPO_DISPLAY_NAME for use in semgrep CI.
    Currently, this does nothing. The goal is to provide a way to override the display
    name of a repo in the Semgrep App. (gh-8953)

  • The OCaml/C executable (semgrep-core or osemgrep) is now passed through
    the strip utility, which reduces its size by 10-25% depending on the
    platform. Contribution by Filipe Pina (@​fopina). (gh-9471)

Changed
  • "Missing plugin" errors (i.e., rules that cannot be run without --pro) will now
    be grouped and reported as a single warning. (ea-842)

v1.61.0

Compare Source

v1.60.1

Compare Source

Added
  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that
    they do not unify within a single pattern or across patterns, and essentially
    just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable
    $_, if they rely on unification still happening. This can be fixed by simply
    giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Changed
  • Dataflow: Simplified the IL translation for Python with statements to let
    symbolic propagation assume that with foo() as x: ... entails x = foo(),
    so that e.g. Session().execute("...") matches:

    with Session() as s:
        s.execute("SELECT * from T") (CODE-6633)
    
Fixed
  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
    the message that was substituted for a metavariable itself contained a valid
    metavariable to be interpolated (ea-838)

v1.60.0

Compare Source

1.60.0 - 2024-02-08

Added
  • Rule syntax: Metavariables by the name of $_ are now anonymous, meaning that
    they do not unify within a single pattern or across patterns, and essentially
    just unconditionally specify some expression.

    For instance, the pattern foo($_, $_) may match the code foo(1, 2).

    This will change the behavior of existing rules that use the metavariable
    $_, if they rely on unification still happening. This can be fixed by simply
    giving the metavariable a real name like $A. (ea-837)

  • Added infrastructure for semgrep supply chain in semgrep-core. Not fully functional yet. (ssc-port)

Fixed
  • Output: Semgrep CLI now no longer sometimes interpolated metavariables twice, if
    the message that was substituted for a metavariable itself contained a valid
    metavariable to be interpolated (ea-838)

v1.59.1

Compare Source

Added
  • taint-mode: Pro: Semgrep can now track taint via static class fields and global
    variables, such as in the following example:

    static char* x;
    
    void foo() {
        x = "tainted";
    }
    
    void bar() {
        sink(x);
    }
    
    void main() {
        foo();
        bar();
    }
    ``` (pa-3378)
Fixed
  • Pro: Make inter-file analysis more tolerant to small bugs, resorting to graceful
    degradation and continuing with the scan, rather than crashing. (pa-3387)

v1.59.0

Compare Source

Added
  • Swift: Now supports typed metavariables, such as ($X : ty). (pa-3370)
Changed
  • Add Elixir to Pro languages list in help information. (gh-9609)

  • Removed sg alias to avoid naming conflicts
    with the shadow-utils sg command for Linux systems. (gh-9642)

  • Prevent unnecessary computation when running scans without verbose logging enabled (gh-9661)

  • Deprecated option taint_match_on introduced in 1.51.0, it is being renamed
    to taint_focus_on. Note that taint_match_on was experimental, and
    taint_focus_on is experimental too. Option taint_match_on will continue
    to work but it will be completely removed at some point after 1.63.0. (pa-3272)

  • Added information on product-related flags to help output, especially for Semgrep Secrets. (pa-3383)

  • taint-mode: Improve inference of best matches for exact-sources, exact-sanitizers,
    and sinks. Now we also avoid FPs in cases such as:

    dangerouslySetInnerHTML = {
      // ok:
      {__html: props ? DOMPurify.sanitize(props.text) : ''} // no more FPs!
    }
    

    where props is tainted and the sink specification is:

    patterns:
      - pattern: |
         dangerouslySetInnerHTML={{__html: $X}}
      - focus-metavariable: $X
    

    Previously Semgrep wrongly considered the individual subexpressions of the
    conditional as sinks, including the props in props ? ..., thus producing a
    false positive. Now it will only consider the conditional expression as a whole
    as the sink. (rules-6457)

  • Removed an internal legacy syntax for secrets rules (mode: semgrep_internal_postprocessor). (scrt-320)

Fixed
  • Autofix: Fixes that span multiple lines will now try to align
    inserted fixed lines with each other. (gh-3070)

  • Matching: Try blocks with catch clauses can now match try blocks that have
    extraneous catch clauses, as long as it matches a subset. For instance,
    the pattern
    try:
    ...
    catch A:
    ...
    can now match
    try:
    ...
    catch A:
    ...
    catch B:
    ...
    ``` (gh-3362)

  • Previously, some people got the error:

    Encountered error when running rules: Other syntax error at line NO FILE INFO YET:-1:
    Invalid_argument: String.sub / Bytes.sub
    

    Semgrep should now report this error properly with a file name and line number and
    handle it gracefully. (gh-9628)

  • Fixed Dockerfile parsing bug where multiline comments were parsed incorrectly. (gh-9628-2)

  • The language server will now properly respect findings that have been ignored via the app (lsp-fingerprints)

  • taint-mode: Pro: Semgrep will now propagate taint via instance variables when
    calling methods within the same class, making this example work:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }
    ``` (pa-3372)
  • taint-mode: Pro: Taint traces will now reflect when taint is propagated via
    class fields, such as in this example:

    class Test {
    
      private String str;
    
      public setStr() {
        this.str = "tainted";
      }
    
      public useStr() {
        //ruleid: test
        sink(this.str);
      }
    
      public test() {
        setStr();
        useStr();
      }
    
    }

    Previously Semgrep will report that taint originated at this.str = "tainted",
    but it would not tell you how the control flow got there. Now the taint trace
    will indicate that we get there by calling setStr() inside test(). (pa-3373)

  • Addressed an issue related to matching top-level identifiers with meta-variable
    qualified patterns in C++, such as matching ::foo with ::$A::$B. This problem
    was specific to Pro Engine-enabled scans. (pa-3375)

v1.58.0

Compare Source

Added
  • Added a severity icon (e.g. "❯❯❱") and corresponding color to our CLI text output
    for findings of known severity. (grow-97)

  • Naming has better support for if statements. In particular, for
    languages with block scope, shadowed variables inside if-else blocks
    that are tainted won't "leak" outside of those blocks.

    This helps with features related to naming, such as tainting.

    For example, previously in Go, the x in sink(x) will report
    that x is tainted, even though the x that is tainted is the
    one inside the scope of the if block.

    func f() {
      x := "safe";
      if (c) {
        x := "tainted";
      }
      // x should not be tainted
      sink(x);
    }

    This is now fixed. (pa-3185)

  • OSemgrep can now scan remote git repositories. Pass --experimental --pro --remote http[s]://<website>/.../<repo>.git to use this feature (pa-remote)

Changed
  • Rules stored under an "hidden" directory (e.g., dir/.hidden/myrule.yml)
    are now processed when using --config .
    We used to skip dot files under dir, but keeping rules/.semgrep.yml,
    but not path/.github/foo.yml, but keeping src/.semgrep/bad_pattern.yml
    but not ./.pre-commit-config.yaml, ... This was mainly because
    we used to fetch rules from ~/.semgrep/ implicitely when --config
    was not given, but this feature was removed, so now we can keep it simple. (hidden_rules)
  • Removed support for writing rules using jsonnet. This feature
    will be restored once we finish the port to OCaml of the semgrep CLI. (jsonnet)
  • The primitive object construct expression will no longer match the new
    expression pattern. For example, the pattern new $TYPE will now only match
    new int, not int(). (pa-3336)
  • The placement new expression will no longer match the new expression without
    placement. For instance, the pattern new ($STORAGE) $TYPE will now only match
    new (storage) int and not new int. (pa-3338)
Fixed
  • Java: You can now use metavariable ellipses properly in
    function arguments, as statements, and as expressions.

    For instance, you may write the pattern

    public $F($...ARGS) { ... }
    ``` (gh-9260)
    
  • Nosemgrep: Fixed a bug where Semgrep would err upon reading a nosemgrep
    comment with multiple rule IDs. (gh-9463)

  • Fixed bugs in gitignore/semgrepignore globbing implementation affecting --experimental. (gh-9544)

  • Fixed rule IDs, descriptions, findings, and autofix text not wrapping as expected.
    Use newline instead of horiziontal separator for findings with a shared file
    but for different rules per design spec. (grow-97)

  • Keep track of the origin of return; statements in the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work properly on them. (pa-3337)

  • C++: Improve translation of delete expressions to the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work on them. Previously
    delete expression at "exit" positions were not being properly recognized
    as such. (pa-3339)

  • cli: fix python runtime error with 0 width wrapped printing (pa-3366)

  • Fixed a bug where Gemfile.lock files with multiple GEM sections
    would not be parsed correctly. (sc-1230)

v1.57.0

Compare Source

1.57.0 - 2024-01-18

Added
  • Added a severity icon (e.g. "❯❯❱") and corresponding color to our CLI text output
    for findings of known severity. (grow-97)

  • Naming has better support for if statements. In particular, for
    languages with block scope, shadowed variables inside if-else blocks
    that are tainted won't "leak" outside of those blocks.

    This helps with features related to naming, such as tainting.

    For example, previously in Go, the x in sink(x) will report
    that x is tainted, even though the x that is tainted is the
    one inside the scope of the if block.

    func f() {
      x := "safe";
      if (c) {
        x := "tainted";
      }
      // x should not be tainted
      sink(x);
    }

    This is now fixed. (pa-3185)

  • OSemgrep can now scan remote git repositories. Pass --experimental --pro --remote http[s]://<website>/.../<repo>.git to use this feature (pa-remote)

Changed
  • Rules stored under an "hidden" directory (e.g., dir/.hidden/myrule.yml)
    are now processed when using --config .
    We used to skip dot files under dir, but keeping rules/.semgrep.yml,
    but not path/.github/foo.yml, but keeping src/.semgrep/bad_pattern.yml
    but not ./.pre-commit-config.yaml, ... This was mainly because
    we used to fetch rules from ~/.semgrep/ implicitely when --config
    was not given, but this feature was removed, so now we can keep it simple. (hidden_rules)
  • The primitive object construct expression will no longer match the new
    expression pattern. For example, the pattern new $TYPE will now only match
    new int, not int(). (pa-3336)
  • The placement new expression will no longer match the new expression without
    placement. For instance, the pattern new ($STORAGE) $TYPE will now only match
    new (storage) int and not new int. (pa-3338)
Fixed
  • Java: You can now use metavariable ellipses properly in
    function arguments, as statements, and as expressions.

    For instance, you may write the pattern

    public $F($...ARGS) { ... }
    ``` (gh-9260)
    
  • Fixed bugs in gitignore/semgrepignore globbing implementation affecting --experimental. (gh-9544)

  • Fixed rule IDs, descriptions, findings, and autofix text not wrapping as expected.
    Use newline instead of horiziontal separator for findings with a shared file
    but for different rules per design spec. (grow-97)

  • Keep track of the origin of return; statements in the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work properly on them. (pa-3337)

  • C++: Improve translation of delete expressions to the dataflow IL so that
    recently added (Pro-only) at-exit: true sinks work on them. Previously
    delete expression at "exit" positions were not being properly recognized
    as such. (pa-3339)

  • Fixed a bug where Gemfile.lock files with multiple GEM sections
    would not be parsed correctly. (sc-1230)

v1.56.0

Compare Source

Added
  • Added a new field that breaks down the number of findings per product
    in the metrics that are sent out by the CLI. This will help Semgrep
    understand users better. (pa-3312)

v1.55.2

Compare Source

Fixed
  • taint-mode: Semgrep was missing some sources occurring inside type expressions,
    for example:

    char *p = new char[source(x)];
    sink(x);

    Now, if x is tainted by side-effect, Semgrep will check x inside the type
    expression char[...] and record it as tainting, and generate a finding for
    sink(x). (pa-3313)

  • taint-mode: C/C++: Sanitization by side-effect was not working correctly for
    ptr->fld l-values. In particular, if ptr is tainted, and then ptr->fld is
    sanitized, Semgrep will now correctly consider ptr->fld as clean. (pa-3328)

v1.55.1

Compare Source

Fixed
  • Honor temporary folder specified via the TMPDIR environment variable (or
    equivalent on Windows) in some instances where it used to be hardcoded as
    /tmp. (gh-9534)
  • Fix pipfile manifest parser error (sc-1084)

v1.55.0

Compare Source

Changed
  • The rule option commutative_compop has been renamed to symmetric_eq. It is
    deprecated and will be removed after the 1.60.0 release. (gh-9496)

v1.54.3

Compare Source

Added
  • Pro only: taint-mode: Added experimental at-exit: true option for sinks, that
    makes a sink spec only apply on the "exit" instructions/statements of a function.
    That is, the instructions after which the control-flow exits the function. This is
    useful for writing rules to find "leaks", such as checking that file descriptors
    are being closed within the same function where they were opened.

    For example, given this taint rule:

    pattern-sources:
      - by-side-effect: true
        patterns:
          - pattern: $FILE = open(...)
          - focus-metavariable: $FILE
    pattern-sanitizers:
      - by-side-effect: true
        patterns:
          - pattern: $FILE.close(...)
          - focus-metavariable: $FILE
    pattern-sinks:
      - at-exit: true
        pattern: |
          def $FUN(...):
            ...

    Semgrep will report a finding in the code below since at print(content), after
    which the control flow reaches the exit of the function, the file has not yet
    been closed:

    def test():
        file = open("test.txt")
        content = file.read()
        print(content) # FINDING
    ``` (pa-3266)

v1.54.2

Compare Source

Added
  • metrics: added more granular information about pro engine configurations to
    help differentiate scans using different engine capabilities. For instance,
    maintainers are now able to distinguish intraprocedural scans without secrets
    validation from intraprocedural scans with secrets validation. This allows us
    to have a better understanding of usage and more accurately identify
    product-specific issues (e.g., to see if something only affects secrets scans). (ea-297)
Fixed
  • Revise error message when running semgrep ci without being logged in to clarify that --config is used with semgrep scan. (gh-9485)

v1.54.1

Compare Source

No significant changes.

v1.54.0

Compare Source

Added
  • Pro only: taint-mode: In a function/method call, it is now possible to arbitrarily
    propagate taint between arguments and the callee. For example in C, one can
    propagate taint from the second argument of `strca

Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot force-pushed the renovate/semgrep-1.x-lockfile branch from 85ea499 to 52dfa1c Compare August 24, 2024 22:22
@renovate renovate bot force-pushed the renovate/semgrep-1.x-lockfile branch from 52dfa1c to 59495ce Compare August 24, 2024 22:28
@Zebradil Zebradil merged commit 8e327c6 into master Aug 25, 2024
6 checks passed
@Zebradil Zebradil deleted the renovate/semgrep-1.x-lockfile branch August 25, 2024 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant