Continuous Integration #
CI/CD integration #
In this chapter, we will walk you through the process of integrating Semgrep into your GitHub repository as part of your continuous integration (CI) and continuous deployment (CD) pipeline.
Recommended Semgrep GitHub integration approach #
We recommend integrating Semgrep with GitHub Actions using the following approach:
- Schedule a full Semgrep scan on the main branch with a broad set of Semgrep rules (e.g.,
p/default
). - Implement a diff-aware scanning approach for pull requests, using a fine-tuned set of rules that yield high confidence and true positive results.
- Once your Semgrep implementation is mature, configure Semgrep to block the PR pipeline if there are unresolved Semgrep findings.
Understanding Semgrep CI configuration options #
Familiarize yourself with the available environment variables and their default values by reviewing the Configuration reference. The following are key points to note:
- Semgrep checks for new versions by default, as controlled by the
SEMGREP_ENABLE_VERSION_CHECK
variable. - By default, Semgrep sets a five-minute timeout for each individual Git command that Semgrep runs (
SEMGREP_GIT_COMMAND_TIMEOUT
). - Semgrep attempts to scan each file with a 30-second timeout (
SEMGREP_TIMEOUT
) and retries up to three times (--timeout-threshold
). - The
SEMGREP_RULES
environment variable defines the rules used by Semgrep. You can specify multiple rule sources by separating them with a space. - By default, the CI process fails if findings are detected but passes if internal errors occur. For more information, see Passing or failing the CI job.
- See the example job that uploads findings to GitHub Advanced Security Dashboard.
Adding custom Semgrep rules to CI/CD #
When you want to use your own custom rules in addition to the standard rulesets (such as p/default
or p/javascript
)
passed to the SEMGREP_RULES
, follow the steps below:
If your custom Semgrep rules directory is in the same repository as the scanned code, just pass the directory path in the
SEMGREP_RULES
variable: (e.g.,SEMGREP RULES: p/default custom-semgrep-rules-dir/
)If your custom Semgrep rules are in another private repository, do the following:
a. Generate an access token for the repository with Semgrep rules. Remember to select the least scopes necessary (e.g. a fine-grained token for your repository with read-only access over the repository contents).
b. Add the generated access token as a secret to the repository where the workflow is run.
c. Add the
actions/checkout
step in a job after the main source code checkout with:- The
repository
name - Personal access
token
(PAT) used to fetch the repository - Relative
path
to place the repository
d. Pass the path to the directory with custom Semgrep rules in the
SEMGREP_RULES
environment variableIf your repository with custom rules is publicly available, just omit the steps where you create the PAT and do not pass the
token
in the checkout step.For example:
# Set up an environment variable containing the name of the private repository with custom Semgrep rules env: SEMGREP_PRIVATE_RULES_REPO: semgrep-private-rules steps: # Main checkout of the repository source code - name: Checkout main repository uses: actions/checkout@v4 # Checkout of the repository with custom Semgrep rules - name: Checkout private custom Semgrep rules uses: actions/checkout@v4 with: repository: ${{ github.repository_owner }}/${{ env.SEMGREP_PRIVATE_RULES_REPO }} # organization-name/semgrep-private-rules token: ${{ secrets.SEMGREP_RULES_TOKEN }} # Configured PAT path: ${{ env.SEMGREP_PRIVATE_RULES_REPO }} # Relative path to place the repository # ... - run: semgrep ci env: # Pass the directory with the checked-out Semgrep rules repository SEMGREP_RULES: ${{ env.SEMGREP_PRIVATE_RULES_REPO }}
- The
GitHub integration steps #
Follow these steps to integrate Semgrep with your GitHub repository:
- Create a
semgrep.yml
file in the.github/workflows
directory of the repository you want to scan. - Copy the code snippet below into the
semgrep.yml
file. This workflow is based on two jobs:- The first job:
- Runs on a schedule basis (once per month).
- Runs when a pull request is merged.
- Runs when there is a direct push on the main/master branch.
- Uses the broad
p/default
Semgrep rule.
- The second job:
- Runs specifically for pull requests.
- Uses multiple security-related rules.
- The first job:
1# Define the name of this GitHub Actions workflow.
2name: Semgrep
3on:
4 # Run the workflow on pull_request events for diff-aware scanning.
5 pull_request: {}
6 # Run the workflow on push events to mainline branches to report all findings.
7 push:
8 branches: ["master", "main"]
9 # Schedule the workflow to run periodically using cron syntax.
10 schedule:
11 - cron: '0 0 1 * *' # Schedule Semgrep to run once per month (at 00:00 on day-of-month 1).
12# Define the jobs that run as part of this workflow.
13jobs:
14 # Define the first job for scheduled scanning and mainline branch scanning.
15 semgrep-schedule:
16 # Define the conditions for running this job. Run on schedule, push to master/main, or merged PR.
17 # Skip any PR created by Dependabot to avoid permission issues.
18 if: ((github.event_name == 'schedule' || github.event_name == 'push' || github.event.pull_request.merged == true)
19 && github.actor != 'dependabot[bot]')
20 # Name this GitHub Actions job.
21 name: Semgrep default scan
22 # Define the environment in which the job runs.
23 runs-on: ubuntu-latest
24 container:
25 # Use a Docker image with Semgrep pre-installed.
26 image: returntocorp/semgrep
27 # Set up an env variable - the name of the (private) repository with custom Semgrep rules
28 # env:
29 # SEMGREP_PRIVATE_RULES_REPO: semgrep-private-rules
30 steps:
31 # Use the GitHub Actions Checkout step to fetch the project source code.
32 - name: Checkout main repository
33 uses: actions/checkout@v4
34 # In case you have a (private) repository with custom Semgrep rules:
35 # - name: Checkout custom Semgrep rules
36 # uses: actions/checkout@v4
37 # with:
38 # repository: ${{ github.repository_owner }}/${{ env.SEMGREP_PRIVATE_RULES_REPO }}
39 # token: ${{ secrets.SEMGREP_RULES_TOKEN }} # If the repository is private
40 # path: ${{ env.SEMGREP_PRIVATE_RULES_REPO }}
41 # Execute the "semgrep ci" command within the Semgrep Docker container.
42 - run: semgrep ci
43 env:
44 # Set the SEMGREP_RULES environment variable to specify which rules Semgrep should use.
45 # Use common security-related rulesets for this job (starting with `p/`)
46 # or use a directory with your custom rules from the current repository (such as `semgrep-rules/`).
47 SEMGREP_RULES: >
48 p/default
49 # If you have a directory in the current repo with your custom rules:
50 # semgrep-rules/
51 # Pass the directory with the checked-out Semgrep rules repository
52 # ${{ env.SEMGREP_PRIVATE_RULES_REPO }}
53 # Define the second job for scanning pull requests.
54 semgrep-pr:
55 # Define the conditions for running this job. Run only within Pull Requests, excluding Dependabot PRs.
56 if: (github.event_name == 'pull_request' && github.actor != 'dependabot[bot]')
57 # Name this GitHub Actions job.
58 name: Semgrep PR scan
59 # Define the environment in which the job runs.
60 runs-on: ubuntu-latest
61 container:
62 # Use the GitHub Actions Checkout step to fetch the project source code.
63 image: returntocorp/semgrep
64 steps:
65 # Fetch project source with GitHub Actions Checkout.
66 - uses: actions/checkout@v4
67 # Execute the "semgrep ci" command within the Semgrep Docker container.
68 - run: semgrep ci
69 env:
70 # Set the SEMGREP_RULES environment variable to specify which rules Semgrep should use.
71 # Use common security-related rulesets for this job (starting with `p/`)
72 # or use a directory with your custom rules from the current repository (such as `semgrep-rules/`).
73 SEMGREP_RULES: >
74 p/cwe-top-25
75 p/owasp-top-ten
76 p/r2c-security-audit
77 p/javascript
78 p/trailofbits
79 # If you have a directory in the current repo with your custom rules:
80 # semgrep-rules/
This configuration ensures that your codebase is scanned regularly for potential issues and that new code introduced through pull requests is thoroughly checked for security vulnerabilities.