scorecard/docs/checks.md
Chris McGehee 76105194da
📖 Adding missing documentation for Token-Permissions (#1656)
* Adding missing documentation for Token-Permissions

* Make documentation for `actions` more accurate

Co-authored-by: Naveen <172697+naveensrinivasan@users.noreply.github.com>
Co-authored-by: laurentsimon <64505099+laurentsimon@users.noreply.github.com>
2022-02-25 22:47:11 +00:00

34 KiB
Raw Blame History

Check Documentation

This page describes each Scorecard check in detail, including scoring criteria, remediation steps to improve the score, and an explanation of the risks associated with a low score. The checks are continually changing and we welcome community feedback. If you have ideas for additions or new detection techniques, please contribute!

Binary-Artifacts

Risk: High (non-reviewable code)

This check determines whether the project has generated executable (binary) artifacts in the source repository.

Including generated executables in the source repository increases user risk. Many programming language systems can generate executables from source code (e.g., C/C++ generated machine code, Java .class files, Python .pyc files, and minified JavaScript). Users will often directly use executables if they are included in the source repository, leading to many dangerous behaviors.

Problems with generated executable (binary) artifacts:

  • Binary artifacts cannot be reviewed, allowing possible obsolete or maliciously subverted executables. Reviews generally review source code, not executables, since it's difficult to audit executables to ensure that they correspond to the source code. Over time the included executables might not correspond to the source code.
  • Generated executables allow the executable generation process to atrophy, which can lead to an inability to create working executables. These problems can be countered with verified reproducible builds, but it's easier to implement verified reproducible builds when executables are not included in the source repository (since the executable generation process is less likely to have atrophied).

Allowed by Scorecards:

  • Files in the source repository that are simultaneously reviewable source code and executables, since these are reviewable. (Some interpretive systems, such as many operating system shells, don't have a mechanism for storing generated executables that are different from the source file.)
  • Source code in the source repository generated by other tools (e.g., by bison, yacc, flex, and lex). There are potential downsides to generated source code, but generated source code tends to be much easier to review and thus presents a lower risk. Generated source code is also often difficult for external tools to detect.
  • Generated documentation in source repositories. Generated documentation is intended for use by humans (not computers) who can evaluate the context. Thus, generated documentation doesn't pose the same level of risk.

Remediation steps

  • Remove the generated executable artifacts from the repository.
  • Build from source.

Branch-Protection

Risk: High (vulnerable to intentional malicious code injection)

This check determines whether a project's default and release branches are protected with GitHub's branch protection settings. Branch protection allows maintainers to define rules that enforce certain workflows for branches, such as requiring review or passing certain status checks before acceptance into a main branch, or preventing rewriting of public history.

Note: The following settings queried by the Branch-Protection check require an admin token: DismissStaleReviews, EnforceAdmin, and StrictStatusCheck. If the provided token does not have admin access, the check will query the branch settings accessible to non-admins and provide results based only on these settings. Even so, we recommend using a non-admin token, which provides a thorough enough result to meet most user needs.

Different types of branch protection protect against different risks:

  • Require code review: requires at least one reviewer, which greatly reduces the risk that a compromised contributor can inject malicious code. Review also increases the likelihood that an unintentional vulnerability in a contribution will be detected and fixed before the change is accepted.

  • Prevent force push: prevents use of the --force command on public branches, which overwrites code irrevocably. This protection prevents the rewriting of public history without external notice.

  • Require status checks: ensures that all required CI tests are met before a change is accepted.

Although requiring code review can greatly reduce the chance that unintentional or malicious code enters the "main" branch, it is not feasible for all projects, such as those that don't have many active participants. For more discussion, see Code Reviews.

Additionally, in some cases these rules will need to be suspended. For example, if a past commit includes illegal content such as child pornography, it may be necessary to use a force push to rewrite the history rather than simply hide the commit.

This test has tiered scoring. Each tier must be fully satisfied to achieve points at the next tier. For example, if you fulfill the Tier 3 checks but do not fulfill all the Tier 2 checks, you will not receive any points for Tier 3.

Note: If Scorecard is run without an administrative access token, the requirements that specify “For administrators” are ignored.

Tier 1 Requirements (3/10 points):

  • Prevent force push
  • Prevent branch deletion
  • For administrators: Include administrator for review

Tier 2 Requirements (6/10 points):

  • Required reviewers >=1
  • For administrators: Strict status checks (require branches to be up-to-date before merging)

Tier 3 Requirements (8/10 points):

  • Status checks defined

Tier 4 Requirements (9/10 points):

  • Required reviewers >= 2

Tier 5 Requirements (10/10 points):

  • For administrators: Dismiss stale reviews

Remediation steps

  • Enable branch protection settings in your source hosting provider to avoid force pushes or deletion of your important branches.
  • For GitHub, check out the steps here.

CI-Tests

Risk: Low (possible unknown vulnerabilities)

This check tries to determine if the project runs tests before pull requests are merged. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

Running tests helps developers catch mistakes early on, which can reduce the number of vulnerabilities that find their way into a project.

The check works by looking for a set of CI-system names in GitHub CheckRuns and Statuses among the recent commits (~30). A CI-system is considered well-known if its name contains any of the following: appveyor, buildkite, circleci, e2e, github-actions, jenkins, mergeable, test, travis-ci.

Note: A project that fulfills this criterion with other tools may still receive a low score on this test. There are many ways to implement CI testing, and it is challenging for an automated tool like Scorecard to detect them all. A low score is therefore not a definitive indication that the project is at risk.

If a project's system was not detected and you think it should be, please open an issue in the scorecard project.

Remediation steps

  • Check-in scripts that run all the tests in your repository.
  • Integrate those scripts with a CI/CD platform that runs it on every pull request (e.g. if hosted on GitHub, GitHub Actions, Prow, etc).

CII-Best-Practices

Risk: Low (possibly not following security best practices)

This check determines whether the project has earned a CII Best Practices Badge, which indicates that the project uses a set of security-focused best development practices for open source software. The check uses the URL for the Git repo and the CII API.

The CII Best Practices badge has 3 tiers: passing, silver, and gold. We give full credit to projects that meet the passing criteria, which is a significant achievement for many projects. Lower scores represent a project that is at least working to achieve a badge, with increasingly more points awarded as more criteria are met.

To earn the passing badge, the project MUST:

  • publish the process for reporting vulnerabilities on the project site
  • provide a working build system that can automatically rebuild the software from source code (where applicable)
  • have a general policy that tests will be added to an automated test suite when major new functionality is added
  • meet various cryptography criteria where applicable
  • have at least one primary developer who knows how to design secure software
  • have at least one primary developer who knows of common kinds of errors that lead to vulnerabilities in this kind of software (and at least one method to counter or mitigate each of them)
  • apply at least one static code analysis tool (beyond compiler warnings and "safe" language modes) to any proposed major production release.

Some of these criteria overlap with other Scorecards checks.

Remediation steps

Code-Review

Risk: High (unintentional vulnerabilities or possible injection of malicious code)

This check determines whether the project requires code review before pull requests (merge requests) are merged.

Reviews detect various unintentional problems, including vulnerabilities that can be fixed immediately before they are merged, which improves the quality of the code. Reviews may also detect or deter an attacker trying to insert malicious code (either as a malicious contributor or as an attacker who has subverted a contributor's account), because a reviewer might detect the subversion.

The check first tries to detect whether Branch-Protection is enabled on the default branch with at least one required reviewer. If this fails, the check determines whether the most recent (~30) commits have a Github-approved review or if the merger is different from the committer (implicit review). It also performs a similar check for reviews using Prow (labels "lgtm" or "approved") and Gerrit ("Reviewed-on" and "Reviewed-by").

Note: Requiring reviews for all changes is infeasible for some projects, such as those with only one active participant. Even a project with multiple active contributors may not have enough active participation to be able to require review of all proposed changes. Projects with a small number of active participants instead sometimes aim for a review of a percentage of proposals (e.g., "at least half of all proposed changes are reviewed").

Requiring review does not eliminate all risks. The other reviewers might fail to notice unintentional vulnerabilities or malicious code, be colluding with a malicious developer, or even be the same person (using a "sock puppet" account).

Remediation steps

  • If the project has only one contributor, or does not have enough reviewers to practically require that all contributions be reviewed, try to recruit more maintainers to the project who will be willing to review others' work. Ideally at least some of these people will be from different organizations (see Contributors). If the project has very limited utility, consider expanding its intended utility so more people will be interested in improving it, and make that larger scope clear to potential contributors.
  • Follow security best practices by performing strict code reviews for every new pull request / merge request.
  • Make "code reviews" mandatory in your repository configuration. (Instructions for GitHub.)
  • Enforce the rule for administrators / code owners as well. (Instructions for GitHub.)

Contributors

Risk: Low (lower number of trusted code reviewers)

This check tries to determine if the project has recent contributors from multiple organizations (e.g., companies). It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

The check looks at the Company field on the GitHub user profile for authors of recent commits. To receive the highest score, the project must have had contributors from at least 3 different companies in the last 30 commits; each of those contributors must have had at least 5 commits in the last 30 commits.

Note: Some projects cannot meet this requirement, such as small projects with only one active participant, or projects with a narrow scope that cannot attract the interest of multiple organizations. See Code Reviews for more information about evaluating projects with a small number of participants.

Remediation steps

  • Ask contributors to join their respective organizations, if they have not already. Otherwise, there is no remediation for this check; it simply provides insight into which organizations have contributed so that you can make a trust-based decision based on that information.

Dangerous-Workflow

Risk: Critical (vulnerable to repository compromise)

This check determines whether the project's GitHub Action workflows has dangerous code patterns. Some examples of these patterns are untrusted code checkouts, logging github context and secrets, or use of potentially untrusted inputs in scripts. The following patterns are checked:

Untrusted Code Checkout: This is the misuse of potentially dangerous triggers. This checks if a pull_request_target workflow trigger was used in conjunction with an explicit pull request checkout. Workflows triggered with pull_request_target have write permission to the target repository and access to target repository secrets. With the PR checkout, PR authors may compromise the repository, for example, by using build scripts controlled by the author of the PR or reading token in memory. This check does not detect whether untrusted code checkouts are used safely, for example, only on pull request that have been assigned a label.

Script Injection with Untrusted Context Variables: This pattern detects whether a workflow's inline script may execute untrusted input from attackers. This occurs when an attacker adds malicious commands and scripts to a context. When a workflow runs, these strings may be interpreted as code that is executed on the runner. Attackers can add their own content to certain github context variables that are considered untrusted, for example, github.event.issue.title. These values should not flow directly into executable code.

The highest score is awarded when all workflows avoid the dangerous code patterns.

Remediation steps

  • Avoid the dangerous workflow patterns. See this post for information on avoiding untrusted code checkouts. See this document for information on avoiding and mitigating the risk of script injections.

Dependency-Update-Tool

Risk: High (possibly vulnerable to attacks on known flaws)

This check tries to determine if the project uses a dependency update tool, specifically dependabot or renovatebot. Out-of-date dependencies make a project vulnerable to known flaws and prone to attacks. These tools automate the process of updating dependencies by scanning for outdated or insecure requirements, and opening a pull request to update them if found.

This check can determine only whether the dependency update tool is enabled; it does not ensure that the tool is run or that the tool's pull requests are merged.

Note: A project that fulfills this criterion with other tools may still receive a low score on this test. There are many ways to implement dependency updates, and it is challenging for an automated tool like Scorecard to detect them all. A low score is therefore not a definitive indication that the project is at risk.

Remediation steps

  • Signup for automatic dependency updates with dependabot or renovatebot and place the config file in the locations that are recommended by these tools. Due to https://github.com/dependabot/dependabot-core/issues/2804 Dependabot can be enabled for forks where security updates have ever been turned on so projects maintaining stable forks should evaluate whether this behavior is satisfactory before turning it on.
  • Unlike dependabot, renovatebot has support to migrate dockerfiles' dependencies from version pinning to hash pinning via the pinDigests setting without aditional manual effort.

Fuzzing

Risk: Medium (possible vulnerabilities in code)

This check tries to determine if the project uses fuzzing by checking if the repository name is included in the OSS-Fuzz project list.

Fuzzing, or fuzz testing, is the practice of feeding unexpected or random data into a program to expose bugs. Regular fuzzing is important to detect vulnerabilities that may be exploited by others, especially since attackers can also use fuzzing to find the same flaws.

Note: A project that fulfills this criterion with other tools may still receive a low score on this test. There are many ways to implement fuzzing, and it is challenging for an automated tool like Scorecard to detect them all. A low score is therefore not a definitive indication that the project is at risk.

Remediation steps

  • Integrate the project with OSS-Fuzz by following the instructions here.

License

Risk: Low (possible impediment to security review)

This check tries to determine if the project has published a license. It works by checking standard locations for a file named according to common conventions for licenses.

A license can give users information about how the source code may or may not be used. The lack of a license will impede any kind of security review or audit and creates a legal risk for potential users.

This check will detect files in the top-level directory with any combination of the following names and extensions:LICENSE, LICENCE, COPYING, COPYRIGHT and .html, .txt, .md. It will also detect these files in a directory named LICENSES. (Files in a LICENSES directory are typically named as their SPDX license identifier followed by an appropriate file extension, as described in the REUSE Specification.)

Remediation steps

  • Determine which license to apply to your project.
  • Create the license in a .txt, .html, or .md file named LICENSE or COPYING, and place it in the top-level directory.
  • Alternately, create a LICENSE directory and add license files with a name that matches your SPDX license identifier.

Maintained

Risk: High (possibly unpatched vulnerabilities)

This check determines whether the project is actively maintained. If the project is archived, it receives the lowest score. If there is at least one commit per week during the previous 90 days, the project receives the highest score. If there is activity on issues from users who are collaborators, members, or owners of the project, the project receives a partial score.

A project which is not active might not be patched, have its dependencies patched, or be actively tested and used. However, a lack of active maintenance is not necessarily always a problem. Some software, especially smaller utility functions, does not normally need to be maintained. For example, a library that determines if an integer is even would not normally need maintenance unless an underlying implementation language definition changed. A lack of active maintenance should signal that potential users should investigate further to judge the situation.

Remediation steps

  • There is no remediation work needed from projects with a low score; this check simply provides insight into the project activity and maintenance commitment. External users should determine whether the software is the type that would not normally need active maintenance.

Packaging

Risk: Medium (users possibly missing security updates)

This check tries to determine if the project is published as a package. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

Packages give users of a project an easy way to download, install, update, and uninstall the software by a package manager. In particular, they make it easy for users to receive security patches as updates.

The check currently looks for GitHub packaging workflows and language-specific GitHub Actions that upload the package to a corresponding hub, e.g., Npm. We plan to add better support to query package manager hubs directly in the future, e.g., for Npm, PyPi.

You can create a package in several ways:

  • Many program language ecosystems have a generally-used packaging format supported by a language-level package manager tool and public package repository.
  • Many operating system platforms also have at least one package format, tool, and public repository (in some cases the source repository generates system-independent source packages, which are then used by others to generate system executable packages).
  • Using container images.

Note: A project that fulfills this criterion with other tools may still receive a low score on this test. There are many ways to package software, and it is challenging for an automated tool like Scorecards to detect them all. A low score is therefore not a definitive indication that the project is at risk. If Scorecards fails to detect the way you publish a package and you think we should support your use case, please let us know by opening an issue.

Remediation steps

Pinned-Dependencies

Risk: Medium (possible compromised dependencies)

This check tries to determine if the project pins its dependencies. A "pinned dependency" is a dependency that is explicitly set to a specific hash instead of allowing a mutable version or range of versions. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

The check works by looking for unpinned dependencies in Dockerfiles, shell scripts and GitHub workflows.

Pinned dependencies reduce several security risks:

  • They ensure that checking and deployment are all done with the same software, reducing deployment risks, simplifying debugging, and enabling reproducibility.
  • They can help mitigate compromised dependencies from undermining the security of the project (in the case where you've evaluated the pinned dependency, you are confident it's not compromised, and a later version is released that is compromised).
  • They are one way to counter dependency confusion (aka substitution) attacks, in which an application uses multiple feeds to acquire software packages (a "hybrid configuration"), and attackers fool the user into using a malicious package via a feed that was not expected for that package.

However, pinning dependencies can inhibit software updates, either because of a security vulnerability or because the pinned version is compromised. Mitigate this risk by:

For projects hosted on GitHub, you can learn more about dependencies using the GitHub dependency graph.

Remediation steps

  • First determine if your project is producing a library or application. If it is a library, you generally don't want to pin dependencies of library users, and should not follow any remediation steps.
  • If your project is producing an application, declare all your dependencies with specific versions in your package format file (e.g. package.json for npm, requirements.txt for python). For C/C++, check in the code from a trusted source and add a README on the specific version used (and the archive SHA hashes).
  • If the package manager supports lock files (e.g. package-lock.json for npm), make sure to check these in the source code as well. These files maintain signatures for the entire dependency tree and saves from future exploitation in case the package is compromised.
  • For Dockerfiles, pin dependencies by hash. See Dockerfile for example.
  • For GitHub workflows, pin dependencies by hash. See main.yaml for example. To determine the permissions needed for your workflows, you may use StepSecurity's online tool by ticking the "Pin actions to a full length commit SHA". You may also tick the "Restrict permissions for GITHUB_TOKEN" to fix issues found by the Token-Permissions check.
  • To help update your dependencies after pinning them, use tools such as Github's dependabot or renovate bot.

SAST

Risk: Medium (possible unknown bugs)

This check tries to determine if the project uses Static Application Security Testing (SAST), also known as static code analysis. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

SAST is testing run on source code before the application is run. Using SAST tools can prevent known classes of bugs from being inadvertently introduced in the codebase.

The checks currently looks for known Github apps such as CodeQL (github-code-scanning), LGTM and SonarCloud in the recent (~30) merged PRs, or the use of "github/codeql-action" in a GitHub workflow.

Note: A project that fulfills this criterion with other tools may still receive a low score on this test. There are many ways to implement SAST, and it is challenging for an automated tool like Scorecard to detect them all. A low score is therefore not a definitive indication that the project is at risk.

Remediation steps

  • Run CodeQL checks in your CI/CD by following the instructions here.

Security-Policy

Risk: Medium (possible insecure reporting of vulnerabilities)

This check tries to determine if the project has published a security policy. It works by looking for a file named SECURITY.md (case-insensitive) in a few well-known directories.

A security policy (typically a SECURITY.md file) can give users information about what constitutes a vulnerability and how to report one securely so that information about a bug is not publicly visible.

Remediation steps

  • Place a security policy file SECURITY.md in the root directory of your repository. This makes it easily discoverable by a vulnerability reporter.
  • The file should contain information on what constitutes a vulnerability and a way to report it securely (e.g. issue tracker with private issue support, encrypted email with a published public key). Follow the coordinated vulnerability disclosure guidelines to respond to vulnerability disclosures.
  • For GitHub, see more information here.

Signed-Releases

Risk: High (possibility of installing malicious releases)

This check tries to determine if the project cryptographically signs release artifacts. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

Signed releases attest to the provenance of the artifact.

This check looks for the following filenames in the project's last five releases: *.minisig, *.asc (pgp), *.sig, *.sign.

Note: The check does not verify the signatures.

Remediation steps

  • Publish the release.
  • Generate a signing key.
  • Download the release as an archive locally.
  • Sign the release archive with this key (should output a signature file).
  • Attach the signature file next to the release archive.
  • If the source is hosted on GitHub, check out the steps here.

Token-Permissions

Risk: High (vulnerable to malicious code additions)

This check determines whether the project's automated workflows tokens are set to read-only by default. It is currently limited to repositories hosted on GitHub, and does not support other source hosting repositories (i.e., Forges).

Setting token permissions to read-only follows the principle of least privilege. This is important because attackers may use a compromised token with write access to push malicious code into the project.

The highest score is awarded when the permissions definitions in each workflow's yaml file are set as read-only at the top level and the required write permissions are declared at the run-level. One point is reduced from the score if all jobs have their permissions defined but the top level permissions are not defined. This configuration is secure, but there is a chance that when a new job is added to the workflow, its job permissions could be left undefined because of human error.

The check cannot detect if the "read-only" GitHub permission setting is enabled, as there is no API available.

Additionally, points are reduced if certain write permissions are defined for a job.

Write permissions causing a small reduction

  • statuses - May allow an attacker to change the result of pre-submit checks and get a PR merged.
  • checks - May allow an attacker to remove pre-submit checks and introduce a bug.
  • security-events - May allow an attacker to read vulnerability reports before a patch is available. However, points are not reduced if the job utilizes a recognized action for uploading SARIF results.
  • deployments - May allow an attacker to charge repo owner by triggering VM runs, and tiny chance an attacker can trigger a remote service with code they own if server accepts code/location variables unsanitized.

Write permissions causing a large reduction

  • contents - Allows an attacker to commit unreviewed code. However, points are not reduced if the job utilizes a recognized packaging action or command.
  • packages - Allows an attacker to publish packages. However, points are not reduced if the job utilizes a recognized packaging action or command.
  • actions - May allow an attacker to steal GitHub secrets by approving to run an action that needs approval.

Remediation steps

  • Set permissions as read-all or contents: read as described in GitHub's documentation.
  • To help determine the permissions needed for your workflows, you may use StepSecurity's online tool by ticking the "Restrict permissions for GITHUB_TOKEN". You may also tick the "Pin actions to a full length commit SHA" to fix issues found by the Pinned-dependencies check.

Vulnerabilities

Risk: High (known vulnerabilities)

This check determines whether the project has open, unfixed vulnerabilities using the OSV (Open Source Vulnerabilities) service. An open vulnerability is readily exploited by attackers and should be fixed as soon as possible.

Remediation steps

  • Fix the vulnerabilities. The details of each vulnerability can be found on https://osv.dev.