Data

To analyze packages, Hipcheck needs to gather data about those packages. That data can come from a variety of sources, including:

  • Git commit histories
  • The GitHub API or, in the future, similar source repository platform APIs
  • Package host APIs like the NPM API

Each of these sources store information about the history of a project, which may be relevant for understanding the practices associated with the code's development, or for detecting possible active supply chain attacks.

Hipcheck tries to cleanly distinguish between data, analyses, and configuration. Data is the raw pieces of information pulled from exterior sources. It is solely factual, recording prior events. Analyses are computations performed on data which produce measures, and which may also produce concerns. Finally, configuration is an expression of the user's policy, which turns the measures produced by analyses into a score and a recommendation. This is perhaps easier to see in a diagram.

Concepts diagram

With this structure, Hipcheck tries to cleanly separate the parts that are factual, from the parts that are measuring facts, and from the parts that are applying subjective policies on those measurements.