lice-comb.matching

The core matching functionality within lice-comb. Matching is provided for three categories of input, and uses a different process for each:

  1. License names
  2. License uris
  3. License texts

Each matching fn has two variants:

  1. A ‘simple’ version that returns a set of SPDX expressions (Strings)
  2. An ‘info’ version that returns an ‘expressions-info map’ containing metadata on how the determination was made (including confidence information)

An expressions-info map has this structure:

  • key: an SPDX expression (String)
  • value: a sequence of ‘expression-info’ maps

An expression-info map has this structure:

  • :id (String, optional): The portion of the SPDX expression that this info map refers to (usually, though not always, a single SPDX identifier).
  • :type (either :declared or :concluded, mandatory): Whether this identifier was unambiguously declared within the input or was instead concluded by lice-comb (see the SPDX FAQ for more detail on the definition of these two terms).
  • :confidence (one of: :high, :medium, :low, only provided when :type = :concluded): Indicates the approximate confidence lice-comb has in its conclusions for this particular SPDX identifier.
  • :confidence-explanations (a set of keywords, optional): Describes why the associated :confidence was not :high.
  • :strategy (a keyword, mandatory): The strategy lice-comb used to determine this particular SPDX identifier. See lice-comb.utils/strategy->string for an up-to-date list of all possible values.
  • :source (a sequence of Strings): The list of sources used to arrive at this portion of the SPDX expression, starting from the most general (the input) through to the most specific (the smallest subset of the input that was used to make this determination).

id->name

(id->name id)

Returns a human readable name of the given license or exception identifier; either the official SPDX license or exception name or, if the id is a lice-comb specific LicenseRef, a lice-comb specific name. Returns id verbatim if unable to determine a name. Returns nil if id is blank.

init!

(init!)

Initialises this namespace upon first call (and does nothing on subsequent calls), returning nil. Consumers of this namespace are not required to call this fn, as initialisation will occur implicitly anyway; it is provided to allow explicit control of the cost of initialisation to callers who need it.

Note: this method may have a substantial performance cost.

lice-comb-license-ref?

(lice-comb-license-ref? id)

Is the given id one of lice-comb’s custom LicenseRefs?

name->expressions

(name->expressions name)

Returns a set of SPDX expressions (Strings) for name. See name->expressions-info for details.

name->expressions-info

(name->expressions-info name)

Returns an expressions-info map for name (a String), or nil if no expressions were found. This involves:

  1. Determining whether name is a valid SPDX license expression, and if so normalising it (see clj-spdx’s spdx.expressions/normalise fn)
  2. Checking if name is actually a URI, and if so performing URL matching on it via uri->expressions-info
  3. attempting to parse name to construct one or more SPDX license expressions

proprietary-commercial?

(proprietary-commercial? id)

Is the given id lice-comb’s custom ‘proprietary / commercial’ LicenseRef?

public-domain?

(public-domain? id)

Is the given id lice-comb’s custom ‘public domain’ LicenseRef?

text->expressions

(text->expressions text)

Returns a set of SPDX expressions (Strings) for text. See text->expressions-info for details.

text->expressions-info

(text->expressions-info text)

Returns an expressions-info map for text (a String, Reader, or anything that’s accepted by clojure.java.io/reader). Returns nil if no expressions were found in it.

Notes:

  • this function implements the SPDX matching guidelines (via clj-spdx). See the SPDX specification for details
  • the caller is expected to open & close a Reader or InputStream passed to this function (e.g. using clojure.core/with-open)
  • you cannot pass a String representation of a filename to this method - you should pass filenames through clojure.java.io/file (or similar) first

unidentified->name

(unidentified->name id)

Returns a human readable name for the given lice-comb custom ‘unidentified’ LicenseRef. Returns nil if id is not a lice-comb custom ‘unidentified’ LicenseRef.

unidentified?

(unidentified? id)

Is the given id a lice-comb custom ‘unidentified’ LicenseRef?

uri->expressions

(uri->expressions uri)

Returns a set of SPDX expressions (Strings) for uri. See uri->expressions-info for details.

uri->expressions-info

(uri->expressions-info uri)

Returns an exceptions-info map for uri (a String, URL, or URI), or nil if no expressions were found. It does this via two steps:

  1. Seeing if uri is in the SPDX license and/or exception lists
  2. Attempting to retrieve the plain text content of uri and if that succeeds running that text through text->expressions-info

Notes on step 1:

  1. this does not perform exact matching; rather it simplifies URIs in various ways to avoid irrelevant differences, including performing a case-insensitive comparison, ignoring protocol differences (http vs https), ignoring extensions representing MIME types (.txt vs .html, etc.), etc.
  2. URIs in the SPDX license and exception lists are not unique - the same URI may represent multiple licenses and/or exceptions.