Skip to content

Commit 9cfc49a

Browse files
authored
Merge pull request #60 from pmonks/dev
Release 2.0.332
2 parents aef54f2 + f6b7908 commit 9cfc49a

File tree

8 files changed

+99
-58
lines changed

8 files changed

+99
-58
lines changed

deps.edn

+1-1
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@
2828
miikka/clj-base62 {:mvn/version "0.1.1"}
2929
com.github.pmonks/clj-spdx {:mvn/version "1.0.176"}
3030
com.github.pmonks/rencg {:mvn/version "1.0.51"}
31-
com.github.pmonks/embroidery {:mvn/version "1.0.41"}}
31+
com.github.pmonks/embroidery {:mvn/version "1.0.44"}}
3232
:aliases
3333
{:build {:deps {com.github.pmonks/pbr {:mvn/version "RELEASE"}}
3434
:ns-default pbr.build}}}

doc/overview.md

+44-22
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Each expression-info map in the sequence of values has this structure:
2424
Whether this identifier was unambiguously declared within the input or was instead concluded by lice-comb (see [the SPDX FAQ](https://wiki.spdx.org/view/SPDX_FAQ) for more detail on the definition of these two terms).
2525
* `:confidence` (one of: `:high`, `:medium`, `:low`, only provided when `:type` = `:concluded`):
2626
Indicates the approximate confidence lice-comb has in its conclusions for this particular SPDX identifier.
27+
* `:confidence-explanations` (a set of keywords, optional):
28+
Describes why the associated `:confidence` was not `:high`.
2729
* `:strategy` (a keyword, mandatory):
2830
The strategy lice-comb used to determine this particular SPDX identifier. See [[lice-comb.utils/strategy->string]] for an up-to-date list of all possible values.
2931
* `:source` (a sequence of `String`s):
@@ -41,34 +43,54 @@ For example, this code:
4143
results in this expressions-info map (pretty printed for clarity):
4244

4345
```clojure
44-
{"GPL-2.0-or-later"
45-
({:id "GPL-2.0-or-later",
46-
:type :concluded,
47-
:confidence :medium,
48-
:strategy :regex-matching,
49-
:source ("https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
50-
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom"
51-
"<licenses><license><name>"
52-
"CDDL/GPLv2+CE"
53-
"GPLv2+")}),
54-
"CDDL-1.1"
55-
({:id "CDDL-1.1",
56-
:type :concluded,
57-
:confidence :low,
58-
:strategy :regex-matching,
59-
:source ("https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
60-
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom"
61-
"<licenses><license><name>"
62-
"CDDL/GPLv2+CE"
63-
"CDDL")})}
46+
{"CDDL-1.1 OR GPL-2.0-only WITH Classpath-exception-2.0"
47+
({:type :concluded
48+
:confidence :low
49+
:strategy :maven-pom-multi-license-rule
50+
:source ("javax.mail/javax.mail-api@1.6.2"
51+
"https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
52+
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom")}
53+
{:id "GPL-2.0-only"
54+
:type :concluded
55+
:confidence :high
56+
:strategy :manual-verification
57+
:source ("javax.mail/javax.mail-api@1.6.2"
58+
"https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
59+
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom"
60+
"<licenses><license><name>"
61+
"CDDL/GPLv2+CE"
62+
"GPLv2+CE"
63+
"GPLv2")}
64+
{:id "Classpath-exception-2.0"
65+
:type :concluded
66+
:confidence :high
67+
:strategy :manual-verification
68+
:source ("javax.mail/javax.mail-api@1.6.2"
69+
"https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
70+
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom"
71+
"<licenses><license><name>"
72+
"CDDL/GPLv2+CE"
73+
"GPLv2+CE"
74+
"CE")}
75+
{:id "CDDL-1.1"
76+
:type :concluded
77+
:confidence :low
78+
:confidence-explanations #{:missing-version}
79+
:strategy :regex-matching
80+
:source ("javax.mail/javax.mail-api@1.6.2"
81+
"https://repo1.maven.org/maven2/javax/mail/javax.mail-api/1.6.2/javax.mail-api-1.6.2.pom"
82+
"https://repo1.maven.org/maven2/com/sun/mail/all/1.6.2/all-1.6.2.pom"
83+
"<licenses><license><name>"
84+
"CDDL/GPLv2+CE"
85+
"CDDL")})}
6486
```
6587

66-
A key insight that the expressions-info map tells us in this case is that the `javax.mail/javax.mail-api@1.6.2` artifact doesn't declare which version of the CDDL it uses, and lice-comb has _inferred_ the latest (`CDDL-1.1`), and in doing so reduced its confidence to "low". This important insight is not apparent when the `simple` variant of the function is used instead:
88+
A key insight that the expressions-info map tells us in this case is that the `javax.mail/javax.mail-api@1.6.2` artifact doesn't declare which version of the CDDL it uses, and lice-comb has _inferred_ the latest (`CDDL-1.1`), and in doing so reduced its confidence to "low" (while also provided a helpful confidence explanation). This important insight is not apparent when the `simple` variant of the function is used instead:
6789

6890
```clojure
6991
(lcmvn/gav->expressions "javax.mail" "javax.mail-api" "1.6.2")
7092

71-
#{"CDDL-1.1" "GPL-2.0-or-later"}
93+
#{"CDDL-1.1 OR GPL-2.0-only WITH Classpath-exception-2.0"}
7294
```
7395

7496
[Back to GitHub](https://github.com/pmonks/lice-comb)

resources/lice_comb/names.edn

+2-3
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
; Map of name values seen in the wild that are too ambiguous / cursed to support any reasonable form of automated parsing
22
{
3-
; Seen in https://repo.maven.apache.org/maven2/com/sun/mail/all/1.4.7/all-1.4.7.pom
3+
; Seen in https://repo.maven.apache.org/maven2/com/sun/mail/all/1.4.7/all-1.4.7.pom and other javax.mail/javax.mail-api artifacts
44
"GPLv2+CE" {"GPL-2.0-only WITH Classpath-exception-2.0"
5-
({:type :concluded :confidence :high :strategy :manual-verification :source ("GPLv2+CE")}
6-
{:id "GPL-2.0-only" :type :concluded :confidence :high :strategy :manual-verification :source ("GPLv2+CE" "GPLv2")}
5+
({:id "GPL-2.0-only" :type :concluded :confidence :high :strategy :manual-verification :source ("GPLv2+CE" "GPLv2")}
76
{:id "Classpath-exception-2.0" :type :concluded :confidence :high :strategy :manual-verification :source ("GPLv2+CE" "CE")})}
87
}

src/lice_comb/deps.clj

+13-11
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@
2121
license information."
2222
(:require [clojure.string :as s]
2323
[clojure.tools.logging :as log]
24-
[embroidery.api :as e]
2524
[lice-comb.maven :as lcmvn]
2625
[lice-comb.files :as lcf]
27-
[lice-comb.impl.expressions-info :as lciei]))
26+
[lice-comb.impl.expressions-info :as lciei]
27+
[lice-comb.impl.utils :as lciu]))
2828

2929
(defn- normalise-dep
3030
"Normalises a dep, by removing any classifier suffixes from the artifact-id
@@ -60,15 +60,17 @@
6060
nil))]
6161
gav-expressions
6262
; If we didn't find any licenses in the dep's POM, check the dep's JAR(s)
63-
(into {} (filter identity (e/pmap* #(try
64-
(lcf/zip->expressions-info %)
65-
(catch javax.xml.stream.XMLStreamException xse
66-
(log/warn (str "Failed to parse pom inside " % " - ignoring") xse)
67-
nil)
68-
(catch java.util.zip.ZipException ze
69-
(log/warn (str "Failed to unzip " % " - ignoring") ze)
70-
nil))
71-
(:paths info))))))))
63+
(into {} (filter identity
64+
(lciu/file-handle-bounded-pmap
65+
#(try
66+
(lcf/zip->expressions-info %)
67+
(catch javax.xml.stream.XMLStreamException xse
68+
(log/warn (str "Failed to parse pom inside " % " - ignoring") xse)
69+
nil)
70+
(catch java.util.zip.ZipException ze
71+
(log/warn (str "Failed to unzip " % " - ignoring") ze)
72+
nil))
73+
(:paths info))))))))
7274

7375
(defmulti dep->expressions-info
7476
"Returns an expressions-info map for `dep` (a `MapEntry` or two-element vector

src/lice_comb/files.clj

+19-15
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,6 @@
2222
(:require [clojure.string :as s]
2323
[clojure.java.io :as io]
2424
[clojure.tools.logging :as log]
25-
[embroidery.api :as e]
2625
[lice-comb.matching :as lcm]
2726
[lice-comb.maven :as lcmvn]
2827
[lice-comb.impl.expressions-info :as lciei]
@@ -52,7 +51,7 @@
5251
directories (as defined by `java.io.File.isHidden()`) are included in the
5352
search or not."
5453
([dir] (probable-license-files dir nil))
55-
([dir {:keys [include-hidden-dirs?] :or {include-hidden-dirs? false} :as opts}]
54+
([dir {:keys [include-hidden-dirs?] :or {include-hidden-dirs? false}}]
5655
(when (lciu/readable-dir? dir)
5756
(some-> (lciu/filter-file-only-seq (io/file dir)
5857
(fn [^java.io.File d] (and (not= (.getCanonicalFile d) (.getCanonicalFile (io/file (lcmvn/local-maven-repo)))) ; Make sure to exclude the Maven local repo, just in case it happens to be nested within dir
@@ -135,7 +134,7 @@
135134
directories (as defined by `java.io.File.isHidden()`) are included in the
136135
search or not."
137136
([dir] (zip-compressed-files dir nil))
138-
([dir {:keys [include-hidden-dirs?] :or {include-hidden-dirs? false} :as opts}]
137+
([dir {:keys [include-hidden-dirs?] :or {include-hidden-dirs? false}}]
139138
(when (lciu/readable-dir? dir)
140139
(some-> (lciu/filter-file-only-seq (io/file dir)
141140
(fn [^java.io.File d] (or include-hidden-dirs? (not (.isHidden d))))
@@ -145,6 +144,7 @@
145144
(s/ends-with? lname ".jar")))))
146145
set))))
147146

147+
#_{:clj-kondo/ignore [:unused-binding]}
148148
(defn dir->expressions-info
149149
"Returns an expressions-info map for `dir` (a `String` or `File`, which must
150150
refer to a readable directory), or `nil` if or no expressions were found.
@@ -160,19 +160,23 @@
160160
([dir] (dir->expressions-info dir nil))
161161
([dir {:keys [include-hidden-dirs? include-zips?] :or {include-hidden-dirs? false include-zips? false} :as opts}]
162162
(when (lciu/readable-dir? dir)
163-
(let [file-expressions (into {} (filter identity (e/pmap* #(try
164-
(file->expressions-info %)
165-
(catch Exception e
166-
(log/warn (str "Unexpected exception while processing " % " - ignoring") e)
167-
nil))
168-
(probable-license-files dir opts))))]
163+
(let [file-expressions (into {} (filter identity
164+
(lciu/file-handle-bounded-pmap
165+
#(try
166+
(file->expressions-info %)
167+
(catch Exception e
168+
(log/warn (str "Unexpected exception while processing " % " - ignoring") e)
169+
nil))
170+
(probable-license-files dir opts))))]
169171
(if include-zips?
170-
(let [zip-expressions (into {} (filter identity (e/pmap* #(try
171-
(zip->expressions-info %)
172-
(catch Exception e
173-
(log/warn (str "Unexpected exception while processing " % " - ignoring") e)
174-
nil))
175-
(zip-compressed-files dir opts))))]
172+
(let [zip-expressions (into {} (filter identity
173+
(lciu/file-handle-bounded-pmap
174+
#(try
175+
(zip->expressions-info %)
176+
(catch Exception e
177+
(log/warn (str "Unexpected exception while processing " % " - ignoring") e)
178+
nil))
179+
(zip-compressed-files dir opts))))]
176180
(lciei/prepend-source (lciu/filepath dir) (merge file-expressions zip-expressions)))
177181
(lciei/prepend-source (lciu/filepath dir) file-expressions))))))
178182

src/lice_comb/impl/utils.clj

+13-1
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@
2222
lice-comb and may change without notice."
2323
(:require [clojure.string :as s]
2424
[clojure.java.io :as io]
25-
[clj-base62.core :as base62]))
25+
[clj-base62.core :as base62]
26+
[embroidery.api :as e]))
2627

2728
(defn mapfonk
2829
"Returns a new map where f has been applied to all of the keys of m."
@@ -326,3 +327,14 @@
326327
(if-not (s/blank? val)
327328
val
328329
default))))
330+
331+
; Note: we could use OSHI to determine the actual number of possible open file
332+
; handles on the runtime environment, but it seems like overkill to bring in
333+
; such a large dependency for this one feature, especially when lice-comb
334+
; typically won't get close to opening this many files.
335+
(defn file-handle-bounded-pmap
336+
"bounded-pmap* hardcoded to no more than 8192 virtual threads. This size is
337+
determined conservatively from macOS, since it's the least common denominator
338+
of the major OSes in terms of number of possible open file handles."
339+
[f coll]
340+
(e/bounded-pmap* 8192 f coll))

src/lice_comb/lein.clj

+5-5
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@
1919
(ns lice-comb.lein
2020
"Functionality related to combing Leiningen dependency sequences for license
2121
information."
22-
(:require [embroidery.api :as e]
23-
[lice-comb.deps :as lcd]
24-
[lice-comb.impl.expressions-info :as lciei]))
22+
(:require [lice-comb.deps :as lcd]
23+
[lice-comb.impl.expressions-info :as lciei]
24+
[lice-comb.impl.utils :as lciu]))
2525

2626
(defn- lein-dep->toolsdeps-dep
2727
"Converts a leiningen style dependency vector into a (partial) tools.deps style
@@ -54,14 +54,14 @@
5454
expressions were found)."
5555
[deps]
5656
(when deps
57-
(into {} (e/pmap* #(vec [% (dep->expressions-info %)]) deps))))
57+
(into {} (lciu/file-handle-bounded-pmap #(vec [% (dep->expressions-info %)]) deps))))
5858

5959
(defn deps->expressions
6060
"Returns a map of sets of SPDX expressions (`String`s) for each Leiningen
6161
style dep in `deps`. See [[deps->expressions-info]] for details."
6262
[deps]
6363
(when deps
64-
(into {} (e/pmap* #(vec [% (dep->expressions %)]) deps))))
64+
(into {} (lciu/file-handle-bounded-pmap #(vec [% (dep->expressions %)]) deps))))
6565

6666
(defn init!
6767
"Initialises this namespace upon first call (and does nothing on subsequent

src/lice_comb/matching.clj

+2
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@
4949
`:type` = `:concluded`):
5050
Indicates the approximate confidence lice-comb has in its conclusions for
5151
this particular SPDX identifier.
52+
* `:confidence-explanations` (a set of keywords, optional):
53+
Describes why the associated `:confidence` was not `:high`.
5254
* `:strategy` (a keyword, mandatory):
5355
The strategy lice-comb used to determine this particular SPDX identifier.
5456
See [[lice-comb.utils/strategy->string]] for an up-to-date list of all

0 commit comments

Comments
 (0)