Sam Gentle.com

Illegal npm modules

Recently, I was a little surprised to learn that the FSF claims that depending on a library makes a derivative work and thus spreads the GPL. To me, that seems obviously ridiculous (how is it possible to create a derivative work of something you haven't modified? Or distributed? Or sometimes even looked at?). Indeed, you can have some fun watching the FSF tie itself into knots trying to avoid the inevitable conclusion that if you make a derivative work by calling code, then basically all non-GPL software is in violation of the GPL.

Anyway, this got me thinking about how in node-land, the common pattern is many many small modules. Indeed, if I install the top 60 most popular npm modules I get about 3000(!) dependent modules including dupes. With so many modules it would be very easy to accidentally include a GPL library somewhere. If you did that, your npm module is (according to the FSF) in violation of the GPL and therefore (according to the FSF) in violation of international copyright law and therefore (according to the FSF) illegal.

I thought it would be fun to find out how many people are breaking Stallman's intergalactic copyright law, so I quickly grabbed the most starred modules and installed them. Npm is actually a couchapp so this was pretty easy to do.


$ wget 'https://skimdb.npmjs.com/registry/_design/app/_view/browseStarPackage?group_level=1' -O starred.json
$ coffee -e "console.log require('./starred').rows.sort((a, b) -> a.value - b.value).slice(-60).map((x) -> x.key[0]).join('\n')" | xargs npm install

Then I waited for a very long time.

For the next step I used the awesome licensecheck module (don't worry, it's not GPL - you can visit that page without creating a derivative work). Many npm modules don't include license information in their metadata, because programmers are lazy, so it employs various sophisticated techniques to figure out and normalise the licenses into a consistent output. And I got back this:


$ licensecheck --tsv | awk '{print $3}' | sort -k1 | uniq -ci | sort -n | tail -12
   2 BSD*
   2 WTFPL
   2 WTFPL2
   3 AGPLV3
   4 BSD-like
   5 Do
  11 MISSING
  15 unmatched:
  41 Apache
  51 ISC
 160 BSD
 874 MIT

Luckily, it seemed like there were no hidden time bombs deep in the dependencies of the most popular projects. The three AGPLV3 entries that turned up are all part of the fairly popular pm2 project. But if it's popular that probably means there are other things depending on it...


$ wget 'https://skimdb.npmjs.com/registry/_design/app/_view/dependedUpon?startkey=[%22pm2%22]&endkey=[%22pm2%22,{}]&reduce=false&include_docs=true' -O depends_on_pm.json
$ coffee -e 'console.log require("./depends_on_pm").rows.length'
26
26? Let's hope all of them are GPL! Especially because, if they aren't, any recursive dependents would also be in violation, probably without even realising it.

$ coffee -e 'console.log require("./depends_on_pm").rows.map((x) -> "#{x.id}: #{x.doc.license || x.doc.versions[x.doc["dist-tags"].latest].licenses?[0].type}").join("\n")'
anthtrigger: MIT
bosco: MIT
bute: MIT
debian-server: MIT
diy-build: BSD
ecrit: MIT
ezseed: GPL
foxjs: MIT
g-dns: ISC
gatewayd-4: undefined
gitbook2edx-external-grader: BSD
hls-endless: MIT
hubba: undefined
itsy: MIT
lark-bootstrap: MIT
nodemvc: MIT
npm-collection-explicit-installs: MIT
nshare-demon: MIT
pm2-auto-pull: ISC
pm2-plotly: MIT
pod: undefined
radic: MIT
tesla: MIT
wordnok: MIT
yog-pm: BSD
zorium-paper-demo: undefined

...Oh dear.