aboutsummaryrefslogtreecommitdiff
path: root/subtitles.rst
blob: eac4bc12421cce639884dcbb2d3c3dbd8d4a1e92 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
When I was young, my parents taught me not to accept candy from strangers,
unless they were present and approved of it, because there was a small risk
of very bad things happening.
It was of course a simplistic rule, but it had to be easy enough to follow
for somebody who wasn't proficient (yet) in the subtleties of social
interactions.

One of the reasons why it worked well was that following it wasn't a big
burden: at home candy was plenty and actual offers were rare: I only
remember missing one piece of candy because of it, and while it may have
been a great one, the ones I could have at home were also good.

Contrary to candy, offers of gratis software from random strangers are quite
common: from suspicious looking websites to legit and professional looking
ones, to platforms that are explicitly designed to allow developers to
publish their own software with little or no checks.

Just like candy, there is also a source of trusted software in the Linux
distributions, especially those lead by a community: I mention mostly Debian
because it's the one I know best, but the same principles apply to Fedora
and, to some measure, to most of the other distributions.
Like good parents, distributions can be wrong, and they do leave room for
older children (and proficient users) to make their own choices, but still
provide a safe default.

Among the unsafe sources there are many different cases and while they do
share some of the risks, they have different targets with different issues;
for brevity the scope of this article is limited to the ones that mostly
concern software developers: language specific package managers and
software distribution platforms like PyPi, npm and rubygems etc.

These platforms are extremely convenient both for the writers of libraries,
who are enabled to publish their work with minor hassles, and for the people
who use such libraries, because they provide an easy way to install and use
an huge amount of code. They are of course also an excellent place for
distributions to find new libraries to package and distribute, and this I
agree is a good thing.

What I however believe is that getting code from such sources and using it
*without carefully checking it* is even more risky than accepting candy from a
random stranger on the street in an unfamiliar neighbourhood.

The risk aren't trivial: while you probably won't be taken as an hostage for
ransom, your data could be, or your devices and the ones who run your
programs could be used in some criminal act causing at least some monetary
damage both to yourself and to society at large.

If you're writing code that should be maintained in time there are also
other risks even when no malice is involved, because each package on these
platform has a different policy with regards to updates, their backwards
compatibility and what can be expected in case an old version is found to
have security issues.

The very fact that everybody can publish anything on such platforms is
both their biggest strength and their main source of vulnerability:
while most of the people who publish their libraries do so with good
intentions, attacks have been described and publicly tested, such as the
fun `typo-squatting`_ one (`archived URL`_) that published harmless
malicious code under common typos for famous libraries.

.. _`typo-squatting`: http://incolumitas.com/2016/06/08/typosquatting-package-managers/
.. _`archived URL`: http://web.archive.org/web/20160801161807/http://incolumitas.com/2016/06/08/typosquatting-package-managers/

Contrast this with Debian, where everybody can contribute, but before
they are allowed full unsupervised access to the archive they have to
establish a relationship with the rest of the community, which includes
meeting other developers in real life, at the very least to get their
gpg keys signed.

This doesn't prevent malicious people from introducing software, but
raises significantly the effort required to do so, and once caught
people can usually be much more effectively prevented from repeating it
than a simple ban on an online-only account can do.

It is true that not every Debian maintainer actually does a full code
review of everything that they allow in the archive, and in some cases
it would be unreasonable to expect it, but in most cases they are at
least reasonably familiar with the code to do at least bug triage, and
most importantly they are in an excellent position to establish a
relationship of mutual trust with the upstream authors.

Additionally, package maintainers don't work in isolation: a growing
number of packages are being maintained by a team of people, and most
importantly there are aspects that involve potentially the whole
community, from the fact that new packages that enter the distribution
are publicity announced on a mailing list to the various distribution-wide
QA efforts.

Going back to the language specific distribution platforms, sometimes
even the people who manage the platform themselves can't be fully
trusted to do the right thing: I believe everybody in the field
remembers the `npm fiasco`_ where a lawyer letter requesting the removal
of a package started a series of events that resulted in potentially
breaking a huge amount of automated build systems.

.. _`npm fiasco`: https://lwn.net/Articles/681410/

Here some of the problems were caused by some technical policies that
caused the whole ecosystem to be especially vulnerable, but one big
issue was the fact that the managers of the npm platform are a private
entity with no oversight from the user community.

Here not all distributions are equal, but contrast this with Debian,
where the distribution is managed by a community that is based on a
`social contract`_ and is governed via democratic procedures established
in its constitution_.

.. _`social contract`: https://www.debian.org/social_contract
.. _constitution: https://www.debian.org/devel/constitution

Additionally, the long history of the distribution model means that many
issues have already been met, the errors have already been done, and
there are established technical procedures to deal with them in a better
way.

So, shouldn't we use language specific distribution platforms at all?
No! As developers we aren't children, we are adults who have the skills
to distinguish between safe and unsafe libraries just as well as the
average distribution maintainer can do.
What I believe we should do is stop treating them as a safe source that
can be used blindly and reserve that status to actual trustful sources
like Debian, falling back to the language specific platforms only when
strictly needed, and in that case:

* actually check carefully what we are using, both by reading the code
  and by analysing the development and community practices of the
  authors;
* if possible, share that work by becoming ourselves maintainers of that
  library in our favourite distribution, to prevent duplication of
  effort and to give back to the community whose work we get advantage
  from.