Work in progress: Fixing typosquatting+namesquatting threats in Python Package Index (PyPI).
Last updated: 20210318
20210301: Some new kid on the block RemindSupplyChainRisks
squads 3591 packages: [github.com/pypa/pypi-support]
20210209: A popular article surfaces about supply chain attacks: [Alex Birsan, medium.com: Dependency Confusion...]
20200519: An article about how --extra-index-url
often is misunderstood - [Matt Kubilus, medium.com: Pip --extra-index-url, Considered Dangerous]
20191201: In-the-wild malware found on PyPi stealing PGP and SSH keys through namesquatted python3-dateutil
library [ZDNet]
20190717: Company ReverseLabs writes a blog post SupPy Chain Malware, finding 3 packages on PyPi that install backdoors, one of which has 82 downloads per month on average.
20181209: A Github repository by developer "roscoe" discloses 18 typosquatted packages, some uploading user credentials, others installing malware.
20181013: Independent security researcher Bertus announces a code analysis revealing 12 malicious packages on PyPi.
20180508: A post on Reddit explains how the package ssh-decorator
sends a user's private SSH key to a server on the internet. The Github and PyPi projects are since then taken down. Evidence is inconclusive as to whether the original author of the package had their PyPi credentials hijacked or was complicit.
We have closed the Pytosquatting initiative for now. This is because Python Security Response Team (PSRT) has announced that they will take action (see below Timeline).
In June 2016, Typosquatting programming language package managers stated that urllib2
had ~4,000 downloads in 2 weeks. But in June 2017, we found the same package name vacant and so we (being the good guys) squatted it for several months up until this disclosure. We take these findings seriously.
20170519: Steve Stagg writes about how he registered stdlib names, sent emails and that »I raised an issue on the official pypi github issue tracker in January. This also got no reply.«
20170628: PyPI Warehouse issue #2151 is opened. Title is "Block package names that conflict with core libraries", but no names were blocked.
20170913: We squatted all available names of stdlib packages (128) - scroll down to see statistics from pingbacks.
20170914: A number of in-the-wild malicious packages on PyPI were disclosed by Slovak National Security Authority.
20170917: PyPI's main developer Donald Stufft creates PR#2396 for database-backed blacklisting of package names. It's unclear how they want to apply the blacklistings, but it would mean a more efficient process for administrators. Most of the stdlib names that we squatted are black listed.
20170922: Python Security Response Team (PSRT) takes action by announcing a detailed plan to mitigate future attacks. The plan is included in an over-all boost of PyPy, receiving a $170k grant from Mozilla Foundation.
Here's a couple of proprosals that we originally posted -- which have since then been expanded in a nice way in PSRT's security announcement.
We had a pingback in the setup.py
of packages involved in Strategy #1, meaning that during a limited duration, we gathered statistics on the extend of the issue. The callback didn't involve any stats from user systems, just an IP so we can count that a unique system has attempted to install a non-existing package that could have been exploited.
We are calling for analysis of the current PyPI resources to find in-the-wild exploits of typosquatting as Slovak National Security Authority has done. We hope there are none, but the problem has been around for a long time, and our primer didn't get reactions from the PyPI admins.
Once done, we hope to achieve a better pip installer that:
It could look like this...
pip install pipsec # Install security-hardening plugin for pip pip install virtualenv-wrapper # See that it fails pip install virtualenvwrapper # This is correct
It seems to be hinted by the closure of pip#4527 that attempts to add security to the client side isn't popular. Arguments are weak, though, so there's no real reason not to do something like the above.
Ars Technica: Devs unknowingly use “malicious” modules snuck into official Python repository
Golem.de: Bösartige Python-Pakete entdeckt (DE)
Hacker News: Malicious software libraries found in PyPI posing as well known libraries
Send comments or complaints to Benjamin Bach and Hanno Böck.
Check out the code for this website on https://github.com/benjaoming/pytosquatting.
Blocked stdlib installations since 20170913-20170916: 20188
On 20170916, PyPI removed our Top 20 of squatted packages, so our statistics won't match up anymore. They didn't remove the other 108 squatted packages.
Package | Average per day | |
---|---|---|
1 | timeit | 10.1 |
2 | pkgutil | 2.9 |
3 | ntpath | 2.2 |
4 | urllib2 | 1.4 |
5 | subprocess | 0.9 |
6 | argparser | 0.8 |
7 | this | 0.8 |
8 | collections | 0.8 |
9 | setuptols | 0.7 |
10 | smtplib | 0.6 |
11 | shutil | 0.6 |
12 | venv | 0.6 |
13 | curses | 0.6 |
14 | idlelib | 0.6 |
15 | glob | 0.6 |
16 | docutil | 0.5 |
17 | base64 | 0.4 |
18 | concurrent | 0.4 |
19 | threading | 0.4 |
20 | csselect | 0.4 |