Asked  6 Months ago    Answers:  5   Viewed   26 times

Occasionally I run into comments or responses that state emphatically that running pip under sudo is "wrong" or "bad", but there are cases (including the way I have a bunch of tools set up) where it is either much simpler, or even necessary to run it that way.

What are the risks associated with running pip under sudo?


Note that this in not the same question as this one, which, despite the title, provides no information about risks. This also isn't a question about how to avoid using sudo, but about specifically why one would want to.

 Answers

18

When you run pip with sudo, you run setup.py with sudo. In other words, you run arbitrary Python code from the Internet as root. If someone puts up a malicious project on PyPI and you install it, you give an attacker root access to your machine. Prior to some recent fixes to pip and PyPI, an attacker could also run a man in the middle attack to inject their code when you download a trustworthy project.

Tuesday, June 1, 2021
 
kwichz
answered 6 Months ago
22

Mainly here are the risks:

  • Session hijacking
  • Session fixation

Consider using OWASP to do against it.

Also have a look at:

PHP Security Guide

Sunday, July 11, 2021
 
ranhan
answered 5 Months ago
64

I have asked this question on the FreeNode #pip channel. The following is my interpretation of the replies I've got there. Thanks go to agronholm and dstufft from #pip for answering my question.

Packages can be maintained on PyPI in three different ways:

  1. Directly on PyPI. If a package is hosted on PyPI, no additional switch is required to install it. Connection to PyPI is secured by HTTPS, therefore the downloads are considered as trusted.

  2. On an external site, with PyPI storing a secure checksum of the relevant files. In this case pip requires the --allow-external switch to proceed. While the download might potentially come from an unsecured server, downloaded files are checked against the secure checksum stored on PyPI. Because of that, this case is also considered secure.

  3. On an external site, without PyPI storing any checksum. In this case there is no way to ensure that the download is safe. --allow-external is not enough to enable installation in this case, pip requires --allow-unverified.

Therefore, --allow-external alone is considered a safe switch, and only using --allow-unverified is a potential security issue. This is also why pip has an --allow-all-external option, but no --allow-all-unverified.

As a side note, --allow-external was introduced not as a security feature, but due to the potential speed, uptime and convenience issues while dealing with third party websites.

Tuesday, August 3, 2021
 
Trav L
answered 4 Months ago
69

I would advise against ever calling any pip somecommand (or pip3) script directly. Instead it's much safer to call pip's executable module for a specific Python interpreter explicitly, something of the form path/to/pythonX.Y -m pip somecommand.

There are many advantages to this, for example:

  • It is explicit for which Python interpreter the projects will be pip-installed (Python 2 or 3, inside the virtual environment or not, etc.)
  • For a virtual environment, one can pip-install (or do other things) without activating it: path/to/venv/bin/python -m pip install SomeProject
  • Under Windows this is the only to safely upgrade pip itself pathtovenvScriptspython.exe -m pip install --upgrade pip

But yes, if all is perfectly setup, then python3 -m pip install SomeProject and pip3 install SomeProject should do the exact same thing, but there are way too many cases where there is an issue with the setup and things don't work as expected and users get confused (as shown by the many questions about this topic on this platform).

References

  • Brett Cannon's article "Why you should use python -m pip"
  • pip's documentation section on "Upgrading pip"
  • venv's documentation section on "Creating virtual environments": "You don’t specifically need to activate an environment [...]"
Thursday, August 19, 2021
 
Minar Mahmud
answered 4 Months ago
96

You can create two-way anonymous pipes using CreatePipe() Win32 API call, so piping input/output is not the only way. You simply get a new file handle you can give to the other process.

Anonymous pipes are based on shared memory, but do not support asynchronous operations (through ReadFileEx, ReadFileWrite). Performance issues (disadvantages) are thus 1) synchronous operation, 2) memory copying, 3) interprocess synchronization. Generally, "raw" shared memory (memory mapped file without an actual backing file) and named pipes will be faster, and sockets and Window messages will be slower (than anonymous pipes).

You cannot use I/O Completion Ports (IOCP) with anonymous pipes, instead you'll have to "poll" the pipe, incurring extra context switches. In addition to serializing the data, the serialized data has to be copied around in memory, as you cannot write directly to the shared memory. One process also has to wait for another, implying that the other process has to signal an interprocess synchronization primitive. Performance depends heavily on the size of the messages (ratio of read/write calls to amount of data sent), because for each read/write the process has to make context switches and possibly spin/sleep.

All methods except "raw" shared memory require memory copying and some sort of interprocess signaling (synchronization), so the main disadvantage of anonymous pipes is synchronous operation. You'll hit the ceiling when transmitting large number of messages, when the CPU spends most of its time doing context switches.

Performance-wise, named pipes are better because you can have worker thread(s) processing async notifications using IOCP, and can even receive multiple messages with one call, thus reducing the API overhead. If making your own components, the extra trouble from giving a name to the pipe is well worth the trouble (and you can even connect across networks). Later Windows versions implement special routing for local sockets, which also do support IOCP.

The fastest method would be to use shared memory directly, but then you will have to take care of interprocess synchronization yourself. It's possible to implement lock-free pipes yourself, but in case you're not constantly transmitting data, you'll still have to use synchronization primitives to notify/wake the listening process up.

Also note that using files does not imply that you would be limited by disk speed, as you can use memory mapped files and even with normal files, the cache manager handles the reading/writing. The other methods (RPC, clipboard, etc) are based on these "standard" methods, which means they'll just add some extra layer of protocol and are meant to be easier/more helpful, or more suitable for some programming environment (but not to be faster).

Thursday, September 16, 2021
 
ranhan
answered 3 Months ago
Only authorized users can answer the question. Please sign in first, or register a free account.
Not the answer you're looking for? Browse other questions tagged :
 
Share