> You will not be forced to work inside something known as a “virtual environment.”
Oof, this terrible advice cancels out an otherwise reasonable post. Beginners who don't know what they're doing are the last people who should be `pip install -r requirements.txt`-ing into the system Python the way this article is recommending. That's not only going to make working on multiple projects nearly impossible (especially for the kind of beginning students who get recommended Anaconda, which are almost always the Data Science-y crowd using NumPy, Pandas, Scikit, etc, which are notoriously finicky with version conflicts), but it stands a good chance of breaking other Python-based system utilities in completely opaque ways. This sort of advice can fubar a naive user's entire workflow.
I know virtualenvs suck to explain to people, but in my opinion it needs to be done before you ever tell them about `pip install`.
Yes. I recently had to debug my gf’s entire work computer setup getting fubar’d from installing a pip package at a user level that conflicted with the name of some package used internally by the 10k+/seat/year software the company devices have installed. Proximal cause? The installer “helpfully” failed over to installing at the user level when she didn’t have permissions to write to the place she told it to. Root cause of course is that million dollar fancy fancy enterprise software somehow being unable to give itself an isolated python package namespace, or provide any relevant errors when conflicts occurred.
The python packaging system has to be used in case studies of worst design decisions of all time.
I actually like Python's package management (with just pip and venv). It's different to all other modern solutions, taking a per-shell rather than per-project approach, but that doesn't mean worse.
The advantage is that it's less "magic" than, say, npm. No need for rules saying how to find node_modules, you just have a PYTHONPATH env variable.
The Rust and Java approach is to do everything through a build tool's CLI, and I can't complain about that. It's probably the best compromise: Less magic than npm, and more user-friendly than Python.
I was talking more about the package lookup when running code, via e.g. `node script.js`. I think it looks in all parent directories of the CWD, or maybe of the script? It's not too complicated, but it is "more magic" IMO.
Actually building Python packages is pretty complex, but that's the case for JS too. Java avoids this by distributing compiled libraries.
It looks up from script.js'd directory or the earliest parent with node_modules. Not CWD. A lot like "import somemodule" in script.py tries to import somemodule.py file from the directory script.py is in.
Traversing to the parent is especially nice for scripts. In python having scripts outside the module directory is quite painful.
Gyp is a lot easier than Python setup.py. The "easy" Java packages are comparable to pure Python/pure JS packages. With e.g. C bindings JNI/Java packaging is a horrid pain.
NPM’s original sin was making package installation and general management so painless that folks installed micro packages for everything. Can’t really fault the software for that.
The issue was, as you say, the introduction of ESM. It used to be that you required modules one way and one way only (yes there was AMD for advanced use cases, but it was an add-on), then people felt the need to “standardize” that, no we have this mess of ESM and CJS.
Review and pin your direct dependencies. With transitive dependencies it doesn't differ from trusting large dependencies in general.
The alternative to micropackages has significant downsides. Pulling in extra surface and rolling your own buggy implementations while waiting for some commitee to bikeshed years on the implementation.
Making the right thing easy rather than the wrong thing hard is a lot better approach.
If you don’t vendor your dependencies. Which is a poor practice that is commonly associated with NPM, but is by no means a requirement of the technology.
Especially if…, I should say. Even vendored dependencies are a risk as NPM commits the additional sin of allowing the act of pulling a package onto the local machine for inspection to execute arbitrary code in the form of “postinstall” hooks.
The Python community needs to solve this ASAP. This almost weekly pain point has turned a language I used to love in college into one of my most despised languages. The fact that the ML community uses this broken platform is infuriating.
Make a Python version 4 that focuses only on fixing the packaging. 100% per-project hermeticy. No global packages whatsoever. Solve just this issue and bring the entire ecosystem on board. Kill all the various virtualenvs, the anacondas, global packages. All of it.
Learn from Rust/Cargo. That project does it mostly right (sans lack of namespaces and reproducible builds).
It took 10 years to force a breaking migration from Python 2 to Python 3 in order to handle Unicode...
I have resigned myself to repeatedly smashing my keyboard on the desk until it rains keys in blind rage induced frustration to let off steam and then calmly creating an issue in the appropriate repo instead of allowing even a modicum of hope that this will ever be fixed systemically.
Well it constantly crashes and requires every shop to develop their own half-broken way of interfacing between it and other industry-standard software, but at least ~half the pixels you’ve seen in any modern movie or tv show have been touched by it at one point or another. So that’s interesting.
I teach using python. You should try explaining virtual environments to students, most of them with no programming experience, or any real notion of the state of a computer system. I do, because we recommend students use them. Every class consists of endless debugging of student systems.
After that you might understand the OP's viewpoint.
Not sure how helpful this advice is for students I'm in my late 30's have programming experience but had never touched Python until about 6 months ago.
What made Virtual environments work for me was switching to a different Editor. I use Visual Studio Code currently it works well with Virtual Environments.
When you first create a new python file in a given "project folder" it prompts you to create a new Venv and when you switch project folders it remembers and restores the Venv for each project.
One of my work colleagues pointed me to VSCode - it streamlined a lot of python things for me.
If your students are disciplined about creating a new folder for each project managing virtual environments vscode could help them.
Only issue I have is my work office has a corporate proxy setup and pip needs certificate to connect (and if I work remotely I have to turn this setting off) I wrote a shell script to toggle between the two proxy settings. Not sure if university will have the same issue proxy issues but if so this would certainly be a pain point for many students.
This is a suggestion that deserves serious consideration. I started using VS Code after I had learned all the basics so I don't know what it feels like to be a new learner who is getting started by using VS Code. But it sounds like it worked for you. (And on campus, the proxy should not be an issue.)
It does answer the question of which text editor to use when the time comes to teach them about editing text files. For a course, it helps that it is free and available on both macOS and Windows.
I have a vague recollection that at some point I was testing without an install of Python and after I selected the Python extension for VS Code, it offered to install one for me. (I don't recall if this was on macOS or Windows and memory could be playing a trick on me.) But in any case, the people behind VS Code do seem to be trying very hard to make it easy for someone who is getting started.
I am a little uneasy about the nudges to use Copilot, but on balance, it might offer a better path for students who are getting started.
After that I wonder why people think Python is a good choice to teach as a first experience in programming. Virtual environments is actually my biggest gripe with Python.
I think it is interesting to consider why Python is so popular despite such fiascos. The answers can be very informative but can also be giant red flags.
C/C++? Full of footguns. Java/C#? Too complicated for beginners. Pascal? Outdated. Etc.
I personally prefer C to teach first-year CS students but just as the lesser evil. A good first programming language is sorely lacking.
(Note that I'm talking about the imperative programming paradigm. The debate on whether one should start with functional programming is outside the scope of this comment).
You can teach imperative programming with OCaml or Racket, too. :)
C is indeed pretty evil, though slightly less though with modern sanitisers, so you can get a better error message than just a segfault (or silently doing the wrong thing).
Pascal isn't really more outdated than C. Especially if you use Delphi?
Python is actually fine, you can get pretty far with just the standard library, and the libraries that you can install with your Linux distribution's package manager (eg via Pacman). I do agree that package management with Python is pretty bad out-of-the-box.
It's weird you throw out Java because they are the teaching language. They're the stick everything else is measured against. CS101 is a sea of Eclipse. Having a programming environment where all the major structures are discoverable through menus is actually pretty useful.
Java is popular as a first language, but I think it's just due to its (bygone?) general popularity. I don't think a language that forces you into a one-class-per-file mindset from day one has any business being used for teaching; students should start from the basics ("commands"/function calls, if, while, for, ...) and then discover what more complex concepts do within that framework.
Java's tooling is nice, but frankly Turbo Pascal (or something at its level) is enough for beginners.
Not for complete but for beginners lightly familiar with programming it is overall not difficult to teach the basics, often easier than to stubborn seniors who want to do it their way only :)
var start = "commtext c00\">";
var end = "</div>";
using var http = new HttpClient();
var page = await http.GetStringAsync("https://news.ycombinator.com/item?id=41425416");
var commStart = page.IndexOf(start) + start.Length;
var commEnd = page.IndexOf(end, commStart);
Console.WriteLine(page[commStart..commEnd]);
It’s not just the packaging situation that makes Python a bad first language. Whenever I write in Python, I find it completely impossible to stop myself from constantly making basic, beginner-level programming mistakes. Every time I miss having a compiler and strict typing that will yell at me when I’ve done something stupid.
Modern linters and type-checkers for Python come pretty close to something usable for these situations. But it's certainly something they tacked on to the language afterwards.
I like the recent addition of (proper) pattern matching to Python.
I switched to Rust full time about 6 months ago and recently had to write a somewhat long Python script for something else... it felt awfully quiet without the compiler yelling at me, but that also meant I was making the same silly mistakes over and over and over and over
You’re swapping something that can be feasibly debugged or at least easily burnt and reformed for something that’s impossible to untangle once they’ve done two projects with different or conflicting deps into the base environment.
The projects I work with have moved to it because it handles multiple languages (not just Python). It's made things easy to keep "the correct version of a language" that each project needs.
I think docker would be a much better alternative. You can just hand students a pre-built docker with everything installed and they are free to mess up the system without any consequences.
But then you're explaining virtualization and networking to your students, how to manage interactions in and out ("I wrote a program in Txtedit, how can I run it?"), when it's started and when it's not etc.
I could see giving them a remote access to a prebuilt env (which can be containerized) to be simpler to explain.
In finding it very difficult to see what exactly is the problem with teaching a minimum of computer and operating system fundamentals to people that have to learn programming.
Not every person wants to become a software engineer, anymore than every person wants to learn to become a plumber or car mechanic. The vast majority, and by vast majority I mean more than 95% of students who are taught to code are training to be scientists, mathematicians, engineers, economists, accountants, or some such. They need to run simulations to understand their field. Any pain in the way of that is a failure of those who create software for a living.
If you're going to be spending a substantial part of your waking life working on a tool, spending a few hours to get the basics right is not an unreasonable ask.
And as it turns out, all non-trivial programming languages that have some sort of packaging system or module system have some version of the problems involved.
You don't see people complaining that musical instruments are unreasonably complex when that complexity usually gets solved after a couple months of training, or when someone who wants to basic woodworking has to have at least a passing knowledge of the different types of hardwoods, MDFs, and essential joinery techniques.
Hate to say, but it definitely sounds like skills issues.
I'd argue it's a lot more than "minimum". You need them to be comfortable with the host system, and also manage the running container, which will have a different OS most of the time, and the virtualization mechanism (docker here).
Simple things like having a student access another student's web server becomes overly complicated, as you're dealing with 4 systems talking to/through each other.
But the fundamental fact is that "accessing someone else's web server" is not that trivial of an affair and has never been when you get down and dirty into the details of networking.
Then why teach Python? All of these issues are because Python is a ridiculously poorly designed language which is still having new features added to it. It's generally a mess.
The issues are with one particular facet of poor design and management. The rest of it is manageable, if distasteful, and worth the effort for the ecosystem (in general).
There really is more than one particular face of poor design in Python. I honestly remain confused why people use and love Python so much. It very much says something.
Oh, there are many. There are just comparatively few that you can point at as being quite as directly responsible for dev misery as thé packaging system.
IMO what needs to happen is python needs to get rid of system installs and only work via venv. It should check the current folder for some .venv or something and auto-config itself to the current directory (or complain that you need to config with a messages like "python not configured for current folder. Run 'source bin/activate' to configure. Honestly thought that could be so much better.
The entire problem of python an venv IMO comes from it not being the default.
That depends very much on what you're doing. For production I'd like to use the OS package manager to install my Python dependencies, and move the responsibility of patching to the OS.
For workstations, absolutely, go with virtual environments, it's the only way to go. One concern I've seen from some in the machine learning space is that rebuilding a virtual environment, for example if macOS upgrades the Python version, takes hours. That can be solved by using pyenv, then you can have multiple versions of Python and be free of Anaconda. I primarily use pyenv to be sure that I have the same Python version that ships with the server OS on my laptop.
I have been out of the loop with Python for a couple years at this point, but how could this get so bad?
If you are looking into a system you are unfamiliar with, where do you look first?... pip? pipx? homebrew?... or is it in anaconda? pyenv?... must be in the os package manager... apt? pacman?
Honestly, Maven and NPM look great compared to this mess.
NPM to me is the great example of a package manager that worse than anything the Python community has come up with, but that's subjective I think.
Python isn't as bad as people make it out to be. There are some issue that you will run into if your project/code base becomes really large, but that's not an issue for most people. The vast majority can get just use python -mvenv .venv to set up a virtual environment and install in that using pip.
Next step up is you need specific versions of Python, so you switch to pyenv, which functions mostly like the built in virtualenv.
Then you have the special cases where you need to lock dependencies much hard than pip can do and you use poetry, pip is to slow so you use pipx. Those are edge cases to be honest. That's not to say that they aren't solving very real problem, but mostly you don't need it.
It would be great if there was one tools that could do it all, lock the Python version, do quick dependency resolution and lock them down.
So far Python has opted to split the problem: Tools for locking the Python version, tools for creating separate environments and tools for managing packages. There's a lot of crossover in the first two, but you can pretty much mix and match virtual environment tools and package managers anyway you like. I think that's pretty unique.
It's definitely unique, but the "out of the box" experience is suffering for it in my opinion.
I actually recently had to fix something small in a Python project, pip refused to work because of homebrew, homebrew didn't have the dependencies and directed me to pipx, pipx finally worked - It was a strange experience.
And for the record NPM mostly has a bad reputation because of its past... nowadays it's perfectly usable and can lock dependencies and node version "out of the box".
> It's definitely unique, but the "out of the box" experience is suffering for it
The tooling that ships with Python could be much better, I'd agree with that. You can go pretty far with venv and pip, but just the python3 -mvenv .venv isn't exactly a great example of user friendliness at work.
What's your problem with NPM? It just works, it's easy to package for, it handles multiple version dependencies elegantly, with pnpm it doesn't trash your disk and is really fast.
(P)NPM and CJS is the best package/dependency management there currently is.
This is insane. Apparently (at least on Debian) this can be circumvented by putting this in ~/.config/pip/pip.conf:
[global]
break-system-packages = true
I would have preferred single-version-externally-managed to keep fond memories of setuptools alive.
It becomes increasingly impossible to track down home directory pollution and config files in Python. Next step will be a Python registry on Linux. How about:
Hmm that still recommends that distros allow admins to install to /usr/local, albeit in such a way that it at least can't break the OS.
IMO the idea that a 'Linux admin' is better informed than a 'Linux user' is increasingly anachronistic. In most cases the admin is just the user running sudo. I'd suggest that such functionality should be enabled by installing some kind of OS package rather than being the default
> I know virtualenvs suck to explain to people, but in my opinion it needs to be done
This is because Python doesn't have versioned imports, which means you can't have multiple versions of the same package in the same environment, but I like to dream about a world where this isn't the case. If instead of import foo we had import foo@x.y.z#optional-checksum, the Python world would be massively improved. It seems like it would be such a simple change too.
Virtualenvs solve not having separate workspace-local package installations, not a lack of versioned imports. Versioned imports are not a good solution to separate installations of packages: code is harder to upgrade, cluttered, and encouraged to depend on specifics rather than contracts. There’s a reason every major language localizes their version pinning into a per-project dependencies file (which can be anywhere on the spectrum between “contracts only semver ranges” and “checksummed/vendored lockfile”). And that’s before considering the troublesome behaviors that emerge when you permit importing multiple versions of the same library in the same program, if you want that too.
> And that’s before considering the troublesome behaviors that emerge when you permit importing multiple versions of the same library in the same program, if you want that too.
Rust has versioned imports. So you can import different versions of the same module (at least in your transitive dependencies).
Yep. And while that’s not something I’d call a mistake per se, it can be troublesome. What if a crate you depend on at different versions provides access to global state? What if a transitive embeds definitions from that crate into public contracts consumed by code using a different version?
Rust offers mitigations or means of detection for those cases, but they still require thought and troubleshooting when they occur. Given Python’s lack of static typing and more-likely-to-be-nonexpert user base and usage patterns, I suspect that troublesomeness would not be a value-add for the Python platform.
Well, for what it's worth, Rust's linter clippy likes to warn you about using different versions of the same crate. I think the warning is mostly formulated in terms of helping you reduce compile times, but your concerns about mutability might also be justified.
I have never broken my Python installation on Windows (official installer) despite recklessly installing anything. If I had broken it, I would have just uninstalled and reinstalled.
While not familiar with macOS (topic of this article), I think the Mac installer works the same.
On Debian of course you can cause great damage because the system Python is used for critical system functions. This is silly, I think the system python should be an isolated install in /usr/sbin. Better yet, move back to Perl for the system.
This is a good point and something I hadn't thought about. I'm also a happy Windows user, and have never had a problem with Python installations. But like you say, Python isn't baked into the system.
I do use one piece of commercial software that includes Python. You can see the installer saying "installing Python." I suppose it should fall on the OS and software vendors to not abuse the infrastructure... "to whom much is given, much is expected," but maybe too much to expect.
To be clear, there's absolutely nothing broken about the system Python in the article. There is just a shell alias causing the anaconda version of Python to be launched instead when you type `python3` at the command prompt...
Building Python is so quick and easy I’ve built an interpreter per project in some circumstances. Sometimes more if I want to make sure it’s compatible with multiple Python versions.
It doesn't, but it's typically the simplest (for me). It also means I get exactly the version I want compiled with the best optimizations for that machine.
Typically building from source is done to include optional components such as with the ./configure --with-pydebug option. With many projects make doc is an option because so many prefer to skip that part of the build in favor of online documentation.
Once Xcode's python is installed, `pip3 install ...` will install libraries to that python installation (assuming you haven't aliased pip3 to another installation, as it appears the anaconda installer did)
The install by xcode of a python3 binary in `/usr/bin` does not come with its own copy of pip3.
So either your command will fail to install or it will use a version of pip that is on PATH but was installed by some other version of python. So in either case you can't install a library into what you are mistakenly calling a system python.
So the original assertion was not true and you have not been able to make up an ex post justification for it.
I was having trouble understanding the scenario you seem to have in mind because it never occurred to me that someone would try to run Python without doing an install from python.org as I explicitly recommend.
If someone does install a version of Python from python.org, it puts the bin folder for that version first on PATH via this line in .zprofile:
It also puts a symlink for python3 and pip3 into the `/usr/local/bin` folder that comes ahead of `/usr/bin`.
So if the user runs
`pip3 install ...`
it will not find the version in `/usr/bin` because there will be two other directories ahead of it on the path that have an instance of `pip3`.
As an aside, if they do the improbable and run something like
`/usr/bin/pip3 install ...`
or if they do not have any Python from python.org installed and run
`pip3 install`
what they will end up with is a user-install that puts libraries under their `~/Library` directory because (at least on an Apple Silicon mac) pip can't write to `/usr/bin`. This fallback to a user-install is confusing to people who encounter it, but it is very different from "installing to the system python."
To summarize, it is exceedingly unlikely that a student who is not comfortable running commands from the terminal is going make the mistake you seem to be worried about and end up with libraries in the user-install location:
1. They do not install an official Python even though that is exactly what I recommend.
2. They do install XCode or XCode Command Line tools, then try to use `pip3 install ...`
> You will not be forced to work inside something known as a “virtual environment.”
Oof, this terrible advice cancels out an otherwise reasonable post. Beginners who don't know what they're doing are the last people who should be `pip install -r requirements.txt`-ing into the system Python the way this article is recommending. That's not only going to make working on multiple projects nearly impossible (especially for the kind of beginning students who get recommended Anaconda, which are almost always the Data Science-y crowd using NumPy, Pandas, Scikit, etc, which are notoriously finicky with version conflicts), but it stands a good chance of breaking other Python-based system utilities in completely opaque ways. This sort of advice can fubar a naive user's entire workflow.
I know virtualenvs suck to explain to people, but in my opinion it needs to be done before you ever tell them about `pip install`.