Context: I was working on getting a computer to autonomously play the first-person shooter Valorant. The code consisted of a Qt UI, video capture and image manipulation using OpenCV, an object-detection neural network using PyTorch, and a 2.4GHz nrf24 radio connected via USB to hijack a Logitech dongle. The AI kind of works - you can see the successes and bugs in the video if you're curious, or go to the actual page for the Valorant AI video for some more info.
All of the code was written in Python, but it was a very different application from what I have experience in. The biggest difference was that this dipped into "scientific computing", which is a quite different world from the world of web apps, cloud services, and Docker containers that I knew.
This was a big takeaway for me. I've never really used Python notebooks before, but the lure of free GPU time in Google Colab was enough to get me to see the light.
Python notebooks like Jupyter Lab and Google Colab are just really awesome for exploratory programming. In my case, I was doing a ton of fiddling with image processing stuff, and the idea that I could print a numpy matrix (e.g. a manipulated image) into a notebook as part of running the code was incredibly helpful. Many of the images shown in the video were saved straight from my notebooks (most of the minimap and team bar images).
Tuning the circle detection, manipulating the black-and-white minimap, scanning the team bar for player portraits... all of it was made much much easier when I could see what I was doing step by step. And in the end, I just had to copy and paste code out of the notebook into my actual Python program to "productionize" it.
On the flip side, I can see how overdependence on notebooks can result in notebook hell, as much of the ML world seems to run on shared Google Colab notebooks that bit rot or are tough to get running locally. (No, a Colab notebook is not a substitute for documentation.)
I know Python is a pretty slow language - I've once had to optimize some math routine using Cython - but the scientific computing community has really stretched Python to its limits. Many parts of my program deal with computer vision, which means image processing, which means matrix operations. The idea that I can write what looks like idiomatic-ish Python code but actually calls out to some pile of matrix-processing native code was super powerful, and I was impressed that scipy and OpenCV use numpy matrices so I never had to hop between "opencv world" and "numpy world". This is what powered things like checking the minimap, checking the team bar, detecting which parts of the map are visible to my team, and also some stuff having to do with pathfinding.
The other interesting performance bit is that my program is multithreaded. One thread ingests video frames, one thread does the "old-school" computer vision, one thread does the PyTorch computer vision, etc. Usually, people bring up the Python GIL to explain why you can't or shouldn't multithread in Python, but the fine print is that the GIL only locks the Python interpreter. If you're doing something outside the interpreter in native code or, more importantly here, in GPU land, that's an opportunity for another Python thread to run. Basically, the more native code you call out to from Python, the faster your code runs, but also the more parallelizable your code becomes.
There was one area where performance really took a hit, and that was A* search over the map which was done in pure Python. Getting pathfinding to be fast could open up some interesting possibilities that require running it very frequently, and I might try optimizing it with Cython in the future.
One problem mentioned in the page about the Valorant AI is that I wasn't about to talk to the nrf24 radio in Windows. But to get to that point, I had to wade through a bunch of code that assumed it was running on Linux (e.g. check and complain about sudo).
It turns out there are something like 3 different ways of checking what OS you're on in the standard library: os.name, sys.platform, platform.system(). Which should I use? Are there some cases when one is preferable over the other?
Given that Windows is experiencing something of an open-source renaissance and that Mac OS is not and will never be the same as Linux, I do think it's worth accounting for some platform differences in the code you write. Someday some poor soul running Windows will try to use the code that you developed on Mac.
In the end I was never able to talk to that USB radio, but I don't think it was PyUSB's fault.
At some point in the project, I got a new computer and ported my code to run on Windows (mostly). I decided to use miniconda instead of PyPI for the hell of it, and because it seems to be really popular in the scientific and ML community. When in Rome, right? For example, PyTorch offers conda install commands in addition to pip commands.
It ended up being only a partial success. The first time I installed PyTorch and Jupyter Lab, things more or less went as intended. About a month later, when I had to install PyTorch on yet another computer, I got endless dependency issues (the side remark in the video about "only taking an entire day to install the dependencies" is somewhat true). The PyTorch installation commands somehow became out of date, and I only got PyTorch installed after trying a bunch of different package versions and Anaconda channels.
I kind of get why Anaconda has different channels, but I can't imagine a non-CS grad student figuring out when to use, say, "pytorch" or "nvidia" or "conda-forge" or no channel specifier. I personally ended up with packages from 4 or 5 different channels but did finally make the dependency solver happy ...
... until I tried to install PySide2 (Python/Qt bindings). PySide2 has a lot of GUI dependencies. So does torchvision, the PyTorch computer vision package. Somewhere in the dependency trees are incompatible dependency specs, and Conda's solution was to install a very old version of torchvision that somehow satisfied the dependency solver, but was *not* compatible with my program.
At the end of the day, I just installed PySide2 from PyPI using pip, dependency solver be damned.
Edit: There was a similar issue on Linux, where I installed both opencv and pyside2 via PyPI. If your program uses multiple conflicting versions of Qt, it will crash. So, dependencies of dependencies do matter! The solution in that case was to use opencv-headless which doesn't depend on Qt.
Most of the bugs I ran into during development would have been avoided with a good static type system. I can't imagine that the time cost of adding types would've been as much or more than the time cost of dealing with Yet Another Type Error.
They're usually trivial to fix, but there are just so many of them hiding in plain sight every time one piece of code interfaces with another. Yes, I know that mypy, pyre, etc. exist, and at this point I'm eagerly waiting for all but one of them to die out and for broad support from packages.
One semi-related issue seems to be deviation from Python norms. In regular Python, you can check if a collection (list, set, etc.) has no items by checking if it's falsy. That doesn't work for PyTorch tensors or numpy matrices (my memory is fuzzy on the details).
At this point, I don't know what language would be better if I want to quickly get up and running with neural networks, image processing, matrix processing, etc. So, yes, my next project in this field will probably be in Python.