The point of the talk is it is non-trivial to detect those dependencies.
It looks like most of the time was spent discussing Python. I suspect that is because it is possible to create software without an explicit build stage, so you would not receive warnings about a dependency until the code is called. If the software treats it as an optional dependency, you may not receive any warnings. This sort of situation is by no means unique to interpreted languages. You can write a program in C, then load a library at run time. (I've never tried this sort of thing, so I don't know how the compiler handles unknown identifiers/symbols.) Heck, even the Linux kernel is expected to run "hidden packages" (i.e. the kernel has no means of tracking the origin of software you ask for it to run).
Yes, you can write software to detect when an inspected application loads external binaries. No, it is not trivial (especially if the software developer was trying to hide a dependency).
And just a quibble: even bootstrapping requires the use of a binary (unless you go to unbelievably extraordinary measures).
pjmlp 11 hours ago [-]
Yeah, and Gentoo exists.
Except mankind uses other platforms as well, and even having the source code available isn't enough if no one is looking into it for vulnerabilities.
11 hours ago [-]
yjftsjthsd-h 11 hours ago [-]
> In almost all ecosystems, it is difficult to keep track of binary dependencies. When you depend on a package’s source code, this is normally recorded in your manifest file — pyproject.toml, package.json and so on. However, when you depend on a package’s precompiled binaries, this information is usually not recorded anywhere. This means that the binary dependency relationship between your project and whatever you’re depending on is hidden — so we can say that you have a phantom binary dependency.
I know it comes up every time... but nix does kinda exist to solve this problem. At least in pure mode.
pjmlp 11 hours ago [-]
Now we just have to improve its ergonomics, while supporting all existing operating systems in production.
okanat 9 hours ago [-]
I think the Conda ecosystem is the closest and has even better ergonomics than Nix. Especially with Pixi, it is a joy to use.
rekado 8 hours ago [-]
Conda does not solve the problems of deployment and they don't have any reproducibility guarantees. That's not surprising considering how Conda binaries are built.
okanat 7 hours ago [-]
That's why I emphasized Pixi. With Pixi you get a per-platform lockfile that guarantees installation of the exact versions.
If what you want is to deploy a server or development environment, you already get it with Pixi. If you want a Windows installer with DLLs, you don't get. However it was never the reason.
pjmlp 9 hours ago [-]
If one is using Python.
All these s suggestions always fall off, because they are special cases for given programming languages, or operating systems.
okanat 7 hours ago [-]
Actually no. I use it to manage more and more non-Python dependencies like Protobuf compiler and LLVM tooling.
I am an embedded developer and we don't use Python for the main project. It is just scripting. It doesn't get rid of everything but it does make developer environment setup so easy.
woodruffw 11 hours ago [-]
Seth Larson gave a talk on this (with a focus on Python as well) at PyCon US last year[1] as well.
It's a non-trivial issue, in terms of balancing conflicting interests: Python (like most interpreted languages) has a story for integrating native libraries, but that story is not particularly user friendly (in terms of users, Python developers, etc. not having the domain expertise to debug failing native builds). So these ecosystems tend to develop bespoke mechanisms for stashing native binaries inside package distributions, turning a build reliability problem into an introspection problem.
This is one of the reasons I like having a nix flake in all of my projects that defines a dev environment, and integration with direnv to activate it. The flake lockfile, combined with the language-specific lockfile, gives a mostly complete picture of everything needed to build/deploy/develop the package.
pabs3 16 hours ago [-]
Personally I like using Debian packages to keep track of source and binary dependencies.
Rendered at 01:22:25 GMT+0000 (Coordinated Universal Time) with Vercel.
https://bootstrappable.org/ https://lwn.net/Articles/983340/ https://github.com/fosslinux/live-bootstrap https://stagex.tools/
It looks like most of the time was spent discussing Python. I suspect that is because it is possible to create software without an explicit build stage, so you would not receive warnings about a dependency until the code is called. If the software treats it as an optional dependency, you may not receive any warnings. This sort of situation is by no means unique to interpreted languages. You can write a program in C, then load a library at run time. (I've never tried this sort of thing, so I don't know how the compiler handles unknown identifiers/symbols.) Heck, even the Linux kernel is expected to run "hidden packages" (i.e. the kernel has no means of tracking the origin of software you ask for it to run).
Yes, you can write software to detect when an inspected application loads external binaries. No, it is not trivial (especially if the software developer was trying to hide a dependency).
And just a quibble: even bootstrapping requires the use of a binary (unless you go to unbelievably extraordinary measures).
Except mankind uses other platforms as well, and even having the source code available isn't enough if no one is looking into it for vulnerabilities.
I know it comes up every time... but nix does kinda exist to solve this problem. At least in pure mode.
If what you want is to deploy a server or development environment, you already get it with Pixi. If you want a Windows installer with DLLs, you don't get. However it was never the reason.
All these s suggestions always fall off, because they are special cases for given programming languages, or operating systems.
I am an embedded developer and we don't use Python for the main project. It is just scripting. It doesn't get rid of everything but it does make developer environment setup so easy.
It's a non-trivial issue, in terms of balancing conflicting interests: Python (like most interpreted languages) has a story for integrating native libraries, but that story is not particularly user friendly (in terms of users, Python developers, etc. not having the domain expertise to debug failing native builds). So these ecosystems tend to develop bespoke mechanisms for stashing native binaries inside package distributions, turning a build reliability problem into an introspection problem.
[1]: https://www.youtube.com/watch?v=x9K3xPmi_tg