Other than people who care about relatively obscure concerns like distro packaging, nobody is impeded in their work in any practical way by crates having a lot of transitive dependencies.
Author here. If I compile a package which has 1000 transitive dependencies written by different authors, there's ~1000 people who can execute arbitrary code on my computer, with my full user permissions. I wouldn't even know if they did.
That sounds like a massive security problem to me. All it would take is one popular crate to get hacked / bribed / taken over and we're all done for. Giving thousands of strangers the ability to run arbitrary code on my computer is a profoundly stupid risk.
Especially given its unnecessary. 99% of crates don't need the ability to execute arbitrary syscalls. Why allow that by default?
Rust can't prevent crates from doing anything. It's not a sandbox language, and can't be made into one without losing its systems programming power and compatibility with C/C++ way of working.
There are countless obscure holes in rustc, LLVM, and linkers, because they were never meant to be a security barrier against the code they compile. This doesn't affect normal programs, because the exploits are impossible to write by accident, but they are possible to write on purpose.
---
Secondly, it's not 1000 crates from 1000 people. Rust projects tend to split themselves into dozens of micro packages. It's almost like splitting code across multiple .c files, except they're visible in Cargo. Many packages are from a few prolific authors and rust-lang members.
The risk is there, but it's not as outsized as it seems.
Maintainers of your distro do not review code they pull in for security, and the libraries you link to have their own transitive dependencies from hundreds of people, but you usually just don't see them: https://wiki.alopex.li/LetsBeRealAboutDependencies
Rust has cargo-vet and cargo-crev for vetting of dependencies. It's actually much easier to review code of small single-purpose packages.
There’s two different attack surfaces - compile time and runtime.
For compile time, there’s a big difference between needing the attacker to exploit the compiler vs literally just use the standard API (both in terms of difficulty of implementation and ease of spotting what should look like fairly weird code). And there’s a big difference between runtime rust vs compile time rust - there’s no reason that cargo can’t sandbox build.rs execution (not what josephg brought up but honestly my bigger concern).
There is a legitimate risk of runtime supply chain attacks and I don’t see why you wouldn’t want to have facilities within Rust to help you force contractually what code is and isn’t able to do when you invoke it as a way to enforce a top-level audit. Even though rust today doesn’t support it doesn’t make it a bad idea or one that can’t be elegantly integrated into today’s rust.
I agree there's a value in forcing exploits to be weirder and more complex, since that helps spotting them in code reviews.
But beyond that, if you don't review the code, then the rest matters very little. Sandboxed build.rs can still inject code that will escape as soon as you test your code (I don't believe people are diligent enough to always strictly isolate these environments despite the inconvenience). It can attack the linker, and people don't even file CVEs for linkers, because they're expected to get only trusted inputs.
Static access permissions per dependency are generally insufficient, because an untrusted dependency is very likely to find some gadget to use by combining trusted deps, e.g. use trusted serde to deserialize some other trusted type that will do I/O, and such indirection is very hard to stop without having fully capability-based sandbox. But in Rust there's no VM to mediate access between modules or the OS, and isolation purely at the source code level is evidently impossible to get right given the complexity of the type system, and LLVM's love for undefined behavior. The soundness holes are documented all over rustc and LLVM bug trackers, including some WONTFIXes. LLVM cares about performance and compatibility first, including concerns of non-Rust languages. "Just don't write weirdly broken code that insists on hitting a paradox in the optimizer" is a valid answer for LLVM where it was never designed to be a security barrier against code that is both untrusted and expected to have maximum performance and direct low-level hardware access at the same time.
And that's just for sandbox escapes. Malware in deps can do damage in the program without crossing any barriers. Anything auth-adjacent can let an attacker in. Parsers and serializers can manipulate data. Any data structure or string library could inject malicious data that will cross the boundaries and e.g. alter file paths or cause XSS.
They can be fixed, but as always, there’s a lot of work to do. The bug that the above package relies on has never been seen in the wild, only from handcrafted code to invoke it, and so is less of a priority than other things.
And some fixes are harder than others. If a fix is going to be a lot of work, but is very obscure, it’s likely to exist for a long time.
Yes, true. But as others have said, there’s probably still some value in making authors of malicious code jump through hoops, even if it will take some time to fix all these bugs.
Are there any attempts to address this at the package management level (not a cargo-specific question)? My first thought is that the package could declare in its config file the "scope" of access that it needs, but even then I'm sure this could be abused or has limitations.
Seems like awareness about this threat vector is becoming more widespread, but I don't hear much discuss trickling through the grapevine re: solutions.
Package scope is typically too coarse - a package might export multiple different pieces of related functionality and you’d want to be able to use the “safe” parts you audited (eg no fs access) and never call the “dangerous” ones.
The harder bit is annotating things - while you can protect against std::fs, it’s likely harder to guarantee that malicious code doesn’t just call syscalls directly via assembly. There’s too many escapes possible which is why I suspect no one has particularly championed this idea.
> it’s likely harder to guarantee that malicious code doesn’t just call syscalls directly via assembly.
Hence the requirement to also limit / ban `unsafe` in untrusted code. I mean, if you can poke raw memory, the game is up. But most utility crates don't need unsafe code.
> Package scope is typically too coarse - a package might export multiple different pieces of related functionality and you’d want to be able to use the “safe” parts you audited
Yeah; I'm imagining a combination of "I give these permissions to this package" in Cargo.toml. And then at runtime, the compiler only checks the call tree of any functions I actually call. Its fine if a crate has utility methods that access std::fs, so long as they're never actually called by my program.
> Hence the requirement to also limit / ban `unsafe` in untrusted code
I think you’d be surprised by how much code has a transitive unsafe somewhere in the call chain. For example, RefCell and Mutex would need unsafe and I think you’d agree those are “safe constructs” that you would want available to “utility” code that should haven’t filesystem access. So now you have to go and reenable constructs that use unsafe that should be allowed anyway. It’s a massively difficult undertaking.
Having easier runtime mechanisms for dropping filesystem permissions would definitely be better. Something like you are required to do filesystem access through an ownership token that determines what you can access and you can specify the “none” token for most code and even do a dynamic downgrade. There’s some such facilities on Linux but they’re quite primitive - it’s process wide and once dropped you can never regain that permission. That’s why the model is to isolate the different parts into separate processes since that’s how OSes scope permissions but it’s super hard and a lot of boilerplate to do something that feels like it should be easy.
> I think you’d be surprised by how much code has a transitive unsafe somewhere in the call chain. For example, RefCell and Mutex would need unsafe and I think you’d agree those are “safe constructs” that you would want available to “utility” code that should haven’t filesystem access. So now you have to go and reenable constructs that use unsafe that should be allowed anyway. It’s a massively difficult undertaking.
RefCell and Mutex have safe wrappers. If you stick to the safe APIs of those types, it should be impossible to read / write to arbitrary memory.
I think we just don't want untrusted code itself using unsafe. We could easily allow a way to whitelist trusted crates, even when they appear deep in the call tree. This would also be useful for things like tokio, and maybe pin_project and others.
Because for a lot of companies, especially ones in industries that Rust is supposedly hoping to displace C and C++ in, dependencies are a much larger concern than memory safety. They slow down velocity way more than running massive amounts of static and dynamic analysis tools to detect memory issues does in C. Every dependency is going to need explicit approval. And frankly, most crates would never receive that approval given the typical quality of a lot of the small utility crates and other transitive dependencies. Not to mention, the amount of transitive dependencies and their size in a lot of popular crates makes them functionally unauditable.
This more than any other issue is I think what prevents Rust adoption outside of more liberal w.r.t dependencies companies in big tech and web parts of the economy.
This is actually one positive in my view behind the rather unwieldy process of using dependencies and building C/C++ projects. There's a much bigger culture of care and minimalism w.r.t. choosing to take on a dependency in open source projects.
Fwiw, the capabilities feature described in the post would go a very long way towards alleviating this issue.
This appears to be implying that rolling your own libraries from scratch is not a roadblock in C and C++, but somehow would be in Rust. That's a double standard.
Rust makes it easy to use third-party dependencies, and if you don't want to use third-party dependencies, then you're no worse off than in C.
Or, you know, leverage Go/.NET/JVM standard libraries for 99.999% of software and get shit done because there's more to memory safe solutions than just Rust.
Not to mention C/C++ dependency situation is a low bar to clear.
Indeed, you wouldn't really be participating in the "rust ecosystem" at that point. I'm not disputing that it'd be a lot more difficult. The experience would be similar to using C++.
No one is forcing you to use a dependency. Write the code yourself just like you would in another language. Or vendor the dependency and re-write/delete/whatever the code you don't like.
Sorry but down here in Earth, not having unlimited resources and time does force us to use dependencies if you want to get things done.
The line has to be drawn somewhere. And that line is much more reasonable when you can trust large trillion dollar backed standard libraries from the likes Go or .NET, in contrast to a fragmented ecosystem from other languages.
What good is vendoring 4 million lines of code if I have to review them anyway at least once? I'd rather have a strong MSFT/GOOGL standard library which I can rely upon and not have to audit, thank you very much.
I disagree i think avoiding dependencies is partly how we have these codebases like chromium's where you can't easily separate the functionally you want and deal with them as a library. That to me isn't minimalism.
There's probably a similar amount of code in the execution path, but the Rust ecosystem reliance on dependencies means that you're pulling in vast amounts of code that doesn't make it to your final application.
A C++ library author is much more likely to just implement a small feature themselves rather than look for another 3rd party library for it. Adding dependencies to your library is a more involved and manual process, so most authors would do it very selectively.
Saying that - a C++ library might depend on Boost and its 14 million LOC. Obviously it's not all being included in the final binary.
> A C++ library author is much more likely to just implement a small feature themselves rather than look for another 3rd party library for it.
This is conflating Javascript and Rust. Unlike Javascript, Rust does not have a culture of "microdependencies". Crates that get pulled in tend to be providing quite a bit more than "just a small feature", and reimplementing them from scratch every time would be needlessly redundant and result in worse code overall.
Rust may not have "left pad" type micro-dependencies, but it definitely has a dependency culture. Looking at `cargo tree` for the medium size project I'm working on, the deepest dependency branch goes to 12 layers deep. There's obviously a lot of duplicates - most dependency trees eventually end with the same few common libraries - but it does require work to audit those and understand their risk profile and code quality.
>Rust does not have a culture of "microdependencies"
It absolutely does by the C/C++ standards. Last time I checked the zed editor had 1000+ dependencies. That amount of crates usually results in at least 300-400 separately maintained projects by running 'cargo supply-chain'. This is an absurd number.
Other than people who care about relatively obscure concerns like distro packaging, nobody is impeded in their work in any practical way by crates having a lot of transitive dependencies.