Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Safe rust has 0 UB

Safe Rust aims for 0 UB, but I don't think you can make the claim that it absolutely has no UB.

This program SEGFAULTs on my system (macOS), because it's reading an invalid memory address due to a stack overflow:

  const N: usize = 1024*1024*1024;

  fn main() {
      let var: [u8; N] = [0; N];
      println!("var: {:?}", var);
  }


Safe Rust has no undefined behavior. Undefined behavior does not mean no crashing, it means that the semantics of the program are undefined.

Rust's semantics are to abort on a stack overflow. A language like C or C++ have no such semantics, they may abort or they may continue running and producing jibberish.


The fact that this program results in reading/writing an unmapped memory address means it’s doing an out-of-bounds access. It segfaults on macOS because the runtime/OS has allocated the stack such that the overflow results in a bad memory access, but that is a behavior of the runtime/OS/hardware, not the language.

I guarantee I could exploit this on a system that does not have virtual memory, or a runtime that does not have unmapped addresses at the end of the stack, to, say, manipulate the contents of another thread’s stack. Therefore, this behavior is undefined.


The language runtime can require that the OS & hardware always results in an exception on stack overflow (or, alternatively, compile in explicit checks for it). You running the program in an environment without that is, technically, just as wrong as running it on a system where integer addition does multiplication.

Now perhaps this means that there are real rust deployments that are "wrong", but that shouldn't include regular sane standard systems, and embedded users should know the tradeoffs.

https://godbolt.org/z/Y75KTT87M:

    .LBB3_1:
            sub     rsp, 4096
            mov     qword ptr [rsp], 0
            cmp     rsp, r11
            jne     .LBB3_1
That's a loop at the start of your 'main' that probes the stack specifically to ensure a segfault definitely happens if your array didn't fit on the stack.


> It segfaults on macOS because the runtime/OS has allocated the stack such that the overflow results in a bad memory access, but that is a behavior of the runtime/OS/hardware, not the language.

Stack overflows are checked in C on macOS not because of guard pages but because the compiler emits stack checks (with cookies). Probably the same is true here.

> I guarantee I could exploit this on a system that does not have virtual memory, or a runtime that does not have unmapped addresses at the end of the stack, to, say, manipulate the contents of another thread’s stack. Therefore, this behavior is undefined.

That's implementation-defined, not undefined.


> Stack overflows are checked in C on macOS not because of guard pages but because the compiler emits stack checks (with cookies).

Compiler-emitted stack checking is optional and not the default, and definitely not what is causing the crash here.

> That's implementation-defined, not undefined.

How could an implementation reasonably define the behavior for a stack overflow that silently corrupts another variable?


> Compiler-emitted stack checking is optional and not the default, and definitely not what is causing the crash here.

It is the default on macOS for clang.

> How could an implementation reasonably define the behavior for a stack overflow that silently corrupts another variable?

Mandate stack checking.


Software stack checking does not guarantee protection from stack overflows wreaking havoc. E.g., your thread could blow its stack, then get preempted before the stack checker can run.

Mandating guard pages/MPU protection would rule out targeting embedded platforms which lack sufficient hardware support.


> Software stack checking does not guarantee protection from stack overflows wreaking havoc.

Yes it does, unless you're violating the memory model. Or are you thinking of Unix signals? Those do seem a bit harder to implement perfectly.

> Mandating guard pages/MPU protection would rule out targeting embedded platforms which lack sufficient hardware support.

Such systems are not secure if they don't have IOMMUs. But can always emulate everything in software and you must do so here.


> Yes it does, unless you're violating the memory model.

Overflowing the stack violates the memory model.

> Such systems are not secure if they don't have IOMMUs.

Secure in what sense? I was under the impression that Rust could run on embedded devices like the ARM Cortex-M3, but maybe I'm wrong.


What does preemption change here? Before the stack checker has finished, nothing else should hold a reference to any of the yet-unchecked stack. That's plenty trivial to ensure. (unless you mean preemption somehow breaking the stack checker itself, in which case, well, that's a broken stack checker and/or preemption, and should be fixed)

If you can't have hardware support, it's trivial for the compiler to do it in software - just an "if (stack_curr - stack_end < desired_size) abort();". I can't imagine a platform where there you cannot reasonably get a lower bound for the range of stack available. Worst-case, you ditch the architectural stack pointer and manage your own stack on the heap, if that's what you need to ensure correct Rust behavior on your funky platform (or accept the non-compliant compromise of no stack checking).


> What does preemption change here? Before the stack checker has finished, nothing else should hold a reference to any of the yet-unchecked stack.

If your thread overflows the stack, it could start writing into memory for which it does not hold a reference. If the thread is preempted before the stack checker can run (see below*) and detect the overflow, and another thread runs which accesses the now-corrupted memory, then you're hosed.

> just an "if (stack_curr - stack_end < desired_size) abort();"

That's not how the compiler-emitted stack checking works AFAIK (*I believe it uses canaries on the stack which are checked at certain points in code). But, I could see this solving the problem. Basically, for every instruction that manipulates the stack pointer (function calls, alloca's, and on some arch's interrupts use the current stack), the resulting address would need to be checked. That would be costly and require OS awareness, but I think it would be safe. Is this an option that the compiler provides? It would save me a lot of time debugging.*


Canaries are a separate unrelated thing solving a different problem - buffer overruns, i.e. writing out-of-bounds. (canaries are a best-effort thing and don't guarantee catching all such problems, and they're also useless for safe Rust where unchecked OOB indexing is not a thing; whereas stack overflow checking can be done precisely)

In my sibling comment showing the assembly that your Rust program generates, it is writing a "0" every 4096 bytes of the stack range that is intended to be later used as the buffer (this "0" is independent from the "0" in your "[0; N]"; it's just an arbitrary value to ensure that the page is writable). It does this, once, at the very start of the function, before everything else (i.e. before the variable "var" even exists, much less is accessible by anything or even initialized). This is effectively exactly the same as my "if (stack_curr - stack_end < desired_size) abort();", just implemented via guaranteed page faults. You can enable this on clang & gcc with -fstack-clash-protection where supported.

Indeed, stack checking can have overhead (so do other requirements Rust makes!), but in general it's not that large. If you don't have stack-allocated VLAs, it's a constant amount of machine code at the start of every function, checking that all possible stack usage the function may do is accessible. And on systems with guard pages (i.e. all of non-embedded) the overhead is trivially none for functions with frame size below 4096 bytes (or however big the guard range is; and for larger frame sizes the overhead of this check will be miniscule compared to whatever actually uses the massive amount of stack).


Report it.


I don't know if it's technically UB or well defined. The crash is a SEGFAULT and not a panic/abort, but it's probably a SEGFAULT due to guard pages. Still, it's possible to evade guard pages so if you access var[X] such that X points to the heap, it's possible you're reading aliased memory which would be UB in safe Rust.

EDIT: Going to take it back. I'm unable to create a situation where I create a large stack array that doesn't result in an immediate stack overflow. I even tried nightly MaybeUninit::uninit_array but that crashed explicitly with a "fatal runtime error: stack overflow" so it seems like the standard library has improved reporting instead of the old SEGFAULT. So no UB.


Panics are not quite the same as an abort in Rust. Most notably a panic can be caught and execution can resume so as to gracefully terminate the application, but an abort is an immediate termination, a go to jail do not pass go kind of situation.

An out of bounds access in Rust will result in a panic but a stack overflow is an abort.


A segfault would imply it's not an abort either although it seems like it has been converted to a proper abort in newer versions of Rust.


Panics can be aborts if set in cargo.toml.


How does Rust implement this on targets where LLVM does not implement stack clash protection?


On Unix targets it installs a signal handler for SIGSEGV and checks if the faulting address falls within the range of the stack guards. See https://github.com/rust-lang/rust/blob/411f34b/library/std/s...

The stack guards would normally be setup by the system runtime (e.g. kernel in the case of the main thread stack, libc for thread stacks), not Rust's runtime. Likewise, stack probes that ensure stack operations don't skip guard pages are usually (always?) emitted by the compiler backend (e.g. GCC, LLVM), not Rust's instrumentation, per se.

In this sense Rust isn't doing anything different than any other typical C or C++ binary, except that automagically hijacking SIGSEGV (or any other signal) from non-application code as Rust does is normally frowned upon, especially when it's merely for aesthetics--i.e. printing a pretty message in-process before dying. Also, attempting to introspect current thread metadata from a signal handler gives me pause. I'm not familiar enough with Rust to track down the underlying implementation code. I presume it's using some POSIX threads interfaces, but POSIX threads interfaces aren't async-signal safe, and though SIGSEGV would normally be sent synchronously (sometimes permitting greater assumptions about the state of the thread), that doesn't mean the Rust runtime isn't technically relying on undefined behavior.

EDIT: To get the guard page range it's using pthread_self, pthread_getattr_np, pthread_attr_getstack, and friends, of which only pthread_self is async-signal safe. See https://github.com/rust-lang/rust/blob/411f34b/library/std/s... I have no concrete evidence to believe the reliance isn't safe in practice on the targeted platforms (OTOH, I could imagine the opposite), but it's a little ironic that it's depending on undefined behavior.


The runtime thing is the easy part. I was wondering about the stack probes, which require LLVM support. There's a comment in the sources that suggest it's still x86-only, but that may be outdated:

“ //! Finally it's worth noting that at the time of this writing LLVM only has //! support for stack probes on x86 and x86_64. There's no support for stack //! probes on any other architecture like ARM or PowerPC64. LLVM I'm sure would //! be more than welcome to accept such a change! ”

https://github.com/rust-lang/compiler-builtins/blob/master/s...


I don't see where those methods are getting called from a Unix signal handler but the code is complex enough that it's easy to miss, especially perusing through github instead of vscode.

AFAICT those methods are called from `guard::current`. In turn, `guard::current` is used to initialize TLS data when a thread is spawned before a signal is generated (& right after the signal handler is installed): https://github.com/rust-lang/rust/blob/26907374b9478d84d766a...

It doesn't look like there's any UB behavior being relied upon but I could very easily be misreading. If I missed it, please give me some more pointers cause this should be a github issue if it's the case - calling non async-safe methods from a signal handler typically can result in a deadlock which is no bueno.


x86_64 macOS has tier 1 rust platform support, which I believe means that it's guaranteed that you get a crash on stack overflow and you can't evade stack protection in safe rust.

It's not possible on all platforms, hence the tiers.

Apparently ARM64 macOS has tier 2 rust platform support, which might mean that that this is not true there, but maybe safe rust has some different unrelated soundness issue on this platform.

I only have very surface knowledge about the tier stuff, so maybe someone can correct me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: