> It segfaults on macOS because the runtime/OS has allocated the stack such that the overflow results in a bad memory access, but that is a behavior of the runtime/OS/hardware, not the language.
Stack overflows are checked in C on macOS not because of guard pages but because the compiler emits stack checks (with cookies). Probably the same is true here.
> I guarantee I could exploit this on a system that does not have virtual memory, or a runtime that does not have unmapped addresses at the end of the stack, to, say, manipulate the contents of another thread’s stack. Therefore, this behavior is undefined.
Software stack checking does not guarantee protection from stack overflows wreaking havoc. E.g., your thread could blow its stack, then get preempted before the stack checker can run.
Mandating guard pages/MPU protection would rule out targeting embedded platforms which lack sufficient hardware support.
What does preemption change here? Before the stack checker has finished, nothing else should hold a reference to any of the yet-unchecked stack. That's plenty trivial to ensure. (unless you mean preemption somehow breaking the stack checker itself, in which case, well, that's a broken stack checker and/or preemption, and should be fixed)
If you can't have hardware support, it's trivial for the compiler to do it in software - just an "if (stack_curr - stack_end < desired_size) abort();". I can't imagine a platform where there you cannot reasonably get a lower bound for the range of stack available. Worst-case, you ditch the architectural stack pointer and manage your own stack on the heap, if that's what you need to ensure correct Rust behavior on your funky platform (or accept the non-compliant compromise of no stack checking).
> What does preemption change here? Before the stack checker has finished, nothing else should hold a reference to any of the yet-unchecked stack.
If your thread overflows the stack, it could start writing into memory for which it does not hold a reference. If the thread is preempted before the stack checker can run (see below*) and detect the overflow, and another thread runs which accesses the now-corrupted memory, then you're hosed.
> just an "if (stack_curr - stack_end < desired_size) abort();"
That's not how the compiler-emitted stack checking works AFAIK (*I believe it uses canaries on the stack which are checked at certain points in code). But, I could see this solving the problem. Basically, for every instruction that manipulates the stack pointer (function calls, alloca's, and on some arch's interrupts use the current stack), the resulting address would need to be checked. That would be costly and require OS awareness, but I think it would be safe. Is this an option that the compiler provides? It would save me a lot of time debugging.*
Canaries are a separate unrelated thing solving a different problem - buffer overruns, i.e. writing out-of-bounds. (canaries are a best-effort thing and don't guarantee catching all such problems, and they're also useless for safe Rust where unchecked OOB indexing is not a thing; whereas stack overflow checking can be done precisely)
In my sibling comment showing the assembly that your Rust program generates, it is writing a "0" every 4096 bytes of the stack range that is intended to be later used as the buffer (this "0" is independent from the "0" in your "[0; N]"; it's just an arbitrary value to ensure that the page is writable). It does this, once, at the very start of the function, before everything else (i.e. before the variable "var" even exists, much less is accessible by anything or even initialized). This is effectively exactly the same as my "if (stack_curr - stack_end < desired_size) abort();", just implemented via guaranteed page faults. You can enable this on clang & gcc with -fstack-clash-protection where supported.
Indeed, stack checking can have overhead (so do other requirements Rust makes!), but in general it's not that large. If you don't have stack-allocated VLAs, it's a constant amount of machine code at the start of every function, checking that all possible stack usage the function may do is accessible. And on systems with guard pages (i.e. all of non-embedded) the overhead is trivially none for functions with frame size below 4096 bytes (or however big the guard range is; and for larger frame sizes the overhead of this check will be miniscule compared to whatever actually uses the massive amount of stack).
Stack overflows are checked in C on macOS not because of guard pages but because the compiler emits stack checks (with cookies). Probably the same is true here.
> I guarantee I could exploit this on a system that does not have virtual memory, or a runtime that does not have unmapped addresses at the end of the stack, to, say, manipulate the contents of another thread’s stack. Therefore, this behavior is undefined.
That's implementation-defined, not undefined.