Show HN: Using Rust to write shell-script like tasks

jitl · on Aug 23, 2020

I love this. I tried to do something similar in Go, because it was in use on my team at Airbnb, and we were looking to port a 2000 line make-and-bash tool to... something not make-and-bash. But as you know, Go doesn’t have macros - so I spent all this effort trying to build a @decorator comment macro system in my personal time (abandoned). Rust seems like a perfect fit for this! We did have a teammate pitching rust, but no one wanted to learn it.

Anyways, congrats on the release. It looks fabulous. Safely splicing shell command snippets together is surprisingly annoying, so it’s really cool to see a hygienic yet user-friendly approach.

gitgud · on Aug 24, 2020

Interesting, it seems that it allows the exact syntax of shell commands, without using strings.

    // valid rust code and shell code, no strings
    run_cmd!(du -ah . | sort -hr | head -n 10)?;

How does rust parse the statement within run_cmd()? Can rust parse other languages like this?

    run_html!(<div>COOL</div>)

steveklabnik · on Aug 24, 2020

https://crates.io/crates/typed-html-macros

gabssnake · on Aug 24, 2020

Yes! If you are interested in doing front-end in Rust using Wasm, checkout Yew : https://github.com/yewstack/yew The data flow is inspired by React, you’ll feel right at home.

qwertycrackers · on Aug 24, 2020

Yeah, there's a system for macro definition where you define the language the macro accepts. It's very powerful and can probably do most of what you're imagining in this comment.

rhn_mk1 · on Aug 23, 2020

I am not a huge fan of copying the shell language wholesale and wrapping inside a macro. Since macros can execute arbitrary code, this makes me feel uneasy that the strings are just executed within a shell context, with all the appropriate, bug-prone, expansion done by the shell.

Seeing "ls /nofile || true;" makes me worry that "||" is actually passed to the shell wholesale. There's also no transparency about how the binary names are resolved.

I much prefer an approach more integrated with the language, like Plumbum: https://plumbum.readthedocs.io/en/latest/local_commands.html...

This no longer looks like the POSIX shell, but instead clearly integrates the good parts directly into the language, even if some complexity bubbles through. I don't have to worry that "grep["world"] < sys.stdin" is piped into an actual shell, because it gets converted into an AST on the way to execution.

papaf · on Aug 23, 2020

Since macros can execute arbitrary code, this makes me feel uneasy that the strings are just executed within a shell context, with all the appropriate, bug-prone, expansion done by the shell.

Its a shame you didn't bother to look at the source code before criticizing. Someone put a lot of work into this library and its actually pretty cool.

The package parses the code in the macros [1] and then calls 'std:Process::Command' [2] which, I believe, does not execute a subshell by default.

[1] https://github.com/rust-shell-script/rust_cmd_lib/blob/maste...

[2] https://github.com/rust-shell-script/rust_cmd_lib/blob/maste...

viraptor · on Aug 23, 2020

I don't think the parent said this is not parsed well, or at least I didn't read it that way. I share the feeling that you see that code and unless you know the implementation it's not clear what shell brokenness is carried over and what isn't. And which shell and version is being emulated. It's much easier to set expectations with a new syntax that's also easier to document than "what to expect of this macro".

rhn_mk1 · on Aug 23, 2020

Thank you, that's indeed what I meant.

The other, related issue is that reimplementing pieces of the shell DSL duplicates what can be done in the parent language.

Taking conditional return value as an example: "ls /nofile || true;"

In this case I don't really want to be given the option to use bash syntax for this. That would encourage the usage of shell idioms for "tricks" like control flow, which are another annoying part of the shell (I can never remember them). I would much prefer if there was a nice way to do that kind of things idiomatically in the parent language, and no other choice. E.g. to ignore the return value I would find it much nicer to be forced to do sth like this instead:

let _ = ls('nofile');

rustshellscript · on Aug 23, 2020

since run_cmd! and run_fun! are returning result type, you can always do let _ = run_cmd!(ls nofile); to ignore single command error.

The “xxx || true” is for ignoring error within a group of commands, which is also very common in sh “set -e” mode. Without it, the group of commands need to be divided into at least 3 parts to still capture all possible command errors. I probably need to document this part with more details.

rhn_mk1 · on Aug 23, 2020

This ties in to what I wrote before: I prefer a philosophy where groups of commands are not written in the shell DSL, but are instead native statements (as much as possible), and the user is forced to use native control flow.

Documentation is not going to make me warm up to the idea, because I don't like having the choice to use the DSL so much.

With that in mind, perhaps I'm not the most valid person to provide criticism of this project ;)

rattray · on Aug 23, 2020

These sound like easy documentation fixes - maybe open an issue?

rustshellscript · on Aug 23, 2020

Author here, thanks for your feedback. This library is calling std::process::command underneath without any shell dependency, which I believe is the same as plumbum’s run* APIs and I should make it clearer in the docs.

This plumbum has its own “cp”, “cat” .., which is similar to shelljs and it looks like a lot of people like this idea which can also be supported by this library in the future.

cheez · on Aug 23, 2020

Oh this plumbum is nice.

IshKebab · on Aug 23, 2020

I agree, this is a misguided idea. The whole point of not using Bash is that you don't have to use it's terrible design and syntax surely?

foota · on Aug 23, 2020

This only seems to use the good parts of the shell, easy piping and redirection, while dropping the language for logic.

IshKebab · on Aug 23, 2020

It also appears to use some bad parts, e.g. command line switches and unquoted arguments.

rustshellscript · on Aug 23, 2020

Unquoted argument is not an issue here, see some examples here: https://github.com/rust-shell-script/rust_cmd_lib/issues/10

yisonPylkita · on Aug 23, 2020

My god, this super useful when you have a mix o shell commands and processing text output from them. Bash isn’t particularly easy to work with parsing non-trivial strings in a readable way (I’m looking at you awk)

thesuperbigfrog · on Aug 23, 2020

This is why Perl and regular expressions have been so popular on Unix and Linux systems for the past 30 years or so.

Python and Ruby are also very handy for these kinds of tasks.

People joke about regular expressions being another problem to solve, but they really are an elegant solution to handle a LARGE portion of text processing needs.

ilovetux · on Aug 23, 2020

I apologize, but I must disagree. Awk is literally amazing once you get used to writing actual scripts instead of trying for the ever-elusive and often-untenable one-liners.

Noone is using `python -c` syntax for constructing one-liners and i think thats helping adoption of python. I have no idea why one-liners are seen as desirable when they're often hard to read and debug.

qchris · on Aug 23, 2020

When I first was getting into coding while doing test engineering, this was one of my greatest complaints. The software engineers would hand me bash scripts filled with very clever, but unexplained and esoteric one-liners.

Whenever something didn't work (and it didn't, because perfectly interfacing with embedded hardware is tough), I had two options: spend literal hours on Google and Stack Overflow, or go stand outside their cubicle and hope they had time for me.

I'll take a verbose function with clearly-followable logic over an amazing one-liner with a maze of options and hacks any day.

scns · on Aug 23, 2020

Where are nontrivial strings easy to parse?

enhray · on Aug 23, 2020

Ruby? Of all of the scripting languages out there, I find most suitable to write small scripts to transform complex text to something more suitable.

oconnor663 · on Aug 23, 2020

I'm going to shamelessly plug my own library here:

https://github.com/oconnor663/duct.rs

I wanted to solve the same problem, originally in Python (https://github.com/oconnor663/duct.py). It's surprisingly annoying to do pipelines and redirections, compared to how easy they are to do in the shell. Lots of libraries try to address this, but most of them seem do it by emulating shell syntax within the host language, using operator overloading or other magic like that. I think that's a limiting choice. (For example, can you use `cd` to change the working dir for the left half of a pipeline but not the right half? In Bash you would use a "subshell" for this.) Instead, I think it's sufficient to build an API out of regular objects with regular methods. The result doesn't look like shell code, but it's easier to reason about, and more consistent across different languages.

rustshellscript · on Aug 23, 2020

It can be supported with internal APIs, even without macros: Cmds::from_cmd(Cmd(...).current_dir(..)) .pipe(Cmd(...).current_dir(...)) .run_cmd(...)

As you can see, it is very verbose and that's why I choose to hide the lower APIs at this moment.

joobus · on Aug 23, 2020

I like your approach more than duct.rs :)

laumars · on Aug 23, 2020

‘cd’ is a shell builtin so you couldn’t use ‘cd’ in any of these solutions unless they then spawn a shell instance...and that worries me if you are because then you really might as well just have a separate .sh file and launch that instead (at least that is more auditable with tools like Shellcheck than any inlined code would be).

dllthomas · on Aug 23, 2020

As I read the parent comment, the broad context is turning "shell-like behavior" into rust code, and the comment is choosing to talk about that projection by focusing on elements in the source and assuming that it's understood that they're really talking about the resulting image. You can't use the shell's cd, but you can call chdir and set the working directory - and hopefully you can do that for only part of your pipeline.

If they were in fact describing implementation, then I mostly agree - it's likely better to write shell directly than generate it, at least short of treating it seriously as a compilation target.

laumars · on Aug 23, 2020

The problem is you can’t have two different threads operating in different working directories. One “cd” would overwrite another. You could have different processes but then you’re now recreating a shell, in which case you might as well just write it in Bash (for example).

dllthomas · on Aug 23, 2020

On the one hand, on Linux with the clone system call, you actually can have a "thread" that shares memory, file descriptor table, etc, but not working directory (or chroot):

        If  CLONE_FS  is  not  set,  the  child  process  works on a copy of
        the filesystem information of the calling process at the time of the
        clone() call.  Calls to chroot(2), chdir(2), or umask(2) performed later
        by one of the processes do not affect the other process.

On the other hand: what you say is true of posix threads, code built atop clone is unlikely to be portable, etc, etc.

More generally, it's very much the case that the process-global nature of the working directory makes some things tricky. On the other hand, there are ways around that.

If the only pieces that need to reference the working directory are processes you're spawning, the answer is simple - carry a description of the intended working directory through your computation, and actually chdir between the fork and the exec.

If the only pieces that need to reference the working directory are things you will be writing as a part of the current project, you can use "at variants" (openat, fstatat, unlinkat, etc).

You can stitch these two together as needed.

What's awkward is if you need to use library code that references the current working directory and does not use "at variants". In that case, you could play some awkward game with a mutex, setting the cwd only for the duration of individual operations and restoring it afterward (although that does require knowing when these operations may be performed).

Or you could fork your process along the lines you need to draw. Splitting your view of memory raises questions of IPC. In particular, if you want to pass language-specified data structures that gets complicated - particularly if they may contain references. If you can get away with treating everything as passing streams of bytes over actual pipes, it's pretty straightforward. To my mind, this is probably most of what's meant by "shell-script like", and while you're recreating part of a shell, it's actually not very much of a shell, it should be well contained in the library, and you have room for a much better story around things like error handling. I don't have a sense of whether any of the particular libraries discussed here are actually addressing exactly this issue or whether they actually do a good job of any of it.

geowwy · on Aug 23, 2020

  > A lot developers just choose shell(sh, bash, ...) scripts for such tasks,
  > by using < to redirect input, > to redirect output and '|' to pipe outputs.
  > In my experience, this is the only good parts of shell script.

If you try to use shell as a general purpose programming language, of course it sucks.

If you treat shell as a DSL for files and streams, nothing can beat it. Shell is amazing.

I'm sceptical a bunch of Rust macros can beat shell. I think you'd better off writing a few smaller programs that use STDIO and stringing them together with shell.

dathinab · on Aug 23, 2020

Actually it succs bad time at file and stream processing (which I have done a bunch of with bash in recent years).

It's only if you want to do trivially simple file and stream processing with a prototype level of robustness.

It's too prone to all kind of unexpected bugs wrt.:

- unusual file names

- unusual stream output

- error handling both for expected and unexpected errors (set -euo pipefail can help a bit)

- accidental cross talking/pollution through env variables (local+declare+arrays do slightly improve this)

- accidental output/stream pollution through debug messages,warning or unexpected formatting interfaces

- hard to process, brittle plain text data

Also:

- doing the steam proceeding, traditional tools like cut,tr,sed,grep,etc. often have bad to terrible UX and also often have quirks which can cause bugs for edge cases (they what ok for the 90th, but we no longer have 80char line length limits and learned a lot about cli UX since then)

I used small bash scripts all the time but the more I do the more I realize that it's technologically left behind and no longer appropriate for the current times.

I frequently do consider replacing bash as my system shell with e.g. python+some library or something similar and the shell is for me still the main way to interact with my PC, I don't even have a GUI file manager!

Through I need something reasonable responsive so not a compiled language.

I'm also not a fan with implicit cached binaries lying around somewhere, maybe leaking disk space.

Through my prompt is actually computed by a small rust program.

Edit: writing from a phone, swipe like keyboards make it feasible but man I which they wouldn't mix up words that often.

Edit: the reason I'm still using sh/bash/fish/zsh and similar is because when I consider switching I get overnighted in what I like to have, realize that it's to much work for me now and therefore postpone it to later.

masklinn · on Aug 23, 2020

> If you try to use shell as a general purpose programming language, of course it sucks.

> If you treat shell as a DSL for files and streams, nothing can beat it. Shell is amazing.

The problem is that any non-trivial shell script is a mix of the two, so you find yourself torn apart by the inconvenience of "file and streams" in most languages (though really it's mostly subprocesses), and the inconvenience of literally everything else in shells.

amalcon · on Aug 23, 2020

This is the one space where I actually like Perl: for shell scripts that grew up a bit, but still amount to mostly manipulating text files and streams.

dijit · on Aug 23, 2020

I often find myself composing smaller programs which I then call/pipe/chain with bash, this looks pretty messy because I’m a sysadmin, not a programmer.

But I do think it works better than trying to do everything in one place.

usrusr · on Aug 23, 2020

> If you treat shell as a DSL for files and streams, nothing can beat it.

My cursory glance at this lib (and what I picked up from other comments) suggests that it's based on exactly this thought:

Take the tiny subset of shell syntax that makes it awesome and reimplement it as an internal DSL in a host language that has sane control flow etc.

jabirali · on Aug 23, 2020

> If you treat shell as a DSL for files and streams, nothing can beat it. Shell is amazing.

On the other hand, wouldn’t you just define anything that beats it also a shell? In my opinion, Fish beats Bash and Zsh in this area, and I would definitely call it a shell even though it’s not a POSIX-compatible shell. A more extreme example would be PowerShell (I’m not a fan but some people love it).

Where would you draw the line between a “shell” and an “interpreted scripting language that beats POSIX shells on dealing with files and streams”?

barrkel · on Aug 23, 2020

Does fish handle <(outputting command) yet?

Powershell isn't even concurrent, its usefulness is along a different dimension to job coordination and control.

jabirali · on Aug 23, 2020

Do you mean that you want `cmd2 <(cmd1)` as a synonym for `cmd1 | cmd2`? In that case, I’m happy it doesn’t; it drastically lowers readability compared to left-to-right command pipelines, and I don’t see what value it adds.

viraptor · on Aug 23, 2020

Not quite. "<(cmd1)" creates a pipe, connects cmd1 output to it and replaces the expression the pipe name. With one argument you can often use "cmd1 | cmd2 /dev/stdin", but with two you need something better.

Think "diff <(something) <(other)" without using temporary files.

rustshellscript · on Aug 23, 2020

Yes, I also use this for automatically capturing command error messages with timestamps into logs: https://github.com/rust-shell-script/bash_cmd_lib/blob/0b0a6...

However this can be fragile in bash script since any sub shell command can fail. I am still wandering how to support it in this rust_cmd_lib library.

viraptor · on Aug 23, 2020

> Powershell isn't even concurrent

It's got "Start-Job" and now gets "ForEach-Object -Parallel". What else do you think it's lacking?

rattray · on Aug 23, 2020

> If you treat shell as a DSL for files and streams, nothing can beat it.

That sounds exactly like the approach of this crate - easily let you use "the files and streams DSL" directly from a general purpose programming language (Rust). You get full access to shell, but only need to use it where it's useful.

I've done the same thing many times with Python, Node, and Ruby, just with template strings which isn't as pretty as this rust macro (even if the latter can be a bit mysterious)

LockAndLol · on Aug 23, 2020

This reminds me of the python version of this called xonsh https://xon.sh/

I really like the idea, but it was missing some simple features that bash had. I can't recall them right now but after an hour of trying to convert a simple bash script, I gave up. That was a year ago. Maybe things changed.

I'll give this and xonsh a go again because I just really dislike bash. Thanks for the project!

oldsj · on Aug 23, 2020

I’ve been playing with xonsh lately and really liking it so far! From what I can tell it’s pretty close to feature parity with fish shell which has a lot of nice things like command auto completion but you don’t have to learn yet another shell syntax it’s just python. Wrote up a quick trip report at https://blog.jamesolds.me/post/xonsh-aws-example/

Siira · on Aug 23, 2020

You might like my alternative design choice: https://github.com/NightMachinary/brish

Xonsh is a superset of Python, which introduces a lot of complexity for little gain. Brish chooses to use Python metaprogramming abilities to solve the problem within the language itself, and so is a much simpler solution.

rustshellscript · on Aug 23, 2020

Thanks, yes since I just finished the core functionality of this project and I am not surprised if it is still missing some critical features when converting bash script. Pls file bugs if you find anything :)

nopurpose · on Aug 23, 2020

Nothing can fix the fact, that pipes carry dumb byte streams. Powershell addressed this, but sadly remains unpopular in Unix crowd

hnlmorg · on Aug 23, 2020

Actually there are several shells out there that fix that problem and still support existing UNIX tools too (which Powershell doesn't play nice with).

My own shell, https://github.com/lmorg/murex does this by passing type information along with the byte stream. So _murex_ aware tools can have structured data passed and POSIX tools can fall back to byte streams. Best of both worlds.

The problem, however, is that as long as Bourne Shell and Bash are installed everywhere, people will write scripts for it. This is less about the popularity of UNIX tools and more about the ubiquity of them (though the two points aren't mutually exclusive).

carlmr · on Aug 24, 2020

>The problem, however, is that as long as Bourne Shell and Bash are installed everywhere, people will write scripts for it.

This is also an issue of interpreted languages. Often I write bash and very constricted python2/3 compatible code, because I can be fairly sure the target audience has both of these.

You need to have everyone install (and maybe even use) your shell/language for them to be able to use it. Or have them recreate your environment (docker or cxfreeze). With Rust it's easy to distribute a small self contained binary.

eska · on Aug 24, 2020

You might be interested in nushell.

staktrace · on Aug 23, 2020

I wrote a tool to do the opposite thing: allow writing "shell scripts" using Rust. Still early days but https://github.com/staktrace/khaki is where it lives.

Fiahil · on Aug 23, 2020

This is great!

What about "set -euxo pipefail"? I see you were using eprintln before the commands, could we have a "set" macro that would do that for us?

rustshellscript · on Aug 23, 2020

You can consider it was enabled by default, and any failed command would return error unless you mask it with “xx || true”.

lugoues · on Aug 24, 2020

There is also https://github.com/igor-petruk/scriptisto which would allow you to wrap any compiled language.

implfuture · on Aug 24, 2020

This is so awesome! Any tips on parallelizing/is it async compatible? I personally found granular error handling combined with parallelization to be impossible to get just right in pure bash.

sneak · on Aug 23, 2020

Does anyone know of something similar for Go?

ausjke · on Aug 23, 2020

what's the difference between this approach and shell scripts? Thanks.

barrenko · on Aug 23, 2020

I really don't see the point of these. Probably just need to explore available tools a bit more.

darthrupert · on Aug 23, 2020

If this is a fun proof of concept, it's nice.

If somebody uses this in an actual system, it's terrifying.

edit oh, Rust is now a thing where even the bad ideas need to be praised without caveats. Gotcha

daxvena · on Aug 23, 2020

I don't think you deserve to be down voted for this. You bring up a valid concern. Although, I don't agree that this project should be outright dismissed either.

So many times, I've run into the issue where I've wanted to chain a set of commands with a concise syntax (specifically in Python) without having to shell out to bash.

What I really like about this library is that it gives you the concise composability of bash, without having to deal with its pitfalls (eg. variable escaping, lack of Windows support, clunky interface for anything that's not a command invocation...).

Using a DSL will always come with certain tradeoffs, and it won't be the best solution for every use case, but I think this library fills a certain need very well.

laumars · on Aug 23, 2020

It’s pretty crazy isn’t it. Things like this are fun as pet projects but the stuff of nightmares in real codebase that need supporting for years and by a department that will have the usual churn of staff.

I’ve managed enough teams and enough code bases in my time to know that sometimes the smartest code is the least clever. If someone if finding the need to write a shell script in Rust then I’d suggest they need to re-evaluate the problem they’re trying to solve.

Ar-Curunir · on Aug 23, 2020

I mean, no one is suggesting that it be used in a critical system yet, so your suggestion is kinda unnecessary

laumars · on Aug 23, 2020

Please people, don’t do stuff like this for anything other than personal projects. You might think it’s safer than writing Bash but it isn’t.

It results in unsafe Rust code since you’re now forking external code that might be missed by people who are strictly vetting for code inside “unsafe” blocks. Ironically anyone who writes she’ll scripts will know that there are problems with shell scripting but thankfully dot-sh files stand out and bring attention to themselves as files that need to be audited. This wouldn’t. If you need to embed other languages or even just the approximate concept of then, then please at least keep those language files separate rather than inlining them.

Then you have an issue that people who are already aware of the pitfalls of shell scripts would know to read through any such scripts but this introduces a newer and unfamiliar scripting language to audit (eg how do we knew that what’s been declared is run but free?). At least Bash et al has had many years of eyeballs on it.

kazinator · on Aug 23, 2020

> unsafe Rust ... since you’re now forking external code

Are you saying that Rust becomes unsafe because it used a C program as a subroutine? E.g. "tar xvf -" or whatever? What is the fix: rewrite tar, awk, scp and whatever else as Rust functions? That's a lot of work.

I'm surprised that you're simultaneously overlooking what ought to be a more gaping problem: that every system call made by a Rust program is a trip through a kernel written in C.

intc · on Aug 23, 2020

Could you please be more specific how it's a "gaping problem" that the underlying kernel is written in C? I think even you'd write a pure Rust kernel from scratch it would take a considerable time to achieve the same quality/performance ratio as we are currently witnessing with C based kernels (*BDS & Linux). It's so easy to throw these "radical claims". Yes? =)

feanaro · on Aug 23, 2020

I think the idea is that if calling external C binaries is a problem, then a kernel written in C is an even larger problem. It was meant as reductio ad absurdum.

kazinator · on Aug 23, 2020

The point is that if forking a process to invoke an external C program to run in another address space is "unsafe", directly calling into the OS (like making that fork call) should be considered "mega unsafe".

zelphirkalt · on Aug 23, 2020

Actually I think Python did that? There is tarfile in the standard library and in my experience it worked quite well. So perhaps that is actually the answer. I do not know how tarfile is implemented though, so pergaps it is itself using any available tar implementation?

laumars · on Aug 23, 2020

There’s a few problems with forking out:

1. Do those programs exist and what happened if they don’t? That behaviour is already understood in Bash, less so in random 3rd party Rust libraries.

2. Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out? (eg easy to check $PATH in Bash but does this library honour that? Easy to ‘which tar’ but is that going to be the same tar that this library forks?)

3. Are the people using this software even aware what external programs are being executed? How do they validate this? A .sh file clearly signals that there are external dependencies that need to be audited. A .rs file does not. This problem becomes magnified if you then start shipping compiled binaries rather than source.

I get people who like Rust are unlikely to be people who like writing shell scripts but the better way to think of this is like an MVC-like design where you have separate concerns that should be clearly separated in source.

joshuamorton · on Aug 23, 2020

None of these are, as far as I can tell, "unsafe" in the rust sense. They won't result in memory safety issues in the current process.

Your concern is a generic concern about shelling out which maybe makes sense in some cases, but is untenable in general.

I also don't see how 1 & 2 are real problems. From looking at the readme I understand what happens if tar doesnt exist. The macro raises an error. This is similar to what calling subprocess.check_call would do in python. It's quite safe and well understood.

And 2 feels truly made up. Not only could you invoke "which tar" within the macro to find the answer, Id bet you a good bit of money that the answer is whatever is in your path. Anything else would be weirdly complicated. This is exactly the same as every other language that has a way to shell out to a subprocess.

And even if you have MVC like design where things are clearly separate, something will need to call the shell at some point. So issues 1, 2 and 3 never go away, even if you stick the script in a .sh file, you still need to invoke that file. And now how do you deploy that?

laumars · on Aug 23, 2020

It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Writing your own shell script parser also introduces other surprises to new developers to your code base (eg what POSIX tokens are supported?).

I’ve seen all to often people trying to get clever because they don’t like a particular ugly but well understood standard and it usually results in more problems than it solves. Which is fine if it’s a personal pet project but such solutions don’t belong in production code.

joshuamorton · on Aug 23, 2020

> It’s inside a .rs file so I don’t see how it’s not unsafe in the Rust sense. Maybe when I drew parallels with the “unsafe” block wasn’t fair but safety isn’t just about memory safety. Any seasoned developer will tell you that writing safe software is a multi-paradigm problem.

If I call a function that can result in an error condition, and I correctly handle the error condition, my code is not "unsafe".

So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

>The MVC point is that by having shell scripts separated out as their own .sh file means they draw attention to themselves when auditing code. Inlining shell scripts do not.

Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant. And yes, there are cases when you need to shell out to another program, because otherwise you reduce yourself to needing to do everything in bash, or have bash be the entrypoint in some weird inversion of control scheme, and I'd much prefer to construct a single command invocation in bash than to parse a complex set of flags, for example.

Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

laumars · on Aug 23, 2020

> So while yes, calling "rm -rf /" is dangerous, it is no more dangerous when done in rust than anywhere else, since you're just calling a subprocess, and the subprocess API is a safe API. There's nothing "unsafe" (in the rust sense, meaning type- or memory-unsafe) about doing so.

The point is that code doesn't belong in Rust to begin with!

> Yes, but if you have to shell out at some point, the difference between calling to myscript.sh that contains "foo --flag x" and directly shelling out to "foo --flag x" is practically nonexistant.

No it isn't. Code auditing and vetting has been a thing for years. Say you have a CI pipeline that hooked into Shellcheck to validate your .sh files for errors, that same pipeline wouldn't vet any pseudo shell code inlined in Rust.

> Is this better than just using rust's builtin `std::process::Child`? Maybe not, but all of your concerns apply equally to using that.

Not all, only the concerns you've cherrypicked.

joshuamorton · on Aug 23, 2020

> The point is that code doesn't belong in Rust to begin with!

I'll reiterate: it is often safer to embed short snippets of bash into other languages than to invert control and call out to other languages from bash. By calling out to bash, you do the majority of your work in better languages.

> No it isn't. Code auditing and vetting has been a thing for years. Say you have a CI pipeline that hooked into Shellcheck to validate your .sh files for errors, that same pipeline wouldn't vet any pseudo shell code inlined in Rust.

You're making a rather particular set of assumptions there.

> Not all, only the concerns you've cherrypicked.

The three you originally mentioned...

laumars · on Aug 23, 2020

> I'll reiterate: it is often safer to embed short snippets of bash into other languages than to invert control and call out to other languages from bash. By calling out to bash, you do the majority of your work in better languages.

"It depends" is a better way of putting it. However the advantages of embedding Bash doesn't, in my opinion, make up for the problems it creates by obfuscating those calls. Putting Bash inside separate .sh files clearly draws attention to those calls.

It's the same reason Rust has the unsafe block - to draw developer attention to unsafe code. So what I'm talking about here is more idiomatic to Rust.

Not to mention this also creates potential surprises due to being a custom parser which could trip new developers on that code base. Smarter code creates fewer surprises, even if that sometimes means uglier code.

In short, if inlining a shell script in Rust seems like a good idea, then I'd suggest one would need to reinvestigate the original problem and possible solutions. There's bound to be a more predictable and maintainable solution out there, even if it is a little less interesting / fun / trendy.

> You're making a rather particular set of assumptions there.

Inlining code is often regarded as an anti-pattern. Separate out your concerns, separate out your languages. It helps with your IDE (eg syntax highlighting, code completion, etc), your code validation tools (eg Shellcheck), with humans understanding the code (path of least surprises), etc.

> The three you originally mentioned...

They weren't the original points I mentioned nor even the only points I've discussed since. They were only a breakdown of one of the points I had raised.

kazinator · on Aug 23, 2020

I understand how that can go "bad". For instance, I worked with a really ugly connection manager written by Qualcomm. It was C++ code (object oriented with classes deriving from abstract bases and implementing virtual functions and all that).

At the bottom of the class hierarchy were methods that did their work with a hodge podge of system calls and invocations of external utilities like "ip" and "iptables" and whatever else.

The thing would react to netlink events in the kernel, paste together commands and pass them to system: not even using fork and exec to do it cleanly.

Just, eww.

lpwq · on Aug 23, 2020

The subprocess API is without system calls? Otherwise it calls into heretic C code.

Of course Rust itself is not "safe" either:

https://rustsec.org/advisories/CVE-2018-1000810.html

kazinator · on Aug 23, 2020

> Do those programs exist

If those programs don't exist, they are simply missing dependencies of the program.

In the shell, we might use the type command. Something similar could be integrated into this scripting system to detect whether some string corresponds to a command that can be found in the PATH.

> Is ‘tar’ calling ./tar, /bin/tar or some other instance of tar? And how do you find out?

PATH is actually used by low-level routines in POSIX, like execvp. If execvp is used as the basis for dispatching commands, then PATH is searched.

> A .rs file does not.

That's a fair point. Over the years, I have seen a fair share of C programs break because they were actually using system() or fork()/exec() to run programs that were missing or had some other problem.

I've also seen (and written myself) complex shell scripts that check for their dependencies up-front and complain if some are missing, which is a good idea, especially if not all execution paths use every dependency, or if an unexpected termination could occur after a lengthy process that the user will have to recover from and repeat.

It can also be loudly documented as part of the system requirements of the program. "This program relies on the utilities tar, awk and expect which are expected to be in the PATH. It was tested with GNU tar 1.29, GNU Awk 4.1.4 and Expect 5.45.4."

If we are packaging this program into a distro, we can express those dependencies in the packaging meta-data, so they are pulled in automatically. The package manager has to be conscientious and to understand that program's requirements.