so much effort to match the performance of lower level languages that it would h...

Zababa · on Nov 17, 2021

I'm not sure, most people aren't writing ASM these days because the compilers are good enough for most cases. Compilers are great.

zanellia · on Nov 17, 2021

I think most HPC people would disagree with this statement. State-of-the-art HPC code is still written in ASM (see e.g., https://github.com/xianyi/OpenBLAS) [that's what Intel is doing too]

marmaduke · on Nov 17, 2021

ASM makes sense when the time spent in a specific routine exceeds the time it takes to write the ASM, which makes a lot of sense for Blas, less so for other HPC yet speculative or less fundamental projects. Cvodes for instance doesn’t need to be written in ASM, and I think Julia makes a strong case that it could have been written in Julia.

Zababa · on Nov 17, 2021

I don't think they would. I think they realize that state-of-the-art HPC code is a small fraction of all the code written. I doubt that these people write ASM instead of Python or JS or C or whatever when doing simple scripts.

guenthert · on Nov 17, 2021

That ASM code is however not necessarily constructed manually. You'd think for high performance code with limited scope, a superoptimizer would be used.

zanellia · on Nov 17, 2021

Not sure what a "superoptimizer" would look like in this context. For a reference, I know for sure that this https://github.com/giaf/blasfeo (which beats Intel MKL) was coded entirely by hand.

giaf · on Nov 18, 2021

There is more and more effort in the automatic development of high-performance linear algebra kernels. But based on my experience, it would certainly be a big challenge to have a tool able to exploit the subtle differences in the assembly languages of different architectures, if the aim is to match or even exceed an expert-crafted assembly kernel.

Anyway, that's surely a very promising active research direction.

Tozen · on Nov 17, 2021

Good point. And you don't have to go that low. Maybe go use Object Pascal, Nim, or Vlang. I know... the libraries. But a lot of them are bindings of C libraries. So, you can create bindings in other languages too or use Python from those languages. There are various options.

zanellia · on Nov 17, 2021

I would disagree on "easier" :) Ever spent half a day debugging a segfault?