No. Magic Bitboards are what you use if you don't have PEXT available. EDIT: I expect the WASM version to use magic-bitboards (which is just a "magical" multiply instruction + a lookup).
Every single sliding piece (ie: Rooks, Bishops, and Queens) use PEXT (or magic-bitboards if the PEXT code is unavailable) to determine where to go. Its a fundamental calculation to determine very, very quickly which moves are legal or not.
Both PEXT and Magic Bitboards are incredible techniques for solving the sliding-piece question. However, PEXT is much much much faster, but relies upon an obscure x86-only assembly instruction (that is slower on AMD Zen or Zen2 unfortunately. So you still need to know your hardware details, Intel machines should use the PEXT version).
---------
Still, even AMD users want the BMI2 version, which has a bunch of bit-operations that can be optimized that are used all the time. (And besides, AMD Zen3 is actually faster on PEXT now, so PEXT is looking good for the future)
-----
In either case, I have my doubts that the NNUE neural-net runs anywhere near as good on WASM than on the hand-optimized SIMD / AVX kernels that the Stockfish team wrote.
Yeah, that’s all I meant, PEXT does the same job as magic bitboards where it’s available. But my point was that either way it’s run once (I seem to remember it’s even a constexpr in recent Stockfishes) so I’d be surprised if it were a major performance hit.
Both Magic Bitboards and PEXT are run every single time Stockfish thinks of a Queen, Bishop, or Rook. Literally every, single, time.
Stockfish does something like 10-million positions per second or something. That's a lot of times the "where can the Queen move" analysis is run. Speeding that routine up by 50% or something (PEXT vs Magic Bitboards) really does make a difference in the great scheme of speed.
----------
Unless you're playing some fantasy-version of Chess (or maybe some extreme endgame where all pawns, queens, bishops, and rooks are dead... pawns because they might promote into a queen/bishop/rook), you'll be benefiting from selecting that PEXT version if you're on an Intel or Zen3 processor.
Is the performance difference of PEXT vs. the classic 64-bit magic bitboards actually close to 50%? The very slight latency increase of and/multiply/shift instead of pext should be somewhat offset by somewhat smaller tables (since magic hashing can have helpful collisions), right?
I should probably know this, since I "discovered" PEXT bitboards, but I got out of computer chess before I got BMI-capable hardware, and so I never actually implemented them :)
Variable-shift perfect hashing (fancy magic) is not especially slower than using pext/pdep, and in any case, there is little value in making move generation more performant. A decent move generator using perfect hashing can run at 40 mnode/s, but Stockfish runs at about 1 mnode/s with a single thread (because of the other work it does). A quick application of Amdahl's law shows that the performance gain from speeding up move generation by any amount is negligible. In fact, with a transposition table and staged move generation, for many nodes only a fraction of possible moves are generated.
Gotcha sorry, been a long time since I did this in chess and the last engine I wrote was for Hnefatafl where you’re using slightly bigger boards and I don’t remember there being a PEXT equivalent for bigger SSE stuff. Somehow convinced myself the lookup table generation was the only hard bit.
Yeah, PEXT only works on 64-bit numbers (and therefore, the 8x8 chess board / 64-bit "bitboard", with 1-bit per position).
EDIT: Both PEXT and Magic Bitboards have a "Setup" phase that is run once. But there's also a separate phase that is run on every single position in the actual chess-board part.
Every single sliding piece (ie: Rooks, Bishops, and Queens) use PEXT (or magic-bitboards if the PEXT code is unavailable) to determine where to go. Its a fundamental calculation to determine very, very quickly which moves are legal or not.
Both PEXT and Magic Bitboards are incredible techniques for solving the sliding-piece question. However, PEXT is much much much faster, but relies upon an obscure x86-only assembly instruction (that is slower on AMD Zen or Zen2 unfortunately. So you still need to know your hardware details, Intel machines should use the PEXT version).
---------
Still, even AMD users want the BMI2 version, which has a bunch of bit-operations that can be optimized that are used all the time. (And besides, AMD Zen3 is actually faster on PEXT now, so PEXT is looking good for the future)
-----
In either case, I have my doubts that the NNUE neural-net runs anywhere near as good on WASM than on the hand-optimized SIMD / AVX kernels that the Stockfish team wrote.
Even if there's some auto-vectorization that WASM can do, its really hard to beat handcrafted assembly / intrinsics (https://github.com/official-stockfish/Stockfish/blob/master/...)