Anything that would be advantaged by AVX512 would be better advantaged by being thrown at a GPU or dedicated hardware of some sort, registers that wide are just stupid on a cpu. I would much prefer that AMD use their transistor budget on things that are actually useful.
It's odd that they didn't add it, IMHO. Didn't they change the FPU to a 256 bit width? Why not then implement AVX512 via 2x 256 bit units fusing? Seems like low hanging fruit. Or am I missing something?
AVX512 comes with Zen3 if the leaks are correct.
IC: With the FP units now capable of doing 256-bit on their own, is there a frequency drop when 256-bit code is run, similar to when Intel runs AVX2?
MP: No, we don’t anticipate any frequency decrease. We leveraged 7nm. One of the things that 7nm enables us is scale in terms of cores and FP execution. It is a true doubling because we didn’t only double the pipeline with, but we also doubled the load-store and the data pipe into it.
IC: Now the Zen 2 core has two 256-bit FP pipes, can users perform AVX512-esque calculations?
MP: At the full launch we’ll share with you exact configurations and what customers want to deploy around that.
Even worse many compilers, cough cough gcc, will default to using an vinsertf128 because, for pre Haswell/Bulldozer, in many cases it may be faster. Someday the default will change but for now this seems to be the case.
* yes there are 2 instructions one floating point one integer but they do the same thing. Ironically the FP is in AVX and the int is in AVX2. Someone in the know please explain this to me.