Nvidia gets mean on upstarts

Lakados

[H]F Junkie
Joined
Feb 3, 2014
Messages
9,670
https://www.extremetech.com/computi...ference-performance-on-h100-with-new-software

So a number of companies were claiming that they were getting close to Nvidia in terms of LLM performance, so Nvidia basically told them all to hold their beer, went and did a driver/software update, and basically doubled the performance of the TensorRT-LLM for the H100 series.

Even better it should be a drop-in upgrade with no retraining needed.
 
Competition is good for the end user.

To think they were just resting on their laurels with poorly optimized software/hardware waiting for it to be necessary.

Without competition, they probably hold back what they are truly capable of just so they can release a new faster generation with little cost or effort.
 
The way I see it is everybody who stuck with their H100’s because the new A100’s were only 2-3 times faster are now going “Fuuuuuuuuuck, our competitors who have the A100’s are how 4-8x faster than we are, we need to upgrade and we need to do it yesterday”

So any H100 customers or potential customers eyeballing competitors who were on the fence just had the fence kicked out from under them.
 
I mean maybe they were artificially slowing things down but LLama 2 was released july 18 2023, GPT-J june 2021

Hopper was launched september 2022, Tensor RT became an early access affair in early may 2023 and was made in collaboration with the Facebook and other major team that create those LLM of the world and sound complicated enough ( in-flight batching, paged-attention, quantization and more of who knows going on).

It is an open source library, if it look like something that should have been there from the get go, it is maybe a claim that could be sustained.

Should be available for Ampere and up in the future has well.

Could be natural very fast moving tech with giant amount of money going in it right now affair.
 
I don't understand why we'd be upset with NVidia for having no competition.
how do you get that from my comment. You kinda got the opposite, if they have no competition I doubt you get this driver anytime soon. My point is due to having no competition they are holding back performance it seems which does sound like classic nvidia.
 
The way I see it is everybody who stuck with their H100’s because the new A100’s were only 2-3 times faster are now going “Fuuuuuuuuuck, our competitors who have the A100’s are how 4-8x faster than we are, we need to upgrade and we need to do it yesterday”

So any H100 customers or potential customers eyeballing competitors who were on the fence just had the fence kicked out from under them.
I think it will boost everyone in the short time (at least that what the bing LLM is telling me):
Those innovations have been integrated into the open-source NVIDIA TensorRT-LLM software, available for Ampere, Lovelace, and Hopper GPUs and set for release in the coming weeks.
https://wccftech.com/nvidia-tensorrt-llm-boosts-large-language-models-up-to-8x-gain-on-hopper-gpus/
As for support, TensorRT-LLM will be supported by all NVIDIA Data Center & AI GPUs that are in production today such as A100, H100, L4, L40, L40S, HGX, Grace Hopper, and so on.
 
how do you get that from my comment. You kinda got the opposite, if they have no competition I doubt you get this driver anytime soon. My point is due to having no competition they are holding back performance it seems which does sound like classic nvidia.
Thanks for the clarification, and apologies if I did/do a bit of "hand waving" dismissal - that's not fair to your point.
 
so they held back performance beause they had no competition is how I see it lmao. Classic Nvidia.
Anyone who remembers the days of nVidia's Detonator drivers knows this all too well. Although it was always with competition. As soon as a competitor brought out a new card/architecture/etc, or more specifically right before they came out, nVidia would release new drivers with major performance increases. It would happen every... single... time.
 
Anyone who remembers the days of nVidia's Detonator drivers knows this all too well. Although it was always with competition. As soon as a competitor brought out a new card/architecture/etc, or more specifically right before they came out, nVidia would release new drivers with major performance increases. It would happen every... single... time.
I mean if the competition is so weak wouldn’t it be rude to dunk on them that hard out the gate. Like going to a playoff game and watching one of the teams getting dominated, utterly and completely destroyed. Nobody wants that.

But seriously if they had that sort of power out of the gate it would have been a landslide. Embarrassingly so.

But in this case I doubt Nvidia was intentional in leaving anything on the table. This update has been months in the making and been based on large amounts of feedback from the customer base.
 
Last edited:
how do you get that from my comment. You kinda got the opposite, if they have no competition I doubt you get this driver anytime soon. My point is due to having no competition they are holding back performance it seems which does sound like classic nvidia.
facebook and co. would have maybe made it (or something quite similar) without Nvidia participation, from my understanding it is pure python software to feed the GPU inference work more efficianly, like using more the 8 bits capability, multi node comms, I am not sure Nvidia had to be involved.

Not sure it is a driver or any driver changed needed, just pyhton code.
 
facebook and co. would have maybe made it (or something quite similar) without Nvidia participation, from my understanding it is pure python software to feed the GPU inference work more efficianly, like using more the 8 bits capability, multi node comms, I am not sure Nvidia had to be involved.

Not sure it is a driver or any driver changed needed, just pyhton code.
Well it’s an overhaul to the scheduler and lets the system interrupt the processing order to deal with deadlocks preemptively and better keep threads running at 100%.
So not quite just Python code but yeah.
 
Back
Top