Microsoft building it's own 5nm AI chip for datacenters

Lakados

[H]F Junkie
Joined
Feb 3, 2014
Messages
8,930
There goes Nvidia's stock. Shouldn't surprised anyone considering how Google and Tesla have in the past dumped Nvidia to make their own silicon.
 
There goes Nvidia's stock. Shouldn't surprised anyone considering how Google and Tesla have in the past dumped Nvidia to make their own silicon.
This is an "old" news of 2 weeks ago (april 18), the stock went up since. They tried/been working at this for a long time
https://www.cnbc.com/2017/07/24/microsoft-creating-ai-chip-for-hololens.html
https://www.cnbc.com/2018/06/11/microsoft-hiring-engineers-for-cloud-ai-chip-design.html

The entry of home solution competition from the giants in that field was always taken into account and nvidia stock price.
 
https://www.extremetech.com/computing/microsoft-building-codename-athena-ai-chip-on-tsmcs-5nm-node
Microsoft is looking to cut costs by building special purpose TSMC 5N AI accelerator chips for their data centers, which should cut their costs by 2/3'rs by reducing their reliance on Nvidia for various acceleration tasks.

Might be enough to get Nvidia to reconsider its pricing as a whole.
By 1/3 I think if all goes well (Athena, if competitive, could reduce the cost per chip by a third when compared with Nvidia's offerings.)
 
Bing powered by Nvidia hardware:

> How do I solve world hunger?

First, I would do this...
etc.
etc.
etc.

Bing powered by Microsoft hardware:

> How do I solve world hunger?

ezgif-5-a7bb6bf1f0.gif
 
It is often underestimated how difficult to build a chip, especially the whole system including software and hardware.

AMD tried many years for AI with existing GPU. Facebook tried several years to build AI chip. Apple tried many years to build 5G chip. These are failed examples.

Google’s TPU is successful, but only internally with tensor flow. Tesla never has the intention to sell or rent its chip. Making the hardware work with general industrial wide software is extremely hard.

This is not Microsoft’s first try. They most likely will fail, at least can not get rid of the CUDA ecosystem. But someone else will try it again. Until the software wold is ready, something like trident to provide an extra layer or API to replace CUDA, and mature to industrial level, it is unlikely to succeed.
 
Last edited:
AMD tried many years for AI with existing GPU. Facebook tried several years to build AI chip. Apple tried many years to build 5G chip. These are failed examples.
They didn't fail so much that nobody had a good use for them. Now everyone is going nuts for AI and still doesn't have a use for them, other than boosting stock value. Google and Microsoft have uses for these, but beyond that there aren't many good examples.
1lw3xhr1y7xa1.jpg
 
They didn't fail so much that nobody had a good use for them.
You think no one had a use to recognize speech, identify what in a picture, trying suggest next contains most likely to keep someone on their platform, propose a way to travel from a point A to a destination on a map that take traffic pattern into account, to use AI in search result, word completion, detect spam email.

Google has is own custon AI chips for a long time for a reason, they were even using their Ai chips to design AI chips years ago.

What do you mean no one had a good use for powerful specialized hardware to do machine learning or use them ?

I feel like facebook use it a giant amount (it is not human manually using data they collect, that's all machine learning)
 
You think no one had a use to recognize speech, identify what in a picture, trying suggest next contains most likely to keep someone on their platform, propose a way to travel from a point A to a destination on a map that take traffic pattern into account, to use AI in search result, word completion, detect spam email.
Yea like Google and Microsoft.
What do you mean no one had a good use for powerful specialized hardware to do machine learning or use them ?

I feel like facebook use it a giant amount (it is not human manually using data they collect, that's all machine learning)
Other than searching through data, who else has a use for this? Like I already said, companies like Google and Microsoft, and yea Meta too. You can probably include Apple as well, since they're known to comb through peoples private data. Amazon because they know you better than you do when it comes to shopping.
 
Other than searching through data, who else has a use for this?
Almost every situation where you have some dataset coupled with an expertise evaluation of it can be used.

Medical scan for example, people take a large dataset of scan of people over time and what was actually their medical conditions, the system can find patterns.
Many people went for a large part of their life to the same dentists and had multiple x-rays took over the years and actual list of procedure they ended up needing, that type of database will exist in many place we will not think ourself but will be obviously to everyone in their own field. Drugs maker already use this to predict what x-y-z molecule will do and come up with stuff.

Gaz-mining everything that capture large enough, high quality enough and good expertise linked with the data are training machine to recognize pattern.

In laws it is quite obvious, finding a list of precedent cases of the past that as any relevance with your next one, registered patents that could be an issue for what you want to do, every clerk job of the style.

Now with generative AI and newer capacity, editing (movies, pictures, text, excel sheet, code, video game assets, etc...) use it, almost anything that as a lot of text examples, all translations (speech or text). Small game studios have already turned into AI generated rigging and assets.

Google for example could generate really good subtitle for all their content and use it for much better search (and train their model with the information of everything said and shown in videos), next step AI look at the videos and get better at suggestion and learn from having watching tons of content, next step make video itselfs people are likely to watch.

Asking who has a use for artificial intelligence is a bit strange, video games, search, maps, the phone or other customer service first line, it is a bit everywhere.

Think about the sentence you said, how many industries and field does not as data in 2023 ?
 
Almost every situation where you have some dataset coupled with an expertise evaluation of it can be used.

Medical scan for example, people take a large dataset of scan of people over time and what was actually their medical conditions, the system can find patterns.
Many people went for a large part of their life to the same dentists and had multiple x-rays took over the years and actual list of procedure they ended up needing, that type of database will exist in many place we will not think ourself but will be obviously to everyone in their own field. Drugs maker already use this to predict what x-y-z molecule will do and come up with stuff.

Gaz-mining everything that capture large enough, high quality enough and good expertise linked with the data are training machine to recognize pattern.

In laws it is quite obvious, finding a list of precedent cases of the past that as any relevance with your next one, registered patents that could be an issue for what you want to do, every clerk job of the style.

Now with generative AI and newer capacity, editing (movies, pictures, text, excel sheet, code, video game assets, etc...) use it, almost anything that as a lot of text examples, all translations (speech or text). Small game studios have already turned into AI generated rigging and assets.

Google for example could generate really good subtitle for all their content and use it for much better search (and train their model with the information of everything said and shown in videos), next step AI look at the videos and get better at suggestion and learn from having watching tons of content, next step make video itselfs people are likely to watch.

Asking who has a use for artificial intelligence is a bit strange, video games, search, maps, the phone or other customer service first line, it is a bit everywhere.

Think about the sentence you said, how many industries and field does not as data in 2023 ?
There are also translation services for some of the relatively obscure stuff ASL to CSL for instance.
There is also a strong case for AI in networking security, but AI doesn't necessarily mean AI, there are lots of system loads that could be vastly improved via ML or CL acceleration or simply having access to compute units that can do the job faster than a traditional CPU can. Plenty of day-to-day stuff, the Chrome RTX upscaling for streaming video is one example but lots of others with, basic photo or video manipulation that having the correct onboard assets will let users do the things they were already doing more smoothly while using less power. Hardware-assisted audio cleanup for meetings, stripping out background noise in real-time, identifying and removing reverb from group conferences, better image cleanup caused by poor lighting so it can identify people in the video stream and adjust the video characteristics around them individually to try and make everybody clear rather than just dimming or brightening the video as a whole. Then there are the options for onboard language assistance for document writing, Grammarly is awesome and all but having something built into the OS that doesn't require an active internet connection and account would be better, it could then potentially work across everything and not just specific programs. There is also a strong case for using GPT models as a form of encyclopedia, Google has gotten very advertisement-heavy, and Wikipedia is awesome as a quick reference but if you need to chase through it for the sources that gets more difficult.
People are working with large datasets more than ever, we've just gotten so used to it most people don't realize just how much they have going on.
 
Well ROCm is a joke, CUDA dominates all just about uncontested, I hope to god Microsoft doesn't try to make their own new standard like everybody else before them and just uses Intel OneAPI, nobody needs yet another mostly proprietary, yet totally open, barely supported, CUDA alternative wanna be that is dead on arrival except for one specific thing that CUDA already does better.
 
Well ROCm is a joke, CUDA dominates all just about uncontested, I hope to god Microsoft doesn't try to make their own new standard like everybody else before them and just uses Intel OneAPI, nobody needs yet another mostly proprietary, yet totally open, barely supported, CUDA alternative wanna be that is dead on arrival except for one specific thing that CUDA already does better.
Maybe something like xbox, mantle, directx 12 etc for Ai !?
 
Maybe I do not follow particularly well the conversation, but we are talking about a chips to run openAI chatgpt models.

Not something to be sold to anyone else, but more like what Amazon, Google, Facebook do to run their stuff, in-house silicon.

And is it something where the API matter ? It is so big, cost so much to giant entity, I imagine they go "full custom" for the task instead of using general purpose API-compute, a bit like ASIC crypto mining:
For example:
https://www.zdnet.com/article/openai-proposes-triton-language-as-an-alternative-to-nvidias-cuda/
 
Maybe I do not follow particularly well the conversation, but we are talking about a chips to run openAI chatgpt models.

Not something to be sold to anyone else, but more like what Amazon, Google, Facebook do to run their stuff, in-house silicon.

And is it something where the API matter ? It is so big, cost so much to giant entity, I imagine they go "full custom" for the task instead of using general purpose API-compute, a bit like ASIC crypto mining:
For example:
https://www.zdnet.com/article/openai-proposes-triton-language-as-an-alternative-to-nvidias-cuda/
OpenAI's chat GPT models are based on CUDA so anything that they do would need to be a reverse engineering of that work, which would require them to build a framework that is at least as flexible as the CUDA libraries it uses.
Microsoft has huge data centers of Nvidia cards and it is expensive but they also rent them out as part of their Azure Stack, so they can't replace those unless they offer something equivalent, same goes for their hosted AI options. Intels OneAPI at least has CUDA compatibility as they have developed a conversion library.
Google, Facebook, and Amazon all have their own internal AI stuff, but they are finding it to be severely lacking compared to the GPT 3.5 and 4 models.
They could go full custom like Amazon Google, and Facebook and use it for only their internal stuff and not offer any of it to the public for public use, but then that just means they are working to develop their own hardware and software, to duplicate and replicate what Nvidia and CUDA already does, to stop having to buy Nvidia hardware for their own internal stuff? I have a hard time seeing a good ROI on such a limited project like that.
 
but then that just means they are working to develop their own hardware and software, to duplicate and replicate what Nvidia and CUDA already does, to stop having to buy Nvidia hardware for their own internal stuff? I have a hard time seeing a good ROI on such a limited project like that.
That what seem to transpire about every article talk about it and it would not to stop Nvidia affair completely (they still have future supercomputer that involve Nvidia), it is because nvidia-microsoft need-cycle and workload optimisation do not align completetly, opening the windows to use some in-house for some stuff.
In-house silicon, à la Google's tensor processing units (TPUs) or Amazon's Graviton 3, could help Microsoft slash costs. Athena will reportedly cost Microsoft about $100M. Currently, Nvidia owns the lion's share of the GPU market, and Microsoft already extensively uses Nvidia hardware in its Azure server centers. Dylan Patel of research firm SemiAnalysis told The Information, "Athena, if competitive, could reduce the cost per chip by a third when compared with Nvidia's offerings."
Patel estimated that each ChatGPT query costs Microsoft about $0.36. Purpose-built silicon could cut costs for Microsoft by some $84 million


Maybe you custom build to run what is well understood that as a massive very similar day to day compute made, to bring cost by use down (say GPT 4.5) and still use tips of the spears affair for the new stuff or less general one.
 
Microsoft Athena is allegedly AMD, according to Bloomberg

https://www.cnbc.com/2023/05/04/amd...crosoft-is-collaborating-on-ai-chip-push.html

Microsoft to date has not released a special-purpose AI chip. The one in development in partnership with AMD carries the code name Athena, Bloomberg reported.

Microsoft is helping AMD to fund the initiative, Bloomberg reported, citing anonymous sources.


https://www.bloomberg.com/news/arti...helping-finance-amd-s-expansion-into-ai-chips


View attachment 568269
It makes sense. Much less risk than starting all from scratch in house.
 
Making the hardware work with general industrial wide software is extremely hard.

This. They are facing an uphill battle. And CUDA is probably walled up in patents.

I can't name a single third-party piece of software that would use Apple's AI accelerator instead of the GPU.
 
Microsoft Athena is allegedly AMD, according to Bloomberg

https://www.cnbc.com/2023/05/04/amd...crosoft-is-collaborating-on-ai-chip-push.html

Microsoft to date has not released a special-purpose AI chip. The one in development in partnership with AMD carries the code name Athena, Bloomberg reported.

Microsoft is helping AMD to fund the initiative, Bloomberg reported, citing anonymous sources.


https://www.bloomberg.com/news/arti...helping-finance-amd-s-expansion-into-ai-chips


View attachment 568269

Lol @ Microsoft having to wipe AMD's ass for them 🤣
 
https://www.extremetech.com/computing/microsoft-building-codename-athena-ai-chip-on-tsmcs-5nm-node
Microsoft is looking to cut costs by building special purpose TSMC 5N AI accelerator chips for their data centers, which should cut their costs by 2/3'rs by reducing their reliance on Nvidia for various acceleration tasks.

Might be enough to get Nvidia to reconsider its pricing as a whole.
Oh they will double down. Nvidia reduce pricing? Na... spend more on cuda I imagine will be the reaction.
 
Oh they will double down. Nvidia reduce pricing? Na... spend more on cuda I imagine will be the reaction.
Well I know their recent price increases to Grid have made it so Accounting is killing that project when the license is up. So looking forward to making that go away.
 
I can see Microsoft developing a chiplet that could be incorporated right on an EPYC CPU making the communication speeds a hell a lot faster than if on a separate card. AMD stacking and chiplet designed CPUs, 2d and 3d aspects of it on one package specialized for Microsoft needs at a much lower cost than developing a whole card and using PCIe or some other method for communication. AMD can add on cache, ram etc. 3d stacked right on the CPU module. A totally customized solution that Nvidia could not touch.
 
Back
Top