Prakat Posted January 26 Share Posted January 26 A little-known AI lab out of China has ignited panic throughout Silicon Valley after releasing AI models that can outperform America’s best despite being built more cheaply and with less-powerful chips. DeepSeek, as the lab is called, unveiled a free, open-source large-language model in late December that it says took only two months and less than $6 million to build, using reduced-capability chips from Nvidia called H800s. The new developments have raised alarms on whether America’s global lead in artificial intelligence is shrinking and called into question big tech’s massive spend on building AI models and data centers. In a set of third-party benchmark tests, DeepSeek’s model outperformed Meta’s Llama 3.1, OpenAI’s GPT-4o and Anthropic’s Claude Sonnet 3.5 in accuracy ranging from complex problem-solving to math and coding. DeepSeek on Monday released r1, a reasoning model that also outperformed OpenAI’s latest o1 in many of those third-party tests. “To see the DeepSeek new model, it’s super impressive in terms of both how they have really effectively done an open-source model that does this inference-time compute, and is super-compute efficient,” Microsoft CEO Satya Nadella said at the World Economic Forum in Davos, Switzerland, on Wednesday. “We should take the developments out of China very, very seriously.” DeepSeek also had to navigate the strict semiconductor restrictions that the U.S. government has imposed on China, cutting the country off from access to the most powerful chips, like Nvidia’s H100s. The latest advancements suggest DeepSeek either found a way to work around the rules, or that the export controls were not the chokehold Washington intended. “They can take a really good, big model and use a process called distillation,” said Benchmark General Partner Chetan Puttagunta. “Basically you use a very large model to help your small model get smart at the thing you want it to get smart at. That’s actually very cost-efficient.” Little is known about the lab and its founder, Liang WenFeng. DeepSeek was was born of a Chinese hedge fund called High-Flyer Quant that manages about $8 billion in assets, according to media reports. But DeepSeek isn’t the only Chinese company making inroads. Leading AI researcher Kai-Fu Lee has said his startup 01.ai was trained using only $3 million. TikTok parent company ByteDance on Wednesday released an update to its model that claims to outperform OpenAI’s o1 in a key benchmark test. “Necessity is the mother of invention,” said Perplexity CEO Aravind Srinivas. “Because they had to figure out work-arounds, they actually ended up building something a lot more efficient.” Watch this video to learn more. sauce: https://www.cnbc.com/2025/01/24/how-chinas-new-ai-model-deepseek-is-threatening-us-dominance.html BacktoCricaddict and randomGuy 1 1 Link to comment Share on other sites More sharing options...
randomGuy Posted January 26 Share Posted January 26 Using chat.deepseek.com Finding it Significantly better than chatgpt. Link to comment Share on other sites More sharing options...
bharathh Posted January 26 Share Posted January 26 39 minutes ago, randomGuy said: Using chat.deepseek.com Finding it Significantly better than chatgpt. How are you making the comparison? Also, has anyone tried the reasoning model? That is supposed to be the significant advancement so far. I've tried using it for root cause analysis of coding errors so far and it has been pretty poor compared to Llama3 and ChatGPT so far. Better than Gemini and MS's version of ChatGPT Link to comment Share on other sites More sharing options...
randomGuy Posted January 26 Share Posted January 26 44 minutes ago, bharathh said: How are you making the comparison? Also, has anyone tried the reasoning model? That is supposed to be the significant advancement so far. I've tried using it for root cause analysis of coding errors so far and it has been pretty poor compared to Llama3 and ChatGPT so far. Better than Gemini and MS's version of ChatGPT This might help. ravishingravi 1 Link to comment Share on other sites More sharing options...
ravishingravi Posted January 26 Share Posted January 26 What's ironic is that the so called free society of creativity and excellence didn't release the paper or made open source. These guys have put a detailed paper out there including their failures. Amazing stuff. https://arxiv.org/abs/2405.04434 randomGuy 1 Link to comment Share on other sites More sharing options...
bharathh Posted January 26 Share Posted January 26 1 hour ago, ravishingravi said: What's ironic is that the so called free society of creativity and excellence didn't release the paper or made open source. These guys have put a detailed paper out there including their failures. Amazing stuff. https://arxiv.org/abs/2405.04434 Meta made Llama opensource. There is no compulsion to make your IP opensource. OpenAI was the one to bust open the GenAI and had a significant first mover advantage. Nothing wrong in trying to make a buck off your IP. There would be little reason to innovate if everyone made things opensource. randomGuy 1 Link to comment Share on other sites More sharing options...
randomGuy Posted January 26 Share Posted January 26 6 hours ago, ravishingravi said: What's ironic is that the so called free society of creativity and excellence didn't release the paper or made open source. These guys have put a detailed paper out there including their failures. Amazing stuff. https://arxiv.org/abs/2405.04434 I think one of China's purposes(not the main purpose) is also to attack US stock market. Tesla stock by China's EV n battery cos. , Apple stock by its smartphone cos., Microsoft stock by its AI model(deepseek) etc. This may not be the driving purpose but it seems to happen and the US stock Market appears to be already in a bubble phase. Link to comment Share on other sites More sharing options...
ravishingravi Posted January 27 Share Posted January 27 4 hours ago, randomGuy said: I think one of China's purposes(not the main purpose) is also to attack US stock market. Tesla stock by China's EV n battery cos. , Apple stock by its smartphone cos., Microsoft stock by its AI model(deepseek) etc. This may not be the driving purpose but it seems to happen and the US stock Market appears to be already in a bubble phase. It will be NVIDIA mainly. Rest actually stand to benefit. India stands to benefit. This is thing about technology. Sometimes it pays to not be early. randomGuy 1 Link to comment Share on other sites More sharing options...
randomGuy Posted January 27 Share Posted January 27 7 hours ago, ravishingravi said: It will be NVIDIA mainly. Rest actually stand to benefit. India stands to benefit. This is thing about technology. Sometimes it pays to not be early. Our prediction coming true. Nvidia futures down 7% Microsoft 6% Nasdaq 4% due to deepseek. Abhi to bas shuruwat hai. ravishingravi 1 Link to comment Share on other sites More sharing options...
Teengunalagaan Posted January 27 Share Posted January 27 9 hours ago, ravishingravi said: It will be NVIDIA mainly. Rest actually stand to benefit. India stands to benefit. This is thing about technology. Sometimes it pays to not be early. Could be eerily similar to what happened to Cisco. Data Center stocks tanked today. AI bubble popped? If the DeepSeek results are truly real, NVIDIA could find itself in a situation eerily similar to what Cisco faced after the dot-com bubble burst in the early 2000s. During the bubble, Cisco was a dominant supplier of networking equipment, riding a wave of… — THE SHORT BEAR (@TheShortBear) January 25, 2025 ravishingravi and diga 2 Link to comment Share on other sites More sharing options...
bharathh Posted January 27 Share Posted January 27 46 minutes ago, Teengunalagaan said: Could be eerily similar to what happened to Cisco. Data Center stocks tanked today. Not going to happen. How has the AI bubble popped? This is just the beginning! Things will keep evolving and getting faster and better. The demand for GPUs will not go away. So many knee jerk reactions! Deepseek has shown that through optimal reinforcement learning the infra needed to fine-tune models is less. It doesn't disappear. This is good news as the research will be spent on improving the memory pool, context window sizes etc. Others will catch up pretty quickly with this as well. Good for us consumers as well - as the price wars will get the market prices to come down. With these things the non-determinism and hallucinations can be worked on - which makes AI a more feasible option to use in applications - even critical ones. diga, ravishingravi and BacktoCricaddict 1 2 Link to comment Share on other sites More sharing options...
mishra Posted January 27 Share Posted January 27 its probably similar to , two Sixth Gen Fighter aircraft when they dont have a decent Engine. Short sellers will use the news for some time and make money but Will fail in rest of world just like Baidu Huwei and so on. raki05 1 Link to comment Share on other sites More sharing options...
bharathh Posted January 27 Share Posted January 27 To be honest - I feel the next big thing in tech is going to be computing architecture. The one that is able to make quantum computers portable or work on the next gen of architecture to be able to optimize AIs is going to be the big winner. Software will reach a plateau soon without some breakthroughs in architecture. Something to advice the next gen in case someone asks you what they should pursue as a degree. ravishingravi, mishra and MechEng 3 Link to comment Share on other sites More sharing options...
ravishingravi Posted January 27 Share Posted January 27 Lone Wolf 1 Link to comment Share on other sites More sharing options...
ravishingravi Posted January 27 Share Posted January 27 1 hour ago, bharathh said: To be honest - I feel the next big thing in tech is going to be computing architecture. The one that is able to make quantum computers portable or work on the next gen of architecture to be able to optimize AIs is going to be the big winner. Software will reach a plateau soon without some breakthroughs in architecture. Something to advice the next gen in case someone asks you what they should pursue as a degree. Yup hardware is next frontier. Link to comment Share on other sites More sharing options...
bharathh Posted January 28 Share Posted January 28 9 hours ago, ravishingravi said: We have invested a lot of money in our Digital India platforms which are the need of the hour. Sure AI is important as well - but the govt is not going to be running these platforms. Let private players in India step up. The Chinese had nothing to show for the past 2 years. I am sure there are ppl in India also working on these things - just not production ready as yet. Doesn't mean we have nothing. diga and MechEng 1 1 Link to comment Share on other sites More sharing options...
Lone Wolf Posted January 28 Share Posted January 28 13 hours ago, ravishingravi said: Is the code open source?? Vishwaguru may take benefit at least. Link to comment Share on other sites More sharing options...
Lone Wolf Posted January 28 Share Posted January 28 15 hours ago, mishra said: its probably similar to , two Sixth Gen Fighter aircraft when they dont have a decent Engine. Short sellers will use the news for some time and make money but Will fail in rest of world just like Baidu Huwei and so on. Wrong Even Chinese 5th gen fighter J20 first flew in friggin 2011. They have had a long time perfecting it since then. Currently it is perfectly suited to take on any Western 5th gen aircraft. Always remember China followed the same industrial model as Japan in the 1960s first begin with building inferior copies of Western products, then slowly and steadily build up your own Indigenous capabilities and, over time surpass them. Link to comment Share on other sites More sharing options...
ravishingravi Posted January 28 Share Posted January 28 5 hours ago, bharathh said: We have invested a lot of money in our Digital India platforms which are the need of the hour. Sure AI is important as well - but the govt is not going to be running these platforms. Let private players in India step up. The Chinese had nothing to show for the past 2 years. I am sure there are ppl in India also working on these things - just not production ready as yet. Doesn't mean we have nothing. What does this mean now ? Are they all feeding off each other. Commoditization is done. Link to comment Share on other sites More sharing options...
ravishingravi Posted January 28 Share Posted January 28 58 minutes ago, Lone Wolf said: Is the code open source?? Vishwaguru may take benefit at least. Open hai. Yeah startups have jumped on it already. Dirth cheap too Lone Wolf 1 Link to comment Share on other sites More sharing options...
Recommended Posts