DeepSeek has unveiled the most powerful open-source AI model
The Chinese startup DeepSeek has unveiled preliminary versions of its new flagship AI model, V4, which it calls the most powerful open-source solution available.
The company announced this.
The new model comes in two versions—V4 Flash and V4 Pro. The base Flash version contains 284 billion parameters (13 billion of which are active), while the advanced Pro version has 1.6 trillion parameters (49 billion active).
DeepSeek claims that the Pro version approaches the performance of leading global proprietary models, but at a significantly lower cost of use.
The model is built on a hybrid attention architecture, which improves performance with long dialogues and allows for the processing of large amounts of information. Specifically, it supports a context of up to 1 million tokens, enabling it to work with large documents or code bases.
The company notes that computing resources for full utilization of V4 Pro are currently limited, but expects costs to decrease following the launch of new clusters on Huawei chips in the second half of the year.
Following this news, shares of Chinese microchip manufacturers rose, while competitors’ stocks fell.
DeepSeek is also in talks to secure investment from major tech companies as part of its first funding round.
We also wrote that artificial intelligence sometimes lies, even when it knows the truth.
Chinese company DeepSeek plans to launch its new V4 artificial intelligence model on Huawei Technologies chips in the coming weeks—this is an important step in the context of China’s technological development and restrictions on the supply of foreign processors.