DeepSeek V4 Multimodal Model Coming Next Week with Native Image, Video, and Text AI Generation
DeepSeek is set to release its new V4 multimodal large language model next week, featuring native support for image, video, and text generation. This marks the first major update since the R1 reasoning model launched in January 2025, aiming to further fill the domestic market demand for low-cost, open-source models. The company plans to publish a brief technical note alongside the release, with a detailed engineering report following approximately one month later.
DeepSeek is preparing to unveil its next-generation V4 multimodal large language model next week, according to a Financial Times report published on February 28, 2026.
The V4 model will natively support image, video, and text generation capabilities, representing a significant leap forward for the Chinese AI startup. This release marks DeepSeek's first major version update since launching the R1 reasoning model in January 2025.
Technical Details and Release Timeline
When launching the V4 model next week, DeepSeek intends to release a concise technical explanation simultaneously, followed by a comprehensive engineering report approximately one month later.
Hardware Adaptation and Domestic Chip Support
In terms of hardware compatibility, DeepSeek has provided robust support for domestic computing power. The company has collaborated with Huawei and Cambricon to optimize V4 model hardware adaptation, which is expected to significantly boost demand for local semiconductors and accelerate the localization of AI model "inference" processes on domestic chips.
This strategic partnership with Chinese hardware manufacturers marks a departure from previous practices, where international chipmakers like NVIDIA and AMD typically received early access to new AI models.
The V4 release is anticipated to further solidify DeepSeek's position in the low-cost, open-source AI model market domestically.