DeepSeek-V4 Preview Version Officially Launched, Offering a New Experience with 1M+ Long Context Memory
On April 24th, DeepSeek announced the official launch and open-source release of the preview version of its new series of models, DeepSeek-V4. DeepSeek-V4 boasts an ultra-long context of one million characters and leads domestically and in the open-source field in terms of Agent capabilities, world knowledge, and reasoning performance. The models are available in two versions based on size: deepseek-v4-flash and deepseek-v4-pro.

Starting today, you can log in to the official website or official App to interact with the latest DeepSeek-V4 and explore the new experience of 1M+ long context memory. The API service has been updated simultaneously. You can call it by modifying the model_name to deepseek-v4-pro or deepseek-v4-flash.
Compared to the previous generation models, DeepSeek-V4-Pro’s Agent capabilities have been significantly enhanced. In the Agentic Coding evaluation, V4-Pro has reached the best level among current open-source models and also performs well in other Agent-related evaluations. Currently, DeepSeek-V4 has become the Agentic Coding model used by the company's internal employees. Evaluation feedback shows that the user experience is better than Sonnet 4.5, and the delivery quality is close to Opus 4.6’s non-thinking mode, but there is still a certain gap compared to Opus 4.6’s thinking mode.
It is reported that DeepSeek-V4 pioneered a new attention mechanism that compresses in the token dimension, combined with DSA sparse attention (DeepSeek Sparse Attention), achieving globally leading long context capabilities and significantly reducing the demand for computing and memory compared to traditional methods. From now on, 1M (one million) context will be the standard configuration for all official DeepSeek services.
V4-Pro and V4-Flash have a maximum context length of 1M, both of which simultaneously support non-thinking mode and thinking mode. The thinking mode supports setting the reasoning_effort parameter to adjust the thinking intensity (high/max). For complex Agent scenarios, it is recommended to use thinking mode and set the intensity to max.
Currently, the DeepSeek API has been simultaneously launched with V4-Pro and V4-Flash, supporting OpenAI ChatCompletions interface and Anthropic interface. When accessing the new models, the base_url remains unchanged, but the model parameter needs to be changed to deepseek-v4-pro or deepseek-v4-flash.