AIDive
DeepSeek logo

DeepSeek

Chinese ChatGPT competitor for chat, coding, and analysis

Description

DeepSeek V3 is a large language model developed by the Chinese company DeepSeek. It’s used for text generation and analysis, translation, and writing code, with an interface similar to ChatGPT or Claude.

What DeepSeek can do
  • Answer questions and handle general chat requests
  • Write and debug code
  • Work with large volumes of text for research and analysis
  • Browse the web and review a webpage when you provide a link
  • Use DeepThink mode for more deliberate analysis and fact-checking
Apps and availability
  • Android app
  • iOS app
Technical notes and benchmarks

DeepSeek reports a 671B-parameter model trained on 14.8T tokens, trained for about two months on an Nvidia H800 cluster, with an estimated cost of about $5.5M. On Codeforces, DeepSeek V3 scored higher than Llama 3.1 and GPT-4o.

Benchmark disputes

Independent testing highlights that results vary by evaluation method. For example, Alessandro Quadron reported different scores for OpenAI o1 depending on the framework, and found DeepSeek V3 roughly comparable to o1 in programming tasks, while Anthropic Sonnet 3.5 scored higher in that setup.

Summary

Categories

Tags

    Newsletter

    Get notified when new AI tools are added

    Join the community.