Jump to content

⚠ Info: We are working on adding content to this platform.

✔ If you want to share your experience and be an active contributor to this Wiki platform, ✉ contact us

×

Ai/Deepseek: Difference between revisions

From Idiosymbolia
< Ai
Created page with "=Deepseek== {{Infobox artificial intelligence | name = DeepSeek AI | logo_filename = File:Deepseek-ai-icon-logo.png|270px|alt=DeepSeek AI logo | developer = DeepSeek (Chinese: 深度求索) | type = Large multimodal model | release_date = Initial release June 2023<br>DeepSeek-R1 (January 2024 | license = Open-source (Apache License 2.0) | website_url = https://www.deepseek.com/ | website_display = Deepseek.com }} '''DeepSeek AI''' is..."
 
No edit summary
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
=Deepseek==
=Deepseek=


{{Infobox artificial intelligence
{{Infobox artificial intelligence
| name = DeepSeek AI
| name = DeepSeek AI
| logo_filename = File:Deepseek-ai-icon-logo.png|270px|alt=DeepSeek AI logo
| logo_filename = File:Deepseek-ai-icon-logo.png|270px|alt=DeepSeek AI logo
| developer = [[DeepSeek]] (Chinese: 深度求索)
| developer = [[DeepSeek]] (CN: 深度求索)
| type = [[Large language model|Large multimodal model]]
| type = [[Large language model|Large multimodal model]]
| release_date = Initial release June 2023<br>DeepSeek-R1 (January 2024
| release_date = June 2023/R1 Jan. 2024
| license = Open-source ([[Apache License 2.0]])
| license = Open-source ([[Apache License 2.0|Apache 2.0]])
| website_url = https://www.deepseek.com/  
| website_url = https://www.deepseek.com/  
| website_display = Deepseek.com
| website_display = Deepseek.com
Line 13: Line 13:




'''DeepSeek AI''' is an open-source [[artificial intelligence]] project developed by Chinese AI research company [[DeepSeek]] (深度求索). Launched in June 2023, it specializes in developing advanced [[large language model]]s (LLMs) with particular focus on [[machine learning|machine learning research]], [[computer programming|code generation]], and [[multimodal learning]]. The project gained significant attention with its January 2024 release of DeepSeek-R1, a 128K-context multimodal model competitive with leading global AI systems.
 
 
 
'''DeepSeek AI''' is an open-source artificial intelligence project developed by Chinese AI research company DeepSeek (深度求索). Launched in June 2023, it specializes in developing advanced large language models (LLMs) with particular focus on machine learning research, code generation, and multimodal learning. The project gained significant attention with its January 2024 release of DeepSeek-R1, a 128K-context multimodal model competitive with leading global AI systems.


== History ==
== History ==
=== Founding and Early Models (2023) ===
=== Founding and Early Models (2023) ===
* '''June 2023''': DeepSeek AI launched with release of [[DeepSeek-Coder]], an open-source code generation model supporting 80+ programming languages
* '''June 2023''': DeepSeek AI launched with release of DeepSeek-Coder, an open-source code generation model supporting 80+ programming languages
* '''August 2023''': Introduction of [[DeepSeek-VL]], a vision-language multimodal model for image-text understanding
* '''August 2023''': Introduction of DeepSeek-VL, a vision-language multimodal model for image-text understanding
* '''October 2023''': Release of [[DeepSeek Math]], specialized for mathematical reasoning and problem-solving
* '''October 2023''': Release of DeepSeek Math, specialized for mathematical reasoning and problem-solving


=== DeepSeek-R1 Era (2024) ===
=== DeepSeek-R1 Era (2024) ===
Line 26: Line 29:
** Multilingual capabilities (English/Chinese focus)
** Multilingual capabilities (English/Chinese focus)
** Strong performance in coding, mathematics, and reasoning benchmarks
** Strong performance in coding, mathematics, and reasoning benchmarks
* '''March 2024''': Integration with [[Hugging Face]] ecosystem and API service launch
* '''March 2024''': Integration with Hugging Face ecosystem and API service launch
* '''May 2024''': Partnership with [[Linus Torvalds|Linux Foundation]] for open-source AI infrastructure
* '''May 2024''': Partnership with Linux Foundation for open-source AI infrastructure


== Technical Architecture ==
== Technical Architecture ==
Line 57: Line 60:
== Performance ==
== Performance ==
DeepSeek models consistently rank highly in major AI benchmarks:
DeepSeek models consistently rank highly in major AI benchmarks:
* '''[[Hugging Face Open LLM Leaderboard]]''': Top 3 open-source models (as of 2024)
* '''Hugging Face Open LLM Leaderboard''': Top 3 open-source models (as of 2024)
* '''[[HumanEval]]''': 75.6% pass@1 score (DeepSeek-Coder-33B)
* '''HumanEval''': 75.6% pass@1 score (DeepSeek-Coder-33B)
* '''[[MMLU]]''': 82.3% accuracy (DeepSeek-R1)
* '''MMLU''': 82.3% accuracy (DeepSeek-R1)
* '''[[GSM8K]]''': 86.5% accuracy in mathematical reasoning
* '''GSM8K''': 86.5% accuracy in mathematical reasoning


== Ethical Framework ==
== Ethical Framework ==
Line 71: Line 74:
== Community and Open Source ==
== Community and Open Source ==
* '''GitHub Presence''': 15k+ stars across repositories
* '''GitHub Presence''': 15k+ stars across repositories
* '''Model Hosting''': Available on [[Hugging Face]] and [[ModelScope]]
* '''Model Hosting''': Available on Hugging Face and ModelScope
* '''Research Partnerships''': Collaborations with [[Tsinghua University]] and [[Chinese Academy of Sciences]]
* '''Research Partnerships''': Collaborations with Tsinghua University and Chinese Academy of Sciences
* '''License''': Open-source under [[Apache License 2.0]] for research/commercial use
* '''License''': Open-source under Apache License 2.0 for research/commercial use


== Reception ==
== Reception ==
* Praised for "setting new standards in open-source AI" ([[MIT Technology Review]], 2024)
* Praised for "setting new standards in open-source AI" (MIT Technology Review, 2024) [https://www.technologyreview.com/2025/01/24/1110526/china-deepseek-top-ai-despite-sanctions/ How a top Chinese AI model overcame US sanctions]
* Recognized as "China's most promising AI research initiative" ([[South China Morning Post]], 2024)
* Recognized as "China's most promising AI research initiative" (South China Morning Post, 2024)
* Critiqued for limited multilingual support beyond Chinese/English
* Critiqued for limited multilingual support beyond Chinese/English


== See Also ==
== References ==
* [[Large language model]]
<ref>DeepSeek, 'Open-Source Code Language Models' eprint=2312.12996 2023-12-20</ref>
* [[Open-source artificial intelligence]]
<ref>DeepSeek-R1: The First Open-Source 128K Context Multimodal Model [https://www.deepseek.com/blog/r1 DeepSeek AI] 2024-01-10</ref>
* [[Hugging Face]]
<ref>'Comparative Analysis of Large Language Models', 'Journal of AI Research' volume=47 , pages: 112–145 doi=10.1016 j.jair.2024.03.005 / year=2024</ref>
* [[Chinese technology]]
* [[Artificial intelligence in China]]


== References ==
<references/>
{{Reflist|refs=
<ref name="arxiv2023">{{cite arXiv |last=DeepSeek |title=DeepSeek-Coder: Open-Source Code Language Models |eprint=2312.12996 |date=2023-12-20}}</ref>
<ref name="r1release">{{cite web |title=DeepSeek-R1: The First Open-Source 128K Context Multimodal Model |url=https://www.deepseek.com/blog/r1 |publisher=DeepSeek AI |date=2024-01-10}}</ref>
<ref name="benchmarks">{{cite journal |title=Comparative Analysis of Large Language Models |journal=Journal of AI Research |volume=47 |pages=112–145 |doi=10.1016/j.jair.2024.03.005 |year=2024}}</ref>
}}


== External Links ==
== External Links ==
Line 100: Line 96:


[[Category:Artificial intelligence]]
[[Category:Artificial intelligence]]
[[Category:Machine learning]]
 
[[Category:Chinese technology]]
 
[[Category:Open-source artificial intelligence]]
{{Ai/tags|Ai/Deepseek|tg-ai-deepseek}}
[[Category:2023 software]]

Latest revision as of 02:12, 24 September 2025

Deepseek

[edit | edit source]
       DeepSeek AI
           Developer:
           DeepSeek (CN: 深度求索)
           Release Date:
           June 2023/R1 Jan. 2024
           License:
           Open-source (Apache 2.0)
           Website:
           Deepseek.com



DeepSeek AI is an open-source artificial intelligence project developed by Chinese AI research company DeepSeek (深度求索). Launched in June 2023, it specializes in developing advanced large language models (LLMs) with particular focus on machine learning research, code generation, and multimodal learning. The project gained significant attention with its January 2024 release of DeepSeek-R1, a 128K-context multimodal model competitive with leading global AI systems.

History

[edit | edit source]

Founding and Early Models (2023)

[edit | edit source]
  • June 2023: DeepSeek AI launched with release of DeepSeek-Coder, an open-source code generation model supporting 80+ programming languages
  • August 2023: Introduction of DeepSeek-VL, a vision-language multimodal model for image-text understanding
  • October 2023: Release of DeepSeek Math, specialized for mathematical reasoning and problem-solving

DeepSeek-R1 Era (2024)

[edit | edit source]
  • January 2024: Launch of DeepSeek-R1, featuring:
    • 128K token context window
    • Multilingual capabilities (English/Chinese focus)
    • Strong performance in coding, mathematics, and reasoning benchmarks
  • March 2024: Integration with Hugging Face ecosystem and API service launch
  • May 2024: Partnership with Linux Foundation for open-source AI infrastructure

Technical Architecture

[edit | edit source]

Model Specifications

[edit | edit source]
DeepSeek Model Comparison
Model Parameters Context Window Specialization Release Date
DeepSeek-Coder 1.3B-33B 16K Code generation June 2023
DeepSeek-VL 7B 32K Vision-language August 2023
DeepSeek-R1 Unknown (est. 30B+) 128K General reasoning January 2024

Key Technologies

[edit | edit source]
  • Hybrid Attention Mechanism: Combines sliding window attention with global token retention
  • Dynamic Token Scaling: Adaptive context management for long documents
  • Multimodal Fusion: Cross-modal alignment architecture in DeepSeek-VL
  • Code-Centric Pretraining: Specialized datasets with 2:1 code-to-text ratio

Capabilities

[edit | edit source]
  • Natural Language Processing: Advanced text generation, summarization, translation
  • Programming Assistance: Code completion, debugging, documentation generation
  • Mathematical Reasoning: Solving complex equations, theorem proving
  • Multimodal Understanding: Image-to-text analysis, visual question answering
  • Knowledge Retrieval: Access to current information through web integration

Performance

[edit | edit source]

DeepSeek models consistently rank highly in major AI benchmarks:

  • Hugging Face Open LLM Leaderboard: Top 3 open-source models (as of 2024)
  • HumanEval: 75.6% pass@1 score (DeepSeek-Coder-33B)
  • MMLU: 82.3% accuracy (DeepSeek-R1)
  • GSM8K: 86.5% accuracy in mathematical reasoning

Ethical Framework

[edit | edit source]

DeepSeek AI operates under strict ethical guidelines:

  • Transparency: Model cards with detailed training data disclosures
  • Safety Protocols: Multi-layer content filtering system
  • Open Governance: Community input on model deployment policies
  • Bias Mitigation: Dedicated adversarial training regimen

Community and Open Source

[edit | edit source]
  • GitHub Presence: 15k+ stars across repositories
  • Model Hosting: Available on Hugging Face and ModelScope
  • Research Partnerships: Collaborations with Tsinghua University and Chinese Academy of Sciences
  • License: Open-source under Apache License 2.0 for research/commercial use

Reception

[edit | edit source]
  • Praised for "setting new standards in open-source AI" (MIT Technology Review, 2024) How a top Chinese AI model overcame US sanctions
  • Recognized as "China's most promising AI research initiative" (South China Morning Post, 2024)
  • Critiqued for limited multilingual support beyond Chinese/English

References

[edit | edit source]

[1] [2] [3]

  1. DeepSeek, 'Open-Source Code Language Models' eprint=2312.12996 2023-12-20
  2. DeepSeek-R1: The First Open-Source 128K Context Multimodal Model DeepSeek AI 2024-01-10
  3. 'Comparative Analysis of Large Language Models', 'Journal of AI Research' volume=47 , pages: 112–145 doi=10.1016 j.jair.2024.03.005 / year=2024
[edit | edit source]


This content is generated full or partially by Ai. Click to report inaccurate content.