
In today’s rapidly developing AI technology landscape, SiliconFlow and Huawei Cloud have joined forces to launch DeepSeek R1&V3 inference services based on Ascend Cloud, providing developers with efficient and stable large model deployment solutions.
Website Introduction
SiliconFlow’s SiliconCloud platform focuses on providing standardized, high-performance generative AI infrastructure, supporting the most advanced large language models and multimodal model inference globally.
Key Features
- Launched DeepSeek R1&V3 model inference services based on Huawei Cloud’s Ascend Cloud.
- Self-developed inference acceleration engine enables model deployment performance comparable to global high-end GPUs.
- Provides stable, production-grade inference services to meet large-scale production environment needs.
- Zero deployment threshold; developers can directly call SiliconCloud API, focusing on application development.
- Offers highly competitive pricing strategies consistent with DeepSeek’s official promotional period prices.
Related Projects
SiliconFlow has also launched six accelerated versions of DeepSeek-R1 distilled models, including DeepSeek-R1-Distill-Llama-70B, DeepSeek-R1-Distill-Qwen-32B, etc., with some models available for free.
Advantages
The service achieves a single-card decode throughput exceeding 1920 Tokens/s under a single-user 20 TPS level, comparable to H100 deployment performance, with model accuracy consistent with DeepSeek’s official standards.
Pricing
Offers both free and paid versions; DeepSeek-V3‘s promotional period price is ¥1/M tokens (input) & ¥2/M tokens (output), and DeepSeek-R1’s price is ¥4/M tokens (input) & ¥16/M tokens (output).
Summary
Founded in August 2023 and based in China, SiliconFlow is dedicated to providing efficient and stable AI inference services. Through these innovative features, users can achieve a more efficient and convenient large model deployment experience.
Relevant Navigation


魔搭社区

Kaiber

Gemma

Replicate

Mistral 7B

TensorFlow
