AirBox Successfully Ports DeepSeek-R1 Models
The Radxa Fogwiseยฎ AirBox has successfully ported the DeepSeek-R1-Distill-Qwen-7B/1.5B models.
Performance Details:
Deepseek-R1-Distill-Qwen-7B reaches 11 tokens/s
Deepseek-R1-Distill-Qwen-1.5B reaches 30 tokens/s
The Radxa development team has ported the DeepSeek-R1-Distill-Qwen-7B / 1.5B distilled models onto the Fogwiseยฎ AirBox. By using the TPU-MLIR toolchain for INT4 quantization and model compilation, We have successfully enabled the DeepSeek-R1 distilled model to run on the AirBox, which has 32 TOPS computational power.
Performance Resultsโ
DeepSeek-R1-Distill-Qwen-7B reaches 11 tokens/s, it is really an Edge Computing Monster, click to watch the video
Model | Quantization | Sequence Length | First Token Latency (s) | Tokens Per Second (tokens/s) |
---|---|---|---|---|
deepseek-r1-distill-qwen-1.5b | INT4 | 8192 | 5.159 | 30.448 |
deepseek-r1-distill-qwen-7b | INT4 | 2048 | 2.843 | 11.008 |
Model Deployment and Usageโ
The DeepSeek-R1-Distill-Qwen-7B/1.5B model porting method and detailed documentation have been released on Radxa official website. The models and code are fully open-source, and welcome everyone to try and deploy them.
Fogwiseยฎ AirBox Overviewโ
The Radxa Fogwiseยฎ AirBox is an embedded AI microserver with a computational power of up to 32TOPS. It supports various precisions (INT8, FP16/BF16, FP32) and local deployment of mainstream large models such as LLM, text-to-image generation, and various CV models. It features high performance, low power consumption, and strong environmental adaptability. With a variety of deep learning algorithms, it can achieve applications such as facial recognition, video structuring, behavior analysis, and status monitoring, empowering digital transformation in smart cities, smart transportation, smart energy, smart finance, smart telecom, and smart industries.
Additionally, the Radxa Fogwiseยฎ AirBox is fully compatible with edge large models such as ChatGLM3, Llama3.1, Qwen2.5, Stable Diffusion3, FLUX.1, MiniCPM-V2.6, CLIP, Whisper, and more. For more details, please refer to the Radxa official documentation, and feel free to experience it.