Files
doc-exports/docs/modelarts/umn/modelarts_01_0015.html
Lai, Weijian 6aa966a79a ModelArts UMN 24.3.0 version
Reviewed-by: Pruthi, Vineet <vineet.pruthi@t-systems.com>
Co-authored-by: Lai, Weijian <laiweijian4@huawei.com>
Co-committed-by: Lai, Weijian <laiweijian4@huawei.com>
2024-11-02 09:04:52 +00:00

1.7 KiB

Model Deployment

ModelArts is capable of managing models and services. This allows mainstream framework images and models from multiple vendors to be managed in a unified manner.

Generally, AI model deployment and large-scale implementation are complex.

Figure 1 Process of deploying a model
  • The real-time inference service features high concurrency, low latency, and elastic scaling, and supports multi-model gray release and A/B testing.
  • ModelArts is optimized based on the high-performance AI inference chip Ascend 310. It can process PBs of inference data within a single day, publish over 1 million inference APIs on the cloud, and control inference network latency to milliseconds.
<script language="JavaScript"> </script>