凯发k8国际娱乐官网入口-k8凯发> ai开发平台modelarts> > > 部署服务

更新时间：2024-01-04 gmt 08:00

部署服务-凯发k8国际娱乐官网入口

功能介绍

将模型部署为服务。

调用方法

请参见。

uri

post /v1/{project_id}/services

表1 路径参数
参数	是否必选	参数类型	描述
project_id	是	string	用户项目id。获取方法请参见。

请求参数

表2 请求header参数
参数	是否必选	参数类型	描述
x-auth-token	是	string	用户token。通过调用iam服务获取用户token接口获取（响应消息头中x-subject-token的值）。

表3 请求body参数
参数	是否必选	参数类型	描述
workspace_id	否	string	服务所属的工作空间id，默认为0，代表默认工作空间。
schedule	否	array of schedule objects	服务调度配置，仅在线服务可配置，默认不使用，服务长期运行。
cluster_id	否	string	专属资源池id，默认为空，不使用专属资源池；使用专属资源池部署服务时需确保集群状态正常；配置此参数后，则使用集群的网络配置，vpc_id参数不生效；与下方real-time config中的cluster_id同时配置时，优先使用real-time config中的cluster_id参数，使用专属资源池时,cluster_id和pool_name两个必须要填其中任意一个。
pool_name	否	string	新版专属资源池id，默认为空，不使用专属资源池；对应新版资源池的资源池id。使用专属资源池部署服务时需确保集群状态正常；当与下方real-time config中的pool_name同时配置时，优先使用real-time config中的pool_name参数，使用专属资源池时，cluster_id和pool_name两个必须要填其中任意一个。
infer_type	是	string	推理方式，取值为real-time/batch/edge。 real-time代表在线服务，将模型部署为一个web service，并且提供在线的测试ui与监控能力，服务一直保持运行。 batch为批量服务，批量服务可对批量数据进行推理，完成数据处理后自动停止。 edge表示边缘服务，通过华为云智能边缘平台，在边缘节点将模型部署为一个web service，需提前在ief（智能边缘服务）创建好节点。
vpc_id	否	string	在线服务实例部署的虚拟私有云id，默认为空，此时modelarts会为每个用户分配一个专属的vpc，用户之间隔离；如需要在服务实例中访问名下vpc内的其他服务组件，则可配置此参数为对应vpc的id。vpc一旦配置，不支持修改。当vpc_id与cluster_id一同配置时，只有专属资源池参数生效。
service_name	是	string	服务名称，支持1-64位字符，可包含字母、中文、数字、中划线、下划线。
description	否	string	服务备注，默认为空，不超过100个字符，不能包含!<> &"'符号。
security_group_id	否	string	安全组，默认为空，当配置了vpc_id则此参数必填。安全组起着虚拟防火墙的作用，为服务实例提供安全的网络访问控制策略。安全组须包含至少一条入方向规则，对协议为tcp、源地址为0.0.0.0/0、端口为8080的请求放行。
subnet_network_id	否	string	子网的网络id，默认为空，当配置了vpc_id则此参数必填。需填写虚拟私有云控制台子网详情中显示的“网络id”。通过子网可提供与其他网络隔离的、可以独享的网络资源。
config	是	array of serviceconfig objects	模型运行配置，当推理方式为batch/edge时仅支持配置一个模型；当推理方式为real-time时，可根据业务需要配置多个模型并分配权重，但多个模型的版本号不能相同。
additional_properties	否	mapserviceadditionalproperties>	服务级别附加属性，便于服务管理。

表4 schedule
参数	是否必选	参数类型	描述
duration	是	integer	对应时间单位的数值，比如2小时后停止，则time_unit填hours，duration填2。
time_unit	是	string	调度时间单位，可选days/hours/minutes。
type	是	string	调度类型，当前仅支持取值为stop。

表5 serviceconfig
参数	是否必选	参数类型	描述
custom_spec	否	customspec object	自定义资源规格配置。
envs	否	map	公共参数。运行模型需要的环境变量键值对，可选填，默认为空。
specification	是	string	公共参数。资源规格，可通过查询支持的服务部署规格可获取规格列表。当前版本可选modelarts.vm.cpu.2u/modelarts.vm.gpu.p4(需申请)/modelarts.vm.ai1.a310(需申请)/custom(仅支持在部署到专属资源池时使用)，需申请的规格请提交工单，由modelarts运维工程师添加权限。若配置为custom，需同时指定custom_spec参数。
weight	否	integer	real-time类型必选。权重百分比，分配到此模型的流量权重，仅当infer_type为real-time时需要配置，多个权重相加必须等于100；当在一个在线服务中同时配置了多个模型版本且设置不同的流量权重比例时，持续地访问此服务的预测接口，modelarts会按此权重比例将预测请求转发到对应的模型版本实例。
deploy_timeout_in_seconds	否	integer	单个模型实例部署的超时时间。
model_id	是	string	公共参数。模型id。通过调用查询ai应用列表接口可以获取。
src_path	否	string	batch服务类型必选。批量任务输入数据的obs路径。
req_uri	否	string	batch服务类型必选。批量任务中调用的推理接口，即模型镜像中暴露的rest接口，需要从模型的config.json文件中选取一个api路径用于此次推理；如使用modelarts提供的预置推理镜像，则此接口为/。
mapping_type	否	string	batch服务类型必选。输入数据的映射类型，可选file或csv。选择file时，指每个推理请求对应到输入数据目录下的一个文件，当使用此方式时，此模型对应req_uri只能有一个输入参数且此参数的类型是file。选择csv时，指每个推理请求对应到csv里的一行数据，当使用此方式时，输入数据目录下的文件只能以.csv为后缀，且需配置mapping_rule参数，以表达推理请求体中各个参数对应到csv的索引。
cluster_id	否	string	real-time服务类型可选。专属资源池id，默认为空，不使用专属资源池；使用专属资源池部署服务时需确保集群状态正常；配置此参数后，则使用集群的网络配置，vpc_id参数不生效。
pool_name	否	string	新版专属资源池id，默认为空，不使用专属资源池；对应新版资源池的资源池id。使用专属资源池部署服务时需确保集群状态正常；当与上层real-time config中的pool_name同时配置时，优先使用real-time config中的pool_name参数。
nodes	否	array of strings	edge服务类型必选。边缘节点id数组，节点id为ief（智能边缘平台）的边缘节点id，在ief上创建边缘节点后可得到。
mapping_rule	否	object	batch服务类型可选。输入参数与csv数据的映射关系，仅当mapping_type为csv时需要填写。映射规则与模型配置文件config.json中输入参数的定义方式相似，只需要在每一个基本类型（string/number/integer/boolean）的参数下配置index参数，指定使用csv数据中对应索引下标的数据作为此参数的值去发送推理请求，csv数据必须以英文半角逗号分隔，index从0开始计数，特殊地，当index为-1时忽略此参数。具体可参考创建批量服务的样例。
src_type	否	string	batch服务类型必选。数据来源类型，可选填manifestfile；默认为空，表示只读取src_path目录下的文件；当取值为manifestfile时，src_path必须为具体的manifest路径，在manifest文件中可指定多个数据路径（参考推理manifest规范）。
dest_path	否	string	batch服务类型必选。批量任务输出结果的obs路径。
instance_count	是	integer	公共参数。模型部署的实例数，当前限制最大实例数为5，如需使用更多的实例数，需提交工单申请。
additional_properties	否	mapmodeladditionalproperties>	模型部署附加属性，便于服务实例管理。

表6 customspec
参数	是否必选	参数类型	描述
gpu_p4	否	float	gpu个数，可选，默认不使用，支持配置小数，输入值不能小于0（最多支持2位小数，小数点后第3位做四舍五入处理）。
memory	是	integer	内存，单位为mb，仅支持整数。
cpu	是	float	cpu核数，支持配置小数，输入值不能小于0.01（最多支持2位小数，小数点后第3位做四舍五入处理）。
ascend_a310	否	integer	ascend芯片个数，可选，默认不使用，不支持与gpu_p4同时配置。

表7 modeladditionalproperties
参数	是否必选	参数类型	描述
log_volume	否	array of log_volume objects	主机日志目录挂载。仅支持使用专属资源池部署服务场景。如果用户使用公共资源池部署服务，则不支持配置该参数，否则会报错。
max_surge	否	float	必须大于0，不配置默认值为1。当小于1时，代表滚动升级时增加的实例数的百分比；当大于1时，代表滚动升级时最大扩容的实例数。
max_unavailable	否	float	必须大于0，不配置默认值为0。当小于1时，代表滚动升级时允许缩容的实例数的百分比；当大于1时，代表滚动升级时允许缩容的实例数。
termination_grace_period_seconds	否	integer	容器优雅停止时间
persistent_volumes	否	array of persistent_volumes objects	持久化存储挂载配置

表8 log_volume
参数	是否必选	参数类型	描述
host_path	是	string	主机上要映射的日志路径。
mount_path	是	string	容器中的日志路径。

表9 persistent_volumes
参数	是否必选	参数类型	描述
name	否	string	存储卷的名称
mount_path	是	string	存储卷在容器中的挂载路径

**表10** serviceadditionalproperties
参数	是否必选	参数类型	描述
smn_notification	是	mapsmnnotification>	smn消息通知结构，用于通知用户服务状态变化。
log_report_channels	否	array of logreportpipeline objects	日志通道组。没有配置或者数组长度为0时部署代表未启用lts日志对接。开启后不支持修改。
websocket_upgrade	否	boolean	服务接口是否升级为websocket。部署服务时，默认值为false；更新服务配置时，默认值为上一次设置的值。 false：不升级为websocket。 true：升级为websocket。开启后，不支持修改。开启websocket时，不支持同时设置“服务流量限制”。

**表11** smnnotification
参数	是否必选	参数类型	描述
topic_urn	是	string	smn主题urn地址。
events	是	array of integers	事件id，目前已有事件id如下。 1：failed 3：running 7：concerning 11：pending

**表12** logreportpipeline
参数	是否必选	参数类型	描述
type	是	string	日志通道类型（目前支持lts）
configuration	否	ltsconfiguration object	lts日志配置

**表13** ltsconfiguration
参数	是否必选	参数类型	描述
log_group_id	是	string	lts日志组id，长度64
log_stream_id	是	string	lts日志流id，长度64

响应参数

状态码： 200

**表14** 响应body参数
参数	参数类型	描述
service_id	string	服务id。
resource_ids	array of strings	资源id数组，服务对应的模型生成的资源id。

请求示例

请求示例，创建在线服务。

post https://{endpoint}/v1/{project_id}/services
{
  "infer_type" : "real-time",
  "service_name" : "mnist",
  "description" : "mnist service",
  "config" : [ {
    "specification" : "modelarts.vm.cpu.2u",
    "weight" : 100,
    "model_id" : "0e07b41b-173e-42db-8c16-8e1b44cc0d44",
    "instance_count" : 1
  } ]
}

请求示例，创建在线服务且配置多版本分流

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "mnist",
  "description" : "mnist service",
  "infer_type" : "real-time",
  "config" : [ {
    "model_id" : "xxxmodel-idxxx",
    "weight" : "70",
    "specification" : "modelarts.vm.cpu.2u",
    "instance_count" : 1,
    "envs" : {
      "model_name" : "mxnet-model-1",
      "load_epoch" : "0"
    }
  }, {
    "model_id" : "xxxxxx",
    "weight" : "30",
    "specification" : "modelarts.vm.cpu.2u",
    "instance_count" : 1
  } ]
}

请求示例，创建专属资源池自定义规格在线服务样例

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "realtime-demo",
  "description" : "",
  "infer_type" : "real-time",
  "cluster_id" : "8abf68a969c3cb3a0169c4acb24b0000",
  "config" : [ {
    "model_id" : "eb6a4a8c-5713-4a27-b8ed-c7e694499af5",
    "weight" : "100",
    "cluster_id" : "8abf68a969c3cb3a0169c4acb24b0000",
    "specification" : "custom",
    "custom_spec" : {
      "cpu" : 1.5,
      "memory" : 7500,
      "gpu_p4" : 0,
      "ascend_a310" : 0
    },
    "instance_count" : 1
  } ]
}

请求示例，创建在线服务设置自动停止

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "service-demo",
  "description" : "demo",
  "infer_type" : "real-time",
  "config" : [ {
    "model_id" : "xxxmodel-idxxx",
    "weight" : "100",
    "specification" : "modelarts.vm.cpu.2u",
    "instance_count" : 1
  } ],
  "schedule" : [ {
    "type" : "stop",
    "time_unit" : "hours",
    "duration" : 1
  } ]
}

请求示例，创建批量服务且输入数据映射方式为“file”

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "batchservicetest",
  "description" : "",
  "infer_type" : "batch",
  "cluster_id" : "8abf68a969c3cb3a0169c4acb24b****",
  "config" : [ {
    "model_id" : "598b913a-af3e-41ba-a1b5-bf065320f1e2",
    "specification" : "modelarts.vm.cpu.2u",
    "instance_count" : 1,
    "src_path" : "https://infers-data.obs.xxxxx.com/xgboosterdata/",
    "dest_path" : "https://infers-data.obs.xxxxx.com/output/",
    "req_uri" : "/",
    "mapping_type" : "file"
  } ]
}

请求示例，创建批量服务且输入数据映射方式为“csv”

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "batchservicetest",
  "description" : "",
  "infer_type" : "batch",
  "config" : [ {
    "model_id" : "598b913a-af3e-41ba-a1b5-bf065320f1e2",
    "specification" : "modelarts.vm.cpu.2u",
    "instance_count" : 1,
    "src_path" : "https://infers-data.obs.xxxxx.com/xgboosterdata/",
    "dest_path" : "https://infers-data.obs.xxxxx.com/output/",
    "req_uri" : "/",
    "mapping_type" : "csv",
    "mapping_rule" : {
      "type" : "object",
      "properties" : {
        "data" : {
          "type" : "object",
          "properties" : {
            "req_data" : {
              "type" : "array",
              "items" : [ {
                "type" : "object",
                "properties" : {
                  "input5" : {
                    "type" : "number",
                    "index" : 0
                  },
                  "input4" : {
                    "type" : "number",
                    "index" : 1
                  },
                  "input3" : {
                    "type" : "number",
                    "index" : 2
                  },
                  "input2" : {
                    "type" : "number",
                    "index" : 3
                  },
                  "input1" : {
                    "type" : "number",
                    "index" : 4
                  }
                }
              } ]
            }
          }
        }
      }
    }
  } ]
}

请求示例，创建边缘服务样例

post https://{endpoint}/v1/{project_id}/services
{
  "service_name" : "service-edge-demo",
  "description" : "",
  "infer_type" : "edge",
  "config" : [ {
    "model_id" : "eb6a4a8c-5713-4a27-b8ed-c7e694499af5",
    "specification" : "custom",
    "instance_count" : 1,
    "custom_spec" : {
      "cpu" : 1.5,
      "memory" : 7500,
      "gpu_p4" : 0,
      "ascend_a310" : 0
    },
    "envs" : { },
    "nodes" : [ "2r8c4fb9-t497-40u3-89yf-skui77db0472" ]
  } ]
}

响应示例

状态码： 200

服务部署成功。

{
  "service_id" : "10eb0091-887f-4839-9929-cbc884f1e20e",
  "resource_ids" : [ "inf-f878991839647358@1598319442708" ]
}

状态码

状态码	描述
200	服务部署成功。

错误码

请参见。

父主题：

意见反馈

文档内容是否对您有帮助？

提交成功！非常感谢您的反馈，我们会继续努力做到更好！您可在查看反馈及问题处理状态。

系统繁忙，请稍后重试

在使用文档中是否遇到以下问题

内容与产品页面不一致

内容不易理解

缺失示例代码

步骤不可操作

搜不到想要的内容

缺少最佳实践

意见反馈（选填）

0/500

请至少选择一项反馈信息并填写问题反馈

字符长度不能超过500

如您有其它疑问，您也可以通过华为云社区问答频道来与我们联系探讨