{"Code":200,"Data":{"AigcAttributes":"{}","AigcIsTop":0,"AigcType":"","AlreadyStar":false,"ApplyMeta":"{}","ApprovalMode":1,"ApprovalNotifyEmail":"","Architectures":null,"Avatar":"https://img.alicdn.com/imgextra/i1/O1CN01yhHrHg1Pdl3UKPhGc_!!6000000001864-2-tps-88-88.png","Backbone":[],"BackendSupport":{"architectures":null,"backend_info":{"deploy_task":null,"lmdeploy":null,"lmdeploy_turbomind":null,"ollama":null,"sglang":null,"vllm":null},"model_id":""},"BaseModel":[],"BaseModelRelation":"","CardReady":0,"CardUnreadyReason":"","CertificationCreateBy":"","CertificationCreatedTime":-62135596800,"ChineseName":"SenseVoice多语言语音理解模型Small-onnx","CoverImages":[],"CreatedBy":"speech_asr","CreatedTime":1727263656,"DashSdkParameter":"","Datasets":{"test":["wikipedia data test","10000 industrial Mandarin sentences test"],"train":["33M-samples online data"]},"DemoAvailable":0,"DemoUnavailableReason":"","Description":"","Domain":["audio"],"Downloads":777922,"ExampleCodeAvailable":0,"ExampleCodeUnavailableReason":"","ForbiddenVisibilityUpdate":false,"Frameworks":["onnx"],"FromSite":"maas","Id":334910,"Integrating":2,"IntegrationFailureLog":"","IntegrationFailureReason":"","IsAccessible":1,"IsCertification":4,"IsHot":0,"IsNewModel":false,"IsOnline":1,"IsPreTrain":0,"IsPublished":1,"IsTop":0,"Language":["cn"],"LastUpdatedTime":1727322403,"Libraries":["onnx"],"License":"Apache License 2.0","Meta":"","Metrics":["f1_score"],"ModelDetail":{},"ModelInfos":{},"ModelRevisions":null,"ModelSource":"USER_UPLOAD","ModelTools":"","ModelType":["Classification"],"MuseInfo":null,"NEXA":{"Catalogues":null,"ModelCover":"","ScientificField":"","Source":"","SubScientificField":null},"Name":"SenseVoiceSmall-onnx","NewVersion":"","NickName":"","OfficialTags":null,"OpenAiSwingDeployInfo":{"Order":0,"Recommend":null,"lmdeploy":{"eas":{"Script":"","requirements":""},"ens":{"Script":"","requirements":""},"fc":{"Script":"","requirements":""},"image_tag":""},"ollama":{"eas":{"Script":"","requirements":""},"ens":{"Script":"","requirements":""},"fc":{"Script":"","requirements":""},"image_tag":""},"pipeline":{"eas":{"Script":"","requirements":""},"ens":{"Script":"","requirements":""},"fc":{"Script":"","requirements":""},"image_tag":""},"vllm":{"eas":{"Script":"","requirements":""},"ens":{"Script":"","requirements":""},"fc":{"Script":"","requirements":""},"image_tag":""}},"Organization":{"ApplyFailureReason":"","ApplyReason":"","Avatar":"https://resouces.modelscope.cn/avatar/86fc4b9c-4548-463c-8cbb-f37f68b87141.png","CreateCompetition":false,"CreatedBy":"damoadmin","Description":"[\"root\",{},[\"p\",{},[\"span\",{\"data-type\":\"text\"},[\"span\",{\"data-type\":\"leaf\"},\"通义实验室(Institute for Intelligent Computing, aka Tongyi Lab) 专注于各领域大模型技术研发与创新应用。实验室研究方向涵盖LLM、多模态理解与生成、视觉AIGC、语音等多个领域。我们并积极推进研究成果的产业化落地。实验室同时积极参与开源社区建设，全方位拥抱开源社区，共同探索AI模型的开源开放。\"]]]]","DisplayUrl":"","Email":"","FromSite":"","FullName":"通义实验室","GithubAddress":"","GmtCreated":"2022-08-03T17:01:55Z","GmtModified":"2025-11-21T08:30:42Z","Id":1,"InitAdminMembers":"","IsApply":false,"IsCertification":"","Mobile":"","Name":"iic","Path":"","Roles":null,"StarCnt":0,"Status":0,"SubscribeVo":null,"Type":2},"PaiModelGalleryUrl":null,"PaiSdkParameter":null,"Path":"iic","ProtectedMode":2,"ReadMeContent":"\n# 模型介绍\n\n## Highlights\n模型为[SenseVoice多语言语音理解模型Small](https://www.modelscope.cn/models/iic/SenseVoiceSmall)的onnx量化导出版本，可以直接用来做生产部署，一键部署教程（[点击此处](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/readme_cn.md)\n\n\n## \u003cstrong\u003e[ModelScope-FunASR](https://github.com/alibaba-damo-academy/FunASR)\u003c/strong\u003e\n\u003cstrong\u003e[FunASR](https://github.com/alibaba-damo-academy/FunASR)\u003c/strong\u003e提供可便捷本地或者云端服务器部署的离线文件转写服务，内核为FunASR已开源runtime-SDK。 集成了达摩院语音实验室在Modelscope社区开源的语音端点检测(VAD)、Paraformer-large语音识别(ASR)、标点恢复(PUNC) 等相关能力，拥有完整的语音识别链路，可以将几十个小时的音频或视频识别成带标点的文字，而且支持上百路请求同时进行转写。\n\n[**最新动态**](https://github.com/alibaba-damo-academy/FunASR#whats-new) \n| [**环境安装**](https://github.com/alibaba-damo-academy/FunASR#installation)\n| [**介绍文档**](https://alibaba-damo-academy.github.io/FunASR/en/index.html)\n| [**服务部署**](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/readme_cn.md)\n| [**模型库**](https://github.com/alibaba-damo-academy/FunASR/blob/main/docs/model_zoo/modelscope_models.md)\n| [**联系我们**](https://github.com/alibaba-damo-academy/FunASR#contact)\n\n## 快速上手\n### docker安装\n如果您已安装docker，忽略本步骤！!\n通过下述命令在服务器上安装docker：\n```shell\ncurl -O https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/shell/install_docker.sh；\nsudo bash install_docker.sh\n```\ndocker安装失败请参考 [Docker Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/docker.html)\n\n### 镜像启动\n通过下述命令拉取并启动FunASR runtime的docker镜像（[获取最新镜像版本](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_offline_zh.md)）：\n\n```shell\nsudo docker pull \\\n  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6\nmkdir -p ./funasr-runtime-resources/models\nsudo docker run -p 10095:10095 -it --privileged=true \\\n  -v $PWD/funasr-runtime-resources/models:/workspace/models \\\n  registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6\n```\n\n### 服务端启动\n\ndocker启动之后，启动 funasr-wss-server服务程序：\n```shell\ncd FunASR/runtime\nnohup bash run_server.sh \\\n  --download-model-dir /workspace/models \\\n  --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \\\n  --model-dir damo/SenseVoiceSmall-onnx  \\\n  --punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \\\n  --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \\\n  --itn-dir thuduj12/fst_itn_zh \\\n  --hotword /workspace/models/hotwords.txt \u003e log.out 2\u003e\u00261 \u0026\n```\n\n### 客户端测试与使用\n\n运行上面安装指令后，会在./funasr-runtime-resources（默认安装目录）中下载客户端测试工具目录samples（[下载点击此处](https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/sample/funasr_samples.tar.gz)），\n我们以Python语言客户端为例，进行说明，支持多种音频格式输入（.wav, .pcm, .mp3等），也支持视频输入(.mp4等)，以及多文件列表wav.scp输入，其他版本客户端请参考文档（[点击此处](https://alibaba-damo-academy.github.io/FunASR/en/runtime/docs/SDK_tutorial_zh.html#id5)）\n\n```shell\npython3 wss_client_asr.py --host \"127.0.0.1\" --port 10095 --mode offline --audio_in \"../audio/asr_example.wav\"\n```\n\n更详细用法介绍（[点击此处](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_tutorial_zh.md)）\n\n\n## 相关论文以及引用信息\n\n```BibTeX\n@inproceedings{chen2020controllable,\n  title={Controllable Time-Delay Transformer for Real-Time Punctuation Prediction and Disfluency Detection},\n  author={Chen, Qian and Chen, Mengzhe and Li, Bo and Wang, Wen},\n  booktitle={ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},\n  pages={8069--8073},\n  year={2020},\n  organization={IEEE}\n}\n```\n\n\n","ReadMeTips":null,"RelatedArxivId":[],"RelatedPaper":[],"Revision":"master","Stars":47,"StorageSize":241588081,"Studios":[],"SubVisionFoundation":"","SupportApiInference":false,"SupportDashDeployment":0,"SupportDashInference":0,"SupportDashTraining":0,"SupportDeployment":0,"SupportExperience":0,"SupportFinetuning":0,"SupportFlexTrain":0,"SupportInference":"","SupportPaiModelGallery":null,"SupportPaiSdk":0,"SwingDeployInfo":null,"Tags":["FunASR","CT-Transformer","Alibaba","ICASSP 2020"],"Tasks":[{"ChineseName":"语音识别","Description":"","DomainName":"audio","Id":31,"IsExhibition":true,"IsHot":0,"IsLeaf":true,"IsLoginRequired":false,"IsRetrieval":true,"Level":1,"Name":"auto-speech-recognition","ParentId":-1,"ParentTask":null,"Sorting":0,"SupportWidgets":true,"TypicalModel":"","WidgetConfig":"{\"task\": \"auto-speech-recognition\", \"inputs\": [{\"type\": \"audio\", \"fileType\": \"pcm\", \"validator\": {\"max_size\": \"10M\"}, \"displayType\": \"AudioTransformer\", \"displayProps\": {\"enableRecording\": true}}], \"output\": {\"displayType\": \"ReadonlyText\", \"displayOutputMapping\": \"text\"}, \"examples\": []}","WidgetValidator":""}],"Tools":[],"TriggerWords":null,"Visibility":5,"VisionFoundation":"","_":null,"widgets":[{"enabled":false,"examples":[{"inputs":[{"data":"我们都是木头人不会讲话不会动","name":"input"}],"name":"1","parameters":null,"title":"示例1"}],"inferencespec":{"cpu":1,"gpu":0,"gpu_memory":0,"memory":4096},"inputs":[{"name":"input","title":"文本","type":"text","validator":null}],"parameters":null,"task":"punctuation"}]},"Message":"success","RequestId":"55def475-ad63-4e5d-a6ac-3325f8f5a918","Success":true}