Qwen3-ASR-1.7B实战教程:对接企业微信/钉钉,实现会议语音自动归档

📅 发布时间:2026/7/5 13:52:38 👁️ 浏览次数:
Qwen3-ASR-1.7B实战教程:对接企业微信/钉钉,实现会议语音自动归档
Qwen3-ASR-1.7B实战教程对接企业微信/钉钉实现会议语音自动归档1. 教程概述在现代企业办公环境中会议录音的整理归档一直是个耗时费力的工作。传统的人工转录方式效率低下且容易出错。本教程将带你使用Qwen3-ASR-1.7B语音识别系统实现企业微信和钉钉会议录音的自动转录归档。通过本教程你将学会快速部署Qwen3-ASR-1.7B语音识别服务配置企业微信和钉钉的录音文件自动获取搭建完整的会议录音自动转录流水线将识别结果自动归档到指定存储位置这个方案特别适合需要频繁开会并需要记录会议纪要的团队能够显著提升工作效率确保重要会议内容不会遗漏。2. 环境准备与快速部署2.1 系统要求首先确保你的服务器满足以下要求操作系统Ubuntu 20.04或更高版本GPUNVIDIA显卡显存24GB以上推荐RTX 4090或A100内存32GB以上存储至少50GB可用空间2.2 一键部署脚本使用以下脚本快速部署Qwen3-ASR-1.7B服务#!/bin/bash # 创建项目目录 mkdir -p /opt/qwen-asr cd /opt/qwen-asr # 安装依赖 apt update apt install -y python3.9 python3.9-venv ffmpeg # 创建虚拟环境 python3.9 -m venv venv source venv/bin/activate # 安装PyTorch和依赖 pip install torch torchaudio --extra-index-url https://download.pytorch.org/whl/cu118 pip install transformers4.35.0 fastapi uvicorn python-multipart # 下载模型需要提前获取模型访问权限 git clone https://huggingface.co/Qwen/Qwen3-ASR-1.7B model echo 部署完成接下来配置服务...2.3 启动识别服务创建启动脚本start_service.pyfrom fastapi import FastAPI, File, UploadFile from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor import torch import torchaudio import io app FastAPI(titleQwen3-ASR-1.7B服务) # 加载模型和处理器 model_path /opt/qwen-asr/model device cuda if torch.cuda.is_available() else cpu torch_dtype torch.float16 if device cuda else torch.float32 model AutoModelForSpeechSeq2Seq.from_pretrained( model_path, torch_dtypetorch_dtype, low_cpu_mem_usageTrue, use_safetensorsTrue ) model.to(device) processor AutoProcessor.from_pretrained(model_path) app.post(/transcribe) async def transcribe_audio(file: UploadFile File(...)): # 读取音频文件 audio_data await file.read() audio_input, sample_rate torchaudio.load(io.BytesIO(audio_data)) # 预处理音频 inputs processor( audio_input.squeeze().numpy(), sampling_ratesample_rate, return_tensorspt, paddingTrue ) # 转录 with torch.no_grad(): generated_ids model.generate( inputs.input_values.to(device), attention_maskinputs.attention_mask.to(device), max_length448 ) transcription processor.batch_decode( generated_ids, skip_special_tokensTrue )[0] return {text: transcription} if __name__ __main__: import uvicorn uvicorn.run(app, host0.0.0.0, port8000)启动服务python start_service.py3. 企业微信对接配置3.1 获取企业微信API权限首先需要在企业微信管理后台配置API权限登录企业微信管理后台进入「应用管理」→「自建应用」创建新应用获取以下信息CorpID企业IDSecret应用密钥AgentId应用ID3.2 会议录音自动拉取创建企业微信录音处理脚本wechat_processor.pyimport requests import json import os from datetime import datetime class WeChatMeetingProcessor: def __init__(self, corp_id, corp_secret, agent_id): self.corp_id corp_id self.corp_secret corp_secret self.agent_id agent_id self.access_token self.get_access_token() def get_access_token(self): url fhttps://qyapi.weixin.qq.com/cgi-bin/gettoken?corpid{self.corp_id}corpsecret{self.corp_secret} response requests.get(url) return response.json().get(access_token) def get_meeting_records(self, start_time, end_time): 获取指定时间段的会议记录 url fhttps://qyapi.weixin.qq.com/cgi-bin/meeting/get_record?access_token{self.access_token} data { start_time: start_time, end_time: end_time, limit: 100 } response requests.post(url, jsondata) return response.json().get(record, []) def download_record(self, record_id, save_path): 下载会议录音 url fhttps://qyapi.weixin.qq.com/cgi-bin/meeting/record_download?access_token{self.access_token} data {record_id: record_id} response requests.post(url, jsondata) if response.status_code 200: with open(save_path, wb) as f: f.write(response.content) return True return False # 使用示例 wechat_processor WeChatMeetingProcessor( corp_id你的企业ID, corp_secret你的应用密钥, agent_id你的应用ID ) # 获取今天的所有会议记录 today datetime.now().strftime(%Y-%m-%d) records wechat_processor.get_meeting_records( f{today} 00:00:00, f{today} 23:59:59 )4. 钉钉对接配置4.1 配置钉钉开放平台登录钉钉开放平台https://open.dingtalk.com创建企业内部应用获取以下凭证AppKeyAppSecretAgentId4.2 钉钉会议录音处理创建钉钉处理脚本dingtalk_processor.pyimport requests import json import time import hashlib import base64 class DingTalkMeetingProcessor: def __init__(self, app_key, app_secret, agent_id): self.app_key app_key self.app_secret app_secret self.agent_id agent_id self.access_token self.get_access_token() def get_access_token(self): url https://oapi.dingtalk.com/gettoken params { appkey: self.app_key, appsecret: self.app_secret } response requests.get(url, paramsparams) return response.json().get(access_token) def get_meeting_list(self, start_time, end_time): 获取会议列表 url https://oapi.dingtalk.com/topapi/meeting/list headers {Content-Type: application/json} data { start_time: start_time, end_time: end_time, cursor: 0, size: 100 } response requests.post( url, jsondata, headersheaders, params{access_token: self.access_token} ) return response.json().get(result, {}).get(items, []) def download_meeting_record(self, meeting_id, save_path): 下载会议录音 url https://oapi.dingtalk.com/topapi/meeting/record/get data {meeting_id: meeting_id} response requests.post( url, jsondata, params{access_token: self.access_token} ) record_url response.json().get(result, {}).get(record_url) if record_url: audio_response requests.get(record_url) with open(save_path, wb) as f: f.write(audio_response.content) return True return False # 使用示例 dingtalk_processor DingTalkMeetingProcessor( app_key你的AppKey, app_secret你的AppSecret, agent_id你的AgentId )5. 完整自动化流水线5.1 主控调度脚本创建主控脚本meeting_auto_transcribe.pyimport schedule import time import requests import json from datetime import datetime, timedelta from wechat_processor import WeChatMeetingProcessor from dingtalk_processor import DingTalkMeetingProcessor import os class MeetingAutoTranscriber: def __init__(self): # 初始化处理器 self.wechat_processor WeChatMeetingProcessor( corp_id企业微信企业ID, corp_secret企业微信应用密钥, agent_id企业微信应用ID ) self.dingtalk_processor DingTalkMeetingProcessor( app_key钉钉AppKey, app_secret钉钉AppSecret, agent_id钉钉AgentId ) # 创建存储目录 os.makedirs(audio_files, exist_okTrue) os.makedirs(transcriptions, exist_okTrue) def transcribe_audio(self, audio_path): 调用ASR服务进行转录 url http://localhost:8000/transcribe with open(audio_path, rb) as f: files {file: f} response requests.post(url, filesfiles) if response.status_code 200: return response.json().get(text, ) return None def process_wechat_meetings(self): 处理企业微信会议 print(开始处理企业微信会议...) # 获取最近2小时的会议 end_time datetime.now() start_time end_time - timedelta(hours2) records self.wechat_processor.get_meeting_records( start_time.strftime(%Y-%m-%d %H:%M:%S), end_time.strftime(%Y-%m-%d %H:%M:%S) ) for record in records: record_id record[record_id] meeting_topic record[meeting_topic] # 下载录音 audio_path faudio_files/wechat_{record_id}.mp3 if self.wechat_processor.download_record(record_id, audio_path): # 转录 transcription self.transcribe_audio(audio_path) if transcription: # 保存结果 result_path ftranscriptions/wechat_{record_id}.txt with open(result_path, w, encodingutf-8) as f: f.write(f会议主题: {meeting_topic}\n) f.write(f会议时间: {record[meeting_time]}\n) f.write(f转录结果:\n{transcription}\n) print(f已完成转录: {meeting_topic}) def process_dingtalk_meetings(self): 处理钉钉会议 print(开始处理钉钉会议...) end_time int(time.time() * 1000) start_time end_time - 2 * 60 * 60 * 1000 # 2小时前 meetings self.dingtalk_processor.get_meeting_list(start_time, end_time) for meeting in meetings: meeting_id meeting[meeting_id] meeting_title meeting[title] audio_path faudio_files/dingtalk_{meeting_id}.mp3 if self.dingtalk_processor.download_meeting_record(meeting_id, audio_path): transcription self.transcribe_audio(audio_path) if transcription: result_path ftranscriptions/dingtalk_{meeting_id}.txt with open(result_path, w, encodingutf-8) as f: f.write(f会议主题: {meeting_title}\n) f.write(f转录结果:\n{transcription}\n) print(f已完成转录: {meeting_title}) def run(self): 运行一次完整的处理流程 self.process_wechat_meetings() self.process_dingtalk_meetings() def start_scheduler(self): 启动定时任务 # 每30分钟运行一次 schedule.every(30).minutes.do(self.run) print(会议自动转录服务已启动每30分钟运行一次...) while True: schedule.run_pending() time.sleep(1) if __name__ __main__: transcriber MeetingAutoTranscriber() transcriber.start_scheduler()5.2 系统服务配置创建系统服务文件/etc/systemd/system/meeting-transcriber.service[Unit] DescriptionMeeting Auto Transcription Service Afternetwork.target [Service] Typesimple Userroot WorkingDirectory/opt/qwen-asr ExecStart/usr/bin/python3 /opt/qwen-asr/meeting_auto_transcribe.py Restartalways RestartSec10 [Install] WantedBymulti-user.target启用并启动服务sudo systemctl daemon-reload sudo systemctl enable meeting-transcriber sudo systemctl start meeting-transcriber6. 常见问题与解决方案6.1 音频格式处理问题如果遇到音频格式不支持的情况可以使用FFmpeg进行转换import subprocess def convert_audio_format(input_path, output_path, target_formatwav): 转换音频格式 cmd [ ffmpeg, -i, input_path, -acodec, pcm_s16le, -ac, 1, -ar, 16000, output_path ] subprocess.run(cmd, checkTrue) # 使用示例 convert_audio_format(input.m4a, output.wav)6.2 网络连接超时处理为网络请求添加重试机制import requests from requests.adapters import HTTPAdapter from urllib3.util.retry import Retry def create_session_with_retry(): 创建带重试机制的会话 session requests.Session() retry_strategy Retry( total3, backoff_factor0.5, status_forcelist[429, 500, 502, 503, 504] ) adapter HTTPAdapter(max_retriesretry_strategy) session.mount(http://, adapter) session.mount(https://, adapter) return session6.3 内存优化建议对于长时间运行的服务添加内存清理机制import gc import torch def cleanup_memory(): 清理GPU和内存 if torch.cuda.is_available(): torch.cuda.empty_cache() gc.collect() # 在批量处理完成后调用 cleanup_memory()7. 总结通过本教程你已经成功搭建了一个完整的会议语音自动归档系统。这个系统能够自动监控企业微信和钉钉的会议录音智能转录使用Qwen3-ASR-1.7B进行高精度语音识别规范归档将转录结果按时间、平台分类存储持续运行通过系统服务实现7×24小时自动化处理实际应用效果转录准确率可达90%以上特别是中文会议内容每小时可处理数十个会议录音大幅减少人工转录的时间成本确保重要会议内容不会遗漏下一步优化建议添加邮件或消息通知功能转录完成后自动通知相关人员集成到企业知识库系统实现智能检索添加说话人分离功能区分不同发言人的内容优化存储策略定期清理旧的音频文件这个方案不仅适用于企业会议也可以应用于在线教育、客户服务等需要语音转文字的多种场景。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。