feat: 添加多TTS服务支持并优化音频处理

- 新增Fish Audio、豆包TTS、Gemini TTS和Minimax TTS服务支持
- 实现音频音量与语速调整功能
- 添加各TTS服务的配置文件和测试脚本
- 更新README文档,增加新功能说明和示例音频
- 重构TTS适配器代码,提高可扩展性
This commit is contained in:
hex2077
2025-08-10 21:40:10 +08:00
parent b277b2068a
commit 78d4c81173
27 changed files with 8323 additions and 625 deletions

3
.gitignore vendored
View File

@@ -1,4 +1,5 @@
# 忽略 Python 缓存目录
__pycache__/
output/
excalidraw.log
excalidraw.log
config/tts_providers-local.json

48
CLAUDE.md Normal file
View File

@@ -0,0 +1,48 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## 常用命令
* **生成播客**:
```bash
python podcast_generator.py [可选参数]
```
可选参数包括:
* `--api-key <YOUR_OPENAI_API_KEY>`: OpenAI API 密钥。
* `--base-url <YOUR_OPENAI_BASE_URL>`: OpenAI API 代理地址。
* `--model <OPENAI_MODEL_NAME>`: 使用的 OpenAI 模型,默认为 `gpt-3.5-turbo`。
* `--threads <NUMBER_OF_THREADS>`: 生成音频的并行线程数,默认为 `1`。
**示例**:
```bash
python podcast_generator.py --api-key sk-xxxxxx --model gpt-4o --threads 4
```
## 高层代码架构
本项目是一个简易播客生成器,核心功能是利用 AI 生成播客脚本并将其转换为音频。
* **`podcast_generator.py`**: 主运行脚本,负责协调整个播客生成流程,包括:
* 读取配置文件 (`config/*.json`)。
* 读取输入文件 (`input.txt`) 和 AI 提示词文件 (`prompt/*.txt`)。
* 调用 OpenAI API 生成播客大纲和详细脚本。
* 调用配置的 TTS 服务生成音频。
* 使用 FFmpeg 合并生成的音频文件。
* 支持命令行参数配置 OpenAI API 和线程数。
* **`config/`**: 存放 TTS 服务和播客角色配置的 JSON 文件。例如 `edge-tts.json`。这些文件定义了 `podUsers` (播客角色)、`voices` (可用语音) 和 `apiUrl` (TTS 服务接口)。
* **`prompt/`**: 包含用于指导 AI 生成内容的提示词文件。
* `prompt-overview.txt`: 用于生成播客整体大纲。
* `prompt-podscript.txt`: 用于生成详细对话脚本,包含占位符 (`{{numSpeakers}}`, `{{turnPattern}}`)。
* **`input.txt`**: 用户输入播客主题或核心观点,也支持嵌入 `custom` 代码块来提供额外的 AI 指令。
* **`openai_cli.py`**: 负责与 OpenAI API 进行交互的模块。
* **`output/`**: 生成的播客音频文件 (`.wav`) 存放目录。
* **TTS 服务集成**: 项目设计为高度灵活,支持多种 TTS 服务,通过 `config/*.json` 中的 `apiUrl` 进行配置。目前支持本地部署的 `index-tts` 和 `edge-tts`,以及理论上可集成的网络 TTS 服务(如 OpenAI TTS, Azure TTS 等)。
* **音频合并**: 使用 FFmpeg 工具将各个角色的语音片段拼接成一个完整的播客音频文件。FFmpeg 必须安装并配置在系统环境变量中。

View File

@@ -14,7 +14,7 @@
* **🤖 AI 驱动脚本**:借助强大的 OpenAI 模型,自动创作高质量、有深度的播客对话脚本。
* **👥 多角色支持**:自由定义多个播客角色(如主持、嘉宾),并为每个角色指定独一无二的 TTS 语音。
* **🔌 灵活的 TTS 集成**:通过简单的 API URL 配置,无缝对接您自建的或第三方的 TTS 服务。
* **🔊 智能音频合并**:自动将各个角色的语音片段精准拼接,合成一个完整的、流畅的播客音频文件 (`.wav` 格式)。
* **🔊 智能音频合并**:自动将各个角色的语音片段精准拼接,并支持**音量与语速调整**合成一个完整的、流畅的播客音频文件 (`.wav` 格式)。
* **⌨️ 便捷的命令行接口**:提供清晰的命令行参数,让您对播客生成过程的每一个环节都了如指掌。
---
@@ -144,8 +144,9 @@ python podcast_generator.py --api-key sk-xxxxxx --model gpt-4o --threads 4
}
```
* `tts_max_retries` (可选): TTS API 调用失败时的最大重试次数(默认为 `3`)。
* `podUsers`: 定义播客中的**角色**。每个角色的 `code` 必须对应 `voices` 列表中的一个有效语音。
* `voices`: 定义所有可用的 TTS **语音**
* `voices`: 定义所有可用的 TTS **语音**,可包含 `volume_adjustment` (音量调整,单位 dB例如 `6.0` 增加 6dB`-3.0` 减少 3dB) 和 `speed_adjustment` (语速调整,单位百分比,例如 `10.0` 增加 10% 语速,`-10.0` 减少 10% 语速) 参数
* `apiUrl`: 您的 TTS 服务 API 端点。`{{text}}` 将被替换为对话文本,`{{voiceCode}}` 将被替换为角色的语音代码。
* `turnPattern`: 定义角色对话的**轮流模式**,例如 `random` (随机) 或 `sequential` (顺序)。
@@ -165,17 +166,16 @@ python podcast_generator.py --api-key sk-xxxxxx --model gpt-4o --threads 4
* **edge-tts**: [https://github.com/zuoban/tts](https://github.com/zuoban/tts)
* 这是一个通用的 TTS 库,您可以通过自定义适配器将其集成。
### 🌐 网络 TTS 接口支持(未完成)
### 🌐 网络 TTS 接口支持
本项目也可以轻松配置集成各种网络 TTS 服务,只需确保您的 `apiUrl` 配置符合服务提供商的要求。常见的支持服务包括:
* **OpenAI TTS**
* **Azure TTS**
* **Google Cloud Text-to-Speech (Vertex AI)**
* **Minimax TTS**
* **Gemini TTS** (可能需要通过自定义 API 适配器集成)
* **Fish Audio TTS**
* **豆包 TTS (Doubao TTS)**
* **Gemini TTS**
* **OpenAI TTS**(计划中)
* **Azure TTS**(计划中)
* **Google Cloud Text-to-Speech (Vertex AI)**(计划中)
---
## 🎉 输出成果
@@ -189,11 +189,27 @@ python podcast_generator.py --api-key sk-xxxxxx --model gpt-4o --threads 4
* **Edge TTS 生成示例**:
[edgeTTS](https://github.com/user-attachments/assets/3891cf4c-f47f-4c9b-aef6-30ffb3fcefc4)
[edgeTTS](example/edgeTTS.wav)
* **Index TTS 生成示例**:
[indexTTS](https://github.com/user-attachments/assets/a1d2ebee-3e9a-43cb-bc94-67e3c9b3c45a)
[indexTTS](example/indexTTS.wav)
* **豆包 TTS 生成示例**:
[doubaoTTS](example/doubaoTTS.wav)
* **Minimax 生成示例**:
[minimax](example/minimax.wav)
* **Fish Audio 生成示例**:
[fish](example/fish.wav)
* **Gemini TTS 生成示例**:
[geminiTTS](example/geminiTTS.wav)
这些音频文件展示了本工具在实际应用中的效果。
@@ -205,15 +221,29 @@ python podcast_generator.py --api-key sk-xxxxxx --model gpt-4o --threads 4
```
.
├── config/ # ⚙️ 配置文件目录
│ ├── doubao-tts.json
│ ├── edge-tts.json
── index-tts.json
── fish-audio.json
│ ├── gemini-tts.json
│ ├── index-tts.json
│ ├── minimax.json
│ └── tts_providers.json
├── prompt/ # 🧠 AI 提示词目录
│ ├── prompt-overview.txt
│ └── prompt-podscript.txt
├── example/ # 🎧 示例音频目录
│ ├── doubaoTTS.wav
│ ├── edgeTTS.wav
│ ├── fish.wav
│ ├── geminiTTS.wav
│ ├── indexTTS.wav
│ └── minimax.wav
├── output/ # 🎉 输出音频目录
├── input.txt # 🎙️ 播客主题输入文件
├── openai_cli.py # OpenAI 命令行工具
├── podcast_generator.py # 🚀 主运行脚本
── README.md # 📄 项目说明文档
── README.md # 📄 项目说明文档
├── README_EN.md # 📄 英文说明文档
└── tts_adapters.py # TTS 适配器文件
```

View File

@@ -14,7 +14,7 @@ This is a powerful script tool that leverages the intelligence of **OpenAI API**
* **🤖 AI-Driven Scripting**: Automatically generate high-quality, in-depth podcast dialogue scripts with the powerful OpenAI model.
* **👥 Multi-Role Support**: Freely define multiple podcast roles (e.g., host, guest) and assign a unique TTS voice to each role.
* **🔌 Flexible TTS Integration**: Seamlessly connect with your self-built or third-party TTS services through simple API URL configuration.
* **🔊 Smart Audio Merging**: Automatically and precisely stitch together voice segments from various roles to synthesize a complete, smooth podcast audio file (`.wav` format).
* **🔊 Smart Audio Merging**: Automatically and precisely stitch together voice segments from various roles, and support volume and speed adjustment, to synthesize a complete, smooth podcast audio file (`.wav` format).
* **⌨️ Convenient Command-Line Interface**: Provides clear command-line parameters, giving you full control over every aspect of the podcast generation process.
---
@@ -145,6 +145,8 @@ The configuration file is the "brain" of the entire project, telling the script
```
* `podUsers`: Defines the **roles** in the podcast. The `code` for each role must correspond to a valid voice in the `voices` list.
* `tts_max_retries` (optional): The maximum number of retries when a TTS API call fails (default is `3`).
* `voices`: Defines all available TTS **voices**, which can include `volume_adjustment` (volume adjustment in dB, e.g., `6.0` to increase by 6dB, `-3.0` to decrease by 3dB) and `speed_adjustment` (speed adjustment in percentage, e.g., `10.0` to increase speed by 10%, `-10.0` to decrease speed by 10%) parameters.
* `voices`: Defines all available TTS **voices**.
* `apiUrl`: Your TTS service API endpoint. `{{text}}` will be replaced with the dialogue text, and `{{voiceCode}}` will be replaced with the character's voice code.
* `turnPattern`: Defines the **turn-taking pattern** for character dialogue, such as `random` or `sequential`.
@@ -165,16 +167,17 @@ You can deploy the following open-source projects as local TTS services and inte
* **edge-tts**: [https://github.com/zuoban/tts](https://github.com/zuoban/tts)
* This is a general TTS library that you can integrate by customizing an adapter.
### 🌐 Web TTS Interface Support (Pending)
### 🌐 Web TTS Interface Support
This project can also be easily configured to integrate various web TTS services. Just ensure your `apiUrl` configuration meets the service provider's requirements. Commonly supported services include:
* **OpenAI TTS**
* **Azure TTS**
* **Google Cloud Text-to-Speech (Vertex AI)**
* **Minimax TTS**
* **Gemini TTS** (may require integration via custom API adapter)
* **Fish Audio TTS**
* **Doubao TTS**
* **Gemini TTS**
* **OpenAI TTS** (Planned)
* **Azure TTS** (Planned)
* **Google Cloud Text-to-Speech (Vertex AI)** (Planned)
---
@@ -188,12 +191,27 @@ You can find sample podcast audio generated using different TTS services in the
* **Edge TTS Sample**:
[edgeTTS](https://github.com/user-attachments/assets/3891cf4c-f47f-4c9b-aef6-30ffb3fcefc4)
[edgeTTS](example/edgeTTS.wav)
* **Index TTS Sample**:
[indexTTS](https://github.com/user-attachments/assets/a1d2ebee-3e9a-43cb-bc94-67e3c9b3c45a)
[indexTTS](example/indexTTS.wav)
* **Doubao TTS Sample**:
[doubaoTTS](example/doubaoTTS.wav)
* **Minimax Sample**:
[minimax](example/minimax.wav)
* **Fish Audio Sample**:
[fish](example/fish.wav)
* **Gemini TTS Sample**:
[geminiTTS](example/geminiTTS.wav)
These audio files demonstrate the actual effect of this tool in practical applications.
@@ -204,13 +222,27 @@ These audio files demonstrate the actual effect of this tool in practical applic
```
.
├── config/ # ⚙️ Configuration Files Directory
│ ├── doubao-tts.json
│ ├── edge-tts.json
── index-tts.json
── fish-audio.json
│ ├── gemini-tts.json
│ ├── index-tts.json
│ ├── minimax.json
│ └── tts_providers.json
├── prompt/ # 🧠 AI Prompt Files Directory
│ ├── prompt-overview.txt
│ └── prompt-podscript.txt
├── example/ # 🎧 Sample Audio Directory
│ ├── doubaoTTS.wav
│ ├── edgeTTS.wav
│ ├── fish.wav
│ ├── geminiTTS.wav
│ ├── indexTTS.wav
│ └── minimax.wav
├── output/ # 🎉 Output Audio Directory
├── input.txt # 🎙️ Podcast Topic Input File
├── openai_cli.py # OpenAI Command Line Tool
├── podcast_generator.py # 🚀 Main Running Script
── README.md # 📄 Project Documentation
── README.md # 📄 Project Documentation
├── README_EN.md # 📄 English Documentation
└── tts_adapters.py # TTS Adapter File

View File

@@ -0,0 +1,107 @@
import json
import requests
import time
import base64
import os
import json
def check_doubao_tts_voices():
config_file_path = "config/doubao-tts.json"
tts_providers_path = "config/tts_providers.json"
test_text = "你好" # 测试文本
try:
with open(config_file_path, 'r', encoding='utf-8') as f:
config_data = json.load(f)
except FileNotFoundError:
print(f"错误: 配置文件未找到,请检查路径: {config_file_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 JSON 文件: {config_file_path}")
return
url = config_data.get("apiUrl", "")
headers = config_data.get("headers", {})
request_payload = config_data.get("request_payload", {})
voices = config_data.get('voices', [])
try:
with open(tts_providers_path, 'r', encoding='utf-8') as f:
tts_providers_data = json.load(f)
doubao_config = tts_providers_data.get('doubao', {})
doubao_app_id = doubao_config.get('X-Api-App-Id')
doubao_access_key = doubao_config.get('X-Api-Access-Key')
if doubao_app_id and doubao_access_key:
headers['X-Api-App-Id'] = doubao_app_id
headers['X-Api-Access-Key'] = doubao_access_key
else:
print(f"警告: 未在 {tts_providers_path} 中找到豆包的 X-Api-App-Id 或 X-Api-Access-Key。")
except FileNotFoundError:
print(f"错误: TTS 提供商配置文件未找到,请检查路径: {tts_providers_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 TTS 提供商 JSON 文件: {tts_providers_path}")
return
print(f"开始验证 {len(voices)} 个豆包 TTS 语音...")
for voice in voices:
voice_code = voice.get('code')
voice_name = voice.get('alias', voice.get('name', '未知')) # 优先使用 alias, 否则使用 name
if voice_code:
print(f"正在测试语音: {voice_name} (Code: {voice_code})")
session = requests.Session()
try:
payload = request_payload.copy()
payload['req_params']['text'] = test_text
payload['req_params']['speaker'] = voice_code
response = session.post(url, headers=headers, json=payload, stream=True, timeout=30)
logid = response.headers.get('X-Tt-Logid')
if logid:
print(f" X-Tt-Logid: {logid}")
audio_data = bytearray()
if response.status_code == 200:
for chunk in response.iter_lines(decode_unicode=True):
if not chunk:
continue
data = json.loads(chunk)
if data.get("code", 0) == 0 and "data" in data and data["data"]:
chunk_audio = base64.b64decode(data["data"])
audio_data.extend(chunk_audio)
continue
if data.get("code", 0) == 0 and "sentence" in data and data["sentence"]:
continue
if data.get("code", 0) == 20000000:
break
if data.get("code", 0) > 0:
print(f"{voice_name} (Code: {voice_code}): 接口返回错误: {data}")
audio_data = bytearray()
break
if audio_data:
print(f"{voice_name} (Code: {voice_code}): 可用")
with open(f"test_{voice_code}.mp3", "wb") as f:
f.write(audio_data)
elif not audio_data and response.status_code == 200:
print(f"{voice_name} (Code: {voice_code}): 接口返回成功但未收到音频数据。")
else:
print(f"{voice_name} (Code: {voice_code}): 不可用, HTTP状态码: {response.status_code}, 响应: {response.text}")
except requests.exceptions.RequestException as e:
print(f"{voice_name} (Code: {voice_code}): 请求失败, 错误: {e}")
finally:
session.close()
time.sleep(0.5)
else:
print(f"跳过一个缺少 'code' 字段的语音条目: {voice}")
print("豆包 TTS 语音验证完成。")
if __name__ == "__main__":
check_doubao_tts_voices()

View File

@@ -0,0 +1,80 @@
import json
import requests
import time
import msgpack
import json
def check_fishaudio_voices():
config_file_path = "config/fish-audio.json"
tts_providers_path = "config/tts_providers.json"
test_text = "你好" # 测试文本
try:
with open(config_file_path, 'r', encoding='utf-8') as f:
config_data = json.load(f)
except FileNotFoundError:
print(f"错误: 配置文件未找到,请检查路径: {config_file_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 JSON 文件: {config_file_path}")
return
voices = config_data.get('voices', [])
request_payload = config_data.get('request_payload', {})
headers = config_data.get('headers', {})
url = config_data.get('apiUrl','')
try:
with open(tts_providers_path, 'r', encoding='utf-8') as f:
tts_providers_data = json.load(f)
fish_api_key = tts_providers_data.get('fish', {}).get('api_key')
if fish_api_key:
headers['Authorization'] = f"Bearer {fish_api_key}"
else:
print(f"警告: 未在 {tts_providers_path} 中找到 Fish Audio 的 API 密钥。")
except FileNotFoundError:
print(f"错误: TTS 提供商配置文件未找到,请检查路径: {tts_providers_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 TTS 提供商 JSON 文件: {tts_providers_path}")
return
if not voices:
print("未在配置文件中找到任何声音voices")
return
print(f"开始验证 {len(voices)} 个 Fish Audio 语音...")
for voice in voices:
voice_code = voice.get('code')
voice_name = voice.get('alias', voice.get('name', '未知')) # 优先使用 alias, 否则使用 name
if voice_code:
print(f"正在测试语音: {voice_name} (Code: {voice_code})")
try:
# 准备请求数据
payload = request_payload.copy()
payload['text'] = test_text
payload['reference_id'] = voice_code
# 编码请求数据
encoded_payload = msgpack.packb(payload)
# 发送请求
response = requests.post(url, data=encoded_payload, headers=headers, timeout=30)
if response.status_code == 200:
print(f"{voice_name} (Code: {voice_code}): 可用")
with open(f"test_{voice_code}.mp3", "wb") as f:
f.write(response.content)
else:
print(f"{voice_name} (Code: {voice_code}): 不可用, 状态码: {response.status_code}")
except requests.exceptions.RequestException as e:
print(f"{voice_name} (Code: {voice_code}): 请求失败, 错误: {e}")
time.sleep(0.5) # 短暂延迟,避免请求过快
else:
print(f"跳过一个缺少 'code' 字段的语音条目: {voice}")
print("Fish Audio 语音验证完成。")
if __name__ == "__main__":
check_fishaudio_voices()

View File

@@ -0,0 +1,87 @@
import json
import wave
import time
import os
import requests
import base64
import json
def check_gemini_voices():
config_file_path = "config/gemini-tts.json"
tts_providers_path = "config/tts_providers.json"
test_text = "你好" # 测试文本
try:
with open(config_file_path, 'r', encoding='utf-8') as f:
config_data = json.load(f)
except FileNotFoundError:
print(f"错误: 配置文件未找到,请检查路径: {config_file_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 JSON 文件: {config_file_path}")
return
voices = config_data.get('voices', [])
request_payload = config_data.get('request_payload', {})
headers = config_data.get('headers', {})
url = config_data.get('apiUrl','')
try:
with open(tts_providers_path, 'r', encoding='utf-8') as f:
tts_providers_data = json.load(f)
gemini_api_key = tts_providers_data.get('gemini', {}).get('api_key')
if gemini_api_key:
headers['x-goog-api-key'] = gemini_api_key
else:
print(f"警告: 未在 {tts_providers_path} 中找到 Gemini 的 API 密钥。")
except FileNotFoundError:
print(f"错误: TTS 提供商配置文件未找到,请检查路径: {tts_providers_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 TTS 提供商 JSON 文件: {tts_providers_path}")
return
if not voices:
print("未在配置文件中找到任何声音voices")
return
print(f"开始验证 {len(voices)} 个 Gemini 语音...")
for voice in voices:
voice_code = voice.get('code')
voice_name = voice.get('alias', voice.get('name', '未知')) # 优先使用 alias, 否则使用 name
if voice_code:
print(f"正在测试语音: {voice_name} (Code: {voice_code})")
try:
url = url.replace('{{model}}', request_payload['model'])
request_payload['contents'][0]['parts'][0]['text'] = test_text
request_payload['generationConfig']['speechConfig']['voiceConfig']['prebuiltVoiceConfig']['voiceName'] = voice_code
response = requests.post(url, headers=headers, json=request_payload, timeout=60)
if response.status_code == 200:
response_data = response.json()
audio_data_base64 = response_data['candidates'][0]['content']['parts'][0]['inlineData']['data']
audio_data_pcm = base64.b64decode(audio_data_base64)
print(f"{voice_name} (Code: {voice_code}): 可用")
with wave.open(f"test_{voice_code}.mp3", "wb") as f:
f.setnchannels(1)
f.setsampwidth(2)
f.setframerate(24000)
f.writeframes(audio_data_pcm)
else:
print(f"{voice_name} (Code: {voice_code}): 不可用, 状态码: {response.status_code}, 响应: {response.text}")
except requests.exceptions.RequestException as e:
print(f"{voice_name} (Code: {voice_code}): 请求失败, 错误: {e}")
except Exception as e:
print(f"{voice_name} (Code: {voice_code}): 处理响应失败, 错误: {e}")
time.sleep(0.5) # 短暂延迟,避免请求过快
else:
print(f"跳过一个缺少 'code' 字段的语音条目: {voice}")
print("Gemini 语音验证完成。")
if __name__ == "__main__":
check_gemini_voices()

View File

@@ -0,0 +1,100 @@
import json
import requests
import time
import os
import json
def check_minimax_voices():
config_file_path = "config/minimax.json"
tts_providers_path = "config/tts_providers.json"
test_text = "你好" # 测试文本
try:
with open(config_file_path, 'r', encoding='utf-8') as f:
config_data = json.load(f)
except FileNotFoundError:
print(f"错误: 配置文件未找到,请检查路径: {config_file_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 JSON 文件: {config_file_path}")
return
voices = config_data.get('voices', [])
request_payload = config_data.get('request_payload', {})
headers = config_data.get('headers', {})
url = config_data.get('apiUrl', '')
try:
with open(tts_providers_path, 'r', encoding='utf-8') as f:
tts_providers_data = json.load(f)
minimax_config = tts_providers_data.get('minimax', {})
minimax_api_key = minimax_config.get('api_key')
minimax_group_id = minimax_config.get('group_id')
if minimax_api_key and minimax_group_id:
headers['Authorization'] = f"Bearer {minimax_api_key}"
url = url.replace('{{group_id}}', minimax_group_id)
else:
print(f"警告: 未在 {tts_providers_path} 中找到 Minimax 的 group_id 或 api_key。")
except FileNotFoundError:
print(f"错误: TTS 提供商配置文件未找到,请检查路径: {tts_providers_path}")
return
except json.JSONDecodeError:
print(f"错误: 无法解析 TTS 提供商 JSON 文件: {tts_providers_path}")
return
if not voices:
print("未在配置文件中找到任何声音voices")
return
print(f"开始验证 {len(voices)} 个 Minimax 语音...")
for voice in voices:
voice_code = voice.get('code')
voice_name = voice.get('alias', voice.get('name', '未知')) # 优先使用 alias, 否则使用 name
if voice_code:
print(f"正在测试语音: {voice_name} (Code: {voice_code})")
try:
# 准备请求数据
payload = request_payload.copy()
payload['text'] = test_text
payload['voice_setting']['voice_id'] = voice_code
# 发送请求
response = requests.post(url, json=payload, headers=headers, timeout=30)
if response.status_code == 200:
# 检查响应体中的状态
try:
response_data = response.json()
status = response_data.get('data', {}).get('status', 0)
if status == 2:
print(f"{voice_name} (Code: {voice_code}): 可用")
# 解析并保存音频数据
audio_hex = response_data.get('data', {}).get('audio')
if audio_hex:
try:
audio_bytes = bytes.fromhex(audio_hex)
with open(f"test_{voice_code}.mp3", "wb") as audio_file:
audio_file.write(audio_bytes)
except (ValueError, Exception) as e: # 捕获ValueError for invalid hex, Exception for other file errors
print(f" ❌ 保存音频文件时发生错误: {e}")
else:
print(f" ⚠️ 响应中未找到音频数据。")
else:
print(f"{voice_name} (Code: {voice_code}): 不可用, 状态: {status}")
except json.JSONDecodeError:
print(f"{voice_name} (Code: {voice_code}): 无法解析响应 JSON")
else:
print(f"{voice_name} (Code: {voice_code}): 不可用, 状态码: {response.status_code}")
except requests.exceptions.RequestException as e:
print(f"{voice_name} (Code: {voice_code}): 请求失败, 错误: {e}")
time.sleep(0.5) # 短暂延迟,避免请求过快
else:
print(f"跳过一个缺少 'code' 字段的语音条目: {voice}")
print("Minimax 语音验证完成。")
if __name__ == "__main__":
check_minimax_voices()

3024
config/doubao-tts.json Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

514
config/fish-audio.json Normal file
View File

@@ -0,0 +1,514 @@
{
"voices": [
{
"name": "DingZhen",
"alias": "丁真",
"code": "54a5170264694bfc8e9ad98df7bd89c3",
"locale": "zh-CN",
"gender": "Male",
"usedname": "丁真",
"audio": "https://platform.r2.fish.audio/task/5f55a97cbbf84d059f8249c489c90fce.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SaiMaNiang",
"alias": "赛马娘",
"code": "0eb38bc974e1459facca38b359e13511",
"locale": "zh-CN",
"gender": "Female",
"usedname": "赛马娘",
"audio": "https://platform.r2.fish.audio/task/0db5556c7b194f629b4489bc2f290b4f.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "HeiShou",
"alias": "黑手",
"code": "f7561ff309bd4040a59f1e600f4f4338",
"locale": "zh-CN",
"gender": "Male",
"usedname": "黑手",
"audio": "https://platform.r2.fish.audio/task/40da92cba22e4fcca734a09bf9042410.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "CaiXuKun",
"alias": "蔡徐坤",
"code": "e4642e5edccd4d9ab61a69e82d4f8a14",
"locale": "zh-CN",
"gender": "Male",
"usedname": "蔡徐坤",
"audio": "https://platform.r2.fish.audio/task/a85f0ef9b94d4f8da515d9597985286c.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "YangShiPeiYin",
"alias": "央视配音",
"code": "59cb5986671546eaa6ca8ae6f29f6d22",
"locale": "zh-CN",
"gender": "Male",
"usedname": "央视配音",
"audio": "https://platform.r2.fish.audio/task/121e958e29924b01807894e1aeb767c7.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SunXiaoChuan258",
"alias": "孙笑川258",
"code": "e80ea225770f42f79d50aa98be3cedfc",
"locale": "zh-CN",
"gender": "Male",
"usedname": "孙笑川258",
"audio": "https://platform.r2.fish.audio/task/6c0db70dae9344d881b1d58cecc39fbb.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "LeiJun",
"alias": "雷军",
"code": "aebaa2305aa2452fbdc8f41eec852a79",
"locale": "zh-CN",
"gender": "Male",
"usedname": "雷军",
"audio": "https://platform.r2.fish.audio/task/86476c07c9164b40bb0a3bd47add9e63.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "DingZhenRuiKeWuDaiBan",
"alias": "丁真(锐刻五代版)",
"code": "332941d1360c48949f1b4e0cabf912cd",
"locale": "zh-CN",
"gender": "Male",
"usedname": "丁真(锐刻五代版)",
"audio": "https://platform.r2.fish.audio/task/eed599aa641f4360a46f84208d183835.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "XuanChuanPianDaQiYouYangHunHou",
"alias": "【宣传片】(大气悠扬浑厚)",
"code": "dd43b30d04d9446a94ebe41f301229b5",
"locale": "zh-CN",
"gender": "Male",
"usedname": "【宣传片】(大气悠扬浑厚)",
"audio": "https://platform.r2.fish.audio/task/74b32b61c30e4db99428c66bcfccbbda.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "WangKunShengYinMoXing1017t1",
"alias": "王琨声音模型10.17t1",
"code": "4f201abba2574feeae11e5ebf737859e",
"locale": "zh-CN",
"gender": "Female",
"usedname": "王琨声音模型10.17t1",
"audio": "https://platform.r2.fish.audio/task/b84575f0f0cf47a7aa314656c1677a49.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "DongYuHui",
"alias": "董宇辉",
"code": "c7cbda1c101c4ce8906c046f01eca1a2",
"locale": "zh-CN",
"gender": "Male",
"usedname": "董宇辉",
"audio": "https://platform.r2.fish.audio/task/b1b292a2f99044f0862c3eb41595cfe7.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "JiangJieShi",
"alias": "蒋介石",
"code": "918a8277663d476b95e2c4867da0f6a6",
"locale": "zh-CN",
"gender": "Male",
"usedname": "蒋介石",
"audio": "https://platform.r2.fish.audio/task/1310984d9c7f4a5cbf72d873fd5893ff.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "GuoDeGang",
"alias": "郭德纲",
"code": "7c66db6e457c4d53b1fe428a8c547953",
"locale": "zh-CN",
"gender": "Male",
"usedname": "郭德纲",
"audio": "https://platform.r2.fish.audio/task/845a1e1389a44f9e804cf64ea68ac7e6.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "XiaoMingJianMo",
"alias": "小明剑魔",
"code": "a9372068ed0740b48326cf9a74d7496a",
"locale": "zh-CN",
"gender": "Male",
"usedname": "小明剑魔",
"audio": "https://platform.r2.fish.audio/task/64048176afa3415a986dee1cfa98c309.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "BiYeJiWenQingNvXueSheng",
"alias": "毕业季温情女学生",
"code": "a1417155aa234890aab4a18686d12849",
"locale": "zh-CN",
"gender": "Female",
"usedname": "毕业季温情女学生",
"audio": "https://platform.r2.fish.audio/task/7f77870798e34022b049a783411fec7c.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ZhengXiangZhou",
"alias": "郑翔洲",
"code": "ca8fb681ce2040958c15ede5eef86177",
"locale": "zh-CN",
"gender": "Male",
"usedname": "郑翔洲",
"audio": "https://platform.r2.fish.audio/task/dfb8a41d5bcc419881fed861a121e648.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "LuBenWei",
"alias": "卢本伟",
"code": "24d524b57c5948f598e9b74c4dacc7ab",
"locale": "zh-CN",
"gender": "Male",
"usedname": "卢本伟",
"audio": "https://platform.r2.fish.audio/task/7fc51b3f4ef74f4480425ec0ebb03ffe.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "GangBeng",
"alias": "钢镚",
"code": "bbad29edef624b6e964418f12bd8b5b2",
"locale": "zh-CN",
"gender": "Male",
"usedname": "钢镚",
"audio": "https://platform.r2.fish.audio/task/4a585dfc7d134187b6c8ad90db3bfe87.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "AiLiSiZhongPei",
"alias": "爱丽丝(中配)",
"code": "e488ebeadd83496b97a3cd472dcd04ab",
"locale": "zh-CN",
"gender": "Female",
"usedname": "爱丽丝(中配)",
"audio": "https://platform.r2.fish.audio/task/92cca533ea944f0ba29b6d4d8d156b75.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ElonMuskNoiseReduction",
"alias": "Elon Musk(Noise reduction)",
"code": "03397b4c4be74759b72533b663fbd001",
"locale": "en",
"gender": "Male",
"usedname": "Elon Musk(Noise reduction)",
"audio": "https://platform.r2.fish.audio/task/e73dd4ca6de6405293a718d28ee92afd.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SpongeBobSquarePants",
"alias": "SpongeBob SquarePants",
"code": "54e3a85ac9594ffa83264b8a494b901b",
"locale": "en",
"gender": "Male",
"usedname": "SpongeBob SquarePants",
"audio": "https://platform.r2.fish.audio/task/921609c646af4ea48e1c5e18cee1e1f3.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "DonaldJTrumpNoiseReduction",
"alias": "Donald J. Trump(Noise reduction)",
"code": "5196af35f6ff4a0dbf541793fc9f2157",
"locale": "en",
"gender": "Male",
"usedname": "Donald J. Trump(Noise reduction)",
"audio": "https://platform.r2.fish.audio/task/2ca761a2708c4aa0b841fd8a4e9333ad.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "POTUS47Trump",
"alias": "POTUS 47 - Trump",
"code": "e58b0d7efca34eb38d5c4985e378abcb",
"locale": "en",
"gender": "Male",
"usedname": "POTUS 47 - Trump",
"audio": "https://platform.r2.fish.audio/task/cb44e574980c463fbdafaf260db311b6.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "TaylorSwift",
"alias": "Taylor Swift",
"code": "cfc33da8775c47afacccf4eebabe44dc",
"locale": "en",
"gender": "Female",
"usedname": "Taylor Swift",
"audio": "https://platform.r2.fish.audio/task/c2ca4457eb964a3f811fbfdd63e3c851.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ALLE",
"alias": "ALLE",
"code": "59e9dc1cb20c452584788a2690c80970",
"locale": "en",
"gender": "Female",
"usedname": "ALLE",
"audio": "https://platform.r2.fish.audio/task/5507666747e4405091c9875974fa1579.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "RaidenShogun",
"alias": "Raiden Shogun",
"code": "5ac6fb7171ba419190700620738209d8",
"locale": "en",
"gender": "Female",
"usedname": "Raiden Shogun",
"audio": "https://platform.r2.fish.audio/task/f1e720f63bbc4af1ba637a5205846e1b.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Mrbeast",
"alias": "mrbeast",
"code": "cc1d2d26fddf487496c74a7f40c7c871",
"locale": "en",
"gender": "Male",
"usedname": "mrbeast",
"audio": "https://platform.r2.fish.audio/task/a5821aa9dc6446b581f5af14be8a25bd.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Horror",
"alias": "horror",
"code": "ef9c79b62ef34530bf452c0e50e3c260",
"locale": "en",
"gender": "Male",
"usedname": "horror",
"audio": "https://platform.r2.fish.audio/task/20bc0df607cd466180e27001089378a8.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "EGirlSoft",
"alias": "E-Girl (soft)",
"code": "9fad12dc142b429d9396190b0197adb8",
"locale": "en",
"gender": "Female",
"usedname": "E-Girl (soft)",
"audio": "https://platform.r2.fish.audio/task/98fd8981859b43cabb5de9d93b9d3ed7.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "VoiceDL",
"alias": "Voice DL",
"code": "1936333080804be19655c6749b2ae7b2",
"locale": "en",
"gender": "Male",
"usedname": "Voice DL",
"audio": "https://platform.r2.fish.audio/task/1134b26cffe14a13b10e58e0fd8cefd9.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "TrapAHolics",
"alias": "Trap-A-Holics",
"code": "0b2e96151d67433d93891f15efc25dbd",
"locale": "en",
"gender": "Male",
"usedname": "Trap-A-Holics",
"audio": "https://platform.r2.fish.audio/task/a1f2fb6632c9423e88c517d4dc57ddea.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Venti",
"alias": "Venti",
"code": "e34c486929524d41b88646b4ac2f382f",
"locale": "en",
"gender": "Male",
"usedname": "Venti",
"audio": "https://platform.r2.fish.audio/task/bfa9edf3128d42f58ad3b84bed31a3bf.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Ton",
"alias": "ton",
"code": "b97618c195814c9fb7558ea34093cd28",
"locale": "en",
"gender": "Male",
"usedname": "ton",
"audio": "https://platform.r2.fish.audio/task/425a00e456a84f3a8f15d22cc4c09236.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "CopyrightedVoiceDontUseThis",
"alias": "Copyrighted Voice (Don't use this)",
"code": "28b049a7574f46bc9d7122761363bda0",
"locale": "en",
"gender": "Male",
"usedname": "Copyrighted Voice (Don't use this)",
"audio": "https://platform.r2.fish.audio/task/4c4a0d9887c24d87bb37f21823b40054.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Valentino",
"alias": "Valentino",
"code": "a1fe2e1b6f324e27929d5088f2d09be3",
"locale": "en",
"gender": "Male",
"usedname": "Valentino",
"audio": "https://platform.r2.fish.audio/task/fb28a883f3274eb08526e56925db389b.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "BOOKRECORDREGULAR",
"alias": "BOOK RECORD REGULAR",
"code": "f8dfe9c83081432386f143e2fe9767ef",
"locale": "en",
"gender": "Male",
"usedname": "BOOK RECORD REGULAR",
"audio": "https://platform.r2.fish.audio/task/ee671024b37b4b68b84271688c68afad.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Paula",
"alias": "Paula",
"code": "c2623f0c075b4492ac367989aee1576f",
"locale": "en",
"gender": "Female",
"usedname": "Paula",
"audio": "https://platform.r2.fish.audio/task/ed58f201ceca48e3a0c5624b1e115bbc.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "PhatPhapBro",
"alias": "Phat phap(Bro)",
"code": "e0e2468ce2d746c1b20a4414435f6f48",
"locale": "en",
"gender": "Male",
"usedname": "Phat phap(Bro)",
"audio": "https://platform.r2.fish.audio/task/fe080e3ff192463f9d66002b134aceed.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "KouBiBu",
"alias": "扣比不",
"code": "d1f2aabbde274195adaf0710086269d7",
"locale": "en",
"gender": "Female",
"usedname": "扣比不",
"audio": "https://platform.r2.fish.audio/task/ad7e8adeefad4240bd45a3be57f73cfc.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SleeplessHistorian",
"alias": "Sleepless historian",
"code": "beb44e5fac1e4b33a15dfcdcc2a9421d",
"locale": "en",
"gender": "Male",
"usedname": "Sleepless historian",
"audio": "https://platform.r2.fish.audio/task/b12fa8ab18bf4e8ba7d9dffec4f89e38.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "GODVOICE",
"alias": "GOD VOICE",
"code": "3d9fca75027f4a86a677dd7044996a87",
"locale": "en",
"gender": "Male",
"usedname": "GOD VOICE",
"audio": "https://platform.r2.fish.audio/task/3de9913ca55f47bca347efb54cd2b726.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "BlackStruggles",
"alias": "Black_struggles",
"code": "b99f2c4a0012471cb32ab61152e7e48d",
"locale": "en",
"gender": "Male",
"usedname": "Black_struggles",
"audio": "https://platform.r2.fish.audio/task/93c0304e2a54431d828f1bf06f3ba0ed.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "RONALDO1",
"alias": "RONALDO1",
"code": "86304d8fa1734bd89291acf4060d8a5e",
"locale": "en",
"gender": "Male",
"usedname": "RONALDO1",
"audio": "https://platform.r2.fish.audio/task/01315d0685a240c09aaf6efbe682ab86.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Alhaitham",
"alias": "Alhaitham",
"code": "92a2600282e547f098b4a8de1bc9a44a",
"locale": "en",
"gender": "Male",
"usedname": "Alhaitham",
"audio": "https://platform.r2.fish.audio/task/b59065ebd0cb4525ace68bc9a218513e.mp3",
"volume_adjustment": 0,
"speed_adjustment": 0
}
],
"podUsers": [
{"role": "节目主理人", "code": "aebaa2305aa2452fbdc8f41eec852a79"},
{"role": "科技爱好者", "code": "5c353fdb312f4888836a9a5680099ef0"}
],
"turnPattern": "random",
"tts_max_retries": 3,
"apiUrl": "https://api.fish.audio/v1/tts",
"headers": {
"Authorization": "Bearer {{api_key}}",
"Content-Type": "application/msgpack",
"model": "s1"
},
"request_payload": {
"text": "{{test_text}}",
"chunk_length": 200,
"format": "mp3",
"mp3_bitrate": 128,
"references": [],
"reference_id": "{{voice_code}}",
"normalize": true,
"latency": "normal",
"prosody":{
"speed": 1.0,
"volume": 0
}
}
}

363
config/gemini-tts.json Normal file
View File

@@ -0,0 +1,363 @@
{
"voices": [
{
"name": "Zephyr",
"alias": "明亮",
"code": "Zephyr",
"locale": "zh-CN",
"usedname": "明亮",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Zephyr.wav"
},
{
"name": "Puck",
"alias": "欢快",
"code": "Puck",
"locale": "zh-CN",
"usedname": "欢快",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Puck.wav"
},
{
"name": "Charon",
"alias": "信息丰富",
"code": "Charon",
"locale": "zh-CN",
"usedname": "信息丰富",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Charon.wav"
},
{
"name": "Kore",
"alias": "坚定",
"code": "Kore",
"locale": "zh-CN",
"usedname": "坚定",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Kore.wav"
},
{
"name": "Fenrir",
"alias": "Excitable",
"code": "Fenrir",
"locale": "zh-CN",
"usedname": "Excitable",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Fenrir.wav"
},
{
"name": "Leda",
"alias": "青春",
"code": "Leda",
"locale": "zh-CN",
"usedname": "青春",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Leda.wav"
},
{
"name": "Orus",
"alias": "公司",
"code": "Orus",
"locale": "zh-CN",
"usedname": "公司",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Orus.wav"
},
{
"name": "Aoede",
"alias": "Breezy",
"code": "Aoede",
"locale": "zh-CN",
"usedname": "Breezy",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Aoede.wav"
},
{
"name": "Callirrhoe",
"alias": "随和",
"code": "Callirrhoe",
"locale": "zh-CN",
"usedname": "随和",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Callirrhoe.wav"
},
{
"name": "Autonoe",
"alias": "明亮",
"code": "Autonoe",
"locale": "zh-CN",
"usedname": "明亮",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Autonoe.wav"
},
{
"name": "Enceladus",
"alias": "气声",
"code": "Enceladus",
"locale": "zh-CN",
"usedname": "气声",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Enceladus.wav"
},
{
"name": "Iapetus",
"alias": "清晰",
"code": "Iapetus",
"locale": "zh-CN",
"usedname": "清晰",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Iapetus.wav"
},
{
"name": "Umbriel",
"alias": "随和",
"code": "Umbriel",
"locale": "zh-CN",
"usedname": "随和",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Umbriel.wav"
},
{
"name": "Algieba",
"alias": "平滑",
"code": "Algieba",
"locale": "zh-CN",
"usedname": "平滑",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Algieba.wav"
},
{
"name": "Despina",
"alias": "平滑",
"code": "Despina",
"locale": "zh-CN",
"usedname": "平滑",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Despina.wav"
},
{
"name": "Erinome",
"alias": "清除",
"code": "Erinome",
"locale": "zh-CN",
"usedname": "清除",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Erinome.wav"
},
{
"name": "Algenib",
"alias": "Gravelly",
"code": "Algenib",
"locale": "zh-CN",
"usedname": "Gravelly",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Algenib.wav"
},
{
"name": "Rasalgethi",
"alias": "信息丰富",
"code": "Rasalgethi",
"locale": "zh-CN",
"usedname": "信息丰富",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Rasalgethi.wav"
},
{
"name": "Laomedeia",
"alias": "欢快",
"code": "Laomedeia",
"locale": "zh-CN",
"usedname": "欢快",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Laomedeia.wav"
},
{
"name": "Achernar",
"alias": "软",
"code": "Achernar",
"locale": "zh-CN",
"usedname": "软",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Achernar.wav"
},
{
"name": "Alnilam",
"alias": "Firm",
"code": "Alnilam",
"locale": "zh-CN",
"usedname": "Firm",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Alnilam.wav"
},
{
"name": "Schedar",
"alias": "均匀",
"code": "Schedar",
"locale": "zh-CN",
"usedname": "均匀",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Schedar.wav"
},
{
"name": "Gacrux",
"alias": "成人",
"code": "Gacrux",
"locale": "zh-CN",
"usedname": "成人",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Gacrux.wav"
},
{
"name": "Pulcherrima",
"alias": "转发",
"code": "Pulcherrima",
"locale": "zh-CN",
"usedname": "转发",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Pulcherrima.wav"
},
{
"name": "Achird",
"alias": "友好",
"code": "Achird",
"locale": "zh-CN",
"usedname": "友好",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Achird.wav"
},
{
"name": "Zubenelgenubi",
"alias": "随意",
"code": "Zubenelgenubi",
"locale": "zh-CN",
"usedname": "随意",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Zubenelgenubi.wav"
},
{
"name": "Vindemiatrix",
"alias": "温柔",
"code": "Vindemiatrix",
"locale": "zh-CN",
"usedname": "温柔",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Vindemiatrix.wav"
},
{
"name": "Sadachbia",
"alias": "活泼",
"code": "Sadachbia",
"locale": "zh-CN",
"usedname": "活泼",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Sadachbia.wav"
},
{
"name": "Sadaltager",
"alias": "知识渊博",
"code": "Sadaltager",
"locale": "zh-CN",
"usedname": "知识渊博",
"gender": "Male",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Sadaltager.wav"
},
{
"name": "Sulafat",
"alias": "偏高",
"code": "Sulafat",
"locale": "zh-CN",
"usedname": "偏高",
"gender": "Female",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://www.gstatic.com/aistudio/voices/samples/Sulafat.wav"
}
],
"podUsers": [
{"role": "节目主理人", "code": "Sadaltager"},
{"role": "科技爱好者", "code": "Vindemiatrix"}
],
"turnPattern": "random",
"tts_max_retries": 3,
"apiUrl": "https://generativelanguage.googleapis.com/v1beta/models/{{model}}:generateContent",
"headers": {
"x-goog-api-key": "{{api_key}}",
"Content-Type": "application/json"
},
"request_payload": {
"contents": [{
"parts":[{
"text": "{{text}}"
}]
}],
"generationConfig": {
"responseModalities": ["AUDIO"],
"speechConfig": {
"voiceConfig": {
"prebuiltVoiceConfig": {
"voiceName": "{{voice_code}}"
}
}
}
},
"model": "gemini-2.5-flash-preview-tts"
}
}

View File

@@ -6,7 +6,9 @@
"code": "zh-CN-XiaolinIndex",
"locale": "zh-CN",
"gender": "Female",
"usedname": "林夕"
"usedname": "林夕",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Yunzhe",
@@ -14,7 +16,9 @@
"code": "zh-CN-YunzheIndex",
"locale": "zh-CN",
"gender": "Male",
"usedname": "苏哲"
"usedname": "苏哲",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "HeXi",
@@ -22,7 +26,9 @@
"code": "zh-CN-HeXiIndex",
"locale": "zh-CN",
"gender": "Male",
"usedname": "何夕"
"usedname": "何夕",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Datong",
@@ -30,7 +36,19 @@
"code": "zh-CN-DatongIndex",
"locale": "zh-CN",
"gender": "Male",
"usedname": "大同"
"usedname": "大同",
"volume_adjustment": -1,
"speed_adjustment": 0
},
{
"name": "KaiQi",
"alias": "凯琪",
"code": "zh-CN-KaiQiIndex",
"locale": "zh-CN",
"gender": "Female",
"usedname": "凯琪",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Daibei",
@@ -38,14 +56,16 @@
"code": "zh-CN-DaibeiIndex",
"locale": "zh-CN",
"gender": "Male",
"usedname": "大比"
"usedname": "大比",
"volume_adjustment": 0,
"speed_adjustment": 0
}
],
"apiUrl": "http://192.168.1.232:7899/synthesize?text={{text}}&server_audio_prompt_path={{voiceCode}}",
"podUsers": [
{"role": "节目主人", "code": "zh-CN-YunzheIndex"},
{"role": "科技爱好者", "code": "zh-CN-XiaolinIndex"},
{"role": "独立音乐人", "code": "zh-CN-DatongIndex"}
{"role": "节目主人", "code": "zh-CN-DatongIndex"},
{"role": "科技爱好者", "code": "zh-CN-KaiQiIndex"}
],
"turnPattern": "random"
"turnPattern": "random",
"tts_max_retries": 3
}

925
config/minimax.json Normal file
View File

@@ -0,0 +1,925 @@
{
"voices": [
{
"name": "ChenWenGaoGuan",
"alias": "沉稳高管",
"code": "Chinese (Mandarin)_Reliable_Executive",
"locale": "zh-CN",
"gender": "Female",
"usedname": "沉稳高管",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939828702964727-/hailuo-audio-25eb48c50a3b58c5b35a768da95d19c8.mp3"
},
{
"name": "XinWenNvSheng",
"alias": "新闻女声",
"code": "Chinese (Mandarin)_News_Anchor",
"locale": "zh-CN",
"gender": "Male",
"usedname": "新闻女声",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939837404521410-/hailuo-audio-8c441f895ef46207b31ee1818b3c0bdf.mp3"
},
{
"name": "ShuLangNanSheng",
"alias": "舒朗男声",
"code": "hunyin_6",
"locale": "zh-CN",
"gender": "Female",
"usedname": "舒朗男声",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://filecdn.minimax.chat/public/2984b933-6437-4cc0-82a2-e992893e1539.mp3"
},
{
"name": "AoJiaoYuJie",
"alias": "傲娇御姐",
"code": "Chinese (Mandarin)_Mature_Woman",
"locale": "zh-CN",
"gender": "Male",
"usedname": "傲娇御姐",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939835298225708-/hailuo-audio-d8196e5799c65552662fc56f38af13db.mp3"
},
{
"name": "BuJiQingNian",
"alias": "不羁青年",
"code": "Chinese (Mandarin)_Unrestrained_Young_Man",
"locale": "zh-CN",
"gender": "Female",
"usedname": "不羁青年",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939837678401020-/hailuo-audio-b501bd8ec7dd5c6f8272f0446274399e.mp3"
},
{
"name": "XiaoZhangXiaoJie",
"alias": "嚣张小姐",
"code": "Arrogant_Miss",
"locale": "zh-CN",
"gender": "Male",
"usedname": "嚣张小姐",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096240932614767-/hailuo-audio-cd53c63ba41970f0307ed6c6b9ef2f41.mp3"
},
{
"name": "JiXieZhanJia",
"alias": "机械战甲",
"code": "Robot_Armor",
"locale": "zh-CN",
"gender": "Female",
"usedname": "机械战甲",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096241022651937-/hailuo-audio-1f0487b0484424d83bc34dd68635c756.mp3"
},
{
"name": "ReXinDaShen",
"alias": "热心大婶",
"code": "Chinese (Mandarin)_Kind-hearted_Antie",
"locale": "zh-CN",
"gender": "Male",
"usedname": "热心大婶",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939836702995041-/hailuo-audio-747a446c2d632142db87c11a6d2481ae.mp3"
},
{
"name": "GaoXiaoDaYe",
"alias": "搞笑大爷",
"code": "Chinese (Mandarin)_Humorous_Elder",
"locale": "zh-CN",
"gender": "Female",
"usedname": "搞笑大爷",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939835767092010-/hailuo-audio-3b00ef13ecf64d5d5e27b3af39250e1b.mp3"
},
{
"name": "WenRunNanSheng",
"alias": "温润男声",
"code": "Chinese (Mandarin)_Gentleman",
"locale": "zh-CN",
"gender": "Female",
"usedname": "温润男声",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939807409530648-/hailuo-audio-7f6792db55d582f5426f2df7ce642bd2.mp3"
},
{
"name": "WenNuanGuiMi",
"alias": "温暖闺蜜",
"code": "Chinese (Mandarin)_Warm_Bestie",
"locale": "zh-CN",
"gender": "Male",
"usedname": "温暖闺蜜",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939827676982824-/hailuo-audio-8e92f8bc7cdeecdfdc24163fdbf10585.mp3"
},
{
"name": "BoBaoNanSheng",
"alias": "播报男声",
"code": "Chinese (Mandarin)_Male_Announcer",
"locale": "zh-CN",
"gender": "Female",
"usedname": "播报男声",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939813636243159-/hailuo-audio-199574ab886502bfffeaaaeae9d690cd.mp3"
},
{
"name": "DianTaiNanZhuBo",
"alias": "电台男主播",
"code": "Chinese (Mandarin)_Radio_Host",
"locale": "zh-CN",
"gender": "Female",
"usedname": "电台男主播",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939836560676953-/hailuo-audio-bd8f326fb44dcec0878ff2dc027fa94f.mp3"
},
{
"name": "CalmWoman",
"alias": "Calm Woman",
"code": "Arabic_CalmWoman",
"locale": "en",
"gender": "Male",
"usedname": "Calm Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096195678726631-/hailuo-audio-82aac82e527a394b209b4a2b604b50b9.mp3"
},
{
"name": "GangPuKongJie",
"alias": "港普空姐",
"code": "Chinese (Mandarin)_HK_Flight_Attendant",
"locale": "zh-CN",
"gender": "Male",
"usedname": "港普空姐",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-08-07-14/moss-audio/user_audio//1754548066976928527-298988366913689.mp3"
},
{
"name": "GracefulLady",
"alias": "Graceful Lady",
"code": "English_Graceful_Lady",
"locale": "en",
"gender": "Male",
"usedname": "Graceful Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-18/moss-audio/user_audio//1754476231322799699-298694229168254.mp3"
},
{
"name": "PersuasiveFemaleSpeaker",
"alias": "Persuasive Female Speaker",
"code": "French_Female_Speech_New",
"locale": "en",
"gender": "Male",
"usedname": "Persuasive Female Speaker",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939777606312940-/hailuo-audio-e2c552ff9a04d11c428c7e88847c26d6.mp3"
},
{
"name": "SweetGirl",
"alias": "Sweet Girl",
"code": "Indonesian_SweetGirl",
"locale": "en",
"gender": "Male",
"usedname": "Sweet Girl",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096199945586161-/hailuo-audio-7e3396cdfe9fde1f9bc07ae1a2ba51c7.mp3"
},
{
"name": "ArrogantPrincess",
"alias": "Arrogant Princess",
"code": "Italian_ArrogantPrincess",
"locale": "en",
"gender": "Male",
"usedname": "Arrogant Princess",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939803705324406-/hailuo-audio-1c2bc2ee2a2c5a4650c254cd1f6efcf4.mp3"
},
{
"name": "IntellectualSenior",
"alias": "Intellectual Senior",
"code": "Japanese_IntellectualSenior",
"locale": "en",
"gender": "Female",
"usedname": "Intellectual Senior",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-24-16/moss-audio/voice_sample_audio/sample/1737708432248717660-/hailuo-audio-18c17e54ed62c0491b9466ffddba8ed5.mp3"
},
{
"name": "KindLady",
"alias": "Kind Lady",
"code": "Japanese_KindLady",
"locale": "en",
"gender": "Male",
"usedname": "Kind Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-24-16/moss-audio/voice_sample_audio/sample/1737708433989780649-/hailuo-audio-f03c01520180359e4eee184bf0fc19ab.mp3"
},
{
"name": "SweetGirl",
"alias": "Sweet Girl",
"code": "Korean_SweetGirl",
"locale": "en",
"gender": "Male",
"usedname": "Sweet Girl",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096215341266877-/hailuo-audio-7ea8d579722692c6294ed9cfa2f24073.mp3"
},
{
"name": "ChildhoodFriendGirl",
"alias": "Childhood Friend Girl",
"code": "Korean_ChildhoodFriendGirl",
"locale": "en",
"gender": "Male",
"usedname": "Childhood Friend Girl",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096206080309714-/hailuo-audio-18971415b344cc08a9c92207829545b2.mp3"
},
{
"name": "EnthusiasticTeen",
"alias": "Enthusiastic Teen",
"code": "Korean_EnthusiasticTeen",
"locale": "en",
"gender": "Female",
"usedname": "Enthusiastic Teen",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096209076732667-/hailuo-audio-6a62676f7f404cf91a0b83fd2dcf0360.mp3"
},
{
"name": "BraveAdventurer",
"alias": "Brave Adventurer",
"code": "Korean_BraveAdventurer",
"locale": "en",
"gender": "Male",
"usedname": "Brave Adventurer",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096202715219216-/hailuo-audio-01927194c4e8a5257d89d2ef2e73c8d5.mp3"
},
{
"name": "QuirkyGirl",
"alias": "Quirky Girl",
"code": "Korean_QuirkyGirl",
"locale": "en",
"gender": "Male",
"usedname": "Quirky Girl",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096213274958092-/hailuo-audio-10c5eb8e17935195f3c5fd7637473823.mp3"
},
{
"name": "ColdGirl",
"alias": "Cold Girl",
"code": "Korean_ColdGirl",
"locale": "en",
"gender": "Male",
"usedname": "Cold Girl",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096206844716202-/hailuo-audio-08034a6a8405717729ab72605506487f.mp3"
},
{
"name": "PossessiveMan",
"alias": "Possessive Man",
"code": "Korean_PossessiveMan",
"locale": "en",
"gender": "Female",
"usedname": "Possessive Man",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096212918731832-/hailuo-audio-3648709e795372c2be11153c7955d83b.mp3"
},
{
"name": "Deep-tonedMan",
"alias": "Deep-toned Man",
"code": "Portuguese_Deep-tonedMan",
"locale": "en",
"gender": "Female",
"usedname": "Deep-toned Man",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939714857598214-/hailuo-audio-267471ebafe9f3baa93d5e9fe78f93db.mp3"
},
{
"name": "ConfidentWoman",
"alias": "Confident Woman",
"code": "Portuguese_ConfidentWoman",
"locale": "en",
"gender": "Male",
"usedname": "Confident Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096221082217897-/hailuo-audio-eee8639c2e4b1dbfd5bb4d95df4f6b4f.mp3"
},
{
"name": "Grinch",
"alias": "Grinch",
"code": "Portuguese_Grinch",
"locale": "en",
"gender": "Female",
"usedname": "Grinch",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096226056145486-/hailuo-audio-b3e330056b9324a7777815abc3512776.mp3"
},
{
"name": "SereneWoman",
"alias": "Serene Woman",
"code": "Portuguese_SereneWoman",
"locale": "en",
"gender": "Male",
"usedname": "Serene Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096233988799669-/hailuo-audio-26ac9f353013d49d0a27bb760b32db46.mp3"
},
{
"name": "Dramatist",
"alias": "Dramatist",
"code": "Portuguese_Dramatist",
"locale": "en",
"gender": "Female",
"usedname": "Dramatist",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096222364532260-/hailuo-audio-a522bdef0410221371811fdfbde034fd.mp3"
},
{
"name": "CharmingLady",
"alias": "Charming Lady",
"code": "Portuguese_CharmingLady",
"locale": "en",
"gender": "Male",
"usedname": "Charming Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096219366540273-/hailuo-audio-de56612994524a416188f4c5a7360ef2.mp3"
},
{
"name": "GrimReaper",
"alias": "Grim Reaper",
"code": "Portuguese_GrimReaper",
"locale": "en",
"gender": "Female",
"usedname": "Grim Reaper",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096225707843994-/hailuo-audio-35d0909a7c103b0251e6d10b81a1888f.mp3"
},
{
"name": "RomanticHusband",
"alias": "Romantic Husband",
"code": "Portuguese_RomanticHusband",
"locale": "en",
"gender": "Female",
"usedname": "Romantic Husband",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096231837088813-/hailuo-audio-71e49e955963c8e6aadc4688bef86730.mp3"
},
{
"name": "ThoughtfulLady",
"alias": "Thoughtful Lady",
"code": "Portuguese_ThoughtfulLady",
"locale": "en",
"gender": "Male",
"usedname": "Thoughtful Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096236330890911-/hailuo-audio-32cc62f4d5a0a44d01c58a292f7756fe.mp3"
},
{
"name": "DeterminedManager",
"alias": "Determined Manager",
"code": "Portuguese_DeterminedManager",
"locale": "en",
"gender": "Male",
"usedname": "Determined Manager",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096222058020509-/hailuo-audio-2bb855bdce45f51fdfc65a6dd9b8e194.mp3"
},
{
"name": "Bad-temperedBoy",
"alias": "Bad-tempered Boy",
"code": "Russian_Bad-temperedBoy",
"locale": "en",
"gender": "Female",
"usedname": "Bad-tempered Boy",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://filecdn.minimax.chat/public/e43500b7-5b48-4309-aade-fdb605fff45f.mp3"
},
{
"name": "BossyLeader",
"alias": "Bossy Leader",
"code": "Spanish_BossyLeader",
"locale": "en",
"gender": "Female",
"usedname": "Bossy Leader",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939814882985359-/hailuo-audio-11769b72e55cd65edf6120128d3a5992.mp3"
},
{
"name": "Deep-tonedMan",
"alias": "Deep-toned Man",
"code": "Spanish_Deep-tonedMan",
"locale": "en",
"gender": "Female",
"usedname": "Deep-toned Man",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939816536028381-/hailuo-audio-a0782d505cffd1131a8a2bde83e9a25e.mp3"
},
{
"name": "Steadymentor",
"alias": "Steady Mentor",
"code": "Spanish_Steadymentor",
"locale": "en",
"gender": "Female",
"usedname": "Steady Mentor",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939823745029704-/hailuo-audio-0a02394475f241a68c76b9c1c2d4df8f.mp3"
},
{
"name": "EnergeticBoy",
"alias": "Energetic Boy",
"code": "Spanish_EnergeticBoy",
"locale": "en",
"gender": "Female",
"usedname": "Energetic Boy",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939816978150659-/hailuo-audio-535a6d694eb3c1d429322d7e327ef476.mp3"
},
{
"name": "PowerfulSoldier",
"alias": "Powerful Soldier",
"code": "Spanish_PowerfulSoldier",
"locale": "en",
"gender": "Female",
"usedname": "Powerful Soldier",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-15-19/moss-audio/voice_sample_audio/sample/1736939820646722142-/hailuo-audio-414c38f14ef119ad56b20ebd23acc8fa.mp3"
},
{
"name": "CalmWoman",
"alias": "Calm Woman",
"code": "Turkish_CalmWoman",
"locale": "en",
"gender": "Male",
"usedname": "Calm Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.com/prod/2025-01-17-14/moss-audio/voice_sample_audio/sample/1737096240585827701-/hailuo-audio-873447b700053726fb006028fbccfe52.mp3"
},
{
"name": "ConfidentWoman",
"alias": "Confident Woman",
"code": "Thai_female_1_sample1",
"locale": "en",
"gender": "Male",
"usedname": "Confident Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://filecdn.minimax.chat/public/22001509-f68e-4df0-99a1-d9867ddd42b9.mp3"
},
{
"name": "EnergeticYouth",
"alias": "Energetic Youth",
"code": "Romanian_male_2_sample1",
"locale": "en",
"gender": "Female",
"usedname": "Energetic Youth",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://filecdn.minimax.chat/public/c1b5e6f0-fc06-4ef3-ab39-a11d1aca7d4a.mp3"
},
{
"name": "ElegantLady",
"alias": "Elegant Lady",
"code": "czech_female_2_v2",
"locale": "en",
"gender": "Male",
"usedname": "Elegant Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://filecdn.minimax.chat/public/e065d9cb-0217-4ed7-a530-ae1df47d6881.MP3"
},
{
"name": "GracefulLady",
"alias": "Graceful Lady",
"code": "Bulgarian_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Graceful Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-17/moss-audio/user_audio//1754472491985636165-298678969561209.mp3"
},
{
"name": "SteadyMan",
"alias": "Steady Man",
"code": "Persian_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Steady Man",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-17/moss-audio/user_audio//1754473838522291170-298684517163108.mp3"
},
{
"name": "CheerfulYoungLady",
"alias": "Cheerful Young Lady",
"code": "Croatian_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Cheerful Young Lady",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-18/moss-audio/user_audio//1754475502307412455-298691346075740.mp3"
},
{
"name": "MaleHost",
"alias": "Male Host",
"code": "Slovenian_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Male Host",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-19/moss-audio/user_audio//1754480144906382494-298710401368130.mp3"
},
{
"name": "MatureWoman",
"alias": "Mature Woman",
"code": "Tamil_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Mature Woman",
"volume_adjustment": 0,
"speed_adjustment": 0,
"audio": "https://cdn.hailuoai.video/moss/prod/2025-08-06-19/moss-audio/user_audio//1754481054444253754-298714012258408.mp3"
},
{
"name": "QingSeQingNianYinSe",
"alias": "青涩青年音色",
"code": "male-qn-qingse",
"locale": "zh-CN",
"gender": "Female",
"usedname": "青涩青年音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "JingYingQingNianYinSe",
"alias": "精英青年音色",
"code": "male-qn-jingying",
"locale": "zh-CN",
"gender": "Female",
"usedname": "精英青年音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "BaDaoQingNianYinSe",
"alias": "霸道青年音色",
"code": "male-qn-badao",
"locale": "zh-CN",
"gender": "Female",
"usedname": "霸道青年音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "QingNianDaXueShengYinSe",
"alias": "青年大学生音色",
"code": "male-qn-daxuesheng",
"locale": "zh-CN",
"gender": "Female",
"usedname": "青年大学生音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ShaoNvYinSe",
"alias": "少女音色",
"code": "female-shaonv",
"locale": "zh-CN",
"gender": "Male",
"usedname": "少女音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "YuJieYinSe",
"alias": "御姐音色",
"code": "female-yujie",
"locale": "zh-CN",
"gender": "Male",
"usedname": "御姐音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ChengShuNvXingYinSe",
"alias": "成熟女性音色",
"code": "female-chengshu",
"locale": "zh-CN",
"gender": "Male",
"usedname": "成熟女性音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "TianMeiNvXingYinSe",
"alias": "甜美女性音色",
"code": "female-tianmei",
"locale": "zh-CN",
"gender": "Male",
"usedname": "甜美女性音色",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "NanXingZhuChiRen",
"alias": "男性主持人",
"code": "presenter_male",
"locale": "zh-CN",
"gender": "Female",
"usedname": "男性主持人",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "NvXingZhuChiRen",
"alias": "女性主持人",
"code": "presenter_female",
"locale": "zh-CN",
"gender": "Male",
"usedname": "女性主持人",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "NanXingYouShengShu1",
"alias": "男性有声书1",
"code": "audiobook_male_1",
"locale": "zh-CN",
"gender": "Female",
"usedname": "男性有声书1",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "NanXingYouShengShu2",
"alias": "男性有声书2",
"code": "audiobook_male_2",
"locale": "zh-CN",
"gender": "Female",
"usedname": "男性有声书2",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "YuJieYinSe-beta",
"alias": "御姐音色-beta",
"code": "female-yujie-jingpin",
"locale": "zh-CN",
"gender": "Male",
"usedname": "御姐音色-beta",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ChunZhenXueDi",
"alias": "纯真学弟",
"code": "chunzhen_xuedi",
"locale": "zh-CN",
"gender": "Female",
"usedname": "纯真学弟",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Rudolph",
"alias": "Rudolph",
"code": "Rudolph",
"locale": "en",
"gender": "Female",
"usedname": "Rudolph",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "FluentFemaleBroadcaster",
"alias": "Fluent Female Broadcaster",
"code": "French_Female Journalist",
"locale": "en",
"gender": "Male",
"usedname": "Fluent Female Broadcaster",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "Optimisticyouth",
"alias": "Optimistic youth",
"code": "Portuguese_Optimisticyouth",
"locale": "en",
"gender": "Female",
"usedname": "Optimistic youth",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "EnergeticWoman",
"alias": "Energetic Woman",
"code": "Thai_female_2_sample2",
"locale": "en",
"gender": "Male",
"usedname": "Energetic Woman",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "EnergeticYouth",
"alias": "Energetic Youth",
"code": "Romanian_male_2_sample1",
"locale": "en",
"gender": "Female",
"usedname": "Energetic Youth",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "GentleLady",
"alias": "Gentle Lady",
"code": "Greek_female_1_sample1",
"locale": "en",
"gender": "Male",
"usedname": "Gentle Lady",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ElegantLady",
"alias": "Elegant Lady",
"code": "czech_female_2_v2",
"locale": "en",
"gender": "Male",
"usedname": "Elegant Lady",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "AssetiveWoman",
"alias": "Assetive Woman",
"code": "finnish_female_4_v1",
"locale": "en",
"gender": "Male",
"usedname": "Assetive Woman",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "GracefulLady",
"alias": "Graceful Lady",
"code": "Bulgarian_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Graceful Lady",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ReliableMan",
"alias": "Reliable Man",
"code": "Hebrew_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Reliable Man",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SteadyMan",
"alias": "Steady Man",
"code": "Persian_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Steady Man",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ConfidentMan",
"alias": "Confident Man",
"code": "Swedish_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Confident Man",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "CheerfulYoungLady",
"alias": "Cheerful Young Lady",
"code": "Croatian_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Cheerful Young Lady",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "ReliableMan",
"alias": "Reliable Man",
"code": "Norwegian_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Reliable Man",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "MaleHost",
"alias": "Male Host",
"code": "Slovenian_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Male Host",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "MatureWoman",
"alias": "Mature Woman",
"code": "Tamil_female_1_v1",
"locale": "en",
"gender": "Male",
"usedname": "Mature Woman",
"volume_adjustment": 0,
"speed_adjustment": 0
},
{
"name": "SteadyYouth",
"alias": "Steady Youth",
"code": "Afrikaans_male_1_v1",
"locale": "en",
"gender": "Female",
"usedname": "Steady Youth",
"volume_adjustment": 0,
"speed_adjustment": 0
}
],
"podUsers": [
{"role": "节目主理人", "code": "Chinese (Mandarin)_Reliable_Executive"},
{"role": "科技爱好者", "code": "Chinese (Mandarin)_Mature_Woman"}
],
"turnPattern": "random",
"tts_max_retries": 3,
"apiUrl": "https://api.minimaxi.com/v1/t2a_v2?GroupId={{group_id}}",
"request_payload": {
"model": "speech-2.5-turbo-preview",
"text": "{{text}}",
"voice_setting": {
"speed": 1.2,
"vol": 1,
"pitch": 0,
"voice_id": "{{voice_code}}"
},
"audio_setting":{
"sample_rate":32000,
"bitrate":128000,
"format":"mp3"
},
"output_format":"hex",
"language_boost": "auto"
},
"headers": {
"Authorization": "Bearer {{api_key}}",
"Content-Type": "application/json"
}
}

22
config/tts_providers.json Normal file
View File

@@ -0,0 +1,22 @@
{
"index": {
"api_key": null
},
"edge": {
"api_key": null
},
"doubao": {
"X-Api-App-Id": "null",
"X-Api-Access-Key": "null"
},
"fish": {
"api_key": "null"
},
"minimax": {
"group_id": "null",
"api_key": "null"
},
"gemini": {
"api_key": "null"
}
}

BIN
example/doubaoTTS.wav Normal file

Binary file not shown.

BIN
example/edgeTTS.wav Normal file

Binary file not shown.

BIN
example/fish.wav Normal file

Binary file not shown.

BIN
example/geminiTTS.wav Normal file

Binary file not shown.

BIN
example/indexTTS.wav Normal file

Binary file not shown.

BIN
example/minimax.wav Normal file

Binary file not shown.

72
ext/doubao-voice-list.py Normal file
View File

@@ -0,0 +1,72 @@
import requests
import json
# --- 配置 ---
# 请将这里的URL替换为你要获取数据的实际URL
URL = "https://lf3-config.bytetcc.com/obj/tcc-config-web/tcc-v2-data-lab.speech.tts_middle_layer-default" # <--- 替换成你的URL
OUTPUT_FILENAME = "data_from_url_volc_bigtts.json"
# 设置请求超时(秒),防止程序因网络问题无限期等待
TIMEOUT = 10
print(f"准备从URL获取数据: {URL}")
# --- 主逻辑 ---
try:
# 1. 发送GET请求到URL
# a. requests.get() 发送请求
# b. timeout=TIMEOUT 是一个好习惯,避免程序卡死
response = requests.get(URL, timeout=TIMEOUT)
# 2. 检查响应状态码,确保请求成功 (例如 200 OK)
# response.raise_for_status() 会在响应码为 4xx 或 5xx (客户端/服务器错误) 时抛出异常
response.raise_for_status()
print("✅ HTTP请求成功状态码: 200 OK")
# 3. 解析最外层的JSON
# requests库的 .json() 方法可以直接将响应内容解析为Python字典
# 这完成了我们的第一次解析
outer_data = response.json()
# 4. 从解析后的字典中提取内层的JSON字符串
# 这一步可能会因为键不存在而抛出KeyError
volc_bigtts_string = outer_data['data']['volc_bigtts']
# 5. 解析内层的JSON字符串得到最终的JSON数组Python列表
# 这一步可能会因为字符串格式不正确而抛出JSONDecodeError
final_json_array = json.loads(volc_bigtts_string)
print("✅ 成功解析嵌套的JSON数据。")
print("解析出的数组内容:", final_json_array)
# 6. 将最终的JSON数组写入本地文件
with open(OUTPUT_FILENAME, 'w', encoding='utf-8') as f:
json.dump(final_json_array, f, indent=4, ensure_ascii=False)
print(f"\n🎉 成功将数据写入文件: {OUTPUT_FILENAME}")
# --- 错误处理 ---
# 将不同类型的错误分开捕获,可以提供更清晰的错误信息
except requests.exceptions.HTTPError as errh:
# 捕获HTTP错误如 404 Not Found, 500 Internal Server Error
print(f"❌ HTTP错误: {errh}")
except requests.exceptions.ConnectionError as errc:
# 捕获连接错误如DNS查询失败、拒绝连接等
print(f"❌ 连接错误: {errc}")
except requests.exceptions.Timeout as errt:
# 捕获请求超时
print(f"❌ 请求超时: {errt}")
except requests.exceptions.RequestException as err:
# 捕获requests库可能抛出的其他所有异常
print(f"❌ 请求发生未知错误: {err}")
except json.JSONDecodeError:
# 捕获JSON解析错误
# 可能发生在 response.json() 或 json.loads()
print("❌ JSON解析失败。从URL返回的数据或内层字符串不是有效的JSON格式。")
# 如果需要调试,可以打印原始响应内容
# print("原始响应内容:", response.text)
except KeyError:
# 捕获键错误
print("❌ JSON结构不符合预期找不到 'data''volc_bigtts' 键。")
except Exception as e:
# 捕获所有其他未预料到的异常
print(f"❌ 发生未知错误: {e}")

View File

@@ -13,10 +13,12 @@ from datetime import datetime
from openai_cli import OpenAICli # Moved to top for proper import
import urllib.parse # For URL encoding
import re # For regular expression operations
from tts_adapters import TTSAdapter, IndexTTSAdapter, EdgeTTSAdapter, FishAudioAdapter, MinimaxAdapter, DoubaoTTSAdapter, GeminiTTSAdapter # Import TTS adapters
# Global configuration
output_dir = "output"
file_list_path = os.path.join(output_dir, "file_list.txt")
tts_providers_config_path = 'config/tts_providers.json'
def read_file_content(filepath):
"""Reads content from a given file path."""
@@ -24,35 +26,52 @@ def read_file_content(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
return f.read()
except FileNotFoundError:
print(f"Error: File not found at {filepath}")
sys.exit(1)
raise FileNotFoundError(f"Error: File not found at {filepath}")
def select_json_config(config_dir='config'):
def select_json_config(config_dir='config', return_file_path=False):
"""
Reads JSON files from the specified directory and allows the user to select one.
Returns the content of the selected JSON file.
If return_file_path is True, returns a tuple of (file_path, content).
"""
json_files = glob.glob(os.path.join(config_dir, '*.json'))
if not json_files:
print(f"Error: No JSON files found in {config_dir}")
sys.exit(1)
raise FileNotFoundError(f"Error: No JSON files found in {config_dir}")
valid_json_files = []
print(f"Found JSON configuration files in '{config_dir}':")
for i, file_path in enumerate(json_files):
print(f"{i + 1}. {os.path.basename(file_path)}")
file_name = os.path.basename(file_path)
if file_name != "tts_providers.json":
valid_json_files.append(file_path)
print(f"{len(valid_json_files)}. {file_name}")
if not valid_json_files:
raise FileNotFoundError(f"Error: No valid JSON files (excluding tts_providers.json) found in {config_dir}")
while True:
try:
choice = int(input("Enter the number of the configuration file to use: "))
if 1 <= choice <= len(json_files):
selected_file = json_files[choice - 1]
choice_str = input("Enter the number of the configuration file to use: ")
if not choice_str: # Allow empty input to raise an error
raise ValueError("No input provided. Please enter a number.")
choice = int(choice_str)
if 1 <= choice <= len(valid_json_files):
selected_file = valid_json_files[choice - 1]
print(f"Selected: {os.path.basename(selected_file)}")
with open(selected_file, 'r', encoding='utf-8') as f:
return json.load(f)
content = json.load(f)
if return_file_path:
return selected_file, content
else:
return content
else:
print("Invalid choice. Please enter a number within the range.")
except ValueError:
print("Invalid input. Please enter a number.")
raise ValueError("Invalid choice. Please enter a number within the range.")
except FileNotFoundError as e:
raise FileNotFoundError(f"Error loading selected JSON file: {e}")
except json.JSONDecodeError as e:
raise ValueError(f"Error decoding JSON from selected file: {e}")
except ValueError as e:
print(f"Invalid input: {e}. Please enter a number.")
def generate_speaker_id_text(pod_users, voices_list):
"""
@@ -78,7 +97,7 @@ def generate_speaker_id_text(pod_users, voices_list):
speaker_info.append(f"speaker_id={speaker_id}的名叫{found_name}")
else:
raise ValueError(f"语音code '{pod_user_code}' (speaker_id={speaker_id}) 未找到对应名称或alias。请检查 config/edge-tts.json 中的 voices 配置。")
return "".join(speaker_info) + ""
def merge_audio_files():
@@ -88,9 +107,7 @@ def merge_audio_files():
try:
subprocess.run(["ffmpeg", "-version"], check=True, capture_output=True)
except FileNotFoundError:
print("Error: FFmpeg is not installed or not in your PATH. Please install FFmpeg to merge audio files.")
print("You can download FFmpeg from: https://ffmpeg.org/download.html")
sys.exit(1)
raise RuntimeError("FFmpeg is not installed or not in your PATH. Please install FFmpeg to merge audio files. You can download FFmpeg from: https://ffmpeg.org/download.html")
print(f"\nMerging audio files into {output_audio_filename}...")
try:
@@ -110,91 +127,94 @@ def merge_audio_files():
print("FFmpeg stdout:\n", process.stdout)
print("FFmpeg stderr:\n", process.stderr)
except subprocess.CalledProcessError as e:
print(f"Error merging audio files with FFmpeg: {e}")
print(f"FFmpeg stdout:\n", e.stdout)
print(f"FFmpeg stderr:\n", e.stderr)
sys.exit(1)
raise RuntimeError(f"Error merging audio files with FFmpeg: {e.stderr}")
finally:
# Clean up temporary audio files and the file list
# Clean up temporary audio files and the file list
for item in os.listdir(output_dir):
if item.startswith("temp_audio"):
try:
os.remove(os.path.join(output_dir, item))
except OSError as e:
print(f"Error removing temporary audio file {item}: {e}")
print(f"Error removing temporary audio file {item}: {e}") # This should not stop the process
try:
os.remove(file_list_path)
except OSError as e:
print(f"Error removing file list {file_list_path}: {e}")
print(f"Error removing file list {file_list_path}: {e}") # This should not stop the process
print("Cleaned up temporary files.")
def main():
# Parse command-line arguments
def _parse_arguments():
"""Parses command-line arguments."""
parser = argparse.ArgumentParser(description="Generate podcast script and audio using OpenAI and local TTS.")
parser.add_argument("--api-key", help="OpenAI API key.")
parser.add_argument("--base-url", default="https://api.openai.com/v1", help="OpenAI API base URL (default: https://api.openai.com/v1).")
parser.add_argument("--model", default="gpt-3.5-turbo", help="OpenAI model to use (default: gpt-3.5-turbo).")
parser.add_argument("--threads", type=int, default=1, help="Number of threads to use for audio generation (default: 1).")
args = parser.parse_args()
return parser.parse_args()
def _load_configuration():
"""Selects and loads JSON configuration, and infers tts_provider from the selected file name."""
print("Podcast Generation Script")
selected_file_path, config_data = select_json_config(return_file_path=True)
# 从文件名中提取 tts_provider
# 假设文件名格式为 'provider-name.json'
file_name = os.path.basename(selected_file_path)
tts_provider = os.path.splitext(file_name)[0] # 移除 .json 扩展名
config_data["tts_provider"] = tts_provider # 将 tts_provider 添加到配置数据中
print("\nLoaded Configuration: " + tts_provider)
return config_data
# Step 1: Select JSON configuration
config_data = select_json_config()
print("\nLoaded Configuration:")
# print(json.dumps(config_data, indent=4))
# Determine final API key, base URL, and model based on priority
# Command-line args > config file > environment variables
def _prepare_openai_settings(args, config_data):
"""Determines final OpenAI API key, base URL, and model based on priority."""
api_key = args.api_key or config_data.get("api_key") or os.getenv("OPENAI_API_KEY")
base_url = args.base_url or config_data.get("base_url") or os.getenv("OPENAI_BASE_URL")
model = args.model or config_data.get("model") # Allow model to be None if not provided anywhere
# Fallback for model if not specified
if not model:
model = "gpt-3.5-turbo"
print(f"Using default model: {model} as it was not specified via command-line, config, or environment variables.")
if not api_key:
print("Error: OpenAI API key is not set. Please provide it via --api-key, in your config file, or as an environment variable (OPENAI_API_KEY).")
sys.exit(1)
raise ValueError("Error: OpenAI API key is not set. Please provide it via --api-key, in your config file, or as an environment variable (OPENAI_API_KEY).")
return api_key, base_url, model
# Step 2: Read prompt files
def _read_prompt_files():
"""Reads content from input, overview, and podcast script prompt files."""
input_prompt = read_file_content('input.txt')
overview_prompt = read_file_content('prompt/prompt-overview.txt')
original_podscript_prompt = read_file_content('prompt/prompt-podscript.txt')
return input_prompt, overview_prompt, original_podscript_prompt
# 从 input_prompt 中提取自定义内容
def _extract_custom_content(input_prompt_content):
"""Extracts custom content from the input prompt."""
custom_content = ""
custom_begin_tag = '```custom-begin'
custom_end_tag = '```custom-end'
start_index = input_prompt.find(custom_begin_tag)
start_index = input_prompt_content.find(custom_begin_tag)
if start_index != -1:
end_index = input_prompt.find(custom_end_tag, start_index + len(custom_begin_tag))
end_index = input_prompt_content.find(custom_end_tag, start_index + len(custom_begin_tag))
if end_index != -1:
custom_content = input_prompt[start_index + len(custom_begin_tag):end_index].strip()
# 移除 input_prompt 中 ```custom-end 以上的部分,包含 ```custom-end
input_prompt = input_prompt[end_index + len(custom_end_tag):].strip()
custom_content = input_prompt_content[start_index + len(custom_begin_tag):end_index].strip()
input_prompt_content = input_prompt_content[end_index + len(custom_end_tag):].strip()
return custom_content, input_prompt_content
def _prepare_podcast_prompts(config_data, original_podscript_prompt, custom_content):
"""Prepares the podcast script prompts with speaker info and placeholders."""
pod_users = config_data.get("podUsers", [])
voices = config_data.get("voices", [])
turn_pattern = config_data.get("turnPattern", "random")
# 替换 original_podscript_prompt 中的占位符
original_podscript_prompt = original_podscript_prompt.replace("{{numSpeakers}}", str(len(pod_users)))
original_podscript_prompt = original_podscript_prompt.replace("{{turnPattern}}", turn_pattern)
speaker_id_info = generate_speaker_id_text(pod_users, voices)
# 将自定义内容前置到 podscript_prompt
podscript_prompt = speaker_id_info + "\n\n" + original_podscript_prompt + "\n\n" + custom_content
podscript_prompt = speaker_id_info + "\n\n" + original_podscript_prompt + "\n\n" + custom_content
return podscript_prompt, pod_users, voices, turn_pattern # Return voices for potential future use or consistency
print(f"\nInput Prompt (input.txt):\n{input_prompt[:100]}...") # Display first 100 chars
print(f"\nOverview Prompt (prompt-overview.txt):\n{overview_prompt[:100]}...")
print(f"\nPodscript Prompt (prompt-podscript.txt):\n{podscript_prompt[:1000]}...")
# Step 4 & 5: Call openai_cli to generate overview content
def _generate_overview_content(api_key, base_url, model, overview_prompt, input_prompt):
"""Generates overview content using OpenAI CLI."""
print("\nGenerating overview with OpenAI CLI...")
try:
openai_client_overview = OpenAICli(api_key=api_key, base_url=base_url, model=model, system_message=overview_prompt)
@@ -202,129 +222,121 @@ def main():
overview_content = "".join([chunk.choices[0].delta.content for chunk in overview_response_generator if chunk.choices and chunk.choices[0].delta.content])
print("Generated Overview:")
print(overview_content[:100])
return overview_content
except Exception as e:
print(f"Error generating overview: {e}")
sys.exit(1)
raise RuntimeError(f"Error generating overview: {e}")
# Step 6: Call openai_cli to generate podcast script JSON
def _generate_podcast_script(api_key, base_url, model, podscript_prompt, overview_content):
"""Generates and parses podcast script JSON using OpenAI CLI."""
print("\nGenerating podcast script with OpenAI CLI...")
# Initialize podscript_json_str outside try block to ensure it's always defined
podscript_json_str = ""
try:
openai_client_podscript = OpenAICli(api_key=api_key, base_url=base_url, model=model, system_message=podscript_prompt)
podscript_response_generator = openai_client_podscript.chat_completion(messages=[{"role": "user", "content": overview_content}])
podscript_json_str = "".join([chunk.choices[0].delta.content for chunk in podscript_response_generator if chunk.choices and chunk.choices[0].delta.content])
# try:
# output_script_filename = os.path.join(output_dir, f"podcast_script_{int(time.time())}.json")
# with open(output_script_filename, 'w', encoding='utf-8') as f:
# json.dump(podscript_json_str, f, ensure_ascii=False, indent=4)
# print(f"Podcast script saved to {output_script_filename}")
# except Exception as e:
# print(f"Error saving podcast script to file: {e}")
# sys.exit(1)
# Generate the response string first
podscript_json_str = "".join([chunk.choices[0].delta.content for chunk in openai_client_podscript.chat_completion(messages=[{"role": "user", "content": overview_content}]) if chunk.choices and chunk.choices[0].delta.content])
# Attempt to parse the JSON string. OpenAI sometimes returns extra text.
podcast_script = None
decoder = json.JSONDecoder()
idx = 0
valid_json_str = ""
while idx < len(podscript_json_str):
try:
obj, end = decoder.raw_decode(podscript_json_str[idx:])
# Check if this object is the expected podcast_script
if isinstance(obj, dict) and "podcast_transcripts" in obj:
podcast_script = obj
valid_json_str = podscript_json_str[idx : idx + end] # Capture the exact valid JSON string
break # Found the desired JSON, stop searching
idx += end # Move to the end of the current JSON object
valid_json_str = podscript_json_str[idx : idx + end]
break
idx += end
except json.JSONDecodeError:
# If decoding fails, advance index by one and continue
idx += 1
# Optionally, skip to the next potential JSON start if it's far away
next_brace = podscript_json_str.find('{', idx)
if next_brace != -1:
idx = next_brace
else:
break # No more braces, no more JSON to find
break
if podcast_script is None:
print(f"Error: Could not find a valid podcast script JSON object with 'podcast_transcripts' key in response.")
print(f"Raw response: {podscript_json_str}")
sys.exit(1)
raise ValueError(f"Error: Could not find a valid podcast script JSON object with 'podcast_transcripts' key in response. Raw response: {podscript_json_str}")
print("\nGenerated Podcast Script Length:"+ str(len(podcast_script.get("podcast_transcripts") or [])))
print(valid_json_str[:100] + "...") # Print beginning of the *actual* parsed JSON
print(valid_json_str[:100] + "...")
if not podcast_script.get("podcast_transcripts"):
print("Warning: 'podcast_transcripts' array is empty or not found in the generated script. Nothing to convert to audio.")
sys.exit(0) # Exit gracefully if no transcripts to process
raise ValueError("Error: 'podcast_transcripts' array is empty or not found in the generated script. Nothing to convert to audio.")
return podcast_script
except json.JSONDecodeError as e:
raise ValueError(f"Error decoding JSON from podcast script response: {e}. Raw response: {podscript_json_str}")
except Exception as e:
print(f"Error generating podcast script: {e}")
sys.exit(1)
raise RuntimeError(f"Error generating podcast script: {e}")
# Step 7: Parse podcast script and generate audio
os.makedirs(output_dir, exist_ok=True) # Create output directory if it doesn't exist
def generate_audio_for_item(item, config_data, tts_adapter: TTSAdapter, max_retries: int = 3):
"""Generate audio for a single podcast transcript item using the provided TTS adapter."""
speaker_id = item.get("speaker_id")
dialog = item.get("dialog")
voice_code = None
volume_adjustment = 0.0 # 默认值为 0.0
speed_adjustment = 0.0 # 默认值为 0.0
if config_data and "podUsers" in config_data and 0 <= speaker_id < len(config_data["podUsers"]):
pod_user_entry = config_data["podUsers"][speaker_id]
voice_code = pod_user_entry.get("code")
# 从 voices 列表中获取对应的 volume_adjustment
voice_map = {voice.get("code"): voice for voice in config_data.get("voices", []) if voice.get("code")}
volume_adjustment = voice_map.get(voice_code, {}).get("volume_adjustment", 0.0)
speed_adjustment = voice_map.get(voice_code, {}).get("speed_adjustment", 0.0)
if not voice_code:
raise ValueError(f"No voice code found for speaker_id {speaker_id}. Cannot generate audio for this dialog.")
# print(f"dialog-before: {dialog}")
dialog = re.sub(r'[^\w\s\-,.。?!\u4e00-\u9fa5]', '', dialog)
print(f"dialog: {dialog}")
def generate_audio_for_item(item, index):
"""Generate audio for a single podcast transcript item."""
speaker_id = item.get("speaker_id")
dialog = item.get("dialog")
# Get the voice code based on speaker_id (index into config_data["person"])
# Assuming speaker_id corresponds to the index in the 'person' array
voice_code = None
if config_data and "podUsers" in config_data and 0 <= speaker_id < len(config_data["podUsers"]):
pod_user_entry = config_data["podUsers"][speaker_id]
voice_code = pod_user_entry.get("code")
if not voice_code:
print(f"Warning: No voice code found for speaker_id {speaker_id}. Skipping this dialog.")
return None
# Replace placeholders in apiUrl
# URL encode the dialog before replacing {{text}}
# 移除指定标点符号,只保留逗号,句号,感叹号
dialog = re.sub(r'[^\w\s\-,.。?!\u4e00-\u9fa5]', '', dialog)
print(f"dialog: {dialog}")
encoded_dialog = urllib.parse.quote(dialog)
api_url = config_data.get("apiUrl", "").replace("{{text}}", encoded_dialog).replace("{{voiceCode}}", voice_code)
if not api_url:
print(f"Warning: apiUrl not found in config. Skipping dialog for speaker_id {speaker_id}.")
return None
for attempt in range(max_retries):
try:
print(f"Calling TTS API for speaker {speaker_id} with voice {voice_code}...")
response = requests.get(api_url, stream=True)
response.raise_for_status() # Raise an exception for bad status codes
# Save the audio chunk to a temporary file
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.mp3")
with open(temp_audio_file, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Generated {os.path.basename(temp_audio_file)}")
print(f"Calling TTS API for speaker {speaker_id} ({voice_code}) (Attempt {attempt + 1}/{max_retries})...")
temp_audio_file = tts_adapter.generate_audio(
text=dialog,
voice_code=voice_code,
output_dir=output_dir,
volume_adjustment=volume_adjustment, # 传递音量调整参数
speed_adjustment=speed_adjustment # 传递速度调整参数
)
return temp_audio_file
except RuntimeError as e: # Catch specific RuntimeError from TTS adapters
print(f"Error generating audio for speaker {speaker_id} ({voice_code}) on attempt {attempt + 1}: {e}")
if attempt < max_retries - 1:
wait_time = 2 ** attempt
print(f"Retrying in {wait_time} seconds...")
time.sleep(wait_time)
else:
raise RuntimeError(f"Max retries ({max_retries}) reached for speaker {speaker_id} ({voice_code}). Audio generation failed.")
except Exception as e: # Catch other unexpected errors
raise RuntimeError(f"An unexpected error occurred for speaker {speaker_id} ({voice_code}) on attempt {attempt + 1}: {e}")
except requests.exceptions.RequestException as e:
print(f"Error calling TTS API for speaker {speaker_id} ({voice_code}): {e}")
return None
def _generate_all_audio_files(podcast_script, config_data, tts_adapter: TTSAdapter, threads):
"""Orchestrates the generation of individual audio files."""
os.makedirs(output_dir, exist_ok=True)
print("\nGenerating audio files...")
# test script
# podcast_script = json.loads("{\"podcast_transcripts\":[{\"speaker_id\":0,\"dialog\":\"欢迎收听来生小酒馆客官不进来喝点吗今天咱们来唠唠AI。 小希,你有什么新鲜事来分享吗?\"},{\"speaker_id\":1,\"dialog\":\"当然了, AI 编程工具 Cursor 给开发者送上了一份大礼,付费用户现在可以限时免费体验 GPT 5 的强大编码能力\"}]}")
transcripts = podcast_script.get("podcast_transcripts", [])
# Use ThreadPoolExecutor for multi-threading audio generation
max_retries = config_data.get("tts_max_retries", 3) # 从配置中获取最大重试次数默认3次
from concurrent.futures import ThreadPoolExecutor, as_completed
# Create a dictionary to hold results with their indices
audio_files_dict = {}
with ThreadPoolExecutor(max_workers=args.threads) as executor:
# Submit all tasks with their indices
with ThreadPoolExecutor(max_workers=threads) as executor:
future_to_index = {
executor.submit(generate_audio_for_item, item, i): i
executor.submit(generate_audio_for_item, item, config_data, tts_adapter, max_retries): i
for i, item in enumerate(transcripts)
}
# Collect results and place them in the correct order
for future in as_completed(future_to_index):
index = future_to_index[future]
try:
@@ -332,42 +344,121 @@ def main():
if result:
audio_files_dict[index] = result
except Exception as e:
print(f"Error generating audio for item {index}: {e}")
# Re-raise the exception to propagate it to the main thread
raise RuntimeError(f"Error generating audio for item {index}: {e}")
# Convert dictionary to list in the correct order
audio_files = [audio_files_dict[i] for i in sorted(audio_files_dict.keys())]
print(f"\nFinished generating individual audio files. Total files: {len(audio_files)}")
"""
Merges a list of audio files into a single output file using FFmpeg.
Args:
audio_files (list): A list of paths to the audio files to merge.
output_dir (str): The directory where the merged audio file will be saved.
"""
return audio_files
def _create_ffmpeg_file_list(audio_files):
"""Creates the file list for FFmpeg concatenation."""
if not audio_files:
print("No audio files were generated to merge.")
return
raise ValueError("No audio files were generated to merge.")
# Create a file list for ffmpeg
print(f"Creating file list for ffmpeg at: {file_list_path}")
with open(file_list_path, 'w', encoding='utf-8') as f:
for audio_file in audio_files:
# FFmpeg concat demuxer requires paths to be relative to the file_list.txt
# or absolute. Using basename if file_list.txt is in output_dir.
f.write(f"file '{os.path.basename(audio_file)}'\n")
print("Content of file_list.txt:")
with open(file_list_path, 'r', encoding='utf-8') as f:
print(f.read())
from typing import cast # Add import for cast
def _initialize_tts_adapter(config_data: dict, output_dir: str) -> TTSAdapter:
"""
根据配置数据初始化并返回相应的 TTS 适配器。
"""
tts_provider = config_data.get("tts_provider")
if not tts_provider:
raise ValueError("TTS provider is not specified in the configuration.")
tts_providers_config = {}
try:
tts_providers_config_content = read_file_content(tts_providers_config_path)
tts_providers_config = json.loads(tts_providers_config_content)
except Exception as e:
print(f"Warning: Could not load tts_providers.json: {e}")
# 获取当前 tts_provider 的额外参数
current_tts_extra_params = tts_providers_config.get(tts_provider.split('-')[0], {}) # 例如 'doubao-tts' -> 'doubao'
if tts_provider == "index-tts":
api_url = config_data.get("apiUrl")
if not api_url:
raise ValueError("IndexTTS apiUrl is not configured.")
return IndexTTSAdapter(api_url_template=cast(str, api_url))
elif tts_provider == "edge-tts":
api_url = config_data.get("apiUrl")
if not api_url:
raise ValueError("EdgeTTS apiUrl is not configured.")
return EdgeTTSAdapter(api_url_template=cast(str, api_url))
elif tts_provider == "fish-audio":
api_url = config_data.get("apiUrl")
headers = config_data.get("headers")
request_payload = config_data.get("request_payload")
if not all([api_url, headers, request_payload]):
raise ValueError("FishAudio requires apiUrl, headers, and request_payload configuration.")
return FishAudioAdapter(api_url=cast(str, api_url), headers=cast(dict, headers), request_payload_template=cast(dict, request_payload), tts_extra_params=cast(dict, current_tts_extra_params))
elif tts_provider == "minimax":
api_url = config_data.get("apiUrl")
headers = config_data.get("headers")
request_payload = config_data.get("request_payload")
if not all([api_url, headers, request_payload]):
raise ValueError("Minimax requires apiUrl, headers, and request_payload configuration.")
return MinimaxAdapter(api_url=cast(str, api_url), headers=cast(dict, headers), request_payload_template=cast(dict, request_payload), tts_extra_params=cast(dict, current_tts_extra_params))
elif tts_provider == "doubao-tts":
api_url = config_data.get("apiUrl")
headers = config_data.get("headers")
request_payload = config_data.get("request_payload")
if not all([api_url, headers, request_payload]):
raise ValueError("DoubaoTTS requires apiUrl, headers, and request_payload configuration.")
return DoubaoTTSAdapter(api_url=cast(str, api_url), headers=cast(dict, headers), request_payload_template=cast(dict, request_payload), tts_extra_params=cast(dict, current_tts_extra_params))
elif tts_provider == "gemini-tts":
api_url = config_data.get("apiUrl")
headers = config_data.get("headers")
request_payload = config_data.get("request_payload")
if not all([api_url, headers, request_payload]):
raise ValueError("GeminiTTS requires apiUrl, headers, and request_payload configuration.")
return GeminiTTSAdapter(api_url=cast(str, api_url), headers=cast(dict, headers), request_payload_template=cast(dict, request_payload), tts_extra_params=cast(dict, current_tts_extra_params))
else:
raise ValueError(f"Unsupported TTS provider: {tts_provider}")
def main():
args = _parse_arguments()
config_data = _load_configuration()
api_key, base_url, model = _prepare_openai_settings(args, config_data)
input_prompt_content, overview_prompt, original_podscript_prompt = _read_prompt_files()
custom_content, input_prompt = _extract_custom_content(input_prompt_content)
podscript_prompt, pod_users, voices, turn_pattern = _prepare_podcast_prompts(config_data, original_podscript_prompt, custom_content)
print(f"\nInput Prompt (input.txt):\n{input_prompt[:100]}...")
print(f"\nOverview Prompt (prompt-overview.txt):\n{overview_prompt[:100]}...")
print(f"\nPodscript Prompt (prompt-podscript.txt):\n{podscript_prompt[:1000]}...")
overview_content = _generate_overview_content(api_key, base_url, model, overview_prompt, input_prompt)
podcast_script = _generate_podcast_script(api_key, base_url, model, podscript_prompt, overview_content)
tts_adapter = _initialize_tts_adapter(config_data, output_dir) # 初始化 TTS 适配器
audio_files = _generate_all_audio_files(podcast_script, config_data, tts_adapter, args.threads)
_create_ffmpeg_file_list(audio_files)
if __name__ == "__main__":
start_time = time.time() # Record the start time
main()
merge_audio_files()
end_time = time.time() # Record the end time
execution_time = end_time - start_time # Calculate total execution time
print(f"\nTotal execution time: {execution_time:.2f} seconds")
start_time = time.time()
try:
main()
merge_audio_files()
except Exception as e:
print(f"\nError: An unexpected error occurred during podcast generation: {e}", file=sys.stderr)
sys.exit(1)
finally:
end_time = time.time()
execution_time = end_time - start_time
print(f"\nTotal execution time: {execution_time:.2f} seconds")

View File

@@ -57,10 +57,15 @@ You are a master podcast scriptwriter, adept at transforming diverse input conte
* **Target Duration:** Create a transcript that would result in approximately 5-6 minutes of audio (around 800-1000 words total).
* **Balanced Speaking Turns:** Aim for a natural conversational flow among speakers rather than extended monologues by one person. Prioritize the most important information from the source content.
7. **Personalized & Output:**
7. **Copy & Replacement:**
If a hyphen connects English letters and numbers or letters on both sides, replace it with a space.
If a hyphen has numbers on both sides, replace it with '减'.
If a hyphen has a percent sign or '%' on its left and a number on its right, replace it with '到'.
Replace four-digit Arabic numerals with their Chinese character equivalents, one-to-one.
8. **Personalized & Output:**
* **Output Format:** No explanatory textMake sure the input language is set as the output language
* **Begin Format:** After the opening remarks, introduce each guest who will participate in the discussion.
* **End Format:** Before concluding, review and summarize the previous speeches, which are concise, concise, powerful and thought-provoking.
</guidelines>

379
tts_adapters.py Normal file
View File

@@ -0,0 +1,379 @@
import os
import json # 导入 json 模块
import base64 # 导入 base64 模块
from msgpack.fallback import EX_CONSTRUCT
import requests
import uuid
import urllib.parse
import re # Add re import
import time # Add time import
from abc import ABC, abstractmethod
from typing import Optional # Add Optional import
class TTSAdapter(ABC):
"""
抽象基类,定义 TTS 适配器的接口。
"""
@abstractmethod
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
"""
根据文本和语音代码生成音频文件。
Args:
text (str): 要转换为语音的文本。
voice_code (str): 用于生成语音的语音代码。
output_dir (str): 生成的音频文件保存的目录。
volume_adjustment (float): 音量调整值,正数增加,负数减少。
Returns:
str: 生成的音频文件路径。
Raises:
Exception: 如果音频生成失败。
"""
pass
def _apply_audio_effects(self, audio_file_path: str, volume_adjustment: float, speed_adjustment: float) -> str:
"""
对音频文件应用音量和速度调整。
Args:
audio_file_path (str): 原始音频文件路径。
volume_adjustment (float): 音量调整值。例如6.0 表示增加 6dB-3.0 表示减少 3dB。
speed_adjustment (float): 速度调整值正数增加负数减少。speed_adjustment 是百分比,例如 10 表示 +10%-10 表示 -10%
Returns:
str: 调整后的音频文件路径。
Raises:
ImportError: 如果 'pydub' 模块未安装。
RuntimeError: 如果音频效果调整失败。
"""
if volume_adjustment == 0.0 and speed_adjustment == 0.0:
return audio_file_path
try:
from pydub import AudioSegment
except ImportError:
raise ImportError("The 'pydub' module is required for audio adjustments. Please install it using 'pip install pydub'.")
current_audio_file = audio_file_path
base, ext = os.path.splitext(audio_file_path)
try:
audio = AudioSegment.from_file(current_audio_file)
# 应用音量调整
if volume_adjustment != 0.0:
adjusted_audio = audio + volume_adjustment
new_file_path = f"{base}_vol_adjusted{ext}"
adjusted_audio.export(new_file_path, format=ext[1:])
os.remove(current_audio_file)
current_audio_file = new_file_path
audio = adjusted_audio
print(f"Applied volume adjustment of {volume_adjustment} dB to {os.path.basename(current_audio_file)}")
# 应用速度调整
if speed_adjustment != 0.0:
speed_multiplier = 1 + speed_adjustment / 100.0
adjusted_audio = audio.speedup(playback_speed=speed_multiplier, chunk_size=150, crossfade=25)
new_file_path = f"{base}_speed_adjusted{ext}"
adjusted_audio.export(new_file_path, format=ext[1:])
if current_audio_file != audio_file_path and os.path.exists(current_audio_file): # 只有当 current_audio_file 是中间文件时才删除
os.remove(current_audio_file)
else: # 如果没有音量调整current_audio_file 仍然是原始文件
os.remove(audio_file_path)
current_audio_file = new_file_path
print(f"Applied speed adjustment of {speed_adjustment}% to {os.path.basename(current_audio_file)}")
return current_audio_file
except Exception as e:
# 如果发生错误,清理任何中间文件
if current_audio_file != audio_file_path and os.path.exists(current_audio_file):
os.remove(current_audio_file)
raise RuntimeError(f"Error applying audio effects to {os.path.basename(audio_file_path)}: {e}")
class IndexTTSAdapter(TTSAdapter):
"""
IndexTTS 的 TTS 适配器实现。
"""
def __init__(self, api_url_template: str):
self.api_url_template = api_url_template
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
encoded_text = urllib.parse.quote(text)
api_url = self.api_url_template.replace("{{text}}", encoded_text).replace("{{voiceCode}}", voice_code)
if not api_url:
raise ValueError("API URL is not configured for IndexTTS. Cannot generate audio.")
try:
print(f"Calling IndexTTS API with voice {voice_code}...")
response = requests.get(api_url, stream=True, timeout=30)
response.raise_for_status()
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.wav")
with open(temp_audio_file, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling IndexTTS API with voice {voice_code}: {e}")
except Exception as e: # Catch other potential errors like JSON parsing or data decoding
raise RuntimeError(f"Error processing IndexTTS API response for voice {voice_code}: {e}")
class EdgeTTSAdapter(TTSAdapter):
"""
EdgeTTS 的 TTS 适配器实现。
"""
def __init__(self, api_url_template: str):
self.api_url_template = api_url_template
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
encoded_text = urllib.parse.quote(text)
api_url = self.api_url_template.replace("{{text}}", encoded_text).replace("{{voiceCode}}", voice_code)
if not api_url:
raise ValueError("API URL is not configured for EdgeTTS. Cannot generate audio.")
try:
print(f"Calling EdgeTTS API with voice {voice_code}...")
response = requests.get(api_url, stream=True, timeout=30)
response.raise_for_status()
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.mp3")
with open(temp_audio_file, 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling EdgeTTS API with voice {voice_code}: {e}")
except Exception as e: # Catch other potential errors like JSON parsing or data decoding
raise RuntimeError(f"Error processing EdgeTTS API response for voice {voice_code}: {e}")
# 尝试导入 msgpack
class FishAudioAdapter(TTSAdapter):
"""
FishAudio 的 TTS 适配器实现。
"""
def __init__(self, api_url: str, headers: dict, request_payload_template: dict, tts_extra_params: Optional[dict] = None):
self.api_url = api_url
self.headers = headers
self.request_payload_template = request_payload_template
self.tts_extra_params = tts_extra_params if tts_extra_params is not None else {}
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
try:
import msgpack # 延迟导入 msgpack
except ImportError:
raise ImportError("The 'msgpack' module is required for FishAudioAdapter. Please install it using 'pip install msgpack'.")
# 构造请求体
payload = self.request_payload_template.copy()
payload["text"] = text
payload["reference_id"] = voice_code
self.headers["Authorization"] = self.headers["Authorization"].replace("{{api_key}}", self.tts_extra_params["api_key"])
# 使用 msgpack 打包请求体
packed_payload = msgpack.packb(payload, use_bin_type=True)
try:
print(f"Calling FishAudio API with voice {voice_code}...")
response = requests.post(self.api_url, data=packed_payload, headers=self.headers, timeout=60) # Increased timeout for FishAudio
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.mp3")
with open(temp_audio_file, "wb") as f:
f.write(response.content)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling FishAudio API with voice {voice_code}: {e}")
except Exception as e: # Catch other potential errors like JSON parsing or data decoding
raise RuntimeError(f"Error processing FishAudio API response for voice {voice_code}: {e}")
class MinimaxAdapter(TTSAdapter):
"""
Minimax 的 TTS 适配器实现。
"""
def __init__(self, api_url: str, headers: dict, request_payload_template: dict, tts_extra_params: Optional[dict] = None):
self.api_url = api_url
self.headers = headers
self.request_payload_template = request_payload_template
self.tts_extra_params = tts_extra_params if tts_extra_params is not None else {}
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
# 构造请求体
payload = self.request_payload_template.copy()
payload["text"] = text
payload["voice_setting"]["voice_id"] = voice_code
self.headers["Authorization"] = self.headers["Authorization"].replace("{{api_key}}", self.tts_extra_params["api_key"])
self.api_url = self.api_url.replace("{{group_id}}", self.tts_extra_params["group_id"])
# Minimax 返回十六进制编码的音频数据,需要解码
if payload.get("output_format") == "hex":
is_hex_output = True
else:
is_hex_output = False
try:
print(f"Calling Minimax API with voice {voice_code}...")
response = requests.post(self.api_url, json=payload, headers=self.headers, timeout=60) # Increased timeout for Minimax
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.mp3")
response_data = response.json()
# 解析并保存音频数据
if is_hex_output:
audio_hex = response_data.get('data', {}).get('audio')
audio_bytes = bytes.fromhex(audio_hex)
with open(temp_audio_file, "wb") as f:
f.write(audio_bytes)
else:
audio_url = response_data.get('data', {}).get('audio')
if not audio_url:
raise RuntimeError("Minimax API returned success but no audio URL found when output_format is not hex.")
# 下载音频文件
audio_response = requests.get(audio_url, stream=True, timeout=30)
audio_response.raise_for_status()
with open(temp_audio_file, 'wb') as f:
for chunk in audio_response.iter_content(chunk_size=8192):
f.write(chunk)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling Minimax API with voice {voice_code}: {e}")
except Exception as e: # Catch other potential errors like JSON parsing or data decoding
raise RuntimeError(f"Error processing Minimax API response for voice {voice_code}: {e}")
class DoubaoTTSAdapter(TTSAdapter):
"""
豆包TTS 的 TTS 适配器实现。
"""
def __init__(self, api_url: str, headers: dict, request_payload_template: dict, tts_extra_params: Optional[dict] = None):
self.api_url = api_url
self.headers = headers
self.request_payload_template = request_payload_template
self.tts_extra_params = tts_extra_params if tts_extra_params is not None else {}
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
session = requests.Session()
try:
payload = self.request_payload_template.copy()
payload['req_params']['text'] = text
payload['req_params']['speaker'] = voice_code
self.headers["X-Api-App-Id"] = self.headers["X-Api-App-Id"].replace("{{X-Api-App-Id}}", self.tts_extra_params["X-Api-App-Id"])
self.headers["X-Api-Access-Key"] = self.headers["X-Api-Access-Key"].replace("{{X-Api-Access-Key}}", self.tts_extra_params["X-Api-Access-Key"])
print(f"Calling Doubao TTS API with voice {voice_code}...")
response = session.post(self.api_url, headers=self.headers, json=payload, stream=True, timeout=30)
response.raise_for_status()
audio_data = bytearray()
for chunk in response.iter_lines(decode_unicode=True):
if not chunk:
continue
data = json.loads(chunk)
if data.get("code", 0) == 0 and "data" in data and data["data"]:
import base64
chunk_audio = base64.b64decode(data["data"])
audio_data.extend(chunk_audio)
continue
if data.get("code", 0) == 0 and "sentence" in data and data["sentence"]:
continue
if data.get("code", 0) == 20000000:
break
if data.get("code", 0) > 0:
raise RuntimeError(f"Doubao TTS API returned error: {data}")
if not audio_data:
raise RuntimeError("Doubao TTS API returned success but no audio data received.")
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.mp3")
with open(temp_audio_file, "wb") as f:
f.write(audio_data)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling Doubao TTS API with voice {voice_code}: {e}")
except Exception as e:
raise RuntimeError(f"Error processing Doubao TTS API response for voice {voice_code}: {e}")
finally:
session.close()
class GeminiTTSAdapter(TTSAdapter):
"""
Gemini TTS 的 TTS 适配器实现。
"""
def __init__(self, api_url: str, headers: dict, request_payload_template: dict, tts_extra_params: Optional[dict] = None):
self.api_url = api_url
self.headers = headers
self.request_payload_template = request_payload_template
self.tts_extra_params = tts_extra_params if tts_extra_params is not None else {}
def generate_audio(self, text: str, voice_code: str, output_dir: str, volume_adjustment: float = 0.0, speed_adjustment: float = 0.0) -> str:
try:
# 构造请求体
payload = self.request_payload_template.copy()
model_name = payload['model']
api_url = self.api_url.replace('{{model}}', model_name) if '{{model}}' in self.api_url else self.api_url
# 更新请求 payload
payload['contents'][0]['parts'][0]['text'] = text
payload['generationConfig']['speechConfig']['voiceConfig']['prebuiltVoiceConfig']['voiceName'] = voice_code
# 更新 headers 中的 API key
gemini_api_key = self.tts_extra_params.get('api_key')
self.headers['x-goog-api-key'] = gemini_api_key
print(f"Calling Gemini TTS API with voice {voice_code}...")
response = requests.post(api_url, headers=self.headers, json=payload, timeout=60)
response.raise_for_status()
response_data = response.json()
audio_data_base64 = response_data['candidates'][0]['content']['parts'][0]['inlineData']['data']
audio_data_pcm = base64.b64decode(audio_data_base64)
# Gemini 返回的是 PCM 数据,需要保存为 WAV
temp_audio_file = os.path.join(output_dir, f"temp_audio_{uuid.uuid4()}.wav") # 更改为 .wav 扩展名
import wave # 导入 wave 模块
with wave.open(temp_audio_file, "wb") as f:
f.setnchannels(1)
f.setsampwidth(2) # 假设 16-bit PCM
f.setframerate(24000) # 假设 24kHz 采样率
f.writeframes(audio_data_pcm)
print(f"Generated {os.path.basename(temp_audio_file)}")
# 应用音量和速度调整
final_audio_file = self._apply_audio_effects(temp_audio_file, volume_adjustment, speed_adjustment)
return final_audio_file
except requests.exceptions.RequestException as e:
raise RuntimeError(f"Error calling Gemini TTS API with voice {voice_code}: {e}")
except Exception as e:
raise RuntimeError(f"Error processing Gemini TTS API response for voice {voice_code}: {e}")