Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensor format input audio translation error #473

Open
DengHao97 opened this issue Jun 19, 2024 · 0 comments
Open

tensor format input audio translation error #473

DengHao97 opened this issue Jun 19, 2024 · 0 comments

Comments

@DengHao97
Copy link

I want to build an API interface, but the translation after the audio input is wrong, I think I may have made a mistake when processing the audio file, may I ask what went wrong? If there is a completed API code, can you provide it to me? Here's my code:

`import torchaudio
import torch
from seamless_communication.inference import Translator
from fastapi import FastAPI, File, UploadFile, Form

model_name = "seamlessM4T_v2_large"
vocoder_name = "vocoder_v2" if model_name == "seamlessM4T_v2_large" else "vocoder_36langs"

translator = Translator(
model_name,
vocoder_name,
device=torch.device("cuda:1"),
dtype=torch.float16,
)

app = FastAPI()

@app.post("/translate")
async def translate(
file: UploadFile = File(...),
to_lang: str = Form(...),
):
audio_input, sample_rate = torchaudio.load(file.file)

if not isinstance(audio_input, torch.Tensor):
    audio_input = torch.from_numpy(audio_input)
audio_input = audio_input.permute(1, 0)

text_output, _ = translator.predict(
    input=audio_input,
    task_str="s2tt",
    tgt_lang=to_lang,
)
print(f"Translated text: {text_output[0]}")
return {
    "code": 200,
    "text": str(text_output[0])
}

if name == "main":
import uvicorn

uvicorn.run(app, host="0.0.0.0", port=7860)`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant