API v1.0 is now available! Check out the new voice cloning features.
SpeechgenSpeechgen

Voice Cloning

Create custom voice models from audio samples

Voice Cloning API

Create custom voice models that sound like any voice using our advanced voice cloning technology.

Endpoint

POST https://api.speechgen.com/v1/voice-clone
Content-Type: multipart/form-data

Request

Headers

HeaderValueRequired
AuthorizationBearer YOUR_API_KEYYes
Content-Typemultipart/form-dataYes

Form Data Parameters

Prop

Type

Supported Audio Formats

  • MP3 (recommended)
  • WAV
  • M4A
  • FLAC
  • OGG

Audio Requirements: - Duration: 10 seconds to 5 minutes - Single speaker only - Clear audio with minimal background noise - File size: Maximum 10MB

Example Request

curl -X POST https://api.speechgen.com/v1/voice-clone \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@voice_sample.mp3" \
  -F "name=My Custom Voice" \
  -F "description=Professional narrator voice" \
  -F "enhance_quality=true"

Response

Success (201)

{
  "model_id": "vm_abc123xyz",
  "name": "My Custom Voice",
  "description": "Professional narrator voice",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z",
  "estimated_completion": "2024-01-15T10:31:00Z",
  "samples": [
    {
      "title": "Default Sample",
      "text": "This is a demonstration of the voice clone...",
      "audio_url": "https://cdn.speechgen.com/samples/abc123.mp3"
    }
  ]
}

Status Values

StatusDescription
processingVoice model is being created
readyVoice model is ready to use
failedVoice creation failed

Check Model Status

GET https://api.speechgen.com/v1/models/{model_id}

Response:

{
  "model_id": "vm_abc123xyz",
  "name": "My Custom Voice",
  "status": "ready",
  "voice_id": "custom-abc123",
  "created_at": "2024-01-15T10:30:00Z",
  "usage_count": 0
}

Using Your Cloned Voice

Once the model status is ready, use the voice_id in text-to-speech requests:

curl -X POST https://api.speechgen.com/v1/text-to-speech \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is my cloned voice speaking!",
    "voice": "custom-abc123",
    "format": "mp3"
  }'

Code Examples

JavaScript

async function cloneVoice(audioFile, name, description) {
  const formData = new FormData();
  formData.append("audio", audioFile);
  formData.append("name", name);
  formData.append("description", description);
  formData.append("enhance_quality", "true");

  const response = await fetch("https://api.speechgen.com/v1/voice-clone", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.SPEECHGEN_API_KEY}`,
    },
    body: formData,
  });

  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.error.message);
  }

  return response.json();
}

// Poll for completion
async function waitForModel(modelId) {
  while (true) {
    const response = await fetch(
      `https://api.speechgen.com/v1/models/${modelId}`,
      {
        headers: {
          Authorization: `Bearer ${process.env.SPEECHGEN_API_KEY}`,
        },
      }
    );

    const model = await response.json();

    if (model.status === "ready") return model;
    if (model.status === "failed") throw new Error("Voice cloning failed");

    await new Promise((r) => setTimeout(r, 5000)); // Wait 5 seconds
  }
}

Python

import requests
import time

def clone_voice(audio_path: str, name: str, description: str = "") -> dict:
    with open(audio_path, 'rb') as audio_file:
        response = requests.post(
            'https://api.speechgen.com/v1/voice-clone',
            headers={
                'Authorization': f'Bearer {os.environ["SPEECHGEN_API_KEY"]}',
            },
            files={
                'audio': audio_file,
            },
            data={
                'name': name,
                'description': description,
                'enhance_quality': 'true',
            }
        )

    response.raise_for_status()
    return response.json()

def wait_for_model(model_id: str) -> dict:
    while True:
        response = requests.get(
            f'https://api.speechgen.com/v1/models/{model_id}',
            headers={
                'Authorization': f'Bearer {os.environ["SPEECHGEN_API_KEY"]}',
            }
        )

        model = response.json()

        if model['status'] == 'ready':
            return model
        if model['status'] == 'failed':
            raise Exception('Voice cloning failed')

        time.sleep(5)

Tips for Best Results

Recording Tips: 1. Record in a quiet environment with minimal echo 2. Speak naturally at a consistent pace 3. Keep a consistent distance from the microphone 4. Avoid background music or noise 5. Record at least 30 seconds for best quality

Delete a Voice Model

DELETE https://api.speechgen.com/v1/models/{model_id}

Response: 204 No Content