Voice Cloning API

Create custom voice models that sound like any voice using our advanced voice cloning technology.

Endpoint

POST https://api.speechgen.com/v1/voice-clone
Content-Type: multipart/form-data

Request

Headers

Header	Value	Required
`Authorization`	`Bearer YOUR_API_KEY`	Yes
`Content-Type`	`multipart/form-data`	Yes

Supported Audio Formats

MP3 (recommended)
WAV
M4A
FLAC
OGG

Audio Requirements: - Duration: 10 seconds to 5 minutes - Single speaker only - Clear audio with minimal background noise - File size: Maximum 10MB

Example Request

curl -X POST https://api.speechgen.com/v1/voice-clone \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "audio=@voice_sample.mp3" \
  -F "name=My Custom Voice" \
  -F "description=Professional narrator voice" \
  -F "enhance_quality=true"

Response

Success (201)

{
  "model_id": "vm_abc123xyz",
  "name": "My Custom Voice",
  "description": "Professional narrator voice",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z",
  "estimated_completion": "2024-01-15T10:31:00Z",
  "samples": [
    {
      "title": "Default Sample",
      "text": "This is a demonstration of the voice clone...",
      "audio_url": "https://cdn.speechgen.com/samples/abc123.mp3"
    }
  ]
}

Status Values

Status	Description
`processing`	Voice model is being created
`ready`	Voice model is ready to use
`failed`	Voice creation failed

Check Model Status

GET https://api.speechgen.com/v1/models/{model_id}

Response:

{
  "model_id": "vm_abc123xyz",
  "name": "My Custom Voice",
  "status": "ready",
  "voice_id": "custom-abc123",
  "created_at": "2024-01-15T10:30:00Z",
  "usage_count": 0
}

Using Your Cloned Voice

Once the model status is ready, use the voice_id in text-to-speech requests:

curl -X POST https://api.speechgen.com/v1/text-to-speech \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is my cloned voice speaking!",
    "voice": "custom-abc123",
    "format": "mp3"
  }'

Code Examples

JavaScript

async function cloneVoice(audioFile, name, description) {
  const formData = new FormData();
  formData.append("audio", audioFile);
  formData.append("name", name);
  formData.append("description", description);
  formData.append("enhance_quality", "true");

  const response = await fetch("https://api.speechgen.com/v1/voice-clone", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${process.env.SPEECHGEN_API_KEY}`,
    },
    body: formData,
  });

  if (!response.ok) {
    const error = await response.json();
    throw new Error(error.error.message);
  }

  return response.json();
}

// Poll for completion
async function waitForModel(modelId) {
  while (true) {
    const response = await fetch(
      `https://api.speechgen.com/v1/models/${modelId}`,
      {
        headers: {
          Authorization: `Bearer ${process.env.SPEECHGEN_API_KEY}`,
        },
      }
    );

    const model = await response.json();

    if (model.status === "ready") return model;
    if (model.status === "failed") throw new Error("Voice cloning failed");

    await new Promise((r) => setTimeout(r, 5000)); // Wait 5 seconds
  }
}

Python

import requests
import time

def clone_voice(audio_path: str, name: str, description: str = "") -> dict:
    with open(audio_path, 'rb') as audio_file:
        response = requests.post(
            'https://api.speechgen.com/v1/voice-clone',
            headers={
                'Authorization': f'Bearer {os.environ["SPEECHGEN_API_KEY"]}',
            },
            files={
                'audio': audio_file,
            },
            data={
                'name': name,
                'description': description,
                'enhance_quality': 'true',
            }
        )

    response.raise_for_status()
    return response.json()

def wait_for_model(model_id: str) -> dict:
    while True:
        response = requests.get(
            f'https://api.speechgen.com/v1/models/{model_id}',
            headers={
                'Authorization': f'Bearer {os.environ["SPEECHGEN_API_KEY"]}',
            }
        )

        model = response.json()

        if model['status'] == 'ready':
            return model
        if model['status'] == 'failed':
            raise Exception('Voice cloning failed')

        time.sleep(5)

Recording Tips: 1. Record in a quiet environment with minimal echo 2. Speak naturally at a consistent pace 3. Keep a consistent distance from the microphone 4. Avoid background music or noise 5. Record at least 30 seconds for best quality

Delete a Voice Model

DELETE https://api.speechgen.com/v1/models/{model_id}

Response: 204 No Content

Voice Cloning