Tuesday, April 30, 2024
Advertisement
  1. You Are At:
  2. News
  3. Technology
  4. Meta unveils 'Voicebox': A cutting-edge AI model for speech generation

Meta unveils 'Voicebox': A cutting-edge AI model for speech generation

Meta's Voicebox AI model revolutionizes speech generation by producing high-quality audio clips in various styles and supporting multiple languages, along with advanced features like noise removal and content editing.

Saumya Nigam Edited By: Saumya Nigam @snigam04 New Delhi Published on: June 18, 2023 18:45 IST
Metam, Voicebox, AI model, speech generation
Image Source : PIXABAY Meta unveils Voicebox- an AI model for speech generation

Meta, the company behind Facebook, has unveiled a groundbreaking generative AI model called 'Voicebox' that has the potential to revolutionize speech generation. In a blog post, Meta announced that Voicebox is the first model capable of generalizing to speech-generation tasks with exceptional performance, even without specific training for those tasks.

Unlike traditional models that generate images or text, Voicebox specializes in producing high-quality audio clips. It can generate speech in multiple styles, either from scratch or by modifying provided samples. The model supports speech synthesis in six languages: English, French, German, Spanish, Polish, and Portuguese. Additionally, Voicebox offers features such as noise removal, content editing, style conversion, and diverse sample generation.

What sets Voicebox apart is its unique learning approach. Instead of relying on autoregressive models, Voicebox learns directly from raw audio data and accompanying transcriptions. This enables the model to modify any part of a given sample, not just the end, resulting in enhanced flexibility and versatility.

ALSO READ: Abuse of spam feature would result in the suspension of the reporting account: Musk

Meta explains that Voicebox is trained to predict a speech segment when given the surrounding speech and its corresponding transcript. Once the model grasps the ability to fill in speech based on context, it can be applied to a wide range of speech generation tasks, allowing it to generate specific portions of an audio recording without reproducing the entire recording.

Thanks to its versatility, Voicebox excels in various applications, including in-context text-to-speech synthesis, cross-lingual style transfer, speech denoising and editing, and diverse speech sampling. The model's performance and adaptability offer new possibilities for creative audio generation and advanced speech manipulation.

ALSO READ: Mercedes-Benz adds ChatGPT integration to enhance vehicle voice control

Meta's Voicebox represents a significant advancement in the field of speech generation, introducing a powerful AI model capable of producing high-quality audio clips and performing various speech-related tasks with exceptional results. As AI technology continues to evolve, Voicebox could open doors to innovative applications in voice-assisted technologies, entertainment, and more.

ALSO READ: 'Hey Disney!' voice assistant now available for Echo devices in the US

Inputs from IANS

Advertisement

Read all the Breaking News Live on indiatvnews.com and Get Latest English News & Updates from Technology