Podcasting platform Podcastle launches a text-to-speech model with more than 450 AI voices

Podcast recording and enhancing platform Podcastle is now becoming a member of different corporations within the AI-powered, text-to-speech race by releasing its personal AI mannequin referred to as Asyncflow v1.0. An API for builders will even be out there, permitting them to straight combine the text-to-speech mannequin of their apps.

Because of the brand new mannequin, the corporate is ready to supply greater than 450 AI voices that may narrate your textual content. The startup stated that it developed the know-how and mannequin in such a method that its coaching and inference prices are low, giving it a bonus towards rivals.

With the transfer, Podcastle joins a lot of startups, together with ElevenLabs, Speechify, and WellSaid, which have developed know-how and AI fashions to transform any sort of textual content right into a voice clip narrated by AI. This know-how spans use circumstances like advertising, commercial, content material creation, training, and company coaching.

Podcastle’s founder, Arto Yeritsyan, instructed TechCrunch that the corporate had all the time needed to construct a text-to-speech mannequin, however the price of coaching and information necessities for that have been very excessive.

“We needed to construct a sturdy text-to-speech mannequin since our inception. Nevertheless, the prices of growth have been very excessive. Because of current massive language mannequin developments, we have been in a position to attain a breakthrough final yr to get to a spot the place we may construct a high-quality voice mannequin without having a ton of knowledge,” Yeritsyan stated.

The corporate was additionally aided in its efforts by its $13.5 million Series A fundraise last year.

Yeritsyan stated that whereas Podcastle costs round $40 per 500 minutes of text-to-speech conversion, ElevenLabs costs $99 for a similar.

Podcastle’s voice cloning characteristic is getting an improve, as effectively, to create a faster course of for coaching.

Earlier, the coaching course of concerned studying roughly 70 totally different sentences. Now, it simply wants a couple of seconds of recording from you to create a clone of your voice. The brand new course of additionally used Podcastle’s Magic Dust AI, which was released last year, to improve audio recording quality.

In our testing, the voice created with the brand new course of sounded a bit robotic, although it mimicked our tone. The corporate stated that, over time, it’ll enhance the characteristic. Plus, you possibly can prepare totally different samples of your voice to get totally different outcomes.

Podcastle stated that other than prices, having instruments for audio, video, podcasts, and AI-powered narration beneath one redesigned web site will give it an edge over rivals. Yeritsyan stated that whereas nearly all of the customers use Podcastle to work on audio content material, video is catching as much as it as effectively.

Source link