{"id":18875,"date":"2024-11-25T19:23:57","date_gmt":"2024-11-25T19:23:57","guid":{"rendered":"https:\/\/gpt.m2mbeta.com\/?p=18875"},"modified":"2024-11-25T19:23:57","modified_gmt":"2024-11-25T19:23:57","slug":"playai-clones-voices-on-command","status":"publish","type":"post","link":"https:\/\/gpt.m2mbeta.com\/?p=18875","title":{"rendered":"PlayAI clones voices on command"},"content":{"rendered":"<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">Back in 2016, Hammad Syed and Mahmoud Felfel, an ex-WhatsApp engineer, thought it\u2019d be neat to build a text-to-speech Chrome extension for Medium articles. The extension, which could read any Medium story aloud, was featured on Product Hunt. A year later, it spawned an entire business.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe saw a bigger opportunity in helping individuals and organizations create realistic audio content for their applications,\u201d Syed told TechCrunch. \u201cWithout the need to build their own model, they could deploy human-quality speech experiences faster than ever before.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Syed and Felfel\u2019s company, <a href=\"https:\/\/play.ai\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">PlayAI<\/a> (formerly PlayHT), pitches itself as the \u201cvoice interface of AI.\u201d Customers can choose from a number of predefined voices, or clone a voice, and use PlayAI\u2019s API to integrate text-to-speech into their apps.<\/p>\n<p class=\"wp-block-paragraph\">Toggles allow users to adjust the intonation, cadence, and tenor of voices.<\/p>\n<p class=\"wp-block-paragraph\">PlayAI also offers a \u201cplayground\u201d where users can upload a file to generate a read-aloud version and a dashboard for creating more-polished audio narrations and voice-overs. Recently, the company got into the \u201c<a href=\"https:\/\/techcrunch.com\/2024\/07\/13\/what-exactly-is-an-ai-agent\/\">AI agents<\/a>\u201d game with tools that can be used to automate tasks such as answering customer calls at a business.<\/p>\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"2000\" height=\"2587\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?w=526\" alt=\"PlayAI\" class=\"wp-image-2912442\" style=\"width:861px;height:auto\" srcset=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif 2000w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=116,150 116w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=232,300 232w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=768,993 768w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=526,680 526w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=928,1200 928w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=990,1280 990w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=332,430 332w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=557,720 557w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=696,900 696w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=618,800 618w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=1187,1536 1187w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=1583,2048 1583w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=516,668 516w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=290,375 290w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=477,617 477w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/eNawEi6tLf7eaE8Dgkmk8FS10rQ.avif?resize=411,531 411w\" sizes=\"auto, (max-width: 2000px) 100vw, 2000px\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">PlayAI\u2019s agent feature, which builds automation tools around the company\u2019s text-to-speech engine. <\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>PlayAI<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">One of PlayAI\u2019s more interesting experiments is PlayNote, which transforms PDFs, videos, photos, songs, and other files into podcast-style shows, read-aloud summaries, one-on-one debates, and even children\u2019s stories. Like Google\u2019s <a href=\"https:\/\/techcrunch.com\/2024\/09\/26\/googles-notebooklm-enhances-ai-note-taking-with-youtube-audio-file-sources-sharable-audio-discussions\/\">NotebookLM<\/a>, PlayNote generates a script from an uploaded file or URL and feeds it to a collection of AI models, which together craft the finished product.<\/p>\n<p class=\"wp-block-paragraph\">I gave it a whirl, and the results weren\u2019t half bad. PlayNote\u2019s \u201cpodcast\u201d setting produces clips more or less on par with NotebookLM\u2019s in terms of quality, and the tool\u2019s ability to ingest photos and videos makes for some fascinating creations. Given a picture of a chicken mole dish I had recently, PlayNote wrote a five-minute podcast script about it. Truly, we are living in the future.<\/p>\n<p class=\"wp-block-paragraph\">Granted, the tool, like all AI tools, generates odd artifacts and <a href=\"https:\/\/techcrunch.com\/2024\/08\/14\/study-suggests-that-even-the-best-ai-models-hallucinate-a-bunch\/\">hallucinations<\/a> from time to time. And while PlayNote will do its best to adapt a file to the format you\u2019ve chosen, don\u2019t expect, say, a dry legal filing to make for the best source material. See: the <a href=\"https:\/\/techcrunch.com\/2024\/11\/14\/musks-amended-lawsuit-against-openai-names-microsoft-as-defendant\/\">Musk vs. OpenAI lawsuit<\/a> framed as a bedtime story:<\/p>\n<figure class=\"wp-block-embed aligncenter is-type-rich is-provider-soundcloud wp-block-embed-soundcloud\"\/>\n<p class=\"wp-block-paragraph\">PlayNote\u2019s podcast format is made possible by PlayAI\u2019s latest model, PlayDialog, which Syed says can use the \u201ccontext and history\u201d of a conversation to generate speech that reflects the conversation flow. \u201cUsing a conversation\u2019s historical context to control prosody, emotion, and pacing,\u00a0PlayDialog\u00a0delivers conversation with natural delivery and appropriate tone,\u201d he continued.<\/p>\n<p class=\"wp-block-paragraph\">PlayAI, which is close rivals with ElevenLabs, has been <a href=\"https:\/\/news.ycombinator.com\/item?id=35328698\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">criticized<\/a> in the past for its laissez-faire approach to safety. The company\u2019s voice cloning tool requires that users check a box indicating that they \u201chave all the necessary rights or consent\u201d to clone a voice \u2014 but there isn\u2019t any enforcement mechanism. I had no trouble creating a clone of Kamala Harris\u2019 voice from a recording.<\/p>\n<p class=\"wp-block-paragraph\">That\u2019s concerning considering the <a href=\"https:\/\/www.washingtonpost.com\/technology\/2023\/03\/05\/ai-voice-scam\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">potential for scams<\/a> and <a href=\"https:\/\/www.vice.com\/en\/article\/ai-voice-firm-4chan-celebrity-voices-emma-watson-joe-rogan-elevenlabs\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">deepfakes<\/a>.<\/p>\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"3840\" height=\"2160\" src=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?w=680\" alt=\"PlayDialog\" class=\"wp-image-2918681\" srcset=\"https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif 3840w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=150,84 150w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=300,169 300w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=768,432 768w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=680,383 680w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=1200,675 1200w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=1280,720 1280w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=430,242 430w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=720,405 720w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=900,506 900w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=800,450 800w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=1536,864 1536w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=2048,1152 2048w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=668,375 668w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=1097,617 1097w, https:\/\/techcrunch.com\/wp-content\/uploads\/2024\/11\/ArMeY8S3SxQcxIgVK42xkn9oduw.avif?resize=708,398 708w\" sizes=\"auto, (max-width: 3840px) 100vw, 3840px\"\/><figcaption class=\"wp-element-caption\"><span class=\"wp-element-caption__text\">PlayAI\u2019s PlayDialog model can generate two-day, \u201cduplex\u201d conversations that sound relatively natural. <\/span><span class=\"wp-block-image__credits\"><strong>Image Credits:<\/strong>PlayAI<\/span><\/figcaption><\/figure>\n<p class=\"wp-block-paragraph\">PlayAI also claims that it automatically detects and blocks \u201csexual, offensive, racist, or threatening content.\u201d But that wasn\u2019t the case in my testing. I used the Harris clone to generate speech I frankly can\u2019t embed here and never once saw a warning message.<\/p>\n<p class=\"wp-block-paragraph\">Meanwhile, PlayNote\u2019s community portal, which is filled with publicly generated content, has files with <a href=\"https:\/\/play.ai\/playnote\/IMG_0954_1731442821480.png\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">explicit titles<\/a> like \u201cWoman Performing Oral Sex.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Syed tells me that PlayAI responds to reports of voices cloned without consent, <a href=\"https:\/\/archive.ph\/HKjue\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">like this one<\/a>, by blocking the user responsible and removing the cloned voice immediately. He also makes the case that PlayAI\u2019s highest-fidelity voice clones, which require 20 minutes of voice samples, are priced higher ($49 per month billed annually or $99 per month) than most scammers are willing to pay.<\/p>\n<p class=\"wp-block-paragraph\">\u201cPlayAI has several ethical safeguards in place,\u201d Syed said. \u201cWe\u2019ve implemented robust mechanisms to identify whether a voice was synthesized using our technology, for example. If any misuse is reported, we promptly verify the origin of the content and take decisive actions to rectify the situation and prevent further ethical violations.\u201d<\/p>\n<p class=\"wp-block-paragraph\">I\u2019d certainly hope that\u2019s the case \u2014 and that PlayAI moves away from <a href=\"https:\/\/voicebot.ai\/2022\/10\/12\/synthetic-steve-jobs-interviewed-by-deepfake-joe-rogan-for-ai-powered-podcast-scripted-by-gpt-3\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">marketing campaigns featuring dead tech celebrities<\/a>. If PlayAI\u2019s moderation isn\u2019t robust, it could face legal challenges in <a href=\"https:\/\/www.dwt.com\/blogs\/artificial-intelligence-law-advisor\/2024\/04\/tennessee-elvis-act-ai-voice-replica\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">Tennessee<\/a>, which has a law on the books preventing platforms from hosting AI to make unauthorized recordings of a person\u2019s voice.<\/p>\n<p class=\"wp-block-paragraph\">PlayAI\u2019s approach to training its voice-cloning AI is also a bit murky. The company won\u2019t reveal where it sourced the data for its models, ostensibly for competitive reasons.<\/p>\n<p class=\"wp-block-paragraph\">\u201cPlayAI uses mostly open datasets, [as well as licensed data] and proprietary datasets that are built in-house,\u201d Syed said. \u201cWe don\u2019t use user data from the products in training, or creators to train models. Our models are trained on millions of hours of real-life human speech, delivering voices in male and female genders across multiple languages and accents.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Most AI models are trained on public web data \u2014 some of which may be copyrighted or under a restrictive license.  Many AI vendors argue that the\u00a0<a href=\"https:\/\/www.copyright.gov\/fair-use\/#:~:text=Fair%20use%20is%20a%20legal,protected%20works%20in%20certain%20circumstances.\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">fair-use<\/a>\u00a0doctrine shields them from copyright claims. But that hasn\u2019t stopped data owners\u00a0<a href=\"https:\/\/www.reuters.com\/legal\/music-publishers-ask-court-halt-ai-company-anthropics-use-lyrics-2023-11-17\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">from<\/a>\u00a0<a href=\"https:\/\/www.finnegan.com\/en\/insights\/articles\/insights-from-the-pending-copilot-class-action-lawsuit.html\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">filing class action lawsuits alleging that vendors used their data sans permission<\/a>.<\/p>\n<p class=\"wp-block-paragraph\">PlayAI hasn\u2019t been sued. However, its terms of service <a href=\"https:\/\/play.ai\/terms\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">suggest<\/a> it won\u2019t go to bat for users if they find themselves under legal threat.<\/p>\n<p class=\"wp-block-paragraph\">Voice cloning platforms like PlayAI face criticism from actors who fear that voice work will eventually be replaced by AI-generated vocals and that actors will have little control over how their digital doubles are used.<\/p>\n<p class=\"wp-block-paragraph\">The Hollywood actors\u2019 union SAG-AFTRA has struck deals with some startups, including online talent marketplace Narrativ and Replica Studios, for what it describes as \u201cfair\u201d and \u201cethical\u201d voice cloning arrangements. But even these tie-ups have come under <a href=\"https:\/\/aibusiness.com\/ml\/sag-aftra-deal-with-ai-voice-cloners-angers-many-actors\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">intense scrutiny<\/a>, including from SAG-AFTRA\u2019s own members.<\/p>\n<p class=\"wp-block-paragraph\">In California, laws require companies relying on a performer\u2019s digital replica (e.g., cloned voice) give a description of the replica\u2019s intended use and negotiate with the performer\u2019s legal counsel. They also require that entertainment employers gain the consent of a deceased performer\u2019s estate before using a digital clone of that person.<\/p>\n<p class=\"wp-block-paragraph\">Syed says that PlayAI \u201cguarantees\u201d that every voice clone generated through its platform is exclusive to the creator. \u201cThis exclusivity is vital for protecting the creative rights of users,\u201d he added. <\/p>\n<p class=\"wp-block-paragraph\">The increasing legal burden is one headwind for PlayAI. Another is the competition. <a href=\"https:\/\/techcrunch.com\/2022\/06\/09\/papercup-raises-20m-for-ai-that-automatically-dubs-videos\/\">Papercup<\/a>, <a href=\"https:\/\/techcrunch.com\/2022\/02\/10\/deepdub-raises-20m-for-a-i-powered-dubbing-that-uses-actors-original-voices\/\">Deepdub<\/a>, <a href=\"https:\/\/techcrunch.com\/2023\/05\/08\/acapela-lets-anyone-back-up-their-own-voice-for-free-in-minutes-just-in-case\/\">Acapela<\/a>, <a href=\"https:\/\/techcrunch.com\/2023\/12\/06\/respeechers-ethics-first-approach-to-ai-voice-cloning-locks-in-new-funding\/\">Respeecher<\/a>, and <a href=\"https:\/\/techcrunch.com\/2023\/06\/30\/voice-ai-raises-6m-as-its-real-time-voice-changer-approaches-500k-users\/\">Voice.ai<\/a>, as well as Big Tech incumbents Amazon, Microsoft, and Google, offer AI dubbing and voice-cloning tools. The aforementioned ElevenLabs, one of the highest-profile voice-cloning vendors, is said to be <a href=\"https:\/\/techcrunch.com\/2024\/10\/03\/investors-are-scrambling-to-get-into-elevenlabs-which-may-soon-be-valued-at-3-billion\/\">raising<\/a> new funds at a valuation over $3 billion.<\/p>\n<p class=\"wp-block-paragraph\">PlayAI isn\u2019t struggling to find investors, though. This month, the Y Combinator-backed company closed a $20 million seed round co-led by 500 Startups and Kindred Ventures, bringing its total capital raised to $21 million. Race Capital and 500 Global also participated.<\/p>\n<p class=\"wp-block-paragraph\">\u201cThe new\u00a0capital will be used to invest in our generative AI voice models and voice agent platform, and to shorten the time for businesses to build human-quality speech experiences,\u201d Syed said, adding that PlayAI plans to expand its 40-person workforce.<\/p>\n<\/div>\n<p><\/p>\n<hr style=\"border-top: 2px solid #ccc; margin-top: 20px;\">\n<p><em>Source: <\/em> <em><a href=\"https:\/\/techcrunch.com\/2024\/11\/25\/playai-clones-voices-on-command\/\">techcrunch.com\u2026<\/a><\/em><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Back in 2016, Hammad Syed and Mahmoud Felfel, an ex-WhatsApp engineer, thought it\u2019d be neat to build a text-to-speech Chrome extension for Medium articles. The extension, which could read any Medium story aloud, was featured on Product Hunt. A year later, it spawned an entire business. \u201cWe saw a bigger opportunity in helping individuals and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-18875","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=\/wp\/v2\/posts\/18875","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=18875"}],"version-history":[{"count":0,"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=\/wp\/v2\/posts\/18875\/revisions"}],"wp:attachment":[{"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=18875"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=18875"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gpt.m2mbeta.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=18875"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}