Humanist Discussion Group, Vol. 37, No. 552. Department of Digital Humanities, University of Cologne Hosted by DH-Cologne www.dhhumanist.org Submit to: humanist@dhhumanist.org Date: 2024-04-16 13:33:02+00:00 From: Clay Foye <clay.foye@gmail.com> Subject: Re: [Humanist] 37.550: text to speech? Dear Maurizio, In my response, I am assuming you meant that the Text-to-Speech (TTS) model changes its output (tone, emphasis, etc.) based upon the inputted text's typology. As you might have found, a lot of the closed-source / licensed TTS models are for things like customer support. For example, I found IBM's customizable model by navigating to a page about "Transforming your call center with conversational AI technology", where one can use markdown-style tags to customize speaking style, emphasis, and tone. IBM's TTS page. <https://cloud.ibm.com/docs/text-to-speech> A good place to look for non-mainstream work on AI is open-source models. In particular, I like to use Hugging Face <https://huggingface.co/>, a collection of open-source models for free to use on your own machine. I browsed their TTS models to get a feel for the state of the publicly available resources. One particular model that caught my eye was Parler-TTS <https://huggingface.co/parler-tts/parler_tts_mini_v0.1>. This model allows you to describe the desired output with natural language. For example, I can provide a letter as the actual text to be read, while also providing a short description of the kind I might provide to a voice actor. Here is the paper/research: https://www.text-description-to-speech.com/. What I particularly like about Hugging Face models is that you yourself can try this model out! There are short instructions for you to get the model running on your own computer on the Parler-TTS page on Hugging Face. Of course, my description is an oversimplification, and the model has the same Jentschian *unheimlich *we have come to expect from AI. It is perhaps disingenuous to describe natural language instructions to a model as the same one might give to a human. There are some interesting loose strings which a trained deconstructionist might pull on that take the form as "Tips" on the Parler-TTS HuggingFace page. From that page: ``` - Include the term "very clear audio" to generate the highest quality audio, and "very noisy audio" for high levels of background noise - Punctuation can be used to control the prosody of the generations, e.g. use commas to add small breaks in speech - The remaining speech features (gender, speaking rate, pitch and reverberation) can be controlled directly through the prompt ``` I hope this helps! Clay On Tue, Apr 16, 2024 at 1:40 AM Humanist <humanist@dhhumanist.org> wrote: > > Humanist Discussion Group, Vol. 37, No. 550. > Department of Digital Humanities, University of Cologne > Hosted by DH-Cologne > www.dhhumanist.org > Submit to: humanist@dhhumanist.org > > > > > Date: 2024-04-15 09:29:46+00:00 > From: maurizio lana <maurizio.lana@uniupo.it> > Subject: text to speech? > > dear all, > > can anyone give me an indication of where in the world research and/or > application work is being done on text-to-speech differentiated by text > typology? (a lettere rather than a news rather than ...) > in my experience mainstream text to speech systems are not so able from > this point of view. > thank you a lot > > Maurizio > > > > ------------------------------------------------------------------------ > > a questo punto devo fare una confessione: > come il mio amico Erri De Luca, sono un europeista estremista. > Questo significa che, per me, l’Europa unita è l’unica utopia politica > ragionevole che noi europei abbiamo coniato. > xavier cercas, inaugurazione del salone del libro, torino 2018 > > ------------------------------------------------------------------------ > Maurizio Lana > Università del Piemonte Orientale > Dipartimento di Studi Umanistici > Piazza Roma 36 - 13100 Vercelli _______________________________________________ Unsubscribe at: http://dhhumanist.org/Restricted List posts to: humanist@dhhumanist.org List info and archives at at: http://dhhumanist.org Listmember interface at: http://dhhumanist.org/Restricted/ Subscribe at: http://dhhumanist.org/membership_form.php