Skip to product information
1 of 1
Regular price £173.99 GBP
Regular price Sale price £173.99 GBP
Sale Sold out
Free UK Shipping

Freshly Printed - allow 7 days lead

Improvements in Speech Synthesis
Cost 258: The Naturalness of Synthetic Speech

E. Keller (Edited by), E Keller (Author), G. Bailly (Edited by), A. Monaghan (Edited by), J. Terken (Edited by), M. Huckvale (Edited by)

9780471499855, Wiley

Hardback, published 12 October 2001

416 pages
25 x 17.2 x 2.8 cm, 0.822 kg

Naturalness in synthetic speech is one of the most intractable problems in information technology today.

Although speech synthesis systems have improved considerably over the last 20 years, they rarely sound entirely like human speakers.

Why is this so, and what can be done about it?

  • Prosodic processing must be rendered more varied and more appropriate to the speech situation
  • Timing, melodic control and the relationships between the various prosodic parameters need increased attention
  • Signal processing systems must be developed and perfected that are capable of generating more than just one voice from a database
  • A better understanding must be achieved of what distinguishes one voice from another, and of how speech styles differ between simply reading aloud numbers and sentences and their use in interactive speech
  • New evaluation methodologies should be developed to provide objective and subjective measurements of the intelligibility of the synthetic speech and the cognitive load imposed upon the listener by impoverished stimuli
  • Adequate text markup systems must be proposed and tested with multiple languages in real-world situations
  • Further research is required to integrate speech synthesis systems into larger natural-language processing systems

Improvements in Speech Synthesis presents the latest research in the above areas. Contributors include speech synthesis specialists from 16 countries, with experience in the development of systems for 12 European languages. This volume emerges from a four-year European COST project focused on "The Naturalness of Synthetic Speech", and will be a valuable text for everyone involved in speech synthesis.

List of Contributors ix

Preface xiii

Part I Issues in Signal Generation 1

1 Towards Greater Naturalness: Future Directions of Research in Speech Synthesis 3
Eric Keller

2 Towards More Versatile Signal Generation Systems 18
Gerard Bailly

3 A Parametric Harmonic ‡ Noise Model 22
Gerard Bailly

4 The COST 258 Signal Generation Test Array 39
Gerard Bailly

5 Concatenative Text-to-Speech Synthesis Based on Sinusoidal Modelling 52
Eduardo Rodiguez Banga, Carmen Garcia Mateo and Xavier Fernandez Salgado

6 Shape Invariant Pitch and Time-Scale Modification of Speech Based on a Harmonic Model 64
Darragh O'Brien and Alex Monaghan

7 Concatenative Speech Synthesis Using SRELP 76
Erhard Rank

Part II Issues in Prosody 87

8 Prosody in Synthetic Speech: Problems, Solutions and Challenges 89
Alex Monaghan

9 State-of-the-Art Summary of European Synthetic Prosody R&D 93
Alex Monaghan

10 Modelling FO in Various Romance Languages: Implementation in Some TTS Systems 104
Philippe Martin

11 Acoustic Characterisation of the Tonic Syllable in Portuguese 120
Joao Paulo Ramos Teixeira and Diamantino R.S. Freitas

12 Prosodic Parameters of Synthetic Czech: Developing Rules for Duration and Intensity 129
Marie Dohalska, Jana Mejvaldova and Tomas Dubeda

13 MFGI, a Linguistically Motivated Quantitative Model of German Prosody 134
Hansjorg Mixdorff

14 Improvements in Modelling the FO Contour for Different Types of Intonation Units in Slovene 144
Ales Dobnikar

15 Representing Speech Rhythm 154
Brigitte Zellner Keller and Eric Keller

16 Phonetic and Timing Considerations in a Swiss High German TTS System 165
Beat Siebenhaar, Brigitte Zellner Keller and Eric Keller

17 Corpus-based Development of Prosodic Models Across Six Languages 176
Justin Fackrell, Halewijn Vereecken, Cynthia Grover, Jean-Pierre Martens and Bert Van Coile

18 Vowel Reduction in German Read Speech 186
Christina Widera

Part III Issues in Styles of Speech 197

19 Variability and Speaking Styles in Speech Synthesis 199
Jacques Terken

20 An Auditory Analysis of the Prosody of Fast and Slow Speech Styles in English, Dutch and German 204
Alex Monaghan

21 Automatic Prosody Modelling of Galician and its Application to Spanish 218
Eduardo Lopez Gonzalo, Juan M. Villar Navarro and Luis A. Hernandez Gomez

22 Reduction and Assimilatory Processes in Conversational French Speech: Implications for Speech Synthesis 228
Danielle Duez

23 Acoustic Patterns of Emotions 237
Branka Zei Pollermann and Marc Archinard

24 The Role of Pitch and Tempo in Spanish Emotional Speech: Towards Concatenative Synthesis 246
Juan Manuel Montero Martinez, Juana M. Gutierrez Arriola, Ricardo de Cordoba Herralde, Emilia Victoria Enriquez Carrasco and Jose Manuel Pardo Munoz

25 Voice Quality and the Synthesis of Affect 252
Ailbhe Ni Chasaide and Christer Gobl

26 Prosodic Parameters of a 'Fun' Speaking Style 264
Kjell Gustafson and David House

27 Dynamics of the Glottal Source Signal: Implications for Naturalness in Speech Synthesis 273
Christer Gobl and Ailbhe Ni Chasaide

28 A Nonlinear Rhythmic Component in Various Styles of Speech 284
Brigitte Zellner Keller and Eric Keller

Part IV Issues in Segmentation and Mark-up 293

29 Issues in Segmentation and Mark-up 295
Mark Huckvale

30 The Use and Potential of Extensible Mark-up (XML) in Speech Generation 297
Mark Huckvale

31 Mark-up for Speech Synthesis: A Review and Some Suggestions 307
Alex Monaghan

32 Automatic Analysis of Prosody for Multi-lingual Speech Corpora 320
Daniel Hirst

33 Automatic Speech Segmentation Based on Alignment with a Text-to-Speech System 328
Petr Horak

34 Using the COST 249 Reference Speech Recogniser for Automatic Speech Segmentation 339
Narada D. Warakagoda and Jon E. Natvig

Part V Future Challenges 349

35 Future Challenges 351
Eric Keller

36 Towards Naturalness, or the Challenge of Subjectiveness 353
Genevieve Caelen-Haumont

37 Synthesis Within Multi-Modal Systems 363
Andrew Breen

38 A Multi-Modal Speech Synthesis Tool Applied to Audio-Visual Prosody 372
Jonas Beskow, Bjorn Granstrom and David House

39 Interface Design for Speech Synthesis Systems 383
Gudrun Flach

Index 391

Subject Areas: Electronics & communications engineering [TJ]

View full details