Removal of Spectral Discontinuity in Concatenated Speech Waveform

Deepika Singh; Parminder Singh

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

On Rayleigh-Ritz Method in Three-Parameter Eigenvalue Problems

January

2014

To Secure and Compress the Message on Local Area Network

May

2013

RIO: An AI based Virtual Assistant

May

2018

Improved the Prediction of Clinical Data Accuracy using RBF Neural Network Model

Mar

2017

Reseach Article

Removal of Spectral Discontinuity in Concatenated Speech Waveform

by Deepika Singh, Parminder Singh

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 53 - Number 16

Year of Publication: 2012

Authors: Deepika Singh, Parminder Singh

10.5120/8504-2250

Deepika Singh, Parminder Singh . Removal of Spectral Discontinuity in Concatenated Speech Waveform. International Journal of Computer Applications. 53, 16 ( September 2012), 9-12. DOI=10.5120/8504-2250

@article{ 10.5120/8504-2250,

author = { Deepika Singh, Parminder Singh },

title = { Removal of Spectral Discontinuity in Concatenated Speech Waveform },

journal = { International Journal of Computer Applications },

issue_date = { September 2012 },

volume = { 53 },

number = { 16 },

month = { September },

year = { 2012 },

issn = { 0975-8887 },

pages = { 9-12 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume53/number16/8504-2250/ },

doi = { 10.5120/8504-2250 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:54:15.173379+05:30

%A Deepika Singh

%A Parminder Singh

%T Removal of Spectral Discontinuity in Concatenated Speech Waveform

%J International Journal of Computer Applications

%@ 0975-8887

%V 53

%N 16

%P 9-12

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Speech synthesis systems which involve concatenation of recorded speech units are currently very popular. These systems are known for producing high quality, natural-sounding speech as they generate speech by joining together waveforms of different speech units. This method of speech generation is quite practical. However the speech units that are being concatenated may have different spectra on either side of the concatenation points. Such mismatches are spectral in nature and give rise to spectral discontinuity in concatenated speech waveforms. The presence of such discontinuities can be very distracting to the listener and degrade the overall quality of output speech. This paper proposes a speech signal processing technique that deals with the problem of spectral discontinuity in the context of concatenated waveform synthesis. It involves the post-processing of the synthesized speech waveform in time domain. This technique is implemented on different single channel Punjabi wave audio files which were created by concatenating different Punjabi syllables. A listening test was conducted to evaluate the proposed technique, and it was observed that the spectral discontinuity is reduced to a large extent and the output speech sounds more natural with the reduction of audible noise.

References

Honda. M. (2003), "Human Speech Production Mechanisms", NTT Technical Review, Vol. 1, No. 3, pp. 24-29.
Tabet. Y. And Boughazi. M. (2011), "Speech Synthesis Techniques: A Survey", 7th International Workshop on Systems, Signal Processing and their Applications, pp. 67-70.
Thakur. S. K. and Satao. K. J. (2011), "Study of Various kinds of Speech Synthesizer Technologies and Expression for Expressive Text To Speech Conversion System", International Journal of Advanced Engineering Sciences and Technologies, Vol. 8, No. 2, pp. 301-305.
Chappell. D. And Hansen. J. (2002), "A Comparison of Spectral Smoothing Methods for Segment Concatenation Based Speech Synthesis", Speech Communication, Vol. 36, pp. 343-374.
Kirkpatrick. B. (2010), "Spectral Discontinuity in Concatenative Speech Synthesis - Perception. Join Costs and Feature Transformations", PhD. Thesis, Dublin City University, pp. 1-63.
Klabbers. E. and Veldhuis. R. (2001), "Reducing Audible Spectral Discontinuities", IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 1, pp. 39-51.
White. S. (2003), "Visualizing Speech Synthesis", Bachelor's Thesis, pp. 4-9.
Lemmetty. S. (1999), "Review of Speech Synthesis Technology", Master's Thesis, Department of Electrical and Communication Engineering. Helsinki University of Technology, pp. 28-46.
Bjorkan. I. (2010), "Speech Generation and Modification in Concatenative Speech Synthesis", PhD. Thesis. Department of Electronics and Concatenative Speech Synthesis", M. Sc. Thesis, University of Crete, Greece, pp. 1-18.
Visagie. A. (2004), "Speech Generation in a Spoken Dialogue System", Master's Thesis, University of Stellenbosch, South Africa, pp. 35-91.
Wouters. J. And Macon. M. (2001), "Control of Spectral Dynamics in Concatenative Speech Synthesis", IEEE Transactions on Speech and Audio Processing, Vol. 9, No. 1, pp. 30-38.
Klabbers. E. (1997), "High Quality Output Speech Generation through Advanced Phrase Concatenation", Proceedings of the Cost Workshop on Speech Technology in the Public Telephone Network: Where are we today?, Rhodes, Greece, Vol. 1, No. 88, pp. 85-88.
Rabiner. L. And Schafer. R. (2007), "Introduction to Digital Speech Processing", Vol. 1, No. 1-2, pp. 1-194.
Mousa. A. (2010), "Voice Conversion Using Pitch-Shifting Algorithm by Time Stretching with PSOLA and Re-sampling", Journal of Electrical Engineering. Vol. 61, No. 1, pp. 57-61.
Plumpe, M. And Meredith, S. (1998), "Which is More Important in a Concatenative Text to Speech System- Pitch, Duration or Spectral Discontinuity?", Proceedings of the third ESCA/COCOSDA Workshop on Speech Synthesis, Jenolan, Australia.

Index Terms

Computer Science

Information Sciences

Keywords

Speech waveform Concatenative speech synthesis Spectral discontinuity