Amazon Polly
Amazon Polly converts text into lifelike speech in a variety of voices and languages.
See Amazon Polly Developer Guide for general guidance using Polly.
API Access
The IPollyClient interface and its implementation TPollyClient provides access to all Polly operations.
Synthesizing Speech
Use SynthesizeSpeech to
convert text to audio. The response contains an AudioStream
that can be saved to a file:
var
Client: IPollyClient;
Request: IPollySynthesizeSpeechRequest;
Response: IPollySynthesizeSpeechResponse;
FileStream: TFileStream;
begin
Client := TPollyClient.Create;
Request := TPollySynthesizeSpeechRequest.Create;
Request.Text := 'Hello from Amazon Polly.';
Request.VoiceId := 'Amy';
Request.Engine := 'neural';
Request.OutputFormat := 'mp3';
Response := Client.SynthesizeSpeech(Request);
FileStream := TFileStream.Create('output.mp3', fmCreate);
try
FileStream.CopyFrom(Response.AudioStream);
finally
FileStream.Free;
end;
end.
Other Operations
- DescribeVoices lists available voices, optionally filtered by language.
- StartSpeechSynthesisTask starts an asynchronous synthesis task that writes output directly to an S3 bucket, useful for longer texts.
- Pronunciation lexicons can be managed with PutLexicon, GetLexicon, ListLexicons, and DeleteLexicon.