Floatp Text-to-Speech Protocol – FTTSP/0.1

Introduction

The purpose of the protocol described in this document is to enable a client program (the client) to communicate with a server program (the server), in order to utilize it's speech synthesizing subsystem (the synthesizer).

The client issues requests for text to be spoken by the synthesizer. The server continually responds with status messages as the synthesizer progresses. The aquired status messages may be used by the client to update the screen to highlight the point where the synthesizer "is at".

A Unix domain socket, TCP socket, or similar may be used for message transport.

Packets

The protocol uses a trivial to parse, yet "human readable", scheme for message packing. Only capital letters are used, numbers are hexadecimal unsigned integers and fields are separated by spaces (20H). The ASCII character code table is used for the interchange, with one exception – the text destined for the synthesizer. Any encoding of the users choosing may be applied to the text to be spoken by the synthesizer, as long as both client and server supports it, obviously. For this document, the ASCII character code table suffices to exemplify "spoken text", and is used for the purpose of clarity.

Packet header

A packet header of four characters defines the total packet size, including the packet header itself. The packet size is encoded as a four digit hexadecimal unsigned integer.

Packet payload

The packet payload follows the packet header, separated by a space. The payload is divided into fields, each one separated by a space.

Packet header and payload fields

  • Packet Size

    The total size of the packet. An unsigned integer in the range 0..FFFFH, encoded as a string of four hexadecimal digits.

  • Request Serial

    The identity of the request. An unsigned integer in the range 0..FFFFH, encoded as a string of four hexadecimal digits.

  • Request Name

    The name of a requested operation. A string of four characters.

    Name Description Remark
    ABRT Request current speak to be aborted  
    HELO Request to handshake Optional
    SPEK Request text to be spoken  
  • Packet Data

    Any data applicable to the packet type.

  • Response Type

    A string of two characters.

    Name Description Remark
    OK Successful request resolution Terminal
    ER Request rejected due to an error Terminal
    EV An event occured while serving the request Progression

Client requests

Packets sent from the client to the server.

<Packet Size> <Request Serial> <Request Name>[ <Packet Data>]

ABRT – Abort speak request

A client's request for the server to abort the currently served speak request.

<Packet Size> <Request Serial> ABRT

000E 0002 ABRT

HELO – Handshake request

A client's request for the server to abort the currently served speak request.

<Packet Size> <Request Serial> HELO

000E 0001 HELO

SPEK – Speak request

A client's request for a text to be spoken by the server's synthesizer subsystem.

<Packet Size> <Request Serial> SPEK <Text>

0024 0002 SPEK Floatp Text-to-Speech

Server responses

Packets sent from the server to the client.

<Packet Size> <Request Serial> <Request Name> <Response Type>[ <Data>]

ER – Request error response

A server's response packet of type ER reports that an error occured while processing a request. An error code is supplied as a 3-digit decimal integer in the packet data field. The connection is then closed by the server.

<Packet Size> <Request Serial> <Request Name> ER <3-Digit Decimal Integer>

0010 0002 SPEK ER 503

Error status codes resembles HTTP status codes.

Code Description
400 Bad Request
500 Internal Server Error
503 Service Unavailable

EV – Request event response

An event packet is a server response of type EV. The event name is a five character word, stored in first field of the packet data, and any event parameters are stored in successive fields.

  • HELO, ENVMT – Environment information event

    Reports information about the server environment.

    ENVMT
    
    0028 0001 HELO EV ENVMT ENCODING "UTF-8"
    
  • SPEK, ABRTD – Speak abortion event

    Reports that the synthesizer has been stopped due to an abort (ABRT) request.

    ABRTD
    
    0017 0001 SPEK EV ABRTD
    
  • SPEK, FNSHD – Speak finished event

    Reports that the synthesizer is finished speaking.

    FNSHD
    
    0017 0003 SPEK EV FNSHD
    
  • SPEK, PRGRS – Speak progress event

    Reports that progress has been made by the synthesizer.

    Parameters are two unsigned integers, each encoded as a string of four hexadecimal digits. These numbers forms a the range in the text for which this event was emitted. The first number is a character offset marking the beginning of the range. The second is the character count of the range.

    PRGRS <Offset> <Count>
    
    0021 0001 SPEK EV PRGRS 0004 0005
    
  • SPEK, STRTD – Speak started event

    Reports that the synthesizer has started speaking.

    STRTD
    
    0017 0001 SPEK EV STRTD
    

OK – Request success response

A server's response packet of type OK reports the successful resolution of a request.

<Packet Size> <Request Serial> <Request Name> OK

0011 0002 SPEK OK

Date: 2019-06-10

Author: Gunnar LingegÄrd

Validate