RALCWI Vocoder
CMX608/CMX618/CMX638
5.2.
Encoder
The encoder deals with a basic frame size of 20ms, 160 samples, of audio. 1, 2, 3, or 4 frames may be
collected together and supplied to the host as a single packet. In the case of the 3 and 4 frame packets,
error protection may also be added with the FEC option. The encoder can also detect single tones (STD)
and/or DTMF in the audio stream. If detected, special frames are produced which the decoder will
recognise and deal with accordingly.
The exact rate at which packets are produced is dependant on the accuracy of the CODEC's sample rate.
The nominal rate is every 20ms, or a multiple thereof, depending on the number of frames that make up
the packet. For instance, a packet of 3 frames with FEC will be produced every 60ms. The algorithm used
to encode voice has algorithmic jitter, i.e. it does not take the same amount of time to encode each frame.
Some frames will take longer than others, consequently, the exact time that a packet will be available is
not predictable. The encoder will notify the host as soon as a packet becomes available. Over a period of
time the average rate will be every 20ms (or a multiple thereof) according to the CODEC's sample rate.
Once a packet of data becomes available, the host may read it straight away, or it can wait for a period of
time. The packet will remain available until the next one is produced.
5.2.1.
Single Frame Packet, without FEC, STD or DTMF
This is the simplest and most basic configuration. The encoder will produce a one-frame raw Vocoder
packet every 20ms. Once the encode instruction is given, the device will collect 20ms (160 samples) worth
of audio. These 160 samples will be given to the encoder to process. The processing of these samples will
take no more than 15ms, therefore the first packet of data will be available no later than 35ms after the
device was instructed to encode.
There are two basic strategies that can be adopted for servicing the encoder:
Event-driven
The host may use the C-BUS interrupt, IRQN, or poll the STATUS register, then read the Vocoder packet
as soon as it becomes available. This is signified by bit 0 (VDA) of the STATUS register being set to '1'.
The host may then choose to hold the packet in a buffer until the correct time to process it arrives. In the
case of a voice recorder, the packet could be put into a storage device immediately. In the case of some
sort of transmission (radio or network), the packet may be held until the correct time-slot arrives.
Timed
Assuming the host has an accurate 20ms timer derived from the same master clock as that supplied to the
audio CODEC (this could be the Vocoder device, or an external CODEC), wait for a timer event and then
instruct the device to encode. Wait for two more timer events, then read the first Vocoder packet. For
every subsequent timer event, read another Vocoder packet. Figure 8 shows the sequence of events.
Figure 8 Single Frame Packet Encoding
2014 CML Microsystems Plc
15
D/608_18_38/11