version number bump
[speexdsp.git] / doc / manual.lyx
1 #LyX 1.3 created this file. For more info see http://www.lyx.org/
2 \lyxformat 221
3 \textclass article
4 \language english
5 \inputencoding auto
6 \fontscheme default
7 \graphics default
8 \float_placement h
9 \paperfontsize default
10 \spacing single 
11 \papersize Default
12 \paperpackage a4
13 \use_geometry 0
14 \use_amsmath 0
15 \use_natbib 0
16 \use_numerical_citations 0
17 \paperorientation portrait
18 \secnumdepth 3
19 \tocdepth 3
20 \paragraph_separation indent
21 \defskip medskip
22 \quotes_language english
23 \quotes_times 2
24 \papercolumns 1
25 \papersides 1
26 \paperpagestyle headings
27
28 \layout Title
29
30 The Speex Codec Manual
31 \newline 
32 (version 1.0.0)
33 \layout Author
34
35 Jean-Marc Valin
36 \layout Standard
37 \pagebreak_top 
38 Copyright (c) 2002-2003 Jean-Marc Valin.
39 \layout Standard
40
41 Permission is granted to copy, distribute and/or modify this document under
42  the terms of the GNU Free Documentation License, Version 1.1 or any later
43  version published by the Free Software Foundation; with no Invariant Section,
44  with no Front-Cover Texts, and with no Back-Cover.
45  A copy of the license is included in the section entitled "GNU Free Documentati
46 on License".
47  
48 \layout Standard
49 \pagebreak_top \pagebreak_bottom 
50
51 \begin_inset LatexCommand \tableofcontents{}
52
53 \end_inset 
54
55
56 \layout Standard
57 \pagebreak_bottom 
58
59 \begin_inset FloatList table
60
61 \end_inset 
62
63
64 \layout Section
65
66 Introduction to Speex
67 \layout Standard
68
69 The Speex project (
70 \family typewriter 
71 http://www.speex.org/
72 \family default 
73 ) has been started because there was a need for a speech codec that was
74  open-source and free from software patents.
75  These are essential conditions for being used by any open-source software.
76  There is already Vorbis that does general audio, but it is not really suitable
77  for speech.
78  Also, unlike many other speech codecs, Speex is not targeted at cell phones
79  (not many open-source cell phones anyway :-) ) but rather voice over IP
80  (VoIP) and file-based compression.
81  
82 \layout Standard
83
84 As design goals, we wanted to have a codec that would allowed both very
85  good quality speech and low bit-rate (unfortunately not at the same time!),
86  which led us to developing a codec with multiple bit-rates.
87  Of course very good quality also meant we had to do wideband (16 kHz sampling
88  rate) in addition to narrowband (telephone quality, 8 kHz sampling rate).
89 \layout Standard
90
91 Designing for VoIP instead of cell phone use means that Speex must be robust
92  to lost packets, but not to corrupted ones since packets either arrive
93  unaltered or don't arrive at all.
94  Also, the idea was to have a reasonnable complexity and memory requirement
95  without compromising too much on the efficiency of the codec.
96 \layout Standard
97
98 All this led us to the choice of CELP
99 \begin_inset LatexCommand \index{CELP}
100
101 \end_inset 
102
103  as the encoding technique to use for Speex.
104  One of the main reasons is that CELP has long proved that it could do the
105  job and scale well to both low bit-rates (think DoD CELP @ 4.8 kbps) and
106  high bit-rates (think G.728 @ 16 kbps).
107  
108 \layout Standard
109
110 The main characteristics can be summerized as follows:
111 \layout Itemize
112
113 Free software/open-source
114 \begin_inset LatexCommand \index{open-source}
115
116 \end_inset 
117
118 , patent
119 \begin_inset LatexCommand \index{patent}
120
121 \end_inset 
122
123  and royalty-free
124 \layout Itemize
125
126 Integration of narrowband
127 \begin_inset LatexCommand \index{narrowband}
128
129 \end_inset 
130
131  and wideband
132 \begin_inset LatexCommand \index{wideband}
133
134 \end_inset 
135
136  in the same bit-stream
137 \layout Itemize
138
139 Wide range of bit-rates available (from 2 kbps to 44 kbps)
140 \layout Itemize
141
142 Dynamic bit-rate switching and Variable Bit-Rate
143 \begin_inset LatexCommand \index{variable bit-rate}
144
145 \end_inset 
146
147  (VBR)
148 \layout Itemize
149
150 Voice Activity Detection
151 \begin_inset LatexCommand \index{voice activity detection}
152
153 \end_inset 
154
155  (VAD, integrated with VBR)
156 \layout Itemize
157
158 Variable complexity
159 \begin_inset LatexCommand \index{complexity}
160
161 \end_inset 
162
163
164 \layout Itemize
165
166 Ultra-wideband mode at 32 kHz (up to 48 kHz)
167 \layout Itemize
168
169 Intensity stereo encoding option
170 \layout Section
171 \pagebreak_top 
172 Feature description
173 \layout Standard
174
175 This section explains the main Speex features, as well as some concepts
176  in speech coding that help better understand the next sections.
177  
178 \layout Subsection*
179
180 Sampling rate
181 \begin_inset LatexCommand \index{sampling rate}
182
183 \end_inset 
184
185
186 \layout Standard
187
188 Speex is mainly designed for 3 different sampling rates: 8 kHz, 16 kHz,
189  and 32 kHz.
190  These are respectively refered to as narrowband
191 \begin_inset LatexCommand \index{narrowband}
192
193 \end_inset 
194
195 , wideband
196 \begin_inset LatexCommand \index{wideband}
197
198 \end_inset 
199
200  and ultra-wideband
201 \begin_inset LatexCommand \index{ultra-wideband}
202
203 \end_inset 
204
205 .
206  
207 \layout Subsection*
208
209 Quality
210 \begin_inset LatexCommand \index{quality}
211
212 \end_inset 
213
214
215 \layout Standard
216
217 Speex encoding is controlled most of the time by a quality parameter that
218  range from 0 to 10.
219  In constant bit-rate
220 \begin_inset LatexCommand \index{constant bit-rate}
221
222 \end_inset 
223
224  (CBR) operation, the quality parameter is an integer, while for variable
225  bit-rate (VBR), the parameter is a float.
226  
227 \layout Subsection*
228
229 Complexity
230 \begin_inset LatexCommand \index{complexity}
231
232 \end_inset 
233
234  (variable)
235 \layout Standard
236
237 With Speex, it is possible to vary the complexity allowed for the encoder.
238  This is done by controlling how the search is performed with an integer
239  ranging from 1 to 10 in a way that's similar to the -1 to -9 options to
240  
241 \emph on 
242 gzip
243 \emph default 
244  and 
245 \emph on 
246 bzip2
247 \emph default 
248  compression utilities.
249  For normal use, the noise level at complexity 1is between 1 and 2 dB higher
250  than at complexity 10, but the CPU requirements for complexity 10 is about
251  5 time higher than for complexity 1.
252  In practice, the best trade-off is between complexity 2 and 4, though higher
253  settings are often useful when encoding non-speech sounds like DTMF
254 \begin_inset LatexCommand \index{DTMF}
255
256 \end_inset 
257
258  tones.
259 \layout Subsection*
260
261 Variable Bit-Rate
262 \begin_inset LatexCommand \index{variable bit-rate}
263
264 \end_inset 
265
266  (VBR)
267 \layout Standard
268
269 Variable bit-rate (VBR) allows a codec to change its bit-rate dynamically
270  to adapt to the 
271 \begin_inset Quotes eld
272 \end_inset 
273
274 difficulty
275 \begin_inset Quotes erd
276 \end_inset 
277
278  of the audio being encoded.
279  In the example of Speex, sounds like vowels and high-energy transients
280  require a higher bit-rate to achieve good quality, while fricatives (e.g.
281  s,f sounds) can be coded adequately with less bits.
282  For this reason, VBR can achive lower bit-rate for the same quality, or
283  a better quality for a certain bit-rate.
284  Despite its advantages, VBR has two main drawbacks: first, by only specifying
285  quality, there's no guaranty about the final average bit-rate.
286  Second, for some real-time applications like voice over IP (VoIP), what
287  counts is the maximum bit-rate, which must be low enough for the communication
288  channel.
289 \layout Subsection*
290
291 Average Bit-Rate
292 \begin_inset LatexCommand \index{average bit-rate}
293
294 \end_inset 
295
296  (ABR)
297 \layout Standard
298
299 Average bit-rate solves one of the problems of VBR, as it dynamically adjusts
300  VBR quality in order to meet a specific target bit-rate.
301  Because the quality/bit-rate is adjusted in real-time (open-loop), the
302  global quality will be slightly lower than that obtained be encoding in
303  VBR with exactly the right quality setting to meet the target average bit-rate.
304 \layout Subsection*
305
306 Voice Activity Detection
307 \begin_inset LatexCommand \index{voice activity detection}
308
309 \end_inset 
310
311  (VAD)
312 \layout Standard
313
314 When enabled, voice activity detection detects whether the audio being encoded
315  is speech or silence/background noise.
316  VAD is always implicitly activated when encoding in VBR, so the option
317  is only useful in non-VBR operation.
318  In this case, Speex detects non-speech periods and encode them with just
319  enough bits to reproduce the background noise.
320  This is called 
321 \begin_inset Quotes eld
322 \end_inset 
323
324 comfort noise generation
325 \begin_inset Quotes erd
326 \end_inset 
327
328  (CNG).
329 \layout Subsection*
330
331 Discontinuous Transmission
332 \begin_inset LatexCommand \index{discontinuous transmission}
333
334 \end_inset 
335
336  (DTX)
337 \layout Standard
338
339 Discontinuous transmission is an addition to VAD operation, that allows
340  to stop transmitting completely when the background noise is stationnary.
341  In file-based operation, since we cannot just stop writing to the file,
342  only 5 bits are used for such frames (corresponding to 250 bps).
343 \layout Subsection*
344
345 Perceptual enhancement
346 \begin_inset LatexCommand \index{perceptual enhancement}
347
348 \end_inset 
349
350
351 \layout Standard
352
353 Perceptual enhancement is a part of the decoder which, when turned on, tries
354  to reduce (the perception of) the noise produced by the coding/decoding
355  process.
356  In most cases, perceptual enhancement make the sound further from the original
357  
358 \emph on 
359 objectively
360 \emph default 
361  (if you use SNR), but in the end it still 
362 \emph on 
363 sounds
364 \emph default 
365  better (subjective improvement).
366 \layout Subsection*
367
368 Algorithmic delay
369 \begin_inset LatexCommand \index{algorithmic delay}
370
371 \end_inset 
372
373
374 \layout Standard
375
376 Every speech codec introduces a delay in the transmission.
377  For Speex, this delay is equal to the frame size, plus some amount of 
378 \begin_inset Quotes eld
379 \end_inset 
380
381 look-ahead
382 \begin_inset Quotes erd
383 \end_inset 
384
385  required to process each frame.
386  In narrowband operation (8 kHz), the delay is 30 ms, while for wideband
387  (16 kHz), the delay is 34 ms.
388  These values don't account for the CPU time it takes to encode or decode
389  the frames.
390 \layout Section
391 \pagebreak_top 
392 Command-line encoder/decoder
393 \begin_inset LatexCommand \label{sec:Command-line-encoder/decoder}
394
395 \end_inset 
396
397
398 \layout Standard
399
400 The base Speex distribution includes a command-line encoder (
401 \emph on 
402 speexenc
403 \emph default 
404 ) and decoder (
405 \emph on 
406 speexdec
407 \emph default 
408 ).
409  This section describes how to use these tools.
410 \layout Subsection
411
412
413 \emph on 
414 speexenc
415 \begin_inset LatexCommand \index{speexenc}
416
417 \end_inset 
418
419
420 \layout Standard
421
422 The 
423 \emph on 
424 speexenc
425 \emph default 
426  utility is used to create Speex files from raw PCM or wave files.
427  It can be used by calling: 
428 \layout LyX-Code
429
430 speexenc [options] input_file output_file
431 \layout Standard
432
433 The value '-' for input_file or output_file corresponds respectively to
434  stdin and stdout.
435  The valid options are:
436 \layout Description
437
438 --narrowband\SpecialChar ~
439 (-n) Tell Speex to treat the input as narrowband (8 kHz).
440  This is the default
441 \layout Description
442
443 --wideband\SpecialChar ~
444 (-w) Tell Speex to treat the input as wideband (16 kHz)
445 \layout Description
446
447 --ultra-wideband\SpecialChar ~
448 (-u) Tell Speex to treat the input as 
449 \begin_inset Quotes eld
450 \end_inset 
451
452 ultra-wideband
453 \begin_inset Quotes erd
454 \end_inset 
455
456  (32 kHz)
457 \layout Description
458
459 --quality\SpecialChar ~
460 n Set the encoding quality (0-10), default is 8
461 \layout Description
462
463 --bitrate\SpecialChar ~
464 n Encoding bit-rate (use bit-rate n or lower) 
465 \layout Description
466
467 --vbr Enable VBR (Variable Bit-Rate), disabled by default
468 \layout Description
469
470 --abr\SpecialChar ~
471 n Enable ABR (Average Bit-Rate) at n kbps, disabled by default
472 \layout Description
473
474 --vad Enable VAD (Voice Activity Detection), disabled by default
475 \layout Description
476
477 --dtx Enable DTX (Discontinuous Transmission), disabled by default
478 \layout Description
479
480 --nframes\SpecialChar ~
481 n Pack n frames in each Ogg packet (this saves space at low bit-rates)
482 \layout Description
483
484 --comp\SpecialChar ~
485 n Set encoding speed/quality tradeoff.
486  The higher the value of n, the slower the encoding (default is 3)
487 \layout Description
488
489 -V Verbose operation, print bit-rate currently in use
490 \layout Description
491
492 --help\SpecialChar ~
493 (-h) Print the help
494 \layout Description
495
496 --version\SpecialChar ~
497 (-v) Print version information
498 \layout Subsubsection*
499
500 Speex comments
501 \layout Description
502
503 --comment Add the given string as an extra comment.
504  This may be used multiple times.
505  
506 \layout Description
507
508 --author Author of this track.
509  
510 \layout Description
511
512 --title Title for this track.
513  
514 \layout Subsubsection*
515
516 Raw input options
517 \layout Description
518
519 --rate\SpecialChar ~
520 n Sampling rate for raw input
521 \layout Description
522
523 --stereo Consider raw input as stereo 
524 \layout Description
525
526 --le Raw input is little-endian 
527 \layout Description
528
529 --be Raw input is big-endian 
530 \layout Description
531
532 --8bit Raw input is 8-bit unsigned 
533 \layout Description
534
535 --16bit Raw input is 16-bit signed 
536 \layout Subsection
537
538
539 \emph on 
540 speexdec
541 \begin_inset LatexCommand \index{speexdec}
542
543 \end_inset 
544
545
546 \layout Standard
547
548 The 
549 \emph on 
550 speexdec
551 \emph default 
552  utility is used to decode Speex files and can be used by calling: 
553 \layout LyX-Code
554
555 speexdec [options] speex_file [output_file]
556 \layout Standard
557
558 The value '-' for input_file or output_file corresponds respectively to
559  stdin and stdout.
560  Also, when no output_file is specified, the file is played to the soundcard.
561  The valid options are:
562 \layout Description
563
564 --enh enable post-filter (default)
565 \layout Description
566
567 --no-enh disable post-filter
568 \layout Description
569
570 --force-nb Force decoding in narrowband 
571 \layout Description
572
573 --force-wb Force decoding in wideband 
574 \layout Description
575
576 --force-uwb Force decoding in ultra-wideband 
577 \layout Description
578
579 --mono Force decoding in mono 
580 \layout Description
581
582 --stereo Force decoding in stereo 
583 \layout Description
584
585 --rate\SpecialChar ~
586 n For decoding at n Hz sampling rate
587 \layout Description
588
589 --packet-loss\SpecialChar ~
590 n Simulate n % random packet loss
591 \layout Description
592
593 -V Verbose operation, print bit-rate currently in use
594 \layout Description
595
596 --help\SpecialChar ~
597 (-h) Print the help
598 \layout Description
599
600 --version\SpecialChar ~
601 (-v) Print version information
602 \layout Section
603 \pagebreak_top 
604 Programming with Speex (the libspeex
605 \begin_inset LatexCommand \index{libspeex}
606
607 \end_inset 
608
609  API
610 \begin_inset LatexCommand \index{API}
611
612 \end_inset 
613
614 )
615 \layout Subsection
616
617 Encoding
618 \layout Standard
619
620 In order to encode speech using Speex, you first need to:
621 \layout LyX-Code
622
623 #include <speex.h>
624 \layout Standard
625
626 You then need to declare a Speex bit-packing struct
627 \layout LyX-Code
628
629 SpeexBits bits;
630 \layout Standard
631
632 and a Speex encoder state
633 \layout LyX-Code
634
635 void *enc_state;
636 \layout Standard
637
638 The two are initialized by:
639 \layout LyX-Code
640
641 speex_bits_init(&bits);
642 \layout LyX-Code
643
644 enc_state = speex_encoder_init(&speex_nb_mode);
645 \layout Standard
646
647 For wideband coding, 
648 \emph on 
649 speex_nb_mode
650 \emph default 
651  will be replaced by 
652 \emph on 
653 speex_wb_mode
654 \emph default 
655 .
656  In most cases, you will need to know the frame size used by the mode you
657  are using.
658  You can get that value in the 
659 \emph on 
660 frame_size
661 \emph default 
662  variable with:
663 \layout LyX-Code
664
665 speex_encoder_ctl(enc_state,SPEEX_GET_FRAME_SIZE,&frame_size);
666 \layout Standard
667
668 Once the initialization is done, for every input frame:
669 \layout LyX-Code
670
671 speex_bits_reset(&bits);
672 \layout LyX-Code
673
674 speex_encode(enc_state, input_frame, &bits);
675 \layout LyX-Code
676
677 nbBytes = speex_bits_write(&bits, byte_ptr, MAX_NB_BYTES);
678 \layout Standard
679
680 where 
681 \emph on 
682 input_frame
683 \emph default 
684  is a 
685 \emph on 
686 (float *)
687 \emph default 
688  pointing to the beginning of a speech frame, 
689 \emph on 
690 byte_ptr
691 \emph default 
692  is a 
693 \emph on 
694 (char *)
695 \emph default 
696  where the encoded frame will be written, 
697 \emph on 
698 MAX_NB_BYTES
699 \emph default 
700  is the maximum number of bytes that can be written to 
701 \emph on 
702 byte_ptr
703 \emph default 
704  without causing an overflow and 
705 \emph on 
706 nbBytes
707 \emph default 
708  is the number of bytes actually written to 
709 \emph on 
710 byte_ptr
711 \emph default 
712  (the encoded size in bytes).
713  Before calling speex_bits_write, it is possible to find the number of bytes
714  that need to be written by calling 
715 \family typewriter 
716 speex_bits_nbytes(&bits)
717 \family default 
718 , which returns a number of bytes.
719  
720 \layout Standard
721
722 After you're done with the encoding, free all resources with:
723 \layout LyX-Code
724
725 speex_bits_destroy(&bits);
726 \layout LyX-Code
727
728 speex_encoder_destroy(enc_state);
729 \layout Standard
730
731 That's about it for the encoder.
732  
733 \layout Subsection
734
735 Decoding
736 \layout Standard
737
738 In order to encode speech using Speex, you first need to:
739 \layout LyX-Code
740
741 #include <speex.h>
742 \layout Standard
743
744 You also need to declare a Speex bit-packing struct
745 \layout LyX-Code
746
747 SpeexBits bits;
748 \layout Standard
749
750 and a Speex encoder state
751 \layout LyX-Code
752
753 void *dec_state;
754 \layout Standard
755
756 The two are initialized by:
757 \layout LyX-Code
758
759 speex_bits_init(&bits);
760 \layout LyX-Code
761
762 dec_state = speex_decoder_init(&speex_nb_mode);
763 \layout Standard
764
765 For wideband decoding, 
766 \emph on 
767 speex_nb_mode
768 \emph default 
769  will be replaced by 
770 \emph on 
771 speex_wb_mode
772 \emph default 
773 .
774  If you need to obtain the size of the frames that will be used by the decoder,
775  you can get that value in the 
776 \emph on 
777 frame_size
778 \emph default 
779  variable with:
780 \layout LyX-Code
781
782 speex_decoder_ctl(dec_state, SPEEX_GET_FRAME_SIZE, &frame_size); 
783 \layout Standard
784
785 There is also a parameter that can be set for the decoder: whether or not
786  to use a perceptual post-filter.
787  This can be set by: 
788 \layout LyX-Code
789
790 speex_decoder_ctl(dec_state, SPEEX_SET_ENH, &enh); 
791 \layout Standard
792
793 where 
794 \emph on 
795 enh
796 \emph default 
797  is an int that with value 0 to have the post-filter disabled and 1 to have
798  it enabled.
799 \layout Standard
800
801 Again, once the decoder initialization is done, for every input frame:
802 \layout LyX-Code
803
804 speex_bits_read_from(&bits, input_bytes, nbBytes);
805 \layout LyX-Code
806
807 speex_decode(st, &bits, output_frame);
808 \layout Standard
809
810 where input_bytes is a 
811 \emph on 
812 (char *)
813 \emph default 
814  containing the bit-stream data received for a frame, 
815 \emph on 
816 nbBytes
817 \emph default 
818  is the size (in bytes) of that bit-stream, and 
819 \emph on 
820 output_frame
821 \emph default 
822  is a 
823 \emph on 
824 (float *)
825 \emph default 
826  and points to the area where the decoded speech frame will be written.
827  A NULL value as the first argument indicates that we don't have the bits
828  for the current frame.
829  When a frame is lost, the Speex decoder will do its best to "guess" the
830  correct signal.
831 \layout Standard
832
833 After you're done with the decoding, free all resources with:
834 \layout LyX-Code
835
836 speex_bits_destroy(&bits);
837 \layout LyX-Code
838
839 speex_decoder_destroy(dec_state);
840 \layout Subsection
841
842 Codec Options (speex_*_ctl)
843 \layout Standard
844
845 The Speex encoder and decoder support many options and requests that can
846  be accessed through the 
847 \emph on 
848 speex_encoder_ctl
849 \emph default 
850  and 
851 \emph on 
852 speex_decoder_ctl
853 \emph default 
854  functions.
855  These functions are similar to the 
856 \emph on 
857 ioctl
858 \emph default 
859  system call and their prototypes are:
860 \layout LyX-Code
861
862 void speex_encoder_ctl(void *encoder, int request, void *ptr);
863 \layout LyX-Code
864
865 void speex_decoder_ctl(void *encoder, int request, void *ptr);
866 \layout Standard
867
868 The different values of request allowed are (note that some only apply to
869  the encoder or the decoder):
870 \layout Description
871
872 SPEEX_SET_ENH** Set perceptual enhancer
873 \begin_inset LatexCommand \index{perceptual enhancement}
874
875 \end_inset 
876
877  to on (1) or off (0) (integer)
878 \layout Description
879
880 SPEEX_GET_ENH** Get perceptual enhancer status (integer)
881 \layout Description
882
883 SPEEX_GET_FRAME_SIZE Get the frame size used for the current mode (integer)
884 \layout Description
885
886 SPEEX_SET_QUALITY* Set the encoder speech quality (integer 0 to 10)
887 \layout Description
888
889 SPEEX_GET_QUALITY* Get the current encoder speech quality (integer 0 to
890  10)
891 \layout Description
892
893 SPEEX_SET_MODE*
894 \begin_inset Formula $\dagger$
895 \end_inset 
896
897
898 \layout Description
899
900 SPEEX_GET_MODE*
901 \begin_inset Formula $\dagger$
902 \end_inset 
903
904
905 \layout Description
906
907 SPEEX_SET_LOW_MODE*
908 \begin_inset Formula $\dagger$
909 \end_inset 
910
911
912 \layout Description
913
914 SPEEX_GET_LOW_MODE*
915 \begin_inset Formula $\dagger$
916 \end_inset 
917
918
919 \layout Description
920
921 SPEEX_SET_HIGH_MODE*
922 \begin_inset Formula $\dagger$
923 \end_inset 
924
925
926 \layout Description
927
928 SPEEX_GET_HIGH_MODE*
929 \begin_inset Formula $\dagger$
930 \end_inset 
931
932
933 \layout Description
934
935 SPEEX_SET_VBR* Set variable bit-rate (VBR) to on (1) or off (0) (integer)
936 \layout Description
937
938 SPEEX_GET_VBR* Get variable bit-rate
939 \begin_inset LatexCommand \index{variable bit-rate}
940
941 \end_inset 
942
943  (VBR) status (integer)
944 \layout Description
945
946 SPEEX_SET_VBR_QUALITY* Set the encoder VBR speech quality (float 0 to 10)
947 \layout Description
948
949 SPEEX_GET_VBR_QUALITY* Get the current encoder VBR speech quality (float
950  0 to 10)
951 \layout Description
952
953 SPEEX_SET_COMPLEXITY* Set the CPU resources allowed for the encoder (integer
954  1 to 10)
955 \layout Description
956
957 SPEEX_GET_COMPLEXITY* Get the CPU resources allowed for the encoder (integer
958  1 to 10)
959 \layout Description
960
961 SPEEX_SET_BITRATE* Set the bit-rate to use to the closest value not exceeding
962  the parameter (integer in bps)
963 \layout Description
964
965 SPEEX_GET_BITRATE Get the current bit-rate in use (integer in bps)
966 \layout Description
967
968 SPEEX_SET_SAMPLING_RATE Set real sampling rate (integer in Hz)
969 \layout Description
970
971 SPEEX_GET_SAMPLING_RATE Get real sampling rate (integer in Hz)
972 \layout Description
973
974 SPEEX_RESET_STATE Reset the encoder/decoder state to its original state
975  (zeros all memories)
976 \layout Description
977
978 SPEEX_SET_VAD* Set voice activity detection
979 \begin_inset LatexCommand \index{voice activity detection}
980
981 \end_inset 
982
983  (VAD) to on (1) or off (0) (integer)
984 \layout Description
985
986 SPEEX_GET_VAD* Get voice activity detection (VAD) status (integer)
987 \layout Description
988
989 SPEEX_SET_DTX* Set discontinuous transmission
990 \begin_inset LatexCommand \index{discontinuous transmission}
991
992 \end_inset 
993
994  (DTX) to on (1) or off (0) (integer)
995 \layout Description
996
997 SPEEX_GET_DTX* Get discontinuous transmission (DTX) status (integer)
998 \layout Description
999
1000 SPEEX_SET_ABR* Set average bit-rate
1001 \begin_inset LatexCommand \index{average bit-rate}
1002
1003 \end_inset 
1004
1005  (ABR) to a value n in bits per second (integer in bps)
1006 \layout Description
1007
1008 SPEEX_GET_ABR* Get average bit-rate (ABR) setting (integer in bps)
1009 \layout Description
1010
1011 * applies only to the encoder
1012 \layout Description
1013
1014 ** applies only to the decoder
1015 \layout Description
1016
1017
1018 \begin_inset Formula $\dagger$
1019 \end_inset 
1020
1021  normally only used internally
1022 \layout Subsection
1023
1024 Mode queries
1025 \layout Standard
1026
1027 Speex modes have a querry system similar to the speex_encoder_ctl and speex_deco
1028 der_ctl calls.
1029  Since modes are read-only, it is only possible to get information about
1030  a particular mode.
1031  The function used to do that is:
1032 \layout LyX-Code
1033
1034 void speex_mode_query(SpeexMode *mode, int request, void *ptr);
1035 \layout Standard
1036
1037 The admissible values for request are (unless otherwise note, the values
1038  are returned through 
1039 \emph on 
1040 ptr
1041 \emph default 
1042 ):
1043 \layout Description
1044
1045 SPEEX_MODE_FRAME_SIZE Get the frame size (in samples) for the mode
1046 \layout Description
1047
1048 SPEEX_SUBMODE_BITRATE Get the bit-rate for a submode number specified throught
1049  
1050 \emph on 
1051 ptr
1052 \emph default 
1053  (integer in bps).
1054  
1055 \layout Subsection
1056
1057 Packing and in-band signalling
1058 \begin_inset LatexCommand \index{in-band signalling}
1059
1060 \end_inset 
1061
1062
1063 \layout Standard
1064
1065 Sometimes it is desirable to pack more than one frame per packet (or other
1066  basic unit of storage).
1067  The proper way to do it is to call speex_encode 
1068 \begin_inset Formula $N$
1069 \end_inset 
1070
1071  times before writing the stream with speex_bits_write.
1072  In cases where the number of frames is not determined by an out-of-band
1073  mechanism, it is possible to include a terminator code.
1074  That terminator consists of the code 15 (decimal) encoded with 5 bits,
1075  as shown in figure 
1076 \begin_inset LatexCommand \ref{cap:quality_vs_bps}
1077
1078 \end_inset 
1079
1080 .
1081  
1082 \layout Standard
1083
1084 It is also possible to send in-band 
1085 \begin_inset Quotes eld
1086 \end_inset 
1087
1088 messages
1089 \begin_inset Quotes erd
1090 \end_inset 
1091
1092  to the other side.
1093  All these messages are encoded as a 
1094 \begin_inset Quotes eld
1095 \end_inset 
1096
1097 pseudo-frame
1098 \begin_inset Quotes erd
1099 \end_inset 
1100
1101  of mode 14 which contain a 4-bit message type code, followed by the message.
1102  Table 
1103 \begin_inset LatexCommand \ref{cap:In-band-signalling-codes}
1104
1105 \end_inset 
1106
1107  lists the available codes, their meaning and the size of the message that
1108  follow.
1109  Most of these messages are requests that are sent to the encoder or decoder
1110  on the other end, which is free to comply or ignore them.
1111  By default, all in-band messages are ignored.
1112 \layout Standard
1113
1114
1115 \begin_inset Float table
1116 placement htbp
1117 wide false
1118 collapsed false
1119
1120 \layout Standard
1121
1122
1123 \begin_inset  Tabular
1124 <lyxtabular version="3" rows="17" columns="3">
1125 <features>
1126 <column alignment="center" valignment="top" leftline="true" width="0pt">
1127 <column alignment="center" valignment="top" leftline="true" width="0pt">
1128 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
1129 <row topline="true" bottomline="true">
1130 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1131 \begin_inset Text
1132
1133 \layout Standard
1134
1135 code
1136 \end_inset 
1137 </cell>
1138 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1139 \begin_inset Text
1140
1141 \layout Standard
1142
1143 Size (bits)
1144 \end_inset 
1145 </cell>
1146 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1147 \begin_inset Text
1148
1149 \layout Standard
1150
1151 Content
1152 \end_inset 
1153 </cell>
1154 </row>
1155 <row topline="true">
1156 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1157 \begin_inset Text
1158
1159 \layout Standard
1160
1161 0
1162 \end_inset 
1163 </cell>
1164 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1165 \begin_inset Text
1166
1167 \layout Standard
1168
1169 1
1170 \end_inset 
1171 </cell>
1172 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1173 \begin_inset Text
1174
1175 \layout Standard
1176
1177 Asks decoder to set perceptual enhancement off (0) or on(1)
1178 \end_inset 
1179 </cell>
1180 </row>
1181 <row topline="true">
1182 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1183 \begin_inset Text
1184
1185 \layout Standard
1186
1187 1
1188 \end_inset 
1189 </cell>
1190 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1191 \begin_inset Text
1192
1193 \layout Standard
1194
1195 1
1196 \end_inset 
1197 </cell>
1198 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1199 \begin_inset Text
1200
1201 \layout Standard
1202
1203 reserved
1204 \end_inset 
1205 </cell>
1206 </row>
1207 <row topline="true">
1208 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1209 \begin_inset Text
1210
1211 \layout Standard
1212
1213 2
1214 \end_inset 
1215 </cell>
1216 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1217 \begin_inset Text
1218
1219 \layout Standard
1220
1221 4
1222 \end_inset 
1223 </cell>
1224 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1225 \begin_inset Text
1226
1227 \layout Standard
1228
1229 Asks encoder to switch to mode N
1230 \end_inset 
1231 </cell>
1232 </row>
1233 <row topline="true">
1234 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1235 \begin_inset Text
1236
1237 \layout Standard
1238
1239 3
1240 \end_inset 
1241 </cell>
1242 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1243 \begin_inset Text
1244
1245 \layout Standard
1246
1247 4
1248 \end_inset 
1249 </cell>
1250 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1251 \begin_inset Text
1252
1253 \layout Standard
1254
1255 Asks encoder to switch to mode N for low-band
1256 \end_inset 
1257 </cell>
1258 </row>
1259 <row topline="true">
1260 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1261 \begin_inset Text
1262
1263 \layout Standard
1264
1265 4
1266 \end_inset 
1267 </cell>
1268 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1269 \begin_inset Text
1270
1271 \layout Standard
1272
1273 4
1274 \end_inset 
1275 </cell>
1276 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1277 \begin_inset Text
1278
1279 \layout Standard
1280
1281 Asks encoder to switch to mode N for high-band
1282 \end_inset 
1283 </cell>
1284 </row>
1285 <row topline="true">
1286 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1287 \begin_inset Text
1288
1289 \layout Standard
1290
1291 5
1292 \end_inset 
1293 </cell>
1294 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1295 \begin_inset Text
1296
1297 \layout Standard
1298
1299 4
1300 \end_inset 
1301 </cell>
1302 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1303 \begin_inset Text
1304
1305 \layout Standard
1306
1307 Asks encoder to switch to quality N for VBR
1308 \end_inset 
1309 </cell>
1310 </row>
1311 <row topline="true">
1312 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1313 \begin_inset Text
1314
1315 \layout Standard
1316
1317 6
1318 \end_inset 
1319 </cell>
1320 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1321 \begin_inset Text
1322
1323 \layout Standard
1324
1325 4
1326 \end_inset 
1327 </cell>
1328 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1329 \begin_inset Text
1330
1331 \layout Standard
1332
1333 Request acknowloedge (0=no, 1=all, 2=only for in-band data)
1334 \end_inset 
1335 </cell>
1336 </row>
1337 <row topline="true">
1338 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1339 \begin_inset Text
1340
1341 \layout Standard
1342
1343 7
1344 \end_inset 
1345 </cell>
1346 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1347 \begin_inset Text
1348
1349 \layout Standard
1350
1351 4
1352 \end_inset 
1353 </cell>
1354 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1355 \begin_inset Text
1356
1357 \layout Standard
1358
1359 Asks encoder to set VBR off (0), on(1), VAD(2), DTX(3)
1360 \end_inset 
1361 </cell>
1362 </row>
1363 <row topline="true">
1364 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1365 \begin_inset Text
1366
1367 \layout Standard
1368
1369 8
1370 \end_inset 
1371 </cell>
1372 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1373 \begin_inset Text
1374
1375 \layout Standard
1376
1377 8
1378 \end_inset 
1379 </cell>
1380 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1381 \begin_inset Text
1382
1383 \layout Standard
1384
1385 Transmit (8-bit) character to the other end
1386 \end_inset 
1387 </cell>
1388 </row>
1389 <row topline="true">
1390 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1391 \begin_inset Text
1392
1393 \layout Standard
1394
1395 9
1396 \end_inset 
1397 </cell>
1398 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1399 \begin_inset Text
1400
1401 \layout Standard
1402
1403 8
1404 \end_inset 
1405 </cell>
1406 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1407 \begin_inset Text
1408
1409 \layout Standard
1410
1411 Intensity stereo information
1412 \end_inset 
1413 </cell>
1414 </row>
1415 <row topline="true">
1416 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1417 \begin_inset Text
1418
1419 \layout Standard
1420
1421 10
1422 \end_inset 
1423 </cell>
1424 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1425 \begin_inset Text
1426
1427 \layout Standard
1428
1429 16
1430 \end_inset 
1431 </cell>
1432 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1433 \begin_inset Text
1434
1435 \layout Standard
1436
1437 Announce maximum bit-rate acceptable (N in bytes/second)
1438 \end_inset 
1439 </cell>
1440 </row>
1441 <row topline="true">
1442 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1443 \begin_inset Text
1444
1445 \layout Standard
1446
1447 11
1448 \end_inset 
1449 </cell>
1450 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1451 \begin_inset Text
1452
1453 \layout Standard
1454
1455 16
1456 \end_inset 
1457 </cell>
1458 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1459 \begin_inset Text
1460
1461 \layout Standard
1462
1463 reserved
1464 \end_inset 
1465 </cell>
1466 </row>
1467 <row topline="true">
1468 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1469 \begin_inset Text
1470
1471 \layout Standard
1472
1473 12
1474 \end_inset 
1475 </cell>
1476 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1477 \begin_inset Text
1478
1479 \layout Standard
1480
1481 32
1482 \end_inset 
1483 </cell>
1484 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1485 \begin_inset Text
1486
1487 \layout Standard
1488
1489 Acknowledge receiving packet N
1490 \end_inset 
1491 </cell>
1492 </row>
1493 <row topline="true">
1494 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1495 \begin_inset Text
1496
1497 \layout Standard
1498
1499 13
1500 \end_inset 
1501 </cell>
1502 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1503 \begin_inset Text
1504
1505 \layout Standard
1506
1507 32
1508 \end_inset 
1509 </cell>
1510 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1511 \begin_inset Text
1512
1513 \layout Standard
1514
1515 reserved
1516 \end_inset 
1517 </cell>
1518 </row>
1519 <row topline="true">
1520 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1521 \begin_inset Text
1522
1523 \layout Standard
1524
1525 14
1526 \end_inset 
1527 </cell>
1528 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1529 \begin_inset Text
1530
1531 \layout Standard
1532
1533 64
1534 \end_inset 
1535 </cell>
1536 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1537 \begin_inset Text
1538
1539 \layout Standard
1540
1541 reserved
1542 \end_inset 
1543 </cell>
1544 </row>
1545 <row topline="true" bottomline="true">
1546 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1547 \begin_inset Text
1548
1549 \layout Standard
1550
1551 15
1552 \end_inset 
1553 </cell>
1554 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1555 \begin_inset Text
1556
1557 \layout Standard
1558
1559 64
1560 \end_inset 
1561 </cell>
1562 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1563 \begin_inset Text
1564
1565 \layout Standard
1566
1567 reserved
1568 \end_inset 
1569 </cell>
1570 </row>
1571 </lyxtabular>
1572
1573 \end_inset 
1574
1575
1576 \layout Caption
1577
1578 In-band signalling codes
1579 \begin_inset LatexCommand \label{cap:In-band-signalling-codes}
1580
1581 \end_inset 
1582
1583
1584 \end_inset 
1585
1586
1587 \layout Standard
1588
1589 Finally, applications may define custom in-band messages using mode 13.
1590  The size of the message in bytes is encoded with 5 bits, so that the decoder
1591  can skip it if it doesn't know how to interpret it.
1592 \layout Section
1593 \pagebreak_top 
1594 Formats and standards
1595 \begin_inset LatexCommand \index{standards}
1596
1597 \end_inset 
1598
1599
1600 \layout Standard
1601
1602 Speex can encode speech in both narrowband and wideband and provides different
1603  bit-rates.
1604  However not all features must be supported by a certain implementation
1605  or device.
1606  In order to be said 
1607 \begin_inset Quotes eld
1608 \end_inset 
1609
1610 Speex compatible
1611 \begin_inset Quotes erd
1612 \end_inset 
1613
1614  (whatever that means), an implementation must implement at least a basic
1615  set of features.
1616 \layout Standard
1617
1618 At the minimum, all narrowband modes of operation MUST be supported at the
1619  decoder.
1620  This includes the decoding of a wideband bit-stream by the narrowband decoder
1621 \begin_inset Foot
1622 collapsed true
1623
1624 \layout Standard
1625
1626 The wideband bit-stream contains an embedded narrowband bit-stream which
1627  can be decoded alone
1628 \end_inset 
1629
1630 .
1631  If present, a wideband decoder MUST be able to decode a narrowband stream,
1632  and MAY either be able to decode all wideband modes or be able to decode
1633  the embedded narrowband part of all modes (which includes ignoring the
1634  high-band bits).
1635 \layout Standard
1636
1637 For encoders, at least one narrowband or wideband mode MUST be supported.
1638  The main reason why all encoding modes do not have to be supported is that
1639  some platforms may not be able to handle the complexity of encoding in
1640  some modes.
1641 \layout Subsection
1642
1643 RTP
1644 \begin_inset LatexCommand \index{RTP}
1645
1646 \end_inset 
1647
1648  Payload Format 
1649 \layout Standard
1650
1651 The RTP payload draft is included in appendix 
1652 \begin_inset LatexCommand \ref{sec:IETF-draft}
1653
1654 \end_inset 
1655
1656  and the latest version is available at 
1657 \begin_inset LatexCommand \url{http://www.speex.org/drafts/latest}
1658
1659 \end_inset 
1660
1661 .
1662  This draft has been sent (2003/02/26) to the Internet Engineering Task
1663  Force (IETF) and will be discussed at the March 18th meeting in San Francisco.
1664  
1665 \layout Subsection
1666
1667 MIME Type
1668 \layout Standard
1669
1670 Speex will use the MIME type 
1671 \family typewriter 
1672 audio/speex
1673 \family default 
1674 .
1675  We will apply for that type in the near future.
1676 \layout Subsection
1677
1678 Ogg
1679 \begin_inset LatexCommand \index{Ogg}
1680
1681 \end_inset 
1682
1683  file format
1684 \layout Standard
1685
1686 Speex bit-streams can be stored in Ogg files.
1687  In this case, the first packet of the Ogg file contains the Speex header
1688  described in table 
1689 \begin_inset LatexCommand \ref{cap:ogg_speex_header}
1690
1691 \end_inset 
1692
1693 .
1694  All integer fields in the headers are stored as little-endian.
1695  The 
1696 \family typewriter 
1697 speex_string
1698 \family default 
1699  field must contain the 
1700 \begin_inset Quotes eld
1701 \end_inset 
1702
1703
1704 \family typewriter 
1705 Speex
1706 \family default 
1707 \SpecialChar ~
1708 \SpecialChar ~
1709 \SpecialChar ~
1710
1711 \begin_inset Quotes eld
1712 \end_inset 
1713
1714  (with 3 training spaces), which identifies the bit-stream.
1715  The next field, 
1716 \family typewriter 
1717 speex_version
1718 \family default 
1719  contains the version of Speex that encoded the file.
1720  For now, refer to speex_header.[ch] for more info.
1721  The 
1722 \emph on 
1723 beginning of stream
1724 \emph default 
1725  (
1726 \family typewriter 
1727 b_o_s
1728 \family default 
1729 ) flag is set to 1 for the header.
1730  The header packet has 
1731 \family typewriter 
1732 packetno=0
1733 \family default 
1734  and 
1735 \family typewriter 
1736 granulepos=0
1737 \family default 
1738 .
1739 \layout Standard
1740
1741 The second packet contains the Speex comment header.
1742  The format used is the Vorbis comment format described here: http://www.xiph.org/
1743 ogg/vorbis/doc/v-comment.html .
1744  This packet has 
1745 \family typewriter 
1746 packetno=1
1747 \family default 
1748  and 
1749 \family typewriter 
1750 granulepos=0
1751 \family default 
1752 .
1753 \layout Standard
1754
1755 The third and subsequant packets each contain one or more (number found
1756  in header) Speex frames.
1757  These are identified with 
1758 \family typewriter 
1759 packetno
1760 \family default 
1761  starting from 2 and the 
1762 \family typewriter 
1763 granulepos
1764 \family default 
1765  is the number of the last sample encoded in that packet.
1766  Le last of these packets has the 
1767 \emph on 
1768 end of stream
1769 \emph default 
1770  (
1771 \family typewriter 
1772 e_o_s
1773 \family default 
1774 ) flag is set to 1.
1775 \layout Standard
1776
1777
1778 \begin_inset Float table
1779 placement htbp
1780 wide true
1781 collapsed false
1782
1783 \layout Standard
1784
1785
1786 \begin_inset  Tabular
1787 <lyxtabular version="3" rows="16" columns="3">
1788 <features>
1789 <column alignment="center" valignment="top" leftline="true" width="0pt">
1790 <column alignment="center" valignment="top" leftline="true" width="0pt">
1791 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
1792 <row topline="true" bottomline="true">
1793 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1794 \begin_inset Text
1795
1796 \layout Standard
1797
1798 Field
1799 \end_inset 
1800 </cell>
1801 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1802 \begin_inset Text
1803
1804 \layout Standard
1805
1806 Type
1807 \end_inset 
1808 </cell>
1809 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1810 \begin_inset Text
1811
1812 \layout Standard
1813
1814 Size
1815 \end_inset 
1816 </cell>
1817 </row>
1818 <row topline="true">
1819 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1820 \begin_inset Text
1821
1822 \layout Standard
1823
1824 speex_string
1825 \end_inset 
1826 </cell>
1827 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1828 \begin_inset Text
1829
1830 \layout Standard
1831
1832 char[]
1833 \end_inset 
1834 </cell>
1835 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1836 \begin_inset Text
1837
1838 \layout Standard
1839
1840 8
1841 \end_inset 
1842 </cell>
1843 </row>
1844 <row topline="true">
1845 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1846 \begin_inset Text
1847
1848 \layout Standard
1849
1850 speex_version
1851 \end_inset 
1852 </cell>
1853 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1854 \begin_inset Text
1855
1856 \layout Standard
1857
1858 char[]
1859 \end_inset 
1860 </cell>
1861 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1862 \begin_inset Text
1863
1864 \layout Standard
1865
1866 20
1867 \end_inset 
1868 </cell>
1869 </row>
1870 <row topline="true">
1871 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1872 \begin_inset Text
1873
1874 \layout Standard
1875
1876 speex_version_id
1877 \end_inset 
1878 </cell>
1879 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1880 \begin_inset Text
1881
1882 \layout Standard
1883
1884 int
1885 \end_inset 
1886 </cell>
1887 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1888 \begin_inset Text
1889
1890 \layout Standard
1891
1892 4
1893 \end_inset 
1894 </cell>
1895 </row>
1896 <row topline="true">
1897 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1898 \begin_inset Text
1899
1900 \layout Standard
1901
1902 header_size
1903 \end_inset 
1904 </cell>
1905 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1906 \begin_inset Text
1907
1908 \layout Standard
1909
1910 int
1911 \end_inset 
1912 </cell>
1913 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1914 \begin_inset Text
1915
1916 \layout Standard
1917
1918 4
1919 \end_inset 
1920 </cell>
1921 </row>
1922 <row topline="true">
1923 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1924 \begin_inset Text
1925
1926 \layout Standard
1927
1928 rate
1929 \end_inset 
1930 </cell>
1931 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1932 \begin_inset Text
1933
1934 \layout Standard
1935
1936 int
1937 \end_inset 
1938 </cell>
1939 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1940 \begin_inset Text
1941
1942 \layout Standard
1943
1944 4
1945 \end_inset 
1946 </cell>
1947 </row>
1948 <row topline="true">
1949 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1950 \begin_inset Text
1951
1952 \layout Standard
1953
1954 mode
1955 \end_inset 
1956 </cell>
1957 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1958 \begin_inset Text
1959
1960 \layout Standard
1961
1962 int
1963 \end_inset 
1964 </cell>
1965 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1966 \begin_inset Text
1967
1968 \layout Standard
1969
1970 4
1971 \end_inset 
1972 </cell>
1973 </row>
1974 <row topline="true">
1975 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1976 \begin_inset Text
1977
1978 \layout Standard
1979
1980 mode_bitstream_version
1981 \end_inset 
1982 </cell>
1983 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1984 \begin_inset Text
1985
1986 \layout Standard
1987
1988 int
1989 \end_inset 
1990 </cell>
1991 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1992 \begin_inset Text
1993
1994 \layout Standard
1995
1996 4
1997 \end_inset 
1998 </cell>
1999 </row>
2000 <row topline="true">
2001 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2002 \begin_inset Text
2003
2004 \layout Standard
2005
2006 nb_channels
2007 \end_inset 
2008 </cell>
2009 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2010 \begin_inset Text
2011
2012 \layout Standard
2013
2014 int
2015 \end_inset 
2016 </cell>
2017 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2018 \begin_inset Text
2019
2020 \layout Standard
2021
2022 4
2023 \end_inset 
2024 </cell>
2025 </row>
2026 <row topline="true">
2027 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2028 \begin_inset Text
2029
2030 \layout Standard
2031
2032 bitrate
2033 \end_inset 
2034 </cell>
2035 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2036 \begin_inset Text
2037
2038 \layout Standard
2039
2040 int
2041 \end_inset 
2042 </cell>
2043 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2044 \begin_inset Text
2045
2046 \layout Standard
2047
2048 4
2049 \end_inset 
2050 </cell>
2051 </row>
2052 <row topline="true">
2053 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2054 \begin_inset Text
2055
2056 \layout Standard
2057
2058 frame_size
2059 \end_inset 
2060 </cell>
2061 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2062 \begin_inset Text
2063
2064 \layout Standard
2065
2066 int
2067 \end_inset 
2068 </cell>
2069 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2070 \begin_inset Text
2071
2072 \layout Standard
2073
2074 4
2075 \end_inset 
2076 </cell>
2077 </row>
2078 <row topline="true">
2079 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2080 \begin_inset Text
2081
2082 \layout Standard
2083
2084 vbr
2085 \end_inset 
2086 </cell>
2087 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2088 \begin_inset Text
2089
2090 \layout Standard
2091
2092 int
2093 \end_inset 
2094 </cell>
2095 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2096 \begin_inset Text
2097
2098 \layout Standard
2099
2100 4
2101 \end_inset 
2102 </cell>
2103 </row>
2104 <row topline="true">
2105 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2106 \begin_inset Text
2107
2108 \layout Standard
2109
2110 frames_per_packet
2111 \end_inset 
2112 </cell>
2113 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2114 \begin_inset Text
2115
2116 \layout Standard
2117
2118 int
2119 \end_inset 
2120 </cell>
2121 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2122 \begin_inset Text
2123
2124 \layout Standard
2125
2126 4
2127 \end_inset 
2128 </cell>
2129 </row>
2130 <row topline="true">
2131 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2132 \begin_inset Text
2133
2134 \layout Standard
2135
2136 extra_headers
2137 \end_inset 
2138 </cell>
2139 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2140 \begin_inset Text
2141
2142 \layout Standard
2143
2144 int
2145 \end_inset 
2146 </cell>
2147 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2148 \begin_inset Text
2149
2150 \layout Standard
2151
2152 4
2153 \end_inset 
2154 </cell>
2155 </row>
2156 <row topline="true">
2157 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2158 \begin_inset Text
2159
2160 \layout Standard
2161
2162 reserved1
2163 \end_inset 
2164 </cell>
2165 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2166 \begin_inset Text
2167
2168 \layout Standard
2169
2170 int
2171 \end_inset 
2172 </cell>
2173 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2174 \begin_inset Text
2175
2176 \layout Standard
2177
2178 4
2179 \end_inset 
2180 </cell>
2181 </row>
2182 <row topline="true" bottomline="true">
2183 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2184 \begin_inset Text
2185
2186 \layout Standard
2187
2188 reserved2
2189 \end_inset 
2190 </cell>
2191 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2192 \begin_inset Text
2193
2194 \layout Standard
2195
2196 int
2197 \end_inset 
2198 </cell>
2199 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2200 \begin_inset Text
2201
2202 \layout Standard
2203
2204 4
2205 \end_inset 
2206 </cell>
2207 </row>
2208 </lyxtabular>
2209
2210 \end_inset 
2211
2212
2213 \layout Caption
2214
2215 Ogg/Speex header packet
2216 \begin_inset LatexCommand \label{cap:ogg_speex_header}
2217
2218 \end_inset 
2219
2220
2221 \end_inset 
2222
2223
2224 \layout Section
2225 \pagebreak_top 
2226 Introduction to CELP Coding
2227 \begin_inset LatexCommand \index{CELP}
2228
2229 \end_inset 
2230
2231
2232 \layout Standard
2233
2234 The three following sections describe the internals of the codec and require
2235  some signal processing knowledge.
2236  If you are only interested in using Speex, they are not required.
2237 \layout Standard
2238
2239 Speex is based on CELP, which stands for Code Excited Linear Prediction.
2240  This section attempts to introduce the principles behind CELP, so if you
2241  are already familiar with CELP, you can safely skip to section 
2242 \begin_inset LatexCommand \ref{sec:Speex-narrowband-mode}
2243
2244 \end_inset 
2245
2246 .
2247  The CELP technique is based on three ideas:
2248 \layout Enumerate
2249
2250 The use of a linear prediction (LP) model to model the vocal tract
2251 \layout Enumerate
2252
2253 The use of (adaptive and fixed) codebook entries as input (excitation) of
2254  the LP model
2255 \layout Enumerate
2256
2257 The search performed in closed-loop in a 
2258 \begin_inset Quotes eld
2259 \end_inset 
2260
2261 perceptually weighted domain
2262 \begin_inset Quotes erd
2263 \end_inset 
2264
2265
2266 \layout Standard
2267
2268 This section describes the basic ideas behind CELP.
2269  Note that it's still incomplete.
2270 \layout Subsection
2271
2272 Linear Prediction (LPC)
2273 \begin_inset LatexCommand \index{linear prediction}
2274
2275 \end_inset 
2276
2277
2278 \layout Standard
2279
2280 Linear prediction is at the base of may speech coding techniques, including
2281  CELP.
2282  The idea behind it is to predict the signal 
2283 \begin_inset Formula $x(n)$
2284 \end_inset 
2285
2286  using a linear combination of its past samples:
2287 \layout Standard
2288
2289
2290 \begin_inset Formula \[
2291 y[n]=\sum_{i=1}^{N}a_{i}x[n-i]\]
2292
2293 \end_inset 
2294
2295 where 
2296 \begin_inset Formula $y[n]$
2297 \end_inset 
2298
2299  is the linear prediction of 
2300 \begin_inset Formula $x[n]$
2301 \end_inset 
2302
2303 .
2304  The prediction error is thus given by:
2305 \begin_inset Formula \[
2306 e[n]=x[n]-y[n]=x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\]
2307
2308 \end_inset 
2309
2310
2311 \layout Standard
2312
2313 The goal of the LPC analysis is to find the best prediction coefficients
2314  
2315 \begin_inset Formula $a_{i}$
2316 \end_inset 
2317
2318  which minimize the quadratic error function:
2319 \begin_inset Formula \[
2320 E=\sum_{n=0}^{L-1}\left[e[n]\right]^{2}=\sum_{n=0}^{L-1}\left[x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\right]^{2}\]
2321
2322 \end_inset 
2323
2324 That can be done by making all derivatives 
2325 \begin_inset Formula $\frac{\partial E}{\partial a_{i}}$
2326 \end_inset 
2327
2328  equal to zero:
2329 \begin_inset Formula \[
2330 \frac{\partial E}{\partial a_{i}}=\frac{\partial}{\partial a_{i}}\sum_{n=0}^{L-1}\left[x[n]-\sum_{i=1}^{N}a_{i}x[n-i]\right]^{2}=0\]
2331
2332 \end_inset 
2333
2334
2335 \layout Standard
2336
2337 The 
2338 \begin_inset Formula $a_{i}$
2339 \end_inset 
2340
2341  filter coefficients are computed using the Levinson-Durbin
2342 \begin_inset LatexCommand \index{Levinson-Durbin}
2343
2344 \end_inset 
2345
2346  algorithm, which starts from the auto-correlation
2347 \begin_inset LatexCommand \index{auto-correlation}
2348
2349 \end_inset 
2350
2351  
2352 \begin_inset Formula $R(m)$
2353 \end_inset 
2354
2355  of the signal 
2356 \begin_inset Formula $x[n]$
2357 \end_inset 
2358
2359 .
2360 \layout Standard
2361
2362
2363 \begin_inset Formula \[
2364 R(m)=\sum_{i=0}^{N-1}x[i]x[i-m]\]
2365
2366 \end_inset 
2367
2368
2369 \layout Standard
2370
2371 For an order 
2372 \begin_inset Formula $N$
2373 \end_inset 
2374
2375  filter, we have:
2376 \begin_inset Formula \[
2377 \mathbf{R}=\left[\begin{array}{cccc}
2378 R(0) & R(1) & \cdots & R(N-1)\\
2379 R(1) & R(0) & \cdots & R(N-2)\\
2380 \vdots & \vdots & \ddots & \vdots\\
2381 R(N-1) & R(N-2) & \cdots & R(0)\end{array}\right]\]
2382
2383 \end_inset 
2384
2385
2386 \begin_inset Formula \[
2387 \mathbf{r}=\left[\begin{array}{c}
2388 R(1)\\
2389 R(2)\\
2390 \vdots\\
2391 R(N)\end{array}\right]\]
2392
2393 \end_inset 
2394
2395
2396 \layout Standard
2397
2398 The filter coefficients 
2399 \begin_inset Formula $a_{i}$
2400 \end_inset 
2401
2402  are found by solving the system 
2403 \begin_inset Formula $\mathbf{Ra}=\mathbf{r}$
2404 \end_inset 
2405
2406 .
2407  What the Levinson-Durbin algorithm does here is making the solution to
2408  the problem 
2409 \begin_inset Formula $\mathcal{O}\left(N^{2}\right)$
2410 \end_inset 
2411
2412  instead of 
2413 \begin_inset Formula $\mathcal{O}\left(N^{3}\right)$
2414 \end_inset 
2415
2416  by exploiting the fact that matrix 
2417 \begin_inset Formula $\mathbf{R}$
2418 \end_inset 
2419
2420  is toeplitz hermitian.
2421  Also, it can be proved that all the roots of 
2422 \begin_inset Formula $A(z)$
2423 \end_inset 
2424
2425  are within the unit circle, which means that 
2426 \begin_inset Formula $1/A(z)$
2427 \end_inset 
2428
2429  is always stable.
2430  This is in theory; in practice because of finite precision, there are two
2431  commonly used techniques to make sure we have a stable filter.
2432  First, we multiply 
2433 \begin_inset Formula $R(0)$
2434 \end_inset 
2435
2436  by a number slightly above one (such as 1.0001), which is equivalent to
2437  adding noise to the signal.
2438  Also, we can apply a window to the auto-correlation, which is equivalent
2439  to filtering in the frequency domain, reducing sharp resonances.
2440 \layout Standard
2441
2442 The linear prediction model represents each speech sample as linear combination
2443  of past samples, plus an error signal called the excitation (or residual).
2444 \begin_inset Formula \[
2445 x[n]=\sum_{i=1}^{N}a_{i}x[n-i]+e[n]\]
2446
2447 \end_inset 
2448
2449
2450 \layout Standard
2451
2452 In the 
2453 \emph on 
2454 z
2455 \emph default 
2456 -domain, this can be expressed as
2457 \layout Standard
2458
2459
2460 \begin_inset Formula \[
2461 x(z)=\frac{1}{A(z)}\: e(z)\]
2462
2463 \end_inset 
2464
2465
2466 \layout Standard
2467
2468 where 
2469 \begin_inset Formula $A(z)$
2470 \end_inset 
2471
2472  is defined as
2473 \layout Standard
2474
2475
2476 \begin_inset Formula \[
2477 A(z)=1-\sum_{i=1}^{N}a_{i}z^{-i}\]
2478
2479 \end_inset 
2480
2481
2482 \layout Standard
2483
2484 We usually refer to 
2485 \begin_inset Formula $A(z)$
2486 \end_inset 
2487
2488  as the analysis filter and 
2489 \begin_inset Formula $1/A(z)$
2490 \end_inset 
2491
2492  as the synthesis filter.
2493  The whole process is called short-term prediction as it predicts the signal
2494  
2495 \begin_inset Formula $x[n]$
2496 \end_inset 
2497
2498  using a prediction using only the 
2499 \begin_inset Formula $N$
2500 \end_inset 
2501
2502  past samples, where 
2503 \begin_inset Formula $N$
2504 \end_inset 
2505
2506  is usually around 10.
2507 \layout Standard
2508
2509 Because LPC coefficients have very little robustness to quantization, they
2510  are converted to Line Spectral Pair
2511 \begin_inset LatexCommand \index{line spectral pair}
2512
2513 \end_inset 
2514
2515  (LSP) coefficients which have a much better behaviour with quantization,
2516  one of them being that it's easy to keep the filter stable.
2517  
2518 \layout Subsection
2519
2520 Pitch Prediction
2521 \begin_inset LatexCommand \index{pitch}
2522
2523 \end_inset 
2524
2525
2526 \layout Standard
2527
2528 During voiced segments, the speech signal is periodic, so it is possible
2529  to take advantage of that property by approximating the excitation signal
2530  
2531 \begin_inset Formula $e[n]$
2532 \end_inset 
2533
2534  by a gain times the past of the excitation:
2535 \layout Standard
2536
2537
2538 \begin_inset Formula \[
2539 e[n]\simeq p[n]=\beta e[n-T]\]
2540
2541 \end_inset 
2542
2543
2544 \layout Standard
2545
2546 where 
2547 \begin_inset Formula $T$
2548 \end_inset 
2549
2550  is the pitch period, 
2551 \begin_inset Formula $\beta$
2552 \end_inset 
2553
2554  is the pitch gain and 
2555 \begin_inset Formula $c(n)$
2556 \end_inset 
2557
2558  is taken from the 
2559 \emph on 
2560 innovation codebook
2561 \emph default 
2562 .
2563  We call that long-term prediction since the excitation is predicted from
2564  
2565 \begin_inset Formula $e[n-T]$
2566 \end_inset 
2567
2568  with 
2569 \begin_inset Formula $T\gg N$
2570 \end_inset 
2571
2572 .
2573 \layout Subsection
2574
2575 Innovation Codebook
2576 \layout Standard
2577
2578 The final excitation 
2579 \begin_inset Formula $e[n]$
2580 \end_inset 
2581
2582  will be the sum of the pitch prediction and an 
2583 \emph on 
2584 innovation
2585 \emph default 
2586  signal 
2587 \begin_inset Formula $c[n]$
2588 \end_inset 
2589
2590  taken from a fixed codebook.
2591 \layout Standard
2592
2593
2594 \begin_inset Formula \[
2595 e[n]=p[n]+c[n]=\beta e[n-T]+c[n]\]
2596
2597 \end_inset 
2598
2599 This is where most of the bits in a CELP codec are allocated.
2600  It represents the information that couldn't be obtained either from linear
2601  prediction or pitch prediction.
2602  In the 
2603 \emph on 
2604 z
2605 \emph default 
2606 -domain we can represent the final signal 
2607 \begin_inset Formula $X(z)$
2608 \end_inset 
2609
2610  as 
2611 \begin_inset Formula \[
2612 X(z)=\frac{C(z)}{A(z)\left(1-\beta z^{-T}\right)}\]
2613
2614 \end_inset 
2615
2616
2617 \layout Subsection
2618
2619 Analysis-by-Synthesis and Error Weighting
2620 \begin_inset LatexCommand \index{error weighting}
2621
2622 \end_inset 
2623
2624
2625 \begin_inset LatexCommand \index{analysis-by-synthesis}
2626
2627 \end_inset 
2628
2629
2630 \layout Standard
2631
2632 Most (if not all) modern audio codecs attempt to 
2633 \begin_inset Quotes eld
2634 \end_inset 
2635
2636 shape
2637 \begin_inset Quotes erd
2638 \end_inset 
2639
2640  the noise so that it appears mostly in the frequency regions where the
2641  ear cannot detect it.
2642  For example, the ear is more tolerant to noise in parts of the spectrum
2643  that are louder and 
2644 \emph on 
2645 vice versa
2646 \emph default 
2647 .
2648  That's why instead of minimizing the simple quadratic error
2649 \begin_inset Formula \[
2650 E=\sum_{n}\left(x[n]-\overline{x}[n]\right)^{2}\]
2651
2652 \end_inset 
2653
2654 where 
2655 \begin_inset Formula $\overline{x}[n]$
2656 \end_inset 
2657
2658  is the encoder signal, we minimize the error for the perceptually weighted
2659  signal
2660 \begin_inset Formula \[
2661 X_{w}(z)=W(z)X(z)\]
2662
2663 \end_inset 
2664
2665 where 
2666 \begin_inset Formula $W(z)$
2667 \end_inset 
2668
2669  is the weighting filter, usually of the form
2670 \layout Standard
2671
2672
2673 \begin_inset Formula \begin{equation}
2674 W(z)=\frac{A\left(\frac{z}{\gamma_{1}}\right)}{A\left(\frac{z}{\gamma_{2}}\right)}\label{eq:weighting_filter}\end{equation}
2675
2676 \end_inset 
2677
2678
2679 \layout Standard
2680
2681 with control parameters 
2682 \begin_inset Formula $\gamma_{1}>\gamma_{2}$
2683 \end_inset 
2684
2685 .
2686  If the noise is white in the perceptually weighted domain, then in the
2687  signal domain its spectral shape will be of the form
2688 \begin_inset Formula \[
2689 A_{noise}(z)=\frac{1}{W(z)}=\frac{A\left(\frac{z}{\gamma_{2}}\right)}{A\left(\frac{z}{\gamma_{1}}\right)}\]
2690
2691 \end_inset 
2692
2693
2694 \layout Standard
2695
2696 If a filter 
2697 \begin_inset Formula $A(z)$
2698 \end_inset 
2699
2700  has (complex) poles at 
2701 \begin_inset Formula $p_{i}$
2702 \end_inset 
2703
2704  in the 
2705 \begin_inset Formula $z$
2706 \end_inset 
2707
2708 -plane, the filter 
2709 \begin_inset Formula $A(z/\gamma)$
2710 \end_inset 
2711
2712  filter will have its poles at 
2713 \begin_inset Formula $p_{i}^{'}=\gamma p_{i}$
2714 \end_inset 
2715
2716 , making it a flatter version of 
2717 \begin_inset Formula $A(z)$
2718 \end_inset 
2719
2720 .
2721 \layout Section
2722 \pagebreak_top 
2723 Speex narrowband mode
2724 \begin_inset LatexCommand \label{sec:Speex-narrowband-mode}
2725
2726 \end_inset 
2727
2728
2729 \begin_inset LatexCommand \index{narrowband}
2730
2731 \end_inset 
2732
2733
2734 \layout Standard
2735
2736 This section looks at how Speex works for narrowband (
2737 \begin_inset Formula $8\:\mathrm{kHz}$
2738 \end_inset 
2739
2740  sampling rate) operation.
2741  The frame size for this mode is 
2742 \begin_inset Formula $20\:\mathrm{ms}$
2743 \end_inset 
2744
2745 , corresponding to 160 samples.
2746  Each frame is also subdivided into 4 sub-frames of 40 samples each.
2747 \layout Standard
2748
2749 Also many design decisions were based on the original goals and assumptions:
2750 \layout Itemize
2751
2752 Minimizing the amount of information extracted from past frames (for robustness
2753  to packet loss)
2754 \layout Itemize
2755
2756 Dynamically-selectable codebooks (LSP, pitch and innovation)
2757 \layout Itemize
2758
2759 sub-vector fixed (innovation) codebooks
2760 \layout Subsection
2761
2762 LPC Analysis
2763 \begin_inset LatexCommand \index{linear prediction}
2764
2765 \end_inset 
2766
2767
2768 \layout Standard
2769
2770 An LPC analysis is first performed on a (asymetric Hamming) window that
2771  spans all the current frame and half a frame in advance.
2772  The LPC coefficients are then converted to Line Spectral Pair
2773 \begin_inset LatexCommand \index{line spectral pair}
2774
2775 \end_inset 
2776
2777  (LSP), a representation that is more robust to quantization.
2778  The LSP's are considered to be associated to the 
2779 \begin_inset Formula $4^{th}$
2780 \end_inset 
2781
2782  sub-frames and the LSP's associated to the first 3 sub-frames are linearly
2783  interpolated using the current and previous LSP's.
2784 \layout Standard
2785
2786 The LSP's are encoded using 30 bits for higher quality modes and 18 bits
2787  for lower quality, through the use of a multi-stage split-vector quantizer.
2788  For the lower quality modes, the 10 coefficients are first quantized with
2789  6 bits and the error is then divided in two 5-coefficient sub-vectors.
2790  Each of them is quantized with 6 bits, for a total of 18 bits.
2791  For the higher quality modes, the remaining error on both sub-vectors is
2792  further quantized with 6 bits each, for a total of 30 bits.
2793 \layout Standard
2794
2795 The perceptual weighting filter 
2796 \begin_inset Formula $W(z)$
2797 \end_inset 
2798
2799  used by Speex is derived from the LPC filter 
2800 \begin_inset Formula $A(z)$
2801 \end_inset 
2802
2803  and corresponds to the one described by eq.
2804  
2805 \begin_inset LatexCommand \ref{eq:weighting_filter}
2806
2807 \end_inset 
2808
2809  with 
2810 \begin_inset Formula $\gamma_{1}=0.9$
2811 \end_inset 
2812
2813  and 
2814 \begin_inset Formula $\gamma_{2}=0.6$
2815 \end_inset 
2816
2817 .
2818  We can use the unquantized 
2819 \begin_inset Formula $A(z)$
2820 \end_inset 
2821
2822  filter since the weighting filter is only used in the encoder.
2823 \layout Subsection
2824
2825 Pitch Prediction (adaptive codebook)
2826 \begin_inset LatexCommand \index{pitch}
2827
2828 \end_inset 
2829
2830
2831 \layout Standard
2832
2833 Speex uses a 3-tap prediction for pitch.
2834  That is, the pitch prediction signal 
2835 \begin_inset Formula $p[n]$
2836 \end_inset 
2837
2838  is obtained by the past of the excitation by:
2839 \begin_inset Formula \[
2840 p[n]=\beta_{0}e[n-T-1]+\beta_{1}e[n-T]+\beta_{2}e[n-T+1]\]
2841
2842 \end_inset 
2843
2844
2845 \layout Standard
2846
2847 where 
2848 \begin_inset Formula $T$
2849 \end_inset 
2850
2851  is the pitch period and the 
2852 \begin_inset Formula $\beta_{i}$
2853 \end_inset 
2854
2855  are the prediction (filter) taps.
2856  It is worth noting that when the pitch is smaller than the sub-frame size,
2857  we repeat the excitation at a period 
2858 \begin_inset Formula $T$
2859 \end_inset 
2860
2861 .
2862  For example, when 
2863 \begin_inset Formula $n-T+1$
2864 \end_inset 
2865
2866 , we use 
2867 \begin_inset Formula $n-2T+1$
2868 \end_inset 
2869
2870  instead.
2871  The period and quantized gains are determined in closed loop.
2872  In most modes, the pitch period is encoded with 7 bits in the 
2873 \begin_inset Formula $\left[17,144\right]$
2874 \end_inset 
2875
2876  range and the 
2877 \begin_inset Formula $\beta_{i}$
2878 \end_inset 
2879
2880  coefficients are vector-quantized using 7 bits (15 kbps narrowband and
2881  above) at higher bit-rates and 5 bits at lower bit-rates (11 kbps narrowband
2882  and below).
2883 \layout Subsection
2884
2885 Innovation Codebook
2886 \layout Standard
2887
2888 In Speex, the innovation signal is quantized using shape-only vector quantizatio
2889 n (VQ).
2890  That means that the codebooks that are used represent both the shape and
2891  the gain at the same time.
2892  This save many bits that would otherwise be allocated for a separate gain
2893  at the price of a slight increase in complexity.
2894  
2895 \layout Subsection
2896
2897 Bit allocation
2898 \layout Standard
2899
2900 There are 7 different narrowband bit-rates defined for Speex, ranging from
2901  200 bps to 18.15 kbps, although the modes below 5.9 kbps should not be used
2902  for speech.
2903  The bit-allocation for each mode is detailed in table 
2904 \begin_inset LatexCommand \ref{cap:bits-narrowband}
2905
2906 \end_inset 
2907
2908 .
2909  Each frame starts with the mode ID encoded with 4 bits which allows a range
2910  from 0 to 15, though only the first 7 values are used (the others are reserved).
2911  The parameters are listed in the table in the order they are packed in
2912  the bit-stream.
2913  All frame-based parameters are packed before sub-frame parameters.
2914  The parameters for a certain sub-frame are all packed before the following
2915  sub-frame is packed.
2916  Note that the 
2917 \begin_inset Quotes eld
2918 \end_inset 
2919
2920 OL
2921 \begin_inset Quotes erd
2922 \end_inset 
2923
2924  in the parameter description means that the parameter is an open loop estimatio
2925 n based on the whole frame.
2926 \layout Standard
2927
2928
2929 \begin_inset Float table
2930 placement h
2931 wide true
2932 collapsed false
2933
2934 \layout Standard
2935
2936
2937 \begin_inset  Tabular
2938 <lyxtabular version="3" rows="12" columns="11">
2939 <features>
2940 <column alignment="center" valignment="top" leftline="true" width="0pt">
2941 <column alignment="center" valignment="top" leftline="true" width="0pt">
2942 <column alignment="center" valignment="top" leftline="true" width="0pt">
2943 <column alignment="center" valignment="top" leftline="true" width="0pt">
2944 <column alignment="center" valignment="top" leftline="true" width="0pt">
2945 <column alignment="center" valignment="top" leftline="true" width="0pt">
2946 <column alignment="center" valignment="top" leftline="true" width="0pt">
2947 <column alignment="center" valignment="top" leftline="true" width="0pt">
2948 <column alignment="center" valignment="top" leftline="true" width="0pt">
2949 <column alignment="center" valignment="top" leftline="true" width="0pt">
2950 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
2951 <row topline="true" bottomline="true">
2952 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2953 \begin_inset Text
2954
2955 \layout Standard
2956
2957 Parameter
2958 \end_inset 
2959 </cell>
2960 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2961 \begin_inset Text
2962
2963 \layout Standard
2964
2965 Update rate
2966 \end_inset 
2967 </cell>
2968 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2969 \begin_inset Text
2970
2971 \layout Standard
2972
2973 0
2974 \end_inset 
2975 </cell>
2976 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2977 \begin_inset Text
2978
2979 \layout Standard
2980
2981 1
2982 \end_inset 
2983 </cell>
2984 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2985 \begin_inset Text
2986
2987 \layout Standard
2988
2989 2
2990 \end_inset 
2991 </cell>
2992 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2993 \begin_inset Text
2994
2995 \layout Standard
2996
2997 3
2998 \end_inset 
2999 </cell>
3000 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3001 \begin_inset Text
3002
3003 \layout Standard
3004
3005 4
3006 \end_inset 
3007 </cell>
3008 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3009 \begin_inset Text
3010
3011 \layout Standard
3012
3013 5
3014 \end_inset 
3015 </cell>
3016 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3017 \begin_inset Text
3018
3019 \layout Standard
3020
3021 6
3022 \end_inset 
3023 </cell>
3024 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3025 \begin_inset Text
3026
3027 \layout Standard
3028
3029 7
3030 \end_inset 
3031 </cell>
3032 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3033 \begin_inset Text
3034
3035 \layout Standard
3036
3037 8
3038 \end_inset 
3039 </cell>
3040 </row>
3041 <row topline="true">
3042 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3043 \begin_inset Text
3044
3045 \layout Standard
3046
3047 Wideband bit
3048 \end_inset 
3049 </cell>
3050 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3051 \begin_inset Text
3052
3053 \layout Standard
3054
3055 frame
3056 \end_inset 
3057 </cell>
3058 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3059 \begin_inset Text
3060
3061 \layout Standard
3062
3063 1
3064 \end_inset 
3065 </cell>
3066 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3067 \begin_inset Text
3068
3069 \layout Standard
3070
3071 1
3072 \end_inset 
3073 </cell>
3074 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3075 \begin_inset Text
3076
3077 \layout Standard
3078
3079 1
3080 \end_inset 
3081 </cell>
3082 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3083 \begin_inset Text
3084
3085 \layout Standard
3086
3087 1
3088 \end_inset 
3089 </cell>
3090 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3091 \begin_inset Text
3092
3093 \layout Standard
3094
3095 1
3096 \end_inset 
3097 </cell>
3098 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3099 \begin_inset Text
3100
3101 \layout Standard
3102
3103 1
3104 \end_inset 
3105 </cell>
3106 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3107 \begin_inset Text
3108
3109 \layout Standard
3110
3111 1
3112 \end_inset 
3113 </cell>
3114 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3115 \begin_inset Text
3116
3117 \layout Standard
3118
3119 1
3120 \end_inset 
3121 </cell>
3122 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3123 \begin_inset Text
3124
3125 \layout Standard
3126
3127 1
3128 \end_inset 
3129 </cell>
3130 </row>
3131 <row topline="true">
3132 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3133 \begin_inset Text
3134
3135 \layout Standard
3136
3137 Mode ID
3138 \end_inset 
3139 </cell>
3140 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3141 \begin_inset Text
3142
3143 \layout Standard
3144
3145 frame
3146 \end_inset 
3147 </cell>
3148 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3149 \begin_inset Text
3150
3151 \layout Standard
3152
3153 4
3154 \end_inset 
3155 </cell>
3156 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3157 \begin_inset Text
3158
3159 \layout Standard
3160
3161 4
3162 \end_inset 
3163 </cell>
3164 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3165 \begin_inset Text
3166
3167 \layout Standard
3168
3169 4
3170 \end_inset 
3171 </cell>
3172 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3173 \begin_inset Text
3174
3175 \layout Standard
3176
3177 4
3178 \end_inset 
3179 </cell>
3180 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3181 \begin_inset Text
3182
3183 \layout Standard
3184
3185 4
3186 \end_inset 
3187 </cell>
3188 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3189 \begin_inset Text
3190
3191 \layout Standard
3192
3193 4
3194 \end_inset 
3195 </cell>
3196 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3197 \begin_inset Text
3198
3199 \layout Standard
3200
3201 4
3202 \end_inset 
3203 </cell>
3204 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3205 \begin_inset Text
3206
3207 \layout Standard
3208
3209 4
3210 \end_inset 
3211 </cell>
3212 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3213 \begin_inset Text
3214
3215 \layout Standard
3216
3217 4
3218 \end_inset 
3219 </cell>
3220 </row>
3221 <row topline="true">
3222 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3223 \begin_inset Text
3224
3225 \layout Standard
3226
3227 LSP
3228 \end_inset 
3229 </cell>
3230 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3231 \begin_inset Text
3232
3233 \layout Standard
3234
3235 frame
3236 \end_inset 
3237 </cell>
3238 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3239 \begin_inset Text
3240
3241 \layout Standard
3242
3243 0
3244 \end_inset 
3245 </cell>
3246 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3247 \begin_inset Text
3248
3249 \layout Standard
3250
3251 18
3252 \end_inset 
3253 </cell>
3254 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3255 \begin_inset Text
3256
3257 \layout Standard
3258
3259 18
3260 \end_inset 
3261 </cell>
3262 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3263 \begin_inset Text
3264
3265 \layout Standard
3266
3267 18
3268 \end_inset 
3269 </cell>
3270 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3271 \begin_inset Text
3272
3273 \layout Standard
3274
3275 18
3276 \end_inset 
3277 </cell>
3278 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3279 \begin_inset Text
3280
3281 \layout Standard
3282
3283 30
3284 \end_inset 
3285 </cell>
3286 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3287 \begin_inset Text
3288
3289 \layout Standard
3290
3291 30
3292 \end_inset 
3293 </cell>
3294 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3295 \begin_inset Text
3296
3297 \layout Standard
3298
3299 30
3300 \end_inset 
3301 </cell>
3302 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3303 \begin_inset Text
3304
3305 \layout Standard
3306
3307 18
3308 \end_inset 
3309 </cell>
3310 </row>
3311 <row topline="true">
3312 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3313 \begin_inset Text
3314
3315 \layout Standard
3316
3317 OL pitch
3318 \end_inset 
3319 </cell>
3320 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3321 \begin_inset Text
3322
3323 \layout Standard
3324
3325 frame
3326 \end_inset 
3327 </cell>
3328 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3329 \begin_inset Text
3330
3331 \layout Standard
3332
3333 0
3334 \end_inset 
3335 </cell>
3336 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3337 \begin_inset Text
3338
3339 \layout Standard
3340
3341 7
3342 \end_inset 
3343 </cell>
3344 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3345 \begin_inset Text
3346
3347 \layout Standard
3348
3349 7
3350 \end_inset 
3351 </cell>
3352 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3353 \begin_inset Text
3354
3355 \layout Standard
3356
3357 0
3358 \end_inset 
3359 </cell>
3360 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3361 \begin_inset Text
3362
3363 \layout Standard
3364
3365 0
3366 \end_inset 
3367 </cell>
3368 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3369 \begin_inset Text
3370
3371 \layout Standard
3372
3373 0
3374 \end_inset 
3375 </cell>
3376 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3377 \begin_inset Text
3378
3379 \layout Standard
3380
3381 0
3382 \end_inset 
3383 </cell>
3384 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3385 \begin_inset Text
3386
3387 \layout Standard
3388
3389 0
3390 \end_inset 
3391 </cell>
3392 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3393 \begin_inset Text
3394
3395 \layout Standard
3396
3397 7
3398 \end_inset 
3399 </cell>
3400 </row>
3401 <row topline="true">
3402 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3403 \begin_inset Text
3404
3405 \layout Standard
3406
3407 OL pitch gain
3408 \end_inset 
3409 </cell>
3410 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3411 \begin_inset Text
3412
3413 \layout Standard
3414
3415 frame
3416 \end_inset 
3417 </cell>
3418 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3419 \begin_inset Text
3420
3421 \layout Standard
3422
3423 0
3424 \end_inset 
3425 </cell>
3426 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3427 \begin_inset Text
3428
3429 \layout Standard
3430
3431 4
3432 \end_inset 
3433 </cell>
3434 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3435 \begin_inset Text
3436
3437 \layout Standard
3438
3439 0
3440 \end_inset 
3441 </cell>
3442 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3443 \begin_inset Text
3444
3445 \layout Standard
3446
3447 0
3448 \end_inset 
3449 </cell>
3450 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3451 \begin_inset Text
3452
3453 \layout Standard
3454
3455 0
3456 \end_inset 
3457 </cell>
3458 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3459 \begin_inset Text
3460
3461 \layout Standard
3462
3463 0
3464 \end_inset 
3465 </cell>
3466 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3467 \begin_inset Text
3468
3469 \layout Standard
3470
3471 0
3472 \end_inset 
3473 </cell>
3474 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3475 \begin_inset Text
3476
3477 \layout Standard
3478
3479 0
3480 \end_inset 
3481 </cell>
3482 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3483 \begin_inset Text
3484
3485 \layout Standard
3486
3487 4
3488 \end_inset 
3489 </cell>
3490 </row>
3491 <row topline="true">
3492 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3493 \begin_inset Text
3494
3495 \layout Standard
3496
3497 OL Exc gain
3498 \end_inset 
3499 </cell>
3500 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3501 \begin_inset Text
3502
3503 \layout Standard
3504
3505 frame
3506 \end_inset 
3507 </cell>
3508 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3509 \begin_inset Text
3510
3511 \layout Standard
3512
3513 0
3514 \end_inset 
3515 </cell>
3516 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3517 \begin_inset Text
3518
3519 \layout Standard
3520
3521 5
3522 \end_inset 
3523 </cell>
3524 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3525 \begin_inset Text
3526
3527 \layout Standard
3528
3529 5
3530 \end_inset 
3531 </cell>
3532 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3533 \begin_inset Text
3534
3535 \layout Standard
3536
3537 5
3538 \end_inset 
3539 </cell>
3540 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3541 \begin_inset Text
3542
3543 \layout Standard
3544
3545 5
3546 \end_inset 
3547 </cell>
3548 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3549 \begin_inset Text
3550
3551 \layout Standard
3552
3553 5
3554 \end_inset 
3555 </cell>
3556 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3557 \begin_inset Text
3558
3559 \layout Standard
3560
3561 5
3562 \end_inset 
3563 </cell>
3564 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3565 \begin_inset Text
3566
3567 \layout Standard
3568
3569 5
3570 \end_inset 
3571 </cell>
3572 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3573 \begin_inset Text
3574
3575 \layout Standard
3576
3577 5
3578 \end_inset 
3579 </cell>
3580 </row>
3581 <row topline="true">
3582 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3583 \begin_inset Text
3584
3585 \layout Standard
3586
3587 Fine pitch
3588 \end_inset 
3589 </cell>
3590 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3591 \begin_inset Text
3592
3593 \layout Standard
3594
3595 sub-frame
3596 \end_inset 
3597 </cell>
3598 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3599 \begin_inset Text
3600
3601 \layout Standard
3602
3603 0
3604 \end_inset 
3605 </cell>
3606 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3607 \begin_inset Text
3608
3609 \layout Standard
3610
3611 0
3612 \end_inset 
3613 </cell>
3614 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3615 \begin_inset Text
3616
3617 \layout Standard
3618
3619 0
3620 \end_inset 
3621 </cell>
3622 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3623 \begin_inset Text
3624
3625 \layout Standard
3626
3627 7
3628 \end_inset 
3629 </cell>
3630 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3631 \begin_inset Text
3632
3633 \layout Standard
3634
3635 7
3636 \end_inset 
3637 </cell>
3638 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3639 \begin_inset Text
3640
3641 \layout Standard
3642
3643 7
3644 \end_inset 
3645 </cell>
3646 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3647 \begin_inset Text
3648
3649 \layout Standard
3650
3651 7
3652 \end_inset 
3653 </cell>
3654 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3655 \begin_inset Text
3656
3657 \layout Standard
3658
3659 7
3660 \end_inset 
3661 </cell>
3662 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3663 \begin_inset Text
3664
3665 \layout Standard
3666
3667 0
3668 \end_inset 
3669 </cell>
3670 </row>
3671 <row topline="true">
3672 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3673 \begin_inset Text
3674
3675 \layout Standard
3676
3677 Pitch gain
3678 \end_inset 
3679 </cell>
3680 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3681 \begin_inset Text
3682
3683 \layout Standard
3684
3685 sub-frame
3686 \end_inset 
3687 </cell>
3688 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3689 \begin_inset Text
3690
3691 \layout Standard
3692
3693 0
3694 \end_inset 
3695 </cell>
3696 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3697 \begin_inset Text
3698
3699 \layout Standard
3700
3701 0
3702 \end_inset 
3703 </cell>
3704 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3705 \begin_inset Text
3706
3707 \layout Standard
3708
3709 5
3710 \end_inset 
3711 </cell>
3712 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3713 \begin_inset Text
3714
3715 \layout Standard
3716
3717 5
3718 \end_inset 
3719 </cell>
3720 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3721 \begin_inset Text
3722
3723 \layout Standard
3724
3725 5
3726 \end_inset 
3727 </cell>
3728 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3729 \begin_inset Text
3730
3731 \layout Standard
3732
3733 7
3734 \end_inset 
3735 </cell>
3736 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3737 \begin_inset Text
3738
3739 \layout Standard
3740
3741 7
3742 \end_inset 
3743 </cell>
3744 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3745 \begin_inset Text
3746
3747 \layout Standard
3748
3749 7
3750 \end_inset 
3751 </cell>
3752 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3753 \begin_inset Text
3754
3755 \layout Standard
3756
3757 0
3758 \end_inset 
3759 </cell>
3760 </row>
3761 <row topline="true">
3762 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3763 \begin_inset Text
3764
3765 \layout Standard
3766
3767 Innovation gain
3768 \end_inset 
3769 </cell>
3770 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3771 \begin_inset Text
3772
3773 \layout Standard
3774
3775 sub-frame
3776 \end_inset 
3777 </cell>
3778 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3779 \begin_inset Text
3780
3781 \layout Standard
3782
3783 0
3784 \end_inset 
3785 </cell>
3786 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3787 \begin_inset Text
3788
3789 \layout Standard
3790
3791 1
3792 \end_inset 
3793 </cell>
3794 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3795 \begin_inset Text
3796
3797 \layout Standard
3798
3799 0
3800 \end_inset 
3801 </cell>
3802 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3803 \begin_inset Text
3804
3805 \layout Standard
3806
3807 1
3808 \end_inset 
3809 </cell>
3810 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3811 \begin_inset Text
3812
3813 \layout Standard
3814
3815 1
3816 \end_inset 
3817 </cell>
3818 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3819 \begin_inset Text
3820
3821 \layout Standard
3822
3823 3
3824 \end_inset 
3825 </cell>
3826 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3827 \begin_inset Text
3828
3829 \layout Standard
3830
3831 3
3832 \end_inset 
3833 </cell>
3834 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3835 \begin_inset Text
3836
3837 \layout Standard
3838
3839 3
3840 \end_inset 
3841 </cell>
3842 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3843 \begin_inset Text
3844
3845 \layout Standard
3846
3847 0
3848 \end_inset 
3849 </cell>
3850 </row>
3851 <row topline="true" bottomline="true">
3852 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3853 \begin_inset Text
3854
3855 \layout Standard
3856
3857 Innovation VQ
3858 \end_inset 
3859 </cell>
3860 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3861 \begin_inset Text
3862
3863 \layout Standard
3864
3865 sub-frame
3866 \end_inset 
3867 </cell>
3868 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3869 \begin_inset Text
3870
3871 \layout Standard
3872
3873 0
3874 \end_inset 
3875 </cell>
3876 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3877 \begin_inset Text
3878
3879 \layout Standard
3880
3881 0
3882 \end_inset 
3883 </cell>
3884 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3885 \begin_inset Text
3886
3887 \layout Standard
3888
3889 16
3890 \end_inset 
3891 </cell>
3892 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3893 \begin_inset Text
3894
3895 \layout Standard
3896
3897 20
3898 \end_inset 
3899 </cell>
3900 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3901 \begin_inset Text
3902
3903 \layout Standard
3904
3905 35
3906 \end_inset 
3907 </cell>
3908 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3909 \begin_inset Text
3910
3911 \layout Standard
3912
3913 48
3914 \end_inset 
3915 </cell>
3916 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3917 \begin_inset Text
3918
3919 \layout Standard
3920
3921 64
3922 \end_inset 
3923 </cell>
3924 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3925 \begin_inset Text
3926
3927 \layout Standard
3928
3929 96
3930 \end_inset 
3931 </cell>
3932 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3933 \begin_inset Text
3934
3935 \layout Standard
3936
3937 10
3938 \end_inset 
3939 </cell>
3940 </row>
3941 <row topline="true" bottomline="true">
3942 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3943 \begin_inset Text
3944
3945 \layout Standard
3946
3947 Total
3948 \end_inset 
3949 </cell>
3950 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3951 \begin_inset Text
3952
3953 \layout Standard
3954
3955 frame
3956 \end_inset 
3957 </cell>
3958 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3959 \begin_inset Text
3960
3961 \layout Standard
3962
3963 5
3964 \end_inset 
3965 </cell>
3966 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3967 \begin_inset Text
3968
3969 \layout Standard
3970
3971 43
3972 \end_inset 
3973 </cell>
3974 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3975 \begin_inset Text
3976
3977 \layout Standard
3978
3979 119
3980 \end_inset 
3981 </cell>
3982 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3983 \begin_inset Text
3984
3985 \layout Standard
3986
3987 160
3988 \end_inset 
3989 </cell>
3990 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3991 \begin_inset Text
3992
3993 \layout Standard
3994
3995 220
3996 \end_inset 
3997 </cell>
3998 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3999 \begin_inset Text
4000
4001 \layout Standard
4002
4003 300
4004 \end_inset 
4005 </cell>
4006 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4007 \begin_inset Text
4008
4009 \layout Standard
4010
4011 364
4012 \end_inset 
4013 </cell>
4014 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4015 \begin_inset Text
4016
4017 \layout Standard
4018
4019 492
4020 \end_inset 
4021 </cell>
4022 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4023 \begin_inset Text
4024
4025 \layout Standard
4026
4027 79
4028 \end_inset 
4029 </cell>
4030 </row>
4031 </lyxtabular>
4032
4033 \end_inset 
4034
4035
4036 \layout Caption
4037
4038 Bit allocation for narrowband modes
4039 \begin_inset LatexCommand \label{cap:bits-narrowband}
4040
4041 \end_inset 
4042
4043
4044 \end_inset 
4045
4046
4047 \layout Standard
4048
4049 So far, no MOS (Mean Opinion Score
4050 \begin_inset LatexCommand \index{mean opinion score}
4051
4052 \end_inset 
4053
4054 ) subjective evaluation has been performed for Speex.
4055  In order to give an idea of the quality achivable with it, table 
4056 \begin_inset LatexCommand \ref{cap:quality_vs_bps}
4057
4058 \end_inset 
4059
4060  presents my own subjective opinion on it.
4061  It sould be noted that different people will perceive the quality differently
4062  and that the person that designed the codec often has a bias (one way or
4063  another) when it comes to subjective evaluation.
4064  Last thing, it should be noted that for most codecs (including Speex) encoding
4065  quality sometimes varies depending on the input.
4066  Note that the complexity is only approximate (within 0.5 mflops and using
4067  the lowers complexity setting).
4068  Decoding requires approximately 0.5 mflops
4069 \begin_inset LatexCommand \index{complexity}
4070
4071 \end_inset 
4072
4073  in most modes (1 mflops with perceptual enhancement).
4074 \layout Standard
4075
4076
4077 \begin_inset Float table
4078 placement h
4079 wide true
4080 collapsed false
4081
4082 \layout Standard
4083
4084
4085 \begin_inset  Tabular
4086 <lyxtabular version="3" rows="17" columns="4">
4087 <features>
4088 <column alignment="center" valignment="top" leftline="true" width="0pt">
4089 <column alignment="center" valignment="top" leftline="true" width="0pt">
4090 <column alignment="center" valignment="top" leftline="true" width="0pt">
4091 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
4092 <row topline="true" bottomline="true">
4093 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4094 \begin_inset Text
4095
4096 \layout Standard
4097
4098 Mode
4099 \end_inset 
4100 </cell>
4101 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4102 \begin_inset Text
4103
4104 \layout Standard
4105
4106 Bit-rate
4107 \begin_inset LatexCommand \index{bit-rate}
4108
4109 \end_inset 
4110
4111  (bps)
4112 \end_inset 
4113 </cell>
4114 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4115 \begin_inset Text
4116
4117 \layout Standard
4118
4119 mflops
4120 \begin_inset LatexCommand \index{complexity}
4121
4122 \end_inset 
4123
4124
4125 \end_inset 
4126 </cell>
4127 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4128 \begin_inset Text
4129
4130 \layout Standard
4131
4132 Quality/description
4133 \end_inset 
4134 </cell>
4135 </row>
4136 <row topline="true">
4137 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4138 \begin_inset Text
4139
4140 \layout Standard
4141
4142 0
4143 \end_inset 
4144 </cell>
4145 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4146 \begin_inset Text
4147
4148 \layout Standard
4149
4150 250
4151 \end_inset 
4152 </cell>
4153 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4154 \begin_inset Text
4155
4156 \layout Standard
4157
4158 N/A
4159 \end_inset 
4160 </cell>
4161 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4162 \begin_inset Text
4163
4164 \layout Standard
4165
4166 No sound (VBR only)
4167 \end_inset 
4168 </cell>
4169 </row>
4170 <row topline="true">
4171 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4172 \begin_inset Text
4173
4174 \layout Standard
4175
4176 1
4177 \end_inset 
4178 </cell>
4179 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4180 \begin_inset Text
4181
4182 \layout Standard
4183
4184 2,150
4185 \end_inset 
4186 </cell>
4187 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4188 \begin_inset Text
4189
4190 \layout Standard
4191
4192 6
4193 \end_inset 
4194 </cell>
4195 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4196 \begin_inset Text
4197
4198 \layout Standard
4199
4200 Vocoder (mostly for comfort noise)
4201 \end_inset 
4202 </cell>
4203 </row>
4204 <row topline="true">
4205 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4206 \begin_inset Text
4207
4208 \layout Standard
4209
4210 2
4211 \end_inset 
4212 </cell>
4213 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4214 \begin_inset Text
4215
4216 \layout Standard
4217
4218 5,950
4219 \end_inset 
4220 </cell>
4221 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4222 \begin_inset Text
4223
4224 \layout Standard
4225
4226 9
4227 \end_inset 
4228 </cell>
4229 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4230 \begin_inset Text
4231
4232 \layout Standard
4233
4234 Very noticeable artifacts/noise, good intelligibility
4235 \end_inset 
4236 </cell>
4237 </row>
4238 <row topline="true">
4239 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4240 \begin_inset Text
4241
4242 \layout Standard
4243
4244 3
4245 \end_inset 
4246 </cell>
4247 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4248 \begin_inset Text
4249
4250 \layout Standard
4251
4252 8,000
4253 \end_inset 
4254 </cell>
4255 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4256 \begin_inset Text
4257
4258 \layout Standard
4259
4260 10
4261 \end_inset 
4262 </cell>
4263 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4264 \begin_inset Text
4265
4266 \layout Standard
4267
4268 Artifacts/noise sometimes noticeable
4269 \end_inset 
4270 </cell>
4271 </row>
4272 <row topline="true">
4273 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4274 \begin_inset Text
4275
4276 \layout Standard
4277
4278 4
4279 \end_inset 
4280 </cell>
4281 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4282 \begin_inset Text
4283
4284 \layout Standard
4285
4286 11,000
4287 \end_inset 
4288 </cell>
4289 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4290 \begin_inset Text
4291
4292 \layout Standard
4293
4294 14
4295 \end_inset 
4296 </cell>
4297 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4298 \begin_inset Text
4299
4300 \layout Standard
4301
4302 Artifacts usually noticeable only with headphones
4303 \end_inset 
4304 </cell>
4305 </row>
4306 <row topline="true">
4307 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4308 \begin_inset Text
4309
4310 \layout Standard
4311
4312 5
4313 \end_inset 
4314 </cell>
4315 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4316 \begin_inset Text
4317
4318 \layout Standard
4319
4320 15,000
4321 \end_inset 
4322 </cell>
4323 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4324 \begin_inset Text
4325
4326 \layout Standard
4327
4328 11
4329 \end_inset 
4330 </cell>
4331 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4332 \begin_inset Text
4333
4334 \layout Standard
4335
4336 Need good headphones to tell the difference
4337 \end_inset 
4338 </cell>
4339 </row>
4340 <row topline="true">
4341 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4342 \begin_inset Text
4343
4344 \layout Standard
4345
4346 6
4347 \end_inset 
4348 </cell>
4349 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4350 \begin_inset Text
4351
4352 \layout Standard
4353
4354 18,200
4355 \end_inset 
4356 </cell>
4357 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4358 \begin_inset Text
4359
4360 \layout Standard
4361
4362 17.5
4363 \end_inset 
4364 </cell>
4365 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4366 \begin_inset Text
4367
4368 \layout Standard
4369
4370 Hard to tell the difference even with good headphones
4371 \end_inset 
4372 </cell>
4373 </row>
4374 <row topline="true">
4375 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4376 \begin_inset Text
4377
4378 \layout Standard
4379
4380 7
4381 \end_inset 
4382 </cell>
4383 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4384 \begin_inset Text
4385
4386 \layout Standard
4387
4388 24,600
4389 \end_inset 
4390 </cell>
4391 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4392 \begin_inset Text
4393
4394 \layout Standard
4395
4396 14.5
4397 \end_inset 
4398 </cell>
4399 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4400 \begin_inset Text
4401
4402 \layout Standard
4403
4404 Completely transparent for voice, good quality music
4405 \end_inset 
4406 </cell>
4407 </row>
4408 <row topline="true">
4409 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4410 \begin_inset Text
4411
4412 \layout Standard
4413
4414 8
4415 \end_inset 
4416 </cell>
4417 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4418 \begin_inset Text
4419
4420 \layout Standard
4421
4422 3,950
4423 \end_inset 
4424 </cell>
4425 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4426 \begin_inset Text
4427
4428 \layout Standard
4429
4430 -
4431 \end_inset 
4432 </cell>
4433 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4434 \begin_inset Text
4435
4436 \layout Standard
4437
4438 Very noticeable artifacts/noise, good intelligibility
4439 \end_inset 
4440 </cell>
4441 </row>
4442 <row topline="true">
4443 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4444 \begin_inset Text
4445
4446 \layout Standard
4447
4448 9
4449 \end_inset 
4450 </cell>
4451 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4452 \begin_inset Text
4453
4454 \layout Standard
4455
4456 N/A
4457 \end_inset 
4458 </cell>
4459 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4460 \begin_inset Text
4461
4462 \layout Standard
4463
4464 N/A
4465 \end_inset 
4466 </cell>
4467 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4468 \begin_inset Text
4469
4470 \layout Standard
4471
4472 reserved
4473 \end_inset 
4474 </cell>
4475 </row>
4476 <row topline="true">
4477 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4478 \begin_inset Text
4479
4480 \layout Standard
4481
4482 10
4483 \end_inset 
4484 </cell>
4485 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4486 \begin_inset Text
4487
4488 \layout Standard
4489
4490 N/A
4491 \end_inset 
4492 </cell>
4493 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4494 \begin_inset Text
4495
4496 \layout Standard
4497
4498 N/A
4499 \end_inset 
4500 </cell>
4501 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4502 \begin_inset Text
4503
4504 \layout Standard
4505
4506 reserved
4507 \end_inset 
4508 </cell>
4509 </row>
4510 <row topline="true">
4511 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4512 \begin_inset Text
4513
4514 \layout Standard
4515
4516 11
4517 \end_inset 
4518 </cell>
4519 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4520 \begin_inset Text
4521
4522 \layout Standard
4523
4524 N/A
4525 \end_inset 
4526 </cell>
4527 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4528 \begin_inset Text
4529
4530 \layout Standard
4531
4532 N/A
4533 \end_inset 
4534 </cell>
4535 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4536 \begin_inset Text
4537
4538 \layout Standard
4539
4540 reserved
4541 \end_inset 
4542 </cell>
4543 </row>
4544 <row topline="true">
4545 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4546 \begin_inset Text
4547
4548 \layout Standard
4549
4550 12
4551 \end_inset 
4552 </cell>
4553 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4554 \begin_inset Text
4555
4556 \layout Standard
4557
4558 N/A
4559 \end_inset 
4560 </cell>
4561 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4562 \begin_inset Text
4563
4564 \layout Standard
4565
4566 N/A
4567 \end_inset 
4568 </cell>
4569 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4570 \begin_inset Text
4571
4572 \layout Standard
4573
4574 reserved
4575 \end_inset 
4576 </cell>
4577 </row>
4578 <row topline="true">
4579 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4580 \begin_inset Text
4581
4582 \layout Standard
4583
4584 13
4585 \end_inset 
4586 </cell>
4587 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4588 \begin_inset Text
4589
4590 \layout Standard
4591
4592 N/A
4593 \end_inset 
4594 </cell>
4595 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4596 \begin_inset Text
4597
4598 \layout Standard
4599
4600 N/A
4601 \end_inset 
4602 </cell>
4603 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4604 \begin_inset Text
4605
4606 \layout Standard
4607
4608 Application-defined, interpreted by callback or skipped
4609 \end_inset 
4610 </cell>
4611 </row>
4612 <row topline="true">
4613 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4614 \begin_inset Text
4615
4616 \layout Standard
4617
4618 14
4619 \end_inset 
4620 </cell>
4621 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4622 \begin_inset Text
4623
4624 \layout Standard
4625
4626 N/A
4627 \end_inset 
4628 </cell>
4629 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4630 \begin_inset Text
4631
4632 \layout Standard
4633
4634 N/A
4635 \end_inset 
4636 </cell>
4637 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4638 \begin_inset Text
4639
4640 \layout Standard
4641
4642 Speex in-band signaling
4643 \end_inset 
4644 </cell>
4645 </row>
4646 <row topline="true" bottomline="true">
4647 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4648 \begin_inset Text
4649
4650 \layout Standard
4651
4652 15
4653 \end_inset 
4654 </cell>
4655 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4656 \begin_inset Text
4657
4658 \layout Standard
4659
4660 N/A
4661 \end_inset 
4662 </cell>
4663 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4664 \begin_inset Text
4665
4666 \layout Standard
4667
4668 N/A
4669 \end_inset 
4670 </cell>
4671 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4672 \begin_inset Text
4673
4674 \layout Standard
4675
4676 Terminator code
4677 \end_inset 
4678 </cell>
4679 </row>
4680 </lyxtabular>
4681
4682 \end_inset 
4683
4684
4685 \layout Caption
4686
4687 Quality versus bit-rate
4688 \begin_inset LatexCommand \label{cap:quality_vs_bps}
4689
4690 \end_inset 
4691
4692
4693 \end_inset 
4694
4695
4696 \layout Subsection
4697
4698 Perceptual enhancement
4699 \begin_inset LatexCommand \index{perceptual enhancement}
4700
4701 \end_inset 
4702
4703
4704 \layout Standard
4705
4706 This part of the codec only applies to the decoder and can even be changed
4707  without affecting inter-operability.
4708  For that reason, the implementation provided and described here should
4709  only be considered as a reference implementation.
4710  The enhancement system is devided in two parts.
4711  First, the synthesis filter 
4712 \begin_inset Formula $S(z)=1/A(z)$
4713 \end_inset 
4714
4715  is replaced by an enhanced filter
4716 \begin_inset Formula \[
4717 S'(z)=\frac{A\left(z/a_{2}\right)A\left(z/a_{3}\right)}{A\left(z\right)A\left(z/a_{1}\right)}\]
4718
4719 \end_inset 
4720
4721 where 
4722 \begin_inset Formula $a_{1}$
4723 \end_inset 
4724
4725  and 
4726 \begin_inset Formula $a_{2}$
4727 \end_inset 
4728
4729  depend on the mode in use and 
4730 \begin_inset Formula $a_{3}=\frac{1}{r}\left(1-\frac{1-ra_{1}}{1-ra_{2}}\right)$
4731 \end_inset 
4732
4733  with 
4734 \begin_inset Formula $r=.9$
4735 \end_inset 
4736
4737 .
4738  The second part of the enhancement consists of using a comb filter to enhance
4739  the pitch in the excitation domain.
4740  
4741 \layout Section
4742 \pagebreak_top 
4743 Speex wideband mode (sub-band CELP)
4744 \begin_inset LatexCommand \index{wideband}
4745
4746 \end_inset 
4747
4748
4749 \layout Standard
4750
4751 For wideband, the Speex approach uses a 
4752 \emph on 
4753 q
4754 \emph default 
4755 uadrature 
4756 \emph on 
4757 m
4758 \emph default 
4759 irror 
4760 \emph on 
4761 f
4762 \emph default 
4763 ilter
4764 \begin_inset LatexCommand \index{quadrature mirror filter}
4765
4766 \end_inset 
4767
4768  (QMF) to split the band in two.
4769  The 16 kHz signal is thus divided into two 8 kHz signals, one representing
4770  the low band (0-4 kHz), the other the high band (4-8 kHz).
4771  The low band is encoded with the narrowband mode described in section 
4772 \begin_inset LatexCommand \ref{sec:Speex-narrowband-mode}
4773
4774 \end_inset 
4775
4776  in such a way that the resulting 
4777 \begin_inset Quotes eld
4778 \end_inset 
4779
4780 embedded narrowband bit-stream
4781 \begin_inset Quotes erd
4782 \end_inset 
4783
4784  can also be decoded with the narrowband decoder.
4785  Since the low band encoding has already been described only the high band
4786  encoding is described in this section.
4787 \layout Subsection
4788
4789 Linear Prediction
4790 \layout Standard
4791
4792 The linear prediction part used for the high-band is very similar to what
4793  is done for narrowband.
4794  The only difference is that we use only 12 bits to encode the high-band
4795  LSP's using a multi-stage vector quantizer (MSVQ).
4796  The first level quantizes the 10 coefficients with 6 bits and the error
4797  is then quantized using 6 bits too.
4798 \layout Subsection
4799
4800 Pitch Prediction
4801 \layout Standard
4802
4803 That part is easy: there's no pitch prediction for the high-band.
4804  There are two reasons for that.
4805  First, there is usually little harmonic structure in this band (above 4
4806  kHz).
4807  Second, it would be very hard to implement since the QMF folds the 4-8
4808  kHz band into 4-0 kHz (reversing the frequency axis), which means that
4809  the location of the harmonics are no longer at multiples of the fundamental
4810  (pitch).
4811 \layout Subsection
4812
4813 Excitation Quantization
4814 \layout Standard
4815
4816 The high-band excitation is coded in the same way as for narrowband.
4817  
4818 \layout Subsection
4819
4820 Bit allocation
4821 \layout Standard
4822
4823 For the wideband mode, all the narrowband frame is packed before the high-band
4824  is encoded.
4825  The narrowband part of the bit-stream is as defined in table 
4826 \begin_inset LatexCommand \ref{cap:bits-narrowband}
4827
4828 \end_inset 
4829
4830 .
4831  The high-band follows, as described in table 
4832 \begin_inset LatexCommand \ref{cap:bits-wideband}
4833
4834 \end_inset 
4835
4836 .
4837  This also means that a wideband frame may be correctly decoded by a narrowband
4838  decoder with the only caveat that if more than one frame is packed in the
4839  same packet, the decoder will need to skip the high-band parts in order
4840  to sync with the bit-stream.
4841 \layout Standard
4842
4843
4844 \begin_inset Float table
4845 placement h
4846 wide true
4847 collapsed false
4848
4849 \layout Standard
4850
4851
4852 \begin_inset  Tabular
4853 <lyxtabular version="3" rows="7" columns="7">
4854 <features>
4855 <column alignment="center" valignment="top" leftline="true" width="0pt">
4856 <column alignment="center" valignment="top" leftline="true" width="0pt">
4857 <column alignment="center" valignment="top" leftline="true" width="0pt">
4858 <column alignment="center" valignment="top" leftline="true" width="0pt">
4859 <column alignment="center" valignment="top" leftline="true" width="0pt">
4860 <column alignment="center" valignment="top" leftline="true" width="0pt">
4861 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
4862 <row topline="true" bottomline="true">
4863 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4864 \begin_inset Text
4865
4866 \layout Standard
4867
4868 Parameter
4869 \end_inset 
4870 </cell>
4871 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4872 \begin_inset Text
4873
4874 \layout Standard
4875
4876 Update rate
4877 \end_inset 
4878 </cell>
4879 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4880 \begin_inset Text
4881
4882 \layout Standard
4883
4884 0
4885 \end_inset 
4886 </cell>
4887 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4888 \begin_inset Text
4889
4890 \layout Standard
4891
4892 1
4893 \end_inset 
4894 </cell>
4895 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4896 \begin_inset Text
4897
4898 \layout Standard
4899
4900 2
4901 \end_inset 
4902 </cell>
4903 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4904 \begin_inset Text
4905
4906 \layout Standard
4907
4908 3
4909 \end_inset 
4910 </cell>
4911 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4912 \begin_inset Text
4913
4914 \layout Standard
4915
4916 4
4917 \end_inset 
4918 </cell>
4919 </row>
4920 <row topline="true">
4921 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4922 \begin_inset Text
4923
4924 \layout Standard
4925
4926 Wideband bit
4927 \end_inset 
4928 </cell>
4929 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4930 \begin_inset Text
4931
4932 \layout Standard
4933
4934 frame
4935 \end_inset 
4936 </cell>
4937 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4938 \begin_inset Text
4939
4940 \layout Standard
4941
4942 1
4943 \end_inset 
4944 </cell>
4945 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4946 \begin_inset Text
4947
4948 \layout Standard
4949
4950 1
4951 \end_inset 
4952 </cell>
4953 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4954 \begin_inset Text
4955
4956 \layout Standard
4957
4958 1
4959 \end_inset 
4960 </cell>
4961 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4962 \begin_inset Text
4963
4964 \layout Standard
4965
4966 1
4967 \end_inset 
4968 </cell>
4969 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4970 \begin_inset Text
4971
4972 \layout Standard
4973
4974 1
4975 \end_inset 
4976 </cell>
4977 </row>
4978 <row topline="true">
4979 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4980 \begin_inset Text
4981
4982 \layout Standard
4983
4984 Mode ID
4985 \end_inset 
4986 </cell>
4987 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4988 \begin_inset Text
4989
4990 \layout Standard
4991
4992 frame
4993 \end_inset 
4994 </cell>
4995 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4996 \begin_inset Text
4997
4998 \layout Standard
4999
5000 3
5001 \end_inset 
5002 </cell>
5003 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5004 \begin_inset Text
5005
5006 \layout Standard
5007
5008 3
5009 \end_inset 
5010 </cell>
5011 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5012 \begin_inset Text
5013
5014 \layout Standard
5015
5016 3
5017 \end_inset 
5018 </cell>
5019 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5020 \begin_inset Text
5021
5022 \layout Standard
5023
5024 3
5025 \end_inset 
5026 </cell>
5027 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5028 \begin_inset Text
5029
5030 \layout Standard
5031
5032 3
5033 \end_inset 
5034 </cell>
5035 </row>
5036 <row topline="true">
5037 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5038 \begin_inset Text
5039
5040 \layout Standard
5041
5042 LSP
5043 \end_inset 
5044 </cell>
5045 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5046 \begin_inset Text
5047
5048 \layout Standard
5049
5050 frame
5051 \end_inset 
5052 </cell>
5053 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5054 \begin_inset Text
5055
5056 \layout Standard
5057
5058 0
5059 \end_inset 
5060 </cell>
5061 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5062 \begin_inset Text
5063
5064 \layout Standard
5065
5066 12
5067 \end_inset 
5068 </cell>
5069 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5070 \begin_inset Text
5071
5072 \layout Standard
5073
5074 12
5075 \end_inset 
5076 </cell>
5077 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5078 \begin_inset Text
5079
5080 \layout Standard
5081
5082 12
5083 \end_inset 
5084 </cell>
5085 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5086 \begin_inset Text
5087
5088 \layout Standard
5089
5090 12
5091 \end_inset 
5092 </cell>
5093 </row>
5094 <row topline="true">
5095 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5096 \begin_inset Text
5097
5098 \layout Standard
5099
5100 Excitation gain
5101 \end_inset 
5102 </cell>
5103 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5104 \begin_inset Text
5105
5106 \layout Standard
5107
5108 sub-frame
5109 \end_inset 
5110 </cell>
5111 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5112 \begin_inset Text
5113
5114 \layout Standard
5115
5116 0
5117 \end_inset 
5118 </cell>
5119 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5120 \begin_inset Text
5121
5122 \layout Standard
5123
5124 5
5125 \end_inset 
5126 </cell>
5127 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5128 \begin_inset Text
5129
5130 \layout Standard
5131
5132 4
5133 \end_inset 
5134 </cell>
5135 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5136 \begin_inset Text
5137
5138 \layout Standard
5139
5140 4
5141 \end_inset 
5142 </cell>
5143 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5144 \begin_inset Text
5145
5146 \layout Standard
5147
5148 4
5149 \end_inset 
5150 </cell>
5151 </row>
5152 <row topline="true" bottomline="true">
5153 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5154 \begin_inset Text
5155
5156 \layout Standard
5157
5158 Excitation VQ
5159 \end_inset 
5160 </cell>
5161 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5162 \begin_inset Text
5163
5164 \layout Standard
5165
5166 sub-frame
5167 \end_inset 
5168 </cell>
5169 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5170 \begin_inset Text
5171
5172 \layout Standard
5173
5174 0
5175 \end_inset 
5176 </cell>
5177 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5178 \begin_inset Text
5179
5180 \layout Standard
5181
5182 0
5183 \end_inset 
5184 </cell>
5185 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5186 \begin_inset Text
5187
5188 \layout Standard
5189
5190 20
5191 \end_inset 
5192 </cell>
5193 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5194 \begin_inset Text
5195
5196 \layout Standard
5197
5198 40
5199 \end_inset 
5200 </cell>
5201 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5202 \begin_inset Text
5203
5204 \layout Standard
5205
5206 80
5207 \end_inset 
5208 </cell>
5209 </row>
5210 <row topline="true" bottomline="true">
5211 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5212 \begin_inset Text
5213
5214 \layout Standard
5215
5216 Total
5217 \end_inset 
5218 </cell>
5219 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5220 \begin_inset Text
5221
5222 \layout Standard
5223
5224 frame
5225 \end_inset 
5226 </cell>
5227 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5228 \begin_inset Text
5229
5230 \layout Standard
5231
5232 4
5233 \end_inset 
5234 </cell>
5235 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5236 \begin_inset Text
5237
5238 \layout Standard
5239
5240 36
5241 \end_inset 
5242 </cell>
5243 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5244 \begin_inset Text
5245
5246 \layout Standard
5247
5248 112
5249 \end_inset 
5250 </cell>
5251 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5252 \begin_inset Text
5253
5254 \layout Standard
5255
5256 192
5257 \end_inset 
5258 </cell>
5259 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5260 \begin_inset Text
5261
5262 \layout Standard
5263
5264 352
5265 \end_inset 
5266 </cell>
5267 </row>
5268 </lyxtabular>
5269
5270 \end_inset 
5271
5272
5273 \layout Caption
5274
5275 Bit allocation for high-band in wideband mode
5276 \begin_inset LatexCommand \label{cap:bits-wideband}
5277
5278 \end_inset 
5279
5280
5281 \end_inset 
5282
5283
5284 \layout Standard
5285
5286
5287 \begin_inset ERT
5288 status Open
5289
5290 \layout Standard
5291
5292 \backslash 
5293 clearpage
5294 \end_inset 
5295
5296
5297 \layout Standard
5298
5299
5300 \begin_inset ERT
5301 status Collapsed
5302
5303 \layout Standard
5304
5305 \backslash 
5306 clearpage
5307 \end_inset 
5308
5309
5310 \layout Section
5311 \start_of_appendix 
5312 FAQ
5313 \layout Subsection*
5314
5315 Vorbis is open-source
5316 \begin_inset LatexCommand \index{open-source}
5317
5318 \end_inset 
5319
5320  and patent-free
5321 \begin_inset LatexCommand \index{patent}
5322
5323 \end_inset 
5324
5325 , why do we need Speex?
5326 \layout Standard
5327
5328 Vorbis is a great project but its goals are not the same as Speex.
5329  Vorbis is mostly aimed at compressing music and audio in general, while
5330  Speex targets speech only.
5331  For that reason Speex can achieve much better results than Vorbis on speech,
5332  typically 2-4 times higher compression at equal quality.
5333 \layout Subsection*
5334
5335 Isn't there a GPL implementation of the GSM-FR codec? Why is Speex necessary?
5336 \layout Standard
5337
5338 First of all, it's not clear whether or not GSM-FR is covered by a phillips
5339  patent (see http://kbs.cs.tu-berlin.de/~jutta/toast.html).
5340  Also, GSM-FR offers mediocre quality at a relatively high bit-rate, while
5341  Speex can offer equivalent quality at almost half the bit-rate.
5342  Last but not least, Speex offers a wide range of bit-rates and sampling
5343  rates, while GSM-FR is limited to 8 kHz speech at 13 kbps.
5344 \layout Subsection*
5345
5346 Under what license is Speex released?
5347 \layout Standard
5348
5349 As of version 1.0 beta 1, Speex in released under Xiph's BSD-like license.
5350  This license is the most permissive of the open-source licenses.
5351 \layout Subsection*
5352
5353 Ogg
5354 \begin_inset LatexCommand \index{Ogg}
5355
5356 \end_inset 
5357
5358 , Speex, Vorbis
5359 \begin_inset LatexCommand \index{Vorbis}
5360
5361 \end_inset 
5362
5363 , what's the difference?
5364 \layout Standard
5365
5366 Ogg is a 
5367 \begin_inset Quotes eld
5368 \end_inset 
5369
5370 container format
5371 \begin_inset Quotes erd
5372 \end_inset 
5373
5374  for holding multimedia data.
5375  Vorbis is an audio codec that uses Ogg to store its bit-streams as files,
5376  hence the name Ogg Vorbis.
5377  Speex also uses the Ogg format to store its bit-streams as files, so technicall
5378 y they would be 
5379 \begin_inset Quotes eld
5380 \end_inset 
5381
5382 Ogg Speex
5383 \begin_inset Quotes erd
5384 \end_inset 
5385
5386  files (I prefer to call them just Speex files).
5387  One difference with Vorbis however, is that Speex is less tied with Ogg.
5388  Actually, if what you do is Voice of IP (VoIP), you don't need Ogg at all.
5389 \layout Subsection*
5390
5391 What's the extension for Speex?
5392 \layout Standard
5393
5394 Speex files have the .spx extension.
5395  Note however that all the Speex tools (speexenc, speexdec) do not rely
5396  on the extension at all so any extension will work.
5397 \layout Subsection*
5398
5399 Can I use Speex for compressing music
5400 \begin_inset LatexCommand \index{music}
5401
5402 \end_inset 
5403
5404 ?
5405 \layout Standard
5406
5407 Just like Vorbis is not really adapted to speech, Speex is really not adapted
5408  for music.
5409  In most cases, you'll be better of with Vorbis when it comes to music.
5410 \layout Subsection*
5411
5412 I converted some MP3's to Speex and the quality is bad.
5413  What's wrong?
5414 \layout Standard
5415
5416 This is called transcoding and it will always result in much poorer quality
5417  than the original MP3.
5418  Unless you have a really good (size) reason to do so, never transcode speech.
5419  This is even valid for self transcoding (tandeming), i.e.
5420  If you decode a Speex file and re-encode it again at the same bit-rate,
5421  you will lose quality.
5422 \layout Subsection*
5423
5424 Does Speex run on Windows?