Reorganization of the doc, added sample source code
[speexdsp.git] / doc / manual.lyx
1 #LyX 1.2 created this file. For more info see http://www.lyx.org/
2 \lyxformat 220
3 \textclass article
4 \language english
5 \inputencoding auto
6 \fontscheme default
7 \graphics default
8 \float_placement h
9 \paperfontsize default
10 \spacing single 
11 \papersize Default
12 \paperpackage a4
13 \use_geometry 0
14 \use_amsmath 0
15 \use_natbib 0
16 \use_numerical_citations 0
17 \paperorientation portrait
18 \secnumdepth 3
19 \tocdepth 3
20 \paragraph_separation indent
21 \defskip medskip
22 \quotes_language english
23 \quotes_times 2
24 \papercolumns 1
25 \papersides 1
26 \paperpagestyle headings
27
28 \layout Title
29
30 The Speex Codec Manual
31 \newline 
32 (for version 1.0rc3)
33 \layout Author
34
35 Jean-Marc Valin
36 \layout Standard
37 \pagebreak_top 
38 Copyright (c) 2002 Jean-Marc Valin.
39 \layout Standard
40
41 Permission is granted to copy, distribute and/or modify this document under
42  the terms of the GNU Free Documentation License, Version 1.1 or any later
43  version published by the Free Software Foundation; with no Invariant Section,
44  with no Front-Cover Texts, and with no Back-Cover.
45  A copy of the license is included in the section entitled "GNU Free Documentati
46 on License".
47  
48 \layout Standard
49 \pagebreak_top \pagebreak_bottom 
50
51 \begin_inset LatexCommand \tableofcontents{}
52
53 \end_inset 
54
55
56 \layout Standard
57 \pagebreak_bottom 
58
59 \begin_inset FloatList table
60
61 \end_inset 
62
63
64 \layout Section
65
66 Introduction to Speex
67 \layout Standard
68
69 The Speex project (
70 \family typewriter 
71 http://www.speex.org/
72 \family default 
73 ) has been started because there was a need for a speech codec that was
74  open-source and free from software patents.
75  These are essential conditions for being used by any open-source software.
76  There is already Vorbis that does general audio, but it is not really suitable
77  for speech.
78  Also, unlike many other speech codecs, Speex is not targeted at cell phones
79  (not many open-source cell phones anyway :-) ) but rather voice over IP
80  (VoIP) and file-based compression.
81  
82 \layout Standard
83
84 As design goals, we wanted to have a codec that would allowed both very
85  good quality speech and low bit-rate (unfortunately not at the same time!),
86  which led us to developing a codec with multiple bit-rates.
87  Of course very good quality also meant we had to do wideband (16 kHz sampling
88  rate) in addition to narrowband (telephone quality, 8 kHz sampling rate).
89 \layout Standard
90
91 Designing for VoIP instead of cell phone use means that Speex must be robust
92  to lost packets, but not to corrupted ones since packets either arrive
93  unaltered or don't arrive at all.
94  Also, the idea was to have a reasonnable complexity and memory requirement
95  without compromising too much on the efficiency of the codec.
96 \layout Standard
97
98 All this led us to the choice of CELP
99 \begin_inset LatexCommand \index{CELP}
100
101 \end_inset 
102
103  as the encoding technique to use for Speex.
104  One of the main reasons is that CELP has long proved that it could do the
105  job and scale well to both low bit-rates (think DoD CELP @ 4.8 kbps) and
106  high bit-rates (think G.728 @ 16 kbps).
107  
108 \layout Standard
109
110 The main characteristics can be summerized as follows:
111 \layout Itemize
112
113 Free software/open-source
114 \begin_inset LatexCommand \index{open-source}
115
116 \end_inset 
117
118 , patent
119 \begin_inset LatexCommand \index{patent}
120
121 \end_inset 
122
123  and royalty-free
124 \layout Itemize
125
126 Integration of narrowband
127 \begin_inset LatexCommand \index{narrowband}
128
129 \end_inset 
130
131  and wideband
132 \begin_inset LatexCommand \index{wideband}
133
134 \end_inset 
135
136  in the same bit-stream
137 \layout Itemize
138
139 Wide range of bit-rates available (from 2 kbps to 44 kbps)
140 \layout Itemize
141
142 Dynamic bit-rate switching and Variable Bit-Rate
143 \begin_inset LatexCommand \index{variable bit-rate}
144
145 \end_inset 
146
147  (VBR)
148 \layout Itemize
149
150 Voice Activity Detection
151 \begin_inset LatexCommand \index{voice activity detection}
152
153 \end_inset 
154
155  (VAD, integrated with VBR)
156 \layout Itemize
157
158 Variable complexity
159 \begin_inset LatexCommand \index{complexity}
160
161 \end_inset 
162
163
164 \layout Itemize
165
166 Ultra-wideband mode at 32 kHz (up to 48 kHz)
167 \layout Itemize
168
169 Intensity stereo encoding option
170 \layout Section
171 \pagebreak_top 
172 Feature description
173 \layout Standard
174
175 This section explains the main Speex features, as well as some concepts
176  in speech coding that help better understand the next sections.
177  
178 \layout Subsection*
179
180 Sampling rate
181 \begin_inset LatexCommand \index{sampling rate}
182
183 \end_inset 
184
185
186 \layout Standard
187
188 Speex is mainly designed for 3 different sampling rates: 8 kHz, 16 kHz,
189  and 32 kHz.
190  These are respectively refered to as narrowband
191 \begin_inset LatexCommand \index{narrowband}
192
193 \end_inset 
194
195 , wideband
196 \begin_inset LatexCommand \index{wideband}
197
198 \end_inset 
199
200  and ultra-wideband
201 \begin_inset LatexCommand \index{ultra-wideband}
202
203 \end_inset 
204
205 .
206  
207 \layout Subsection*
208
209 Quality
210 \begin_inset LatexCommand \index{quality}
211
212 \end_inset 
213
214
215 \layout Standard
216
217 Speex encoding is controlled most of the time by a quality parameter that
218  range from 0 to 10.
219  In constant bit-rate
220 \begin_inset LatexCommand \index{constant bit-rate}
221
222 \end_inset 
223
224  (CBR) operation, the quality parameter is an integer, while for variable
225  bit-rate (VBR), the parameter is a float.
226  
227 \layout Subsection*
228
229 Complexity
230 \begin_inset LatexCommand \index{complexity}
231
232 \end_inset 
233
234  (variable)
235 \layout Standard
236
237 With Speex, it is possible to vary the complexity allowed for the encoder.
238  This is done by controlling how the search is performed with an integer
239  ranging from 1 to 10 in a way that's similar to the -1 to -9 options to
240  
241 \emph on 
242 gzip
243 \emph default 
244  and 
245 \emph on 
246 bzip2
247 \emph default 
248  compression utilities.
249  For normal use, the noise level at complexity 1is between 1 and 2 dB higher
250  than at complexity 10, but the CPU requirements for complexity 10 is about
251  5 time higher than for complexity 1.
252  In practice, the best trade-off is between complexity 2 and 4, though higher
253  settings are often useful when encoding non-speech sounds like DTMF
254 \begin_inset LatexCommand \index{DTMF}
255
256 \end_inset 
257
258  tones.
259 \layout Subsection*
260
261 Variable Bit-Rate
262 \begin_inset LatexCommand \index{variable bit-rate}
263
264 \end_inset 
265
266  (VBR)
267 \layout Standard
268
269 Variable bit-rate (VBR) allows a codec to change its bit-rate dynamically
270  to adapt to the 
271 \begin_inset Quotes eld
272 \end_inset 
273
274 difficulty
275 \begin_inset Quotes erd
276 \end_inset 
277
278  of the audio being encoded.
279  In the example of Speex, sounds like vowels and high-energy transients
280  require a higher bit-rate to achieve good quality, while fricatives (e.g.
281  s,f sounds) can be coded adequately with less bits.
282  For this reason, VBR can achive lower bit-rate for the same quality, or
283  a better quality for a certain bit-rate.
284  Despite its advantages, VBR has two main drawbacks: first, by only specifying
285  quality, there's no guaranty about the final average bit-rate.
286  Second, for some real-time applications like voice over IP (VoIP), what
287  counts is the maximum bit-rate, which must be low enough for the communication
288  channel.
289 \layout Subsection*
290
291 Average Bit-Rate
292 \begin_inset LatexCommand \index{average bit-rate}
293
294 \end_inset 
295
296  (ABR)
297 \layout Standard
298
299 Average bit-rate solves one of the problems of VBR, as it dynamically adjusts
300  VBR quality in order to meet a specific target bit-rate.
301  Because the quality/bit-rate is adjusted in real-time (open-loop), the
302  global quality will be slightly lower than that obtained be encoding in
303  VBR with exactly the right quality setting to meet the target average bit-rate.
304 \layout Subsection*
305
306 Voice Activity Detection
307 \begin_inset LatexCommand \index{voice activity detection}
308
309 \end_inset 
310
311  (VAD)
312 \layout Standard
313
314 When enabled, voice activity detection detects whether the audio being encoded
315  is speech or silence/background noise.
316  VAD is always implicitly activated when encoding in VBR, so the option
317  is only useful in non-VBR operation.
318  In this case, Speex detects non-speech periods and encode them with just
319  enough bits to reproduce the background noise.
320  This is called 
321 \begin_inset Quotes eld
322 \end_inset 
323
324 comfort noise generation
325 \begin_inset Quotes erd
326 \end_inset 
327
328  (CNG).
329 \layout Subsection*
330
331 Discontinuous Transmission
332 \begin_inset LatexCommand \index{discontinuous transmission}
333
334 \end_inset 
335
336  (DTX)
337 \layout Standard
338
339 Discontinuous transmission is an addition to VAD operation, that allows
340  to stop transmitting completely when the background noise is stationnary.
341  In file-based operation, since we cannot just stop writing to the file,
342  only 5 bits are used for such frames (corresponding to 250 bps).
343 \layout Subsection*
344
345 Perceptual enhancement
346 \begin_inset LatexCommand \index{perceptual enhancement}
347
348 \end_inset 
349
350
351 \layout Standard
352
353 Perceptual enhancement is a part of the decoder which, when turned on, tries
354  to reduce (the perception of) the noise produced by the coding/decoding
355  process.
356  In most cases, perceptual enhancement make the sound further from the original
357  
358 \emph on 
359 objectively
360 \emph default 
361  (if you use SNR), but in the end it still 
362 \emph on 
363 sounds
364 \emph default 
365  better (subjective improvement).
366 \layout Subsection*
367
368 Algorithmic delay
369 \begin_inset LatexCommand \index{algorithmic delay}
370
371 \end_inset 
372
373
374 \layout Standard
375
376 Every speech codec introduces a delay in the transmission.
377  For Speex, this delay is equal to the frame size, plus some amount of 
378 \begin_inset Quotes eld
379 \end_inset 
380
381 look-ahead
382 \begin_inset Quotes erd
383 \end_inset 
384
385  required to process each frame.
386  In narrowband operation (8 kHz), the delay is 30 ms, while for wideband
387  (16 kHz), the delay is 34 ms.
388  These values don't account for the CPU time it takes to encode or decode
389  the frames.
390 \layout Section
391 \pagebreak_top 
392 Command-line encoder/decoder
393 \begin_inset LatexCommand \label{sec:Command-line-encoder/decoder}
394
395 \end_inset 
396
397
398 \layout Standard
399
400 The base Speex distribution includes a command-line encoder (
401 \emph on 
402 speexenc
403 \emph default 
404 ) and decoder (
405 \emph on 
406 speexdec
407 \emph default 
408 ).
409  This section describes how to use these tools.
410 \layout Subsection
411
412
413 \emph on 
414 speexenc
415 \begin_inset LatexCommand \index{speexenc}
416
417 \end_inset 
418
419
420 \layout Standard
421
422 The 
423 \emph on 
424 speexenc
425 \emph default 
426  utility is used to create Speex files from raw PCM or wave files.
427  It can be used by calling: 
428 \layout LyX-Code
429
430 speexenc [options] input_file output_file
431 \layout Standard
432
433 The value '-' for input_file or output_file corresponds respectively to
434  stdin and stdout.
435  The valid options are:
436 \layout Description
437
438 --narrowband\SpecialChar ~
439 (-n) Tell Speex to treat the input as narrowband (8 kHz).
440  This is the default
441 \layout Description
442
443 --wideband\SpecialChar ~
444 (-w) Tell Speex to treat the input as wideband (16 kHz)
445 \layout Description
446
447 --ultra-wideband\SpecialChar ~
448 (-u) Tell Speex to treat the input as 
449 \begin_inset Quotes eld
450 \end_inset 
451
452 ultra-wideband
453 \begin_inset Quotes erd
454 \end_inset 
455
456  (32 kHz)
457 \layout Description
458
459 --quality\SpecialChar ~
460 n Set the encoding quality (0-10), default is 8
461 \layout Description
462
463 --bitrate\SpecialChar ~
464 n Encoding bit-rate (use bit-rate n or lower) 
465 \layout Description
466
467 --vbr Enable VBR (Variable Bit-Rate), disabled by default
468 \layout Description
469
470 --abr\SpecialChar ~
471 n Enable ABR (Average Bit-Rate) at n kbps, disabled by default
472 \layout Description
473
474 --vad Enable VAD (Voice Activity Detection), disabled by default
475 \layout Description
476
477 --dtx Enable DTX (Discontinuous Transmission), disabled by default
478 \layout Description
479
480 --nframes\SpecialChar ~
481 n Pack n frames in each Ogg packet (this saves space at low bit-rates)
482 \layout Description
483
484 --comp\SpecialChar ~
485 n Set encoding speed/quality tradeoff.
486  The higher the value of n, the slower the encoding (default is 3)
487 \layout Description
488
489 -V Verbose operation, print bit-rate currently in use
490 \layout Description
491
492 --help\SpecialChar ~
493 (-h) Print the help
494 \layout Description
495
496 --version\SpecialChar ~
497 (-v) Print version information
498 \layout Subsubsection*
499
500 Speex comments
501 \layout Description
502
503 --comment Add the given string as an extra comment.
504  This may be used multiple times.
505  
506 \layout Description
507
508 --author Author of this track.
509  
510 \layout Description
511
512 --title Title for this track.
513  
514 \layout Subsubsection*
515
516 Raw input options
517 \layout Description
518
519 --rate\SpecialChar ~
520 n Sampling rate for raw input
521 \layout Description
522
523 --stereo Consider raw input as stereo 
524 \layout Description
525
526 --le Raw input is little-endian 
527 \layout Description
528
529 --be Raw input is big-endian 
530 \layout Description
531
532 --8bit Raw input is 8-bit unsigned 
533 \layout Description
534
535 --16bit Raw input is 16-bit signed 
536 \layout Subsection
537
538
539 \emph on 
540 speexdec
541 \begin_inset LatexCommand \index{speexdec}
542
543 \end_inset 
544
545
546 \layout Standard
547
548 The 
549 \emph on 
550 speexdec
551 \emph default 
552  utility is used to decode Speex files and can be used by calling: 
553 \layout LyX-Code
554
555 speexdec [options] speex_file [output_file]
556 \layout Standard
557
558 The value '-' for input_file or output_file corresponds respectively to
559  stdin and stdout.
560  Also, when no output_file is specified, the file is played to the soundcard.
561  The valid options are:
562 \layout Description
563
564 --enh enable post-filter (default)
565 \layout Description
566
567 --no-enh disable post-filter
568 \layout Description
569
570 --force-nb Force decoding in narrowband 
571 \layout Description
572
573 --force-wb Force decoding in wideband 
574 \layout Description
575
576 --force-uwb Force decoding in ultra-wideband 
577 \layout Description
578
579 --mono Force decoding in mono 
580 \layout Description
581
582 --stereo Force decoding in stereo 
583 \layout Description
584
585 --rate\SpecialChar ~
586 n For decoding at n Hz sampling rate
587 \layout Description
588
589 --packet-loss\SpecialChar ~
590 n Simulate n % random packet loss
591 \layout Description
592
593 -V Verbose operation, print bit-rate currently in use
594 \layout Description
595
596 --help\SpecialChar ~
597 (-h) Print the help
598 \layout Description
599
600 --version\SpecialChar ~
601 (-v) Print version information
602 \layout Section
603 \pagebreak_top 
604 Programming with Speex (the libspeex
605 \begin_inset LatexCommand \index{libspeex}
606
607 \end_inset 
608
609  API
610 \begin_inset LatexCommand \index{API}
611
612 \end_inset 
613
614 )
615 \layout Subsection
616
617 Encoding
618 \layout Standard
619
620 In order to encode speech using Speex, you first need to:
621 \layout LyX-Code
622
623 #include <speex.h>
624 \layout Standard
625
626 You then need to declare a Speex bit-packing struct
627 \layout LyX-Code
628
629 SpeexBits bits;
630 \layout Standard
631
632 and a Speex encoder state
633 \layout LyX-Code
634
635 void *enc_state;
636 \layout Standard
637
638 The two are initialized by:
639 \layout LyX-Code
640
641 speex_bits_init(&bits);
642 \layout LyX-Code
643
644 enc_state = speex_encoder_init(&speex_nb_mode);
645 \layout Standard
646
647 For wideband coding, 
648 \emph on 
649 speex_nb_mode
650 \emph default 
651  will be replaced by 
652 \emph on 
653 speex_wb_mode
654 \emph default 
655 .
656  In most cases, you will need to know the frame size used by the mode you
657  are using.
658  You can get that value in the 
659 \emph on 
660 frame_size
661 \emph default 
662  variable with:
663 \layout LyX-Code
664
665 speex_encoder_ctl(enc_state,SPEEX_GET_FRAME_SIZE,&frame_size);
666 \layout Standard
667
668 Once the initialization is done, for every input frame:
669 \layout LyX-Code
670
671 speex_bits_reset(&bits);
672 \layout LyX-Code
673
674 speex_encode(enc_state, input_frame, &bits);
675 \layout LyX-Code
676
677 nbBytes = speex_bits_write(&bits, byte_ptr, MAX_NB_BYTES);
678 \layout Standard
679
680 where 
681 \emph on 
682 input_frame
683 \emph default 
684  is a 
685 \emph on 
686 (float *)
687 \emph default 
688  pointing to the beginning of a speech frame, 
689 \emph on 
690 byte_ptr
691 \emph default 
692  is a 
693 \emph on 
694 (char *)
695 \emph default 
696  where the encoded frame will be written, 
697 \emph on 
698 MAX_NB_BYTES
699 \emph default 
700  is the maximum number of bytes that can be written to 
701 \emph on 
702 byte_ptr
703 \emph default 
704  without causing an overflow and 
705 \emph on 
706 nbBytes
707 \emph default 
708  is the number of bytes actually written to 
709 \emph on 
710 byte_ptr
711 \emph default 
712  (the encoded size in bytes).
713  Before calling speex_bits_write, it is possible to find the number of bytes
714  that need to be written by calling 
715 \family typewriter 
716 speex_bits_nbytes(&bits)
717 \family default 
718 , which returns a number of bytes.
719  
720 \layout Standard
721
722 After you're done with the encoding, free all resources with:
723 \layout LyX-Code
724
725 speex_bits_destroy(&bits);
726 \layout LyX-Code
727
728 speex_encoder_destroy(enc_state);
729 \layout Standard
730
731 That's about it for the encoder.
732  
733 \layout Subsection
734
735 Decoding
736 \layout Standard
737
738 In order to encode speech using Speex, you first need to:
739 \layout LyX-Code
740
741 #include <speex.h>
742 \layout Standard
743
744 You also need to declare a Speex bit-packing struct
745 \layout LyX-Code
746
747 SpeexBits bits;
748 \layout Standard
749
750 and a Speex encoder state
751 \layout LyX-Code
752
753 void *dec_state;
754 \layout Standard
755
756 The two are initialized by:
757 \layout LyX-Code
758
759 speex_bits_init(&bits);
760 \layout LyX-Code
761
762 dec_state = speex_decoder_init(&speex_nb_mode);
763 \layout Standard
764
765 For wideband decoding, 
766 \emph on 
767 speex_nb_mode
768 \emph default 
769  will be replaced by 
770 \emph on 
771 speex_wb_mode
772 \emph default 
773 .
774  If you need to obtain the size of the frames that will be used by the decoder,
775  you can get that value in the 
776 \emph on 
777 frame_size
778 \emph default 
779  variable with:
780 \layout LyX-Code
781
782 speex_decoder_ctl(dec_state, SPEEX_GET_FRAME_SIZE, &frame_size); 
783 \layout Standard
784
785 There is also a parameter that can be set for the decoder: whether or not
786  to use a perceptual post-filter.
787  This can be set by: 
788 \layout LyX-Code
789
790 speex_decoder_ctl(dec_state, SPEEX_SET_ENH, &enh); 
791 \layout Standard
792
793 where 
794 \emph on 
795 enh
796 \emph default 
797  is an int that with value 0 to have the post-filter disabled and 1 to have
798  it enabled.
799 \layout Standard
800
801 Again, once the decoder initialization is done, for every input frame:
802 \layout LyX-Code
803
804 speex_bits_read_from(&bits, input_bytes, nbBytes);
805 \layout LyX-Code
806
807 speex_decode(st, &bits, output_frame);
808 \layout Standard
809
810 where input_bytes is a 
811 \emph on 
812 (char *)
813 \emph default 
814  containing the bit-stream data received for a frame, 
815 \emph on 
816 nbBytes
817 \emph default 
818  is the size (in bytes) of that bit-stream, and 
819 \emph on 
820 output_frame
821 \emph default 
822  is a 
823 \emph on 
824 (float *)
825 \emph default 
826  and points to the area where the decoded speech frame will be written.
827  A NULL value as the first argument indicates that we don't have the bits
828  for the current frame.
829  When a frame is lost, the Speex decoder will do its best to "guess" the
830  correct signal.
831 \layout Standard
832
833 After you're done with the decoding, free all resources with:
834 \layout LyX-Code
835
836 speex_bits_destroy(&bits);
837 \layout LyX-Code
838
839 speex_decoder_destroy(dec_state);
840 \layout Subsection
841
842 Codec Options (speex_*_ctl)
843 \layout Standard
844
845 The Speex encoder and decoder support many options and requests that can
846  be accessed through the 
847 \emph on 
848 speex_encoder_ctl
849 \emph default 
850  and 
851 \emph on 
852 speex_decoder_ctl
853 \emph default 
854  functions.
855  These functions are similar to the 
856 \emph on 
857 ioctl
858 \emph default 
859  system call and their prototypes are:
860 \layout LyX-Code
861
862 void speex_encoder_ctl(void *encoder, int request, void *ptr);
863 \layout LyX-Code
864
865 void speex_decoder_ctl(void *encoder, int request, void *ptr);
866 \layout Standard
867
868 The different values of request allowed are (note that some only apply to
869  the encoder or the decoder):
870 \layout Description
871
872 SPEEX_SET_ENH** Set perceptual enhancer
873 \begin_inset LatexCommand \index{perceptual enhancement}
874
875 \end_inset 
876
877  to on (1) or off (0) (integer)
878 \layout Description
879
880 SPEEX_GET_ENH** Get perceptual enhancer status (integer)
881 \layout Description
882
883 SPEEX_GET_FRAME_SIZE Get the frame size used for the current mode (integer)
884 \layout Description
885
886 SPEEX_SET_QUALITY* Set the encoder speech quality (integer 0 to 10)
887 \layout Description
888
889 SPEEX_GET_QUALITY* Get the current encoder speech quality (integer 0 to
890  10)
891 \layout Description
892
893 SPEEX_SET_MODE*
894 \begin_inset Formula $\dagger $
895 \end_inset 
896
897
898 \layout Description
899
900 SPEEX_GET_MODE*
901 \begin_inset Formula $\dagger $
902 \end_inset 
903
904
905 \layout Description
906
907 SPEEX_SET_LOW_MODE*
908 \begin_inset Formula $\dagger $
909 \end_inset 
910
911
912 \layout Description
913
914 SPEEX_GET_LOW_MODE*
915 \begin_inset Formula $\dagger $
916 \end_inset 
917
918
919 \layout Description
920
921 SPEEX_SET_HIGH_MODE*
922 \begin_inset Formula $\dagger $
923 \end_inset 
924
925
926 \layout Description
927
928 SPEEX_GET_HIGH_MODE*
929 \begin_inset Formula $\dagger $
930 \end_inset 
931
932
933 \layout Description
934
935 SPEEX_SET_VBR* Set variable bit-rate (VBR) to on (1) or off (0) (integer)
936 \layout Description
937
938 SPEEX_GET_VBR* Get variable bit-rate
939 \begin_inset LatexCommand \index{variable bit-rate}
940
941 \end_inset 
942
943  (VBR) status (integer)
944 \layout Description
945
946 SPEEX_SET_VBR_QUALITY* Set the encoder VBR speech quality (float 0 to 10)
947 \layout Description
948
949 SPEEX_GET_VBR_QUALITY* Get the current encoder VBR speech quality (float
950  0 to 10)
951 \layout Description
952
953 SPEEX_SET_COMPLEXITY* Set the CPU resources allowed for the encoder (integer
954  1 to 10)
955 \layout Description
956
957 SPEEX_GET_COMPLEXITY* Get the CPU resources allowed for the encoder (integer
958  1 to 10)
959 \layout Description
960
961 SPEEX_SET_BITRATE* Set the bit-rate to use to the closest value not exceeding
962  the parameter (integer in bps)
963 \layout Description
964
965 SPEEX_GET_BITRATE Get the current bit-rate in use (integer in bps)
966 \layout Description
967
968 SPEEX_SET_SAMPLING_RATE Set real sampling rate (integer in Hz)
969 \layout Description
970
971 SPEEX_GET_SAMPLING_RATE Get real sampling rate (integer in Hz)
972 \layout Description
973
974 SPEEX_RESET_STATE Reset the encoder/decoder state to its original state
975  (zeros all memories)
976 \layout Description
977
978 SPEEX_SET_VAD* Set voice activity detection
979 \begin_inset LatexCommand \index{voice activity detection}
980
981 \end_inset 
982
983  (VAD) to on (1) or off (0) (integer)
984 \layout Description
985
986 SPEEX_GET_VAD* Get voice activity detection (VAD) status (integer)
987 \layout Description
988
989 SPEEX_SET_DTX* Set discontinuous transmission
990 \begin_inset LatexCommand \index{discontinuous transmission}
991
992 \end_inset 
993
994  (DTX) to on (1) or off (0) (integer)
995 \layout Description
996
997 SPEEX_GET_DTX* Get discontinuous transmission (DTX) status (integer)
998 \layout Description
999
1000 SPEEX_SET_ABR* Set average bit-rate
1001 \begin_inset LatexCommand \index{average bit-rate}
1002
1003 \end_inset 
1004
1005  (ABR) to a value n in bits per second (integer in bps)
1006 \layout Description
1007
1008 SPEEX_GET_ABR* Get average bit-rate (ABR) setting (integer in bps)
1009 \layout Description
1010
1011 * applies only to the encoder
1012 \layout Description
1013
1014 ** applies only to the decoder
1015 \layout Description
1016
1017
1018 \begin_inset Formula $\dagger $
1019 \end_inset 
1020
1021  normally only used internally
1022 \layout Subsection
1023
1024 Mode queries
1025 \layout Standard
1026
1027 Speex modes have a querry system similar to the speex_encoder_ctl and speex_deco
1028 der_ctl calls.
1029  Since modes are read-only, it is only possible to get information about
1030  a particular mode.
1031  The function used to do that is:
1032 \layout LyX-Code
1033
1034 void speex_mode_query(SpeexMode *mode, int request, void *ptr);
1035 \layout Standard
1036
1037 The admissible values for request are (unless otherwise note, the values
1038  are returned through 
1039 \emph on 
1040 ptr
1041 \emph default 
1042 ):
1043 \layout Description
1044
1045 SPEEX_MODE_FRAME_SIZE Get the frame size (in samples) for the mode
1046 \layout Description
1047
1048 SPEEX_SUBMODE_BITRATE Get the bit-rate for a submode number specified throught
1049  
1050 \emph on 
1051 ptr
1052 \emph default 
1053  (integer in bps).
1054  
1055 \layout Subsection
1056
1057 Packing and in-band signalling
1058 \begin_inset LatexCommand \index{in-band signalling}
1059
1060 \end_inset 
1061
1062
1063 \layout Standard
1064
1065 Sometimes it is desirable to pack more than one frame per packet (or other
1066  basic unit of storage).
1067  The proper way to do it is to call speex_encode 
1068 \begin_inset Formula $N$
1069 \end_inset 
1070
1071  times before writing the stream with speex_bits_write.
1072  In cases where the number of frames is not determined by an out-of-band
1073  mechanism, it is possible to include a terminator code.
1074  That terminator consists of the code 15 (decimal) encoded with 5 bits,
1075  as shown in figure 
1076 \begin_inset LatexCommand \ref{cap:quality_vs_bps}
1077
1078 \end_inset 
1079
1080 .
1081  
1082 \layout Standard
1083
1084 It is also possible to send in-band 
1085 \begin_inset Quotes eld
1086 \end_inset 
1087
1088 messages
1089 \begin_inset Quotes erd
1090 \end_inset 
1091
1092  to the other side.
1093  All these messages are encoded as a 
1094 \begin_inset Quotes eld
1095 \end_inset 
1096
1097 pseudo-frame
1098 \begin_inset Quotes erd
1099 \end_inset 
1100
1101  of mode 14 which contain a 4-bit message type code, followed by the message.
1102  Table 
1103 \begin_inset LatexCommand \ref{cap:In-band-signalling-codes}
1104
1105 \end_inset 
1106
1107  lists the available codes, their meaning and the size of the message that
1108  follow.
1109  Most of these messages are requests that are sent to the encoder or decoder
1110  on the other end, which is free to comply or ignore them.
1111  By default, all in-band messages are ignored.
1112 \layout Standard
1113
1114
1115 \begin_inset Float table
1116 placement htbp
1117 wide false
1118 collapsed false
1119
1120 \layout Standard
1121
1122
1123 \begin_inset  Tabular
1124 <lyxtabular version="3" rows="17" columns="3">
1125 <features>
1126 <column alignment="center" valignment="top" leftline="true" width="0pt">
1127 <column alignment="center" valignment="top" leftline="true" width="0pt">
1128 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
1129 <row topline="true" bottomline="true">
1130 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1131 \begin_inset Text
1132
1133 \layout Standard
1134
1135 code
1136 \end_inset 
1137 </cell>
1138 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1139 \begin_inset Text
1140
1141 \layout Standard
1142
1143 Size (bits)
1144 \end_inset 
1145 </cell>
1146 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1147 \begin_inset Text
1148
1149 \layout Standard
1150
1151 Content
1152 \end_inset 
1153 </cell>
1154 </row>
1155 <row topline="true">
1156 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1157 \begin_inset Text
1158
1159 \layout Standard
1160
1161 0
1162 \end_inset 
1163 </cell>
1164 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1165 \begin_inset Text
1166
1167 \layout Standard
1168
1169 1
1170 \end_inset 
1171 </cell>
1172 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1173 \begin_inset Text
1174
1175 \layout Standard
1176
1177 Asks decoder to set perceptual enhancement off (0) or on(1)
1178 \end_inset 
1179 </cell>
1180 </row>
1181 <row topline="true">
1182 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1183 \begin_inset Text
1184
1185 \layout Standard
1186
1187 1
1188 \end_inset 
1189 </cell>
1190 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1191 \begin_inset Text
1192
1193 \layout Standard
1194
1195 1
1196 \end_inset 
1197 </cell>
1198 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1199 \begin_inset Text
1200
1201 \layout Standard
1202
1203 reserved
1204 \end_inset 
1205 </cell>
1206 </row>
1207 <row topline="true">
1208 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1209 \begin_inset Text
1210
1211 \layout Standard
1212
1213 2
1214 \end_inset 
1215 </cell>
1216 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1217 \begin_inset Text
1218
1219 \layout Standard
1220
1221 4
1222 \end_inset 
1223 </cell>
1224 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1225 \begin_inset Text
1226
1227 \layout Standard
1228
1229 Asks encoder to switch to mode N
1230 \end_inset 
1231 </cell>
1232 </row>
1233 <row topline="true">
1234 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1235 \begin_inset Text
1236
1237 \layout Standard
1238
1239 3
1240 \end_inset 
1241 </cell>
1242 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1243 \begin_inset Text
1244
1245 \layout Standard
1246
1247 4
1248 \end_inset 
1249 </cell>
1250 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1251 \begin_inset Text
1252
1253 \layout Standard
1254
1255 Asks encoder to switch to mode N for low-band
1256 \end_inset 
1257 </cell>
1258 </row>
1259 <row topline="true">
1260 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1261 \begin_inset Text
1262
1263 \layout Standard
1264
1265 4
1266 \end_inset 
1267 </cell>
1268 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1269 \begin_inset Text
1270
1271 \layout Standard
1272
1273 4
1274 \end_inset 
1275 </cell>
1276 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1277 \begin_inset Text
1278
1279 \layout Standard
1280
1281 Asks encoder to switch to mode N for high-band
1282 \end_inset 
1283 </cell>
1284 </row>
1285 <row topline="true">
1286 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1287 \begin_inset Text
1288
1289 \layout Standard
1290
1291 5
1292 \end_inset 
1293 </cell>
1294 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1295 \begin_inset Text
1296
1297 \layout Standard
1298
1299 4
1300 \end_inset 
1301 </cell>
1302 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1303 \begin_inset Text
1304
1305 \layout Standard
1306
1307 Asks encoder to switch to quality N for VBR
1308 \end_inset 
1309 </cell>
1310 </row>
1311 <row topline="true">
1312 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1313 \begin_inset Text
1314
1315 \layout Standard
1316
1317 6
1318 \end_inset 
1319 </cell>
1320 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1321 \begin_inset Text
1322
1323 \layout Standard
1324
1325 4
1326 \end_inset 
1327 </cell>
1328 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1329 \begin_inset Text
1330
1331 \layout Standard
1332
1333 Request acknowloedge (0=no, 1=all, 2=only for in-band data)
1334 \end_inset 
1335 </cell>
1336 </row>
1337 <row topline="true">
1338 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1339 \begin_inset Text
1340
1341 \layout Standard
1342
1343 7
1344 \end_inset 
1345 </cell>
1346 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1347 \begin_inset Text
1348
1349 \layout Standard
1350
1351 4
1352 \end_inset 
1353 </cell>
1354 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1355 \begin_inset Text
1356
1357 \layout Standard
1358
1359 Asks encoder to set VBR off (0), on(1), VAD(2), DTX(3)
1360 \end_inset 
1361 </cell>
1362 </row>
1363 <row topline="true">
1364 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1365 \begin_inset Text
1366
1367 \layout Standard
1368
1369 8
1370 \end_inset 
1371 </cell>
1372 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1373 \begin_inset Text
1374
1375 \layout Standard
1376
1377 8
1378 \end_inset 
1379 </cell>
1380 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1381 \begin_inset Text
1382
1383 \layout Standard
1384
1385 Transmit (8-bit) character to the other end
1386 \end_inset 
1387 </cell>
1388 </row>
1389 <row topline="true">
1390 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1391 \begin_inset Text
1392
1393 \layout Standard
1394
1395 9
1396 \end_inset 
1397 </cell>
1398 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1399 \begin_inset Text
1400
1401 \layout Standard
1402
1403 8
1404 \end_inset 
1405 </cell>
1406 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1407 \begin_inset Text
1408
1409 \layout Standard
1410
1411 Intensity stereo information
1412 \end_inset 
1413 </cell>
1414 </row>
1415 <row topline="true">
1416 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1417 \begin_inset Text
1418
1419 \layout Standard
1420
1421 10
1422 \end_inset 
1423 </cell>
1424 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1425 \begin_inset Text
1426
1427 \layout Standard
1428
1429 16
1430 \end_inset 
1431 </cell>
1432 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1433 \begin_inset Text
1434
1435 \layout Standard
1436
1437 Announce maximum bit-rate acceptable (N in bytes/second)
1438 \end_inset 
1439 </cell>
1440 </row>
1441 <row topline="true">
1442 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1443 \begin_inset Text
1444
1445 \layout Standard
1446
1447 11
1448 \end_inset 
1449 </cell>
1450 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1451 \begin_inset Text
1452
1453 \layout Standard
1454
1455 16
1456 \end_inset 
1457 </cell>
1458 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1459 \begin_inset Text
1460
1461 \layout Standard
1462
1463 reserved
1464 \end_inset 
1465 </cell>
1466 </row>
1467 <row topline="true">
1468 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1469 \begin_inset Text
1470
1471 \layout Standard
1472
1473 12
1474 \end_inset 
1475 </cell>
1476 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1477 \begin_inset Text
1478
1479 \layout Standard
1480
1481 32
1482 \end_inset 
1483 </cell>
1484 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1485 \begin_inset Text
1486
1487 \layout Standard
1488
1489 Acknowledge receiving packet N
1490 \end_inset 
1491 </cell>
1492 </row>
1493 <row topline="true">
1494 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1495 \begin_inset Text
1496
1497 \layout Standard
1498
1499 13
1500 \end_inset 
1501 </cell>
1502 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1503 \begin_inset Text
1504
1505 \layout Standard
1506
1507 32
1508 \end_inset 
1509 </cell>
1510 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1511 \begin_inset Text
1512
1513 \layout Standard
1514
1515 reserved
1516 \end_inset 
1517 </cell>
1518 </row>
1519 <row topline="true">
1520 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1521 \begin_inset Text
1522
1523 \layout Standard
1524
1525 14
1526 \end_inset 
1527 </cell>
1528 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1529 \begin_inset Text
1530
1531 \layout Standard
1532
1533 64
1534 \end_inset 
1535 </cell>
1536 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1537 \begin_inset Text
1538
1539 \layout Standard
1540
1541 reserved
1542 \end_inset 
1543 </cell>
1544 </row>
1545 <row topline="true" bottomline="true">
1546 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1547 \begin_inset Text
1548
1549 \layout Standard
1550
1551 15
1552 \end_inset 
1553 </cell>
1554 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1555 \begin_inset Text
1556
1557 \layout Standard
1558
1559 64
1560 \end_inset 
1561 </cell>
1562 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1563 \begin_inset Text
1564
1565 \layout Standard
1566
1567 reserved
1568 \end_inset 
1569 </cell>
1570 </row>
1571 </lyxtabular>
1572
1573 \end_inset 
1574
1575
1576 \layout Caption
1577
1578 In-band signalling codes
1579 \begin_inset LatexCommand \label{cap:In-band-signalling-codes}
1580
1581 \end_inset 
1582
1583
1584 \end_inset 
1585
1586
1587 \layout Standard
1588
1589 Finally, applications may define custom in-band messages using mode 13.
1590  The size of the message in bytes is encoded with 5 bits, so that the decoder
1591  can skip it if it doesn't know how to interpret it.
1592 \layout Section
1593 \pagebreak_top 
1594 Formats and standards
1595 \begin_inset LatexCommand \index{standards}
1596
1597 \end_inset 
1598
1599
1600 \layout Standard
1601
1602 Speex can encode speech in both narrowband and wideband and provides different
1603  bit-rates.
1604  However not all features must be supported by a certain implementation
1605  or device.
1606  In order to be said 
1607 \begin_inset Quotes eld
1608 \end_inset 
1609
1610 Speex compatible
1611 \begin_inset Quotes erd
1612 \end_inset 
1613
1614  (whatever that means), an implementation must implement at least a basic
1615  set of features.
1616 \layout Standard
1617
1618 At the minimum, all narrowband modes of operation MUST be supported at the
1619  decoder.
1620  This includes the decoding of a wideband bit-stream by the narrowband decoder
1621 \begin_inset Foot
1622 collapsed true
1623
1624 \layout Standard
1625
1626 The wideband bit-stream contains an embedded narrowband bit-stream which
1627  can be decoded alone
1628 \end_inset 
1629
1630 .
1631  If present, a wideband decoder MUST be able to decode a narrowband stream,
1632  and MAY either be able to decode all wideband modes or be able to decode
1633  the embedded narrowband part of all modes (which includes ignoring the
1634  high-band bits).
1635 \layout Standard
1636
1637 For encoders, at least one narrowband or wideband mode MUST be supported.
1638  The main reason why all encoding modes do not have to be supported is that
1639  some platforms may not be able to handle the complexity of encoding in
1640  some modes.
1641 \layout Subsection
1642
1643 RTP
1644 \begin_inset LatexCommand \index{RTP}
1645
1646 \end_inset 
1647
1648  Payload Format 
1649 \layout Standard
1650
1651 The latest RTP payload draft can be found at 
1652 \begin_inset LatexCommand \url{http://www.speex.org/drafts/latest}
1653
1654 \end_inset 
1655
1656 .
1657  We are (2003/01/14) about to send the latest draft to the IETF for comments.
1658  
1659 \layout Subsection
1660
1661 MIME Type
1662 \layout Standard
1663
1664 Speex will use the MIME type 
1665 \family typewriter 
1666 audio/speex
1667 \family default 
1668 .
1669  We will apply for that type in the near future.
1670 \layout Subsection
1671
1672 Ogg
1673 \begin_inset LatexCommand \index{Ogg}
1674
1675 \end_inset 
1676
1677  file format
1678 \layout Standard
1679
1680 Speex bit-streams can be stored in Ogg files.
1681  In this case, the first packet of the Ogg file contains the Speex header
1682  described in table 
1683 \begin_inset LatexCommand \ref{cap:ogg_speex_header}
1684
1685 \end_inset 
1686
1687 .
1688  All integer fields in the headers are stored as little-endian.
1689  The 
1690 \family typewriter 
1691 speex_string
1692 \family default 
1693  field must contain the 
1694 \begin_inset Quotes eld
1695 \end_inset 
1696
1697
1698 \family typewriter 
1699 Speex
1700 \family default 
1701 \SpecialChar ~
1702 \SpecialChar ~
1703 \SpecialChar ~
1704
1705 \begin_inset Quotes eld
1706 \end_inset 
1707
1708  (with 3 training spaces), which identifies the bit-stream.
1709  The next field, 
1710 \family typewriter 
1711 speex_version
1712 \family default 
1713  contains the version of Speex that encoded the file.
1714  For now, refer to speex_header.[ch] for more info.
1715  The 
1716 \emph on 
1717 beginning of stream
1718 \emph default 
1719  (
1720 \family typewriter 
1721 b_o_s
1722 \family default 
1723 ) flag is set to 1 for the header.
1724  The header packet has 
1725 \family typewriter 
1726 packetno=0
1727 \family default 
1728  and 
1729 \family typewriter 
1730 granulepos=0
1731 \family default 
1732 .
1733 \layout Standard
1734
1735 The second packet contains the Speex comment header.
1736  The format used is the Vorbis comment format described here: http://www.xiph.org/
1737 ogg/vorbis/doc/v-comment.html .
1738  This packet has 
1739 \family typewriter 
1740 packetno=1
1741 \family default 
1742  and 
1743 \family typewriter 
1744 granulepos=0
1745 \family default 
1746 .
1747 \layout Standard
1748
1749 The third and subsequant packets each contain one or more (number found
1750  in header) Speex frames.
1751  These are identified with 
1752 \family typewriter 
1753 packetno
1754 \family default 
1755  starting from 2 and the 
1756 \family typewriter 
1757 granulepos
1758 \family default 
1759  is the number of the last sample encoded in that packet.
1760  Le last of these packets has the 
1761 \emph on 
1762 end of stream
1763 \emph default 
1764  (
1765 \family typewriter 
1766 e_o_s
1767 \family default 
1768 ) flag is set to 1.
1769 \layout Standard
1770
1771
1772 \begin_inset Float table
1773 placement htbp
1774 wide true
1775 collapsed false
1776
1777 \layout Standard
1778
1779
1780 \begin_inset  Tabular
1781 <lyxtabular version="3" rows="16" columns="3">
1782 <features>
1783 <column alignment="center" valignment="top" leftline="true" width="0pt">
1784 <column alignment="center" valignment="top" leftline="true" width="0pt">
1785 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
1786 <row topline="true" bottomline="true">
1787 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1788 \begin_inset Text
1789
1790 \layout Standard
1791
1792 Field
1793 \end_inset 
1794 </cell>
1795 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1796 \begin_inset Text
1797
1798 \layout Standard
1799
1800 Type
1801 \end_inset 
1802 </cell>
1803 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1804 \begin_inset Text
1805
1806 \layout Standard
1807
1808 Size
1809 \end_inset 
1810 </cell>
1811 </row>
1812 <row topline="true">
1813 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1814 \begin_inset Text
1815
1816 \layout Standard
1817
1818 speex_string
1819 \end_inset 
1820 </cell>
1821 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1822 \begin_inset Text
1823
1824 \layout Standard
1825
1826 char[]
1827 \end_inset 
1828 </cell>
1829 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1830 \begin_inset Text
1831
1832 \layout Standard
1833
1834 8
1835 \end_inset 
1836 </cell>
1837 </row>
1838 <row topline="true">
1839 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1840 \begin_inset Text
1841
1842 \layout Standard
1843
1844 speex_version
1845 \end_inset 
1846 </cell>
1847 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1848 \begin_inset Text
1849
1850 \layout Standard
1851
1852 char[]
1853 \end_inset 
1854 </cell>
1855 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1856 \begin_inset Text
1857
1858 \layout Standard
1859
1860 20
1861 \end_inset 
1862 </cell>
1863 </row>
1864 <row topline="true">
1865 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1866 \begin_inset Text
1867
1868 \layout Standard
1869
1870 speex_version_id
1871 \end_inset 
1872 </cell>
1873 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1874 \begin_inset Text
1875
1876 \layout Standard
1877
1878 int
1879 \end_inset 
1880 </cell>
1881 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1882 \begin_inset Text
1883
1884 \layout Standard
1885
1886 4
1887 \end_inset 
1888 </cell>
1889 </row>
1890 <row topline="true">
1891 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1892 \begin_inset Text
1893
1894 \layout Standard
1895
1896 header_size
1897 \end_inset 
1898 </cell>
1899 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1900 \begin_inset Text
1901
1902 \layout Standard
1903
1904 int
1905 \end_inset 
1906 </cell>
1907 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1908 \begin_inset Text
1909
1910 \layout Standard
1911
1912 4
1913 \end_inset 
1914 </cell>
1915 </row>
1916 <row topline="true">
1917 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1918 \begin_inset Text
1919
1920 \layout Standard
1921
1922 rate
1923 \end_inset 
1924 </cell>
1925 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1926 \begin_inset Text
1927
1928 \layout Standard
1929
1930 int
1931 \end_inset 
1932 </cell>
1933 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1934 \begin_inset Text
1935
1936 \layout Standard
1937
1938 4
1939 \end_inset 
1940 </cell>
1941 </row>
1942 <row topline="true">
1943 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1944 \begin_inset Text
1945
1946 \layout Standard
1947
1948 mode
1949 \end_inset 
1950 </cell>
1951 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1952 \begin_inset Text
1953
1954 \layout Standard
1955
1956 int
1957 \end_inset 
1958 </cell>
1959 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1960 \begin_inset Text
1961
1962 \layout Standard
1963
1964 4
1965 \end_inset 
1966 </cell>
1967 </row>
1968 <row topline="true">
1969 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1970 \begin_inset Text
1971
1972 \layout Standard
1973
1974 mode_bitstream_version
1975 \end_inset 
1976 </cell>
1977 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1978 \begin_inset Text
1979
1980 \layout Standard
1981
1982 int
1983 \end_inset 
1984 </cell>
1985 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
1986 \begin_inset Text
1987
1988 \layout Standard
1989
1990 4
1991 \end_inset 
1992 </cell>
1993 </row>
1994 <row topline="true">
1995 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
1996 \begin_inset Text
1997
1998 \layout Standard
1999
2000 nb_channels
2001 \end_inset 
2002 </cell>
2003 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2004 \begin_inset Text
2005
2006 \layout Standard
2007
2008 int
2009 \end_inset 
2010 </cell>
2011 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2012 \begin_inset Text
2013
2014 \layout Standard
2015
2016 4
2017 \end_inset 
2018 </cell>
2019 </row>
2020 <row topline="true">
2021 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2022 \begin_inset Text
2023
2024 \layout Standard
2025
2026 bitrate
2027 \end_inset 
2028 </cell>
2029 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2030 \begin_inset Text
2031
2032 \layout Standard
2033
2034 int
2035 \end_inset 
2036 </cell>
2037 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2038 \begin_inset Text
2039
2040 \layout Standard
2041
2042 4
2043 \end_inset 
2044 </cell>
2045 </row>
2046 <row topline="true">
2047 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2048 \begin_inset Text
2049
2050 \layout Standard
2051
2052 frame_size
2053 \end_inset 
2054 </cell>
2055 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2056 \begin_inset Text
2057
2058 \layout Standard
2059
2060 int
2061 \end_inset 
2062 </cell>
2063 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2064 \begin_inset Text
2065
2066 \layout Standard
2067
2068 4
2069 \end_inset 
2070 </cell>
2071 </row>
2072 <row topline="true">
2073 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2074 \begin_inset Text
2075
2076 \layout Standard
2077
2078 vbr
2079 \end_inset 
2080 </cell>
2081 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2082 \begin_inset Text
2083
2084 \layout Standard
2085
2086 int
2087 \end_inset 
2088 </cell>
2089 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2090 \begin_inset Text
2091
2092 \layout Standard
2093
2094 4
2095 \end_inset 
2096 </cell>
2097 </row>
2098 <row topline="true">
2099 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2100 \begin_inset Text
2101
2102 \layout Standard
2103
2104 frames_per_packet
2105 \end_inset 
2106 </cell>
2107 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2108 \begin_inset Text
2109
2110 \layout Standard
2111
2112 int
2113 \end_inset 
2114 </cell>
2115 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2116 \begin_inset Text
2117
2118 \layout Standard
2119
2120 4
2121 \end_inset 
2122 </cell>
2123 </row>
2124 <row topline="true">
2125 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2126 \begin_inset Text
2127
2128 \layout Standard
2129
2130 extra_headers
2131 \end_inset 
2132 </cell>
2133 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2134 \begin_inset Text
2135
2136 \layout Standard
2137
2138 int
2139 \end_inset 
2140 </cell>
2141 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2142 \begin_inset Text
2143
2144 \layout Standard
2145
2146 4
2147 \end_inset 
2148 </cell>
2149 </row>
2150 <row topline="true">
2151 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2152 \begin_inset Text
2153
2154 \layout Standard
2155
2156 reserved1
2157 \end_inset 
2158 </cell>
2159 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2160 \begin_inset Text
2161
2162 \layout Standard
2163
2164 int
2165 \end_inset 
2166 </cell>
2167 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2168 \begin_inset Text
2169
2170 \layout Standard
2171
2172 4
2173 \end_inset 
2174 </cell>
2175 </row>
2176 <row topline="true" bottomline="true">
2177 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2178 \begin_inset Text
2179
2180 \layout Standard
2181
2182 reserved2
2183 \end_inset 
2184 </cell>
2185 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2186 \begin_inset Text
2187
2188 \layout Standard
2189
2190 int
2191 \end_inset 
2192 </cell>
2193 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
2194 \begin_inset Text
2195
2196 \layout Standard
2197
2198 4
2199 \end_inset 
2200 </cell>
2201 </row>
2202 </lyxtabular>
2203
2204 \end_inset 
2205
2206
2207 \layout Caption
2208
2209 Ogg/Speex header packet
2210 \begin_inset LatexCommand \label{cap:ogg_speex_header}
2211
2212 \end_inset 
2213
2214
2215 \end_inset 
2216
2217
2218 \layout Section
2219 \pagebreak_top 
2220 Introduction to CELP Coding
2221 \begin_inset LatexCommand \index{CELP}
2222
2223 \end_inset 
2224
2225
2226 \layout Standard
2227
2228 The three following sections describe the internals of the codec and require
2229  some signal processing knowledge.
2230  If you are only interested in using Speex, they are not required.
2231 \layout Standard
2232
2233 Speex is based on CELP, which stands for Code Excited Linear Prediction.
2234  This section attempts to introduce the principles behind CELP, so if you
2235  are already familiar with CELP, you can safely skip to section 
2236 \begin_inset LatexCommand \ref{sec:Speex-narrowband-mode}
2237
2238 \end_inset 
2239
2240 .
2241  The CELP technique is based on three ideas:
2242 \layout Enumerate
2243
2244 The use of a linear prediction (LP) model to model the vocal tract
2245 \layout Enumerate
2246
2247 The use of (adaptive and fixed) codebook entries as input (excitation) of
2248  the LP model
2249 \layout Enumerate
2250
2251 The search performed in closed-loop in a 
2252 \begin_inset Quotes eld
2253 \end_inset 
2254
2255 perceptually weighted domain
2256 \begin_inset Quotes erd
2257 \end_inset 
2258
2259
2260 \layout Standard
2261
2262 This section describes the basic ideas behind CELP.
2263  Note that it's still incomplete.
2264 \layout Subsection
2265
2266 Linear Prediction (LPC)
2267 \begin_inset LatexCommand \index{linear prediction}
2268
2269 \end_inset 
2270
2271
2272 \layout Standard
2273
2274 Linear prediction is at the base of may speech coding techniques, including
2275  CELP.
2276  The idea behind it is to predict the signal 
2277 \begin_inset Formula $x(n)$
2278 \end_inset 
2279
2280  using a linear combination of its past samples:
2281 \layout Standard
2282
2283
2284 \begin_inset Formula \[
2285 y[n]=\sum _{i=1}^{N}a_{i}x[n-i]\]
2286
2287 \end_inset 
2288
2289 where 
2290 \begin_inset Formula $y[n]$
2291 \end_inset 
2292
2293  is the linear prediction of 
2294 \begin_inset Formula $x[n]$
2295 \end_inset 
2296
2297 .
2298  The prediction error is thus given by:
2299 \begin_inset Formula \[
2300 e[n]=x[n]-y[n]=x[n]-\sum _{i=1}^{N}a_{i}x[n-i]\]
2301
2302 \end_inset 
2303
2304
2305 \layout Standard
2306
2307 The goal of the LPC analysis is to find the best prediction coefficients
2308  
2309 \begin_inset Formula $a_{i}$
2310 \end_inset 
2311
2312  which minimize the quadratic error function:
2313 \begin_inset Formula \[
2314 E=\sum _{n=0}^{L-1}\left[e[n]\right]^{2}=\sum _{n=0}^{L-1}\left[x[n]-\sum _{i=1}^{N}a_{i}x[n-i]\right]^{2}\]
2315
2316 \end_inset 
2317
2318 That can be done by making all derivatives 
2319 \begin_inset Formula $\frac{\partial E}{\partial a_{i}}$
2320 \end_inset 
2321
2322  equal to zero:
2323 \begin_inset Formula \[
2324 \frac{\partial E}{\partial a_{i}}=\frac{\partial }{\partial a_{i}}\sum _{n=0}^{L-1}\left[x[n]-\sum _{i=1}^{N}a_{i}x[n-i]\right]^{2}=0\]
2325
2326 \end_inset 
2327
2328
2329 \layout Standard
2330
2331 The 
2332 \begin_inset Formula $a_{i}$
2333 \end_inset 
2334
2335  filter coefficients are computed using the Levinson-Durbin
2336 \begin_inset LatexCommand \index{Levinson-Durbin}
2337
2338 \end_inset 
2339
2340  algorithm, which starts from the auto-correlation
2341 \begin_inset LatexCommand \index{auto-correlation}
2342
2343 \end_inset 
2344
2345  
2346 \begin_inset Formula $R(m)$
2347 \end_inset 
2348
2349  of the signal 
2350 \begin_inset Formula $x[n]$
2351 \end_inset 
2352
2353 .
2354 \layout Standard
2355
2356
2357 \begin_inset Formula \[
2358 R(m)=\sum _{i=0}^{N-1}x[i]x[i-m]\]
2359
2360 \end_inset 
2361
2362
2363 \layout Standard
2364
2365 For an order 
2366 \begin_inset Formula $N$
2367 \end_inset 
2368
2369  filter, we have:
2370 \begin_inset Formula \[
2371 \mathbf{R}=\left[\begin{array}{cccc}
2372  R(0) & R(1) & \cdots  & R(N-1)\\
2373  R(1) & R(0) & \cdots  & R(N-2)\\
2374  \vdots  & \vdots  & \ddots  & \vdots \\
2375  R(N-1) & R(N-2) & \cdots  & R(0)\end{array}
2376 \right]\]
2377
2378 \end_inset 
2379
2380
2381 \begin_inset Formula \[
2382 \mathbf{r}=\left[\begin{array}{c}
2383  R(1)\\
2384  R(2)\\
2385  \vdots \\
2386  R(N)\end{array}
2387 \right]\]
2388
2389 \end_inset 
2390
2391
2392 \layout Standard
2393
2394 The filter coefficients 
2395 \begin_inset Formula $a_{i}$
2396 \end_inset 
2397
2398  are found by solving the system 
2399 \begin_inset Formula $\mathbf{Ra}=\mathbf{r}$
2400 \end_inset 
2401
2402 .
2403  What the Levinson-Durbin algorithm does here is making the solution to
2404  the problem 
2405 \begin_inset Formula $\mathcal{O}\left(N^{2}\right)$
2406 \end_inset 
2407
2408  instead of 
2409 \begin_inset Formula $\mathcal{O}\left(N^{3}\right)$
2410 \end_inset 
2411
2412  by exploiting the fact that matrix 
2413 \begin_inset Formula $\mathbf{R}$
2414 \end_inset 
2415
2416  is toeplitz hermitian.
2417  Also, it can be proved that all the roots of 
2418 \begin_inset Formula $A(z)$
2419 \end_inset 
2420
2421  are within the unit circle, which means that 
2422 \begin_inset Formula $1/A(z)$
2423 \end_inset 
2424
2425  is always stable.
2426  This is in theory; in practice because of finite precision, there are two
2427  commonly used techniques to make sure we have a stable filter.
2428  First, we multiply 
2429 \begin_inset Formula $R(0)$
2430 \end_inset 
2431
2432  by a number slightly above one (such as 1.0001), which is equivalent to
2433  adding noise to the signal.
2434  Also, we can apply a window to the auto-correlation, which is equivalent
2435  to filtering in the frequency domain, reducing sharp resonances.
2436 \layout Standard
2437
2438 The linear prediction model represents each speech sample as linear combination
2439  of past samples, plus an error signal called the excitation (or residual).
2440 \begin_inset Formula \[
2441 x[n]=\sum _{i=1}^{N}a_{i}x[n-i]+e[n]\]
2442
2443 \end_inset 
2444
2445
2446 \layout Standard
2447
2448 In the 
2449 \emph on 
2450 z
2451 \emph default 
2452 -domain, this can be expressed as
2453 \layout Standard
2454
2455
2456 \begin_inset Formula \[
2457 x(z)=\frac{1}{A(z)}\: e(z)\]
2458
2459 \end_inset 
2460
2461
2462 \layout Standard
2463
2464 where 
2465 \begin_inset Formula $A(z)$
2466 \end_inset 
2467
2468  is defined as
2469 \layout Standard
2470
2471
2472 \begin_inset Formula \[
2473 A(z)=1-\sum _{i=1}^{N}a_{i}z^{-i}\]
2474
2475 \end_inset 
2476
2477
2478 \layout Standard
2479
2480 We usually refer to 
2481 \begin_inset Formula $A(z)$
2482 \end_inset 
2483
2484  as the analysis filter and 
2485 \begin_inset Formula $1/A(z)$
2486 \end_inset 
2487
2488  as the synthesis filter.
2489  The whole process is called short-term prediction as it predicts the signal
2490  
2491 \begin_inset Formula $x[n]$
2492 \end_inset 
2493
2494  using a prediction using only the 
2495 \begin_inset Formula $N$
2496 \end_inset 
2497
2498  past samples, where 
2499 \begin_inset Formula $N$
2500 \end_inset 
2501
2502  is usually around 10.
2503 \layout Standard
2504
2505 Because LPC coefficients have very little robustness to quantization, they
2506  are converted to Line Spectral Pair
2507 \begin_inset LatexCommand \index{line spectral pair}
2508
2509 \end_inset 
2510
2511  (LSP) coefficients which have a much better behaviour with quantization,
2512  one of them being that it's easy to keep the filter stable.
2513  
2514 \layout Subsection
2515
2516 Pitch Prediction
2517 \begin_inset LatexCommand \index{pitch}
2518
2519 \end_inset 
2520
2521
2522 \layout Standard
2523
2524 During voiced segments, the speech signal is periodic, so it is possible
2525  to take advantage of that property by approximating the excitation signal
2526  
2527 \begin_inset Formula $e[n]$
2528 \end_inset 
2529
2530  by a gain times the past of the excitation:
2531 \layout Standard
2532
2533
2534 \begin_inset Formula \[
2535 e[n]\simeq p[n]=\beta e[n-T]\]
2536
2537 \end_inset 
2538
2539
2540 \layout Standard
2541
2542 where 
2543 \begin_inset Formula $T$
2544 \end_inset 
2545
2546  is the pitch period, 
2547 \begin_inset Formula $\beta $
2548 \end_inset 
2549
2550  is the pitch gain and 
2551 \begin_inset Formula $c(n)$
2552 \end_inset 
2553
2554  is taken from the 
2555 \emph on 
2556 innovation codebook
2557 \emph default 
2558 .
2559  We call that long-term prediction since the excitation is predicted from
2560  
2561 \begin_inset Formula $e[n-T]$
2562 \end_inset 
2563
2564  with 
2565 \begin_inset Formula $T\gg N$
2566 \end_inset 
2567
2568 .
2569 \layout Subsection
2570
2571 Innovation Codebook
2572 \layout Standard
2573
2574 The final excitation 
2575 \begin_inset Formula $e[n]$
2576 \end_inset 
2577
2578  will be the sum of the pitch prediction and an 
2579 \emph on 
2580 innovation
2581 \emph default 
2582  signal 
2583 \begin_inset Formula $c[n]$
2584 \end_inset 
2585
2586  taken from a fixed codebook.
2587 \layout Standard
2588
2589
2590 \begin_inset Formula \[
2591 e[n]=p[n]+c[n]=\beta e[n-T]+c[n]\]
2592
2593 \end_inset 
2594
2595 This is where most of the bits in a CELP codec are allocated.
2596  It represents the information that couldn't be obtained either from linear
2597  prediction or pitch prediction.
2598  In the 
2599 \emph on 
2600 z
2601 \emph default 
2602 -domain we can represent the final signal 
2603 \begin_inset Formula $X(z)$
2604 \end_inset 
2605
2606  as 
2607 \begin_inset Formula \[
2608 X(z)=\frac{C(z)}{A(z)\left(1-\beta z^{-T}\right)}\]
2609
2610 \end_inset 
2611
2612
2613 \layout Subsection
2614
2615 Analysis-by-Synthesis and Error Weighting
2616 \begin_inset LatexCommand \index{error weighting}
2617
2618 \end_inset 
2619
2620
2621 \begin_inset LatexCommand \index{analysis-by-synthesis}
2622
2623 \end_inset 
2624
2625
2626 \layout Standard
2627
2628 Most (if not all) modern audio codecs attempt to 
2629 \begin_inset Quotes eld
2630 \end_inset 
2631
2632 shape
2633 \begin_inset Quotes erd
2634 \end_inset 
2635
2636  the noise so that it appears mostly in the frequency regions where the
2637  ear cannot detect it.
2638  For example, the ear is more tolerant to noise in parts of the spectrum
2639  that are louder and 
2640 \emph on 
2641 vice versa
2642 \emph default 
2643 .
2644  That's why instead of minimizing the simple quadratic error
2645 \begin_inset Formula \[
2646 E=\sum _{n}\left(x[n]-\overline{x}[n]\right)^{2}\]
2647
2648 \end_inset 
2649
2650 where 
2651 \begin_inset Formula $\overline{x}[n]$
2652 \end_inset 
2653
2654  is the encoder signal, we minimize the error for the perceptually weighted
2655  signal
2656 \begin_inset Formula \[
2657 X_{w}(z)=W(z)X(z)\]
2658
2659 \end_inset 
2660
2661 where 
2662 \begin_inset Formula $W(z)$
2663 \end_inset 
2664
2665  is the weighting filter, usually of the form
2666 \layout Standard
2667
2668
2669 \begin_inset Formula \begin{equation}
2670 W(z)=\frac{A\left(\frac{z}{\gamma _{1}}\right)}{A\left(\frac{z}{\gamma _{2}}\right)}\label{eq:weighting_filter}\end{equation}
2671
2672 \end_inset 
2673
2674
2675 \layout Standard
2676
2677 with control parameters 
2678 \begin_inset Formula $\gamma _{1}>\gamma _{2}$
2679 \end_inset 
2680
2681 .
2682  If the noise is white in the perceptually weighted domain, then in the
2683  signal domain its spectral shape will be of the form
2684 \begin_inset Formula \[
2685 A_{noise}(z)=\frac{1}{W(z)}=\frac{A\left(\frac{z}{\gamma _{2}}\right)}{A\left(\frac{z}{\gamma _{1}}\right)}\]
2686
2687 \end_inset 
2688
2689
2690 \layout Standard
2691
2692 If a filter 
2693 \begin_inset Formula $A(z)$
2694 \end_inset 
2695
2696  has (complex) poles at 
2697 \begin_inset Formula $p_{i}$
2698 \end_inset 
2699
2700  in the 
2701 \begin_inset Formula $z$
2702 \end_inset 
2703
2704 -plane, the filter 
2705 \begin_inset Formula $A(z/\gamma )$
2706 \end_inset 
2707
2708  filter will have its poles at 
2709 \begin_inset Formula $p_{i}^{'}=\gamma p_{i}$
2710 \end_inset 
2711
2712 , making it a flatter version of 
2713 \begin_inset Formula $A(z)$
2714 \end_inset 
2715
2716 .
2717 \layout Section
2718 \pagebreak_top 
2719 Speex narrowband mode
2720 \begin_inset LatexCommand \label{sec:Speex-narrowband-mode}
2721
2722 \end_inset 
2723
2724
2725 \begin_inset LatexCommand \index{narrowband}
2726
2727 \end_inset 
2728
2729
2730 \layout Standard
2731
2732 This section looks at how Speex works for narrowband (
2733 \begin_inset Formula $8\: \mathrm{kHz}$
2734 \end_inset 
2735
2736  sampling rate) operation.
2737  The frame size for this mode is 
2738 \begin_inset Formula $20\: \mathrm{ms}$
2739 \end_inset 
2740
2741 , corresponding to 160 samples.
2742  Each frame is also subdivided into 4 sub-frames of 40 samples each.
2743 \layout Standard
2744
2745 Also many design decisions were based on the original goals and assumptions:
2746 \layout Itemize
2747
2748 Minimizing the amount of information extracted from past frames (for robustness
2749  to packet loss)
2750 \layout Itemize
2751
2752 Dynamically-selectable codebooks (LSP, pitch and innovation)
2753 \layout Itemize
2754
2755 sub-vector fixed (innovation) codebooks
2756 \layout Subsection
2757
2758 LPC Analysis
2759 \begin_inset LatexCommand \index{linear prediction}
2760
2761 \end_inset 
2762
2763
2764 \layout Standard
2765
2766 An LPC analysis is first performed on a (asymetric Hamming) window that
2767  spans all the current frame and half a frame in advance.
2768  The LPC coefficients are then converted to Line Spectral Pair
2769 \begin_inset LatexCommand \index{line spectral pair}
2770
2771 \end_inset 
2772
2773  (LSP), a representation that is more robust to quantization.
2774  The LSP's are considered to be associated to the 
2775 \begin_inset Formula $4^{th}$
2776 \end_inset 
2777
2778  sub-frames and the LSP's associated to the first 3 sub-frames are linearly
2779  interpolated using the current and previous LSP's.
2780 \layout Standard
2781
2782 The LSP's are encoded using 30 bits for higher quality modes and 18 bits
2783  for lower quality, through the use of a multi-stage split-vector quantizer.
2784  For the lower quality modes, the 10 coefficients are first quantized with
2785  6 bits and the error is then divided in two 5-coefficient sub-vectors.
2786  Each of them is quantized with 6 bits, for a total of 18 bits.
2787  For the higher quality modes, the remaining error on both sub-vectors is
2788  further quantized with 6 bits each, for a total of 30 bits.
2789 \layout Standard
2790
2791 The perceptual weighting filter 
2792 \begin_inset Formula $W(z)$
2793 \end_inset 
2794
2795  used by Speex is derived from the LPC filter 
2796 \begin_inset Formula $A(z)$
2797 \end_inset 
2798
2799  and corresponds to the one described by eq.
2800  
2801 \begin_inset LatexCommand \ref{eq:weighting_filter}
2802
2803 \end_inset 
2804
2805  with 
2806 \begin_inset Formula $\gamma _{1}=0.9$
2807 \end_inset 
2808
2809  and 
2810 \begin_inset Formula $\gamma _{2}=0.6$
2811 \end_inset 
2812
2813 .
2814  We can use the unquantized 
2815 \begin_inset Formula $A(z)$
2816 \end_inset 
2817
2818  filter since the weighting filter is only used in the encoder.
2819 \layout Subsection
2820
2821 Pitch Prediction (adaptive codebook)
2822 \begin_inset LatexCommand \index{pitch}
2823
2824 \end_inset 
2825
2826
2827 \layout Standard
2828
2829 Speex uses a 3-tap prediction for pitch.
2830  That is, the pitch prediction signal 
2831 \begin_inset Formula $p[n]$
2832 \end_inset 
2833
2834  is obtained by the past of the excitation by:
2835 \begin_inset Formula \[
2836 p[n]=\beta _{0}e[n-T-1]+\beta _{1}e[n-T]+\beta _{2}e[n-T+1]\]
2837
2838 \end_inset 
2839
2840
2841 \layout Standard
2842
2843 where 
2844 \begin_inset Formula $T$
2845 \end_inset 
2846
2847  is the pitch period and the 
2848 \begin_inset Formula $\beta _{i}$
2849 \end_inset 
2850
2851  are the prediction (filter) taps.
2852  It is worth noting that when the pitch is smaller than the sub-frame size,
2853  we repeat the excitation at a period 
2854 \begin_inset Formula $T$
2855 \end_inset 
2856
2857 .
2858  For example, when 
2859 \begin_inset Formula $n-T+1$
2860 \end_inset 
2861
2862 , we use 
2863 \begin_inset Formula $n-2T+1$
2864 \end_inset 
2865
2866  instead.
2867  The period and quantized gains are determined in closed loop.
2868  In most modes, the pitch period is encoded with 7 bits in the 
2869 \begin_inset Formula $\left[17,144\right]$
2870 \end_inset 
2871
2872  range and the 
2873 \begin_inset Formula $\beta _{i}$
2874 \end_inset 
2875
2876  coefficients are vector-quantized using 7 bits (15 kbps narrowband and
2877  above) at higher bit-rates and 5 bits at lower bit-rates (11 kbps narrowband
2878  and below).
2879 \layout Subsection
2880
2881 Innovation Codebook
2882 \layout Standard
2883
2884 In Speex, the innovation signal is quantized using shape-only vector quantizatio
2885 n (VQ).
2886  That means that the codebooks that are used represent both the shape and
2887  the gain at the same time.
2888  This save many bits that would otherwise be allocated for a separate gain
2889  at the price of a slight increase in complexity.
2890  
2891 \layout Subsection
2892
2893 Bit allocation
2894 \layout Standard
2895
2896 There are 7 different narrowband bit-rates defined for Speex, ranging from
2897  200 bps to 18.15 kbps, although the modes below 5.9 kbps should not be used
2898  for speech.
2899  The bit-allocation for each mode is detailed in table 
2900 \begin_inset LatexCommand \ref{cap:bits-narrowband}
2901
2902 \end_inset 
2903
2904 .
2905  Each frame starts with the mode ID encoded with 4 bits which allows a range
2906  from 0 to 15, though only the first 7 values are used (the others are reserved).
2907  The parameters are listed in the table in the order they are packed in
2908  the bit-stream.
2909  All frame-based parameters are packed before sub-frame parameters.
2910  The parameters for a certain sub-frame are all packed before the following
2911  sub-frame is packed.
2912  Note that the 
2913 \begin_inset Quotes eld
2914 \end_inset 
2915
2916 OL
2917 \begin_inset Quotes erd
2918 \end_inset 
2919
2920  in the parameter description means that the parameter is an open loop estimatio
2921 n based on the whole frame.
2922 \layout Standard
2923
2924
2925 \begin_inset Float table
2926 placement h
2927 wide true
2928 collapsed false
2929
2930 \layout Standard
2931
2932
2933 \begin_inset  Tabular
2934 <lyxtabular version="3" rows="12" columns="11">
2935 <features>
2936 <column alignment="center" valignment="top" leftline="true" width="0pt">
2937 <column alignment="center" valignment="top" leftline="true" width="0pt">
2938 <column alignment="center" valignment="top" leftline="true" width="0pt">
2939 <column alignment="center" valignment="top" leftline="true" width="0pt">
2940 <column alignment="center" valignment="top" leftline="true" width="0pt">
2941 <column alignment="center" valignment="top" leftline="true" width="0pt">
2942 <column alignment="center" valignment="top" leftline="true" width="0pt">
2943 <column alignment="center" valignment="top" leftline="true" width="0pt">
2944 <column alignment="center" valignment="top" leftline="true" width="0pt">
2945 <column alignment="center" valignment="top" leftline="true" width="0pt">
2946 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
2947 <row topline="true" bottomline="true">
2948 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2949 \begin_inset Text
2950
2951 \layout Standard
2952
2953 Parameter
2954 \end_inset 
2955 </cell>
2956 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2957 \begin_inset Text
2958
2959 \layout Standard
2960
2961 Update rate
2962 \end_inset 
2963 </cell>
2964 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2965 \begin_inset Text
2966
2967 \layout Standard
2968
2969 0
2970 \end_inset 
2971 </cell>
2972 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2973 \begin_inset Text
2974
2975 \layout Standard
2976
2977 1
2978 \end_inset 
2979 </cell>
2980 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2981 \begin_inset Text
2982
2983 \layout Standard
2984
2985 2
2986 \end_inset 
2987 </cell>
2988 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2989 \begin_inset Text
2990
2991 \layout Standard
2992
2993 3
2994 \end_inset 
2995 </cell>
2996 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
2997 \begin_inset Text
2998
2999 \layout Standard
3000
3001 4
3002 \end_inset 
3003 </cell>
3004 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3005 \begin_inset Text
3006
3007 \layout Standard
3008
3009 5
3010 \end_inset 
3011 </cell>
3012 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3013 \begin_inset Text
3014
3015 \layout Standard
3016
3017 6
3018 \end_inset 
3019 </cell>
3020 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3021 \begin_inset Text
3022
3023 \layout Standard
3024
3025 7
3026 \end_inset 
3027 </cell>
3028 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3029 \begin_inset Text
3030
3031 \layout Standard
3032
3033 8
3034 \end_inset 
3035 </cell>
3036 </row>
3037 <row topline="true">
3038 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3039 \begin_inset Text
3040
3041 \layout Standard
3042
3043 Wideband bit
3044 \end_inset 
3045 </cell>
3046 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3047 \begin_inset Text
3048
3049 \layout Standard
3050
3051 frame
3052 \end_inset 
3053 </cell>
3054 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3055 \begin_inset Text
3056
3057 \layout Standard
3058
3059 1
3060 \end_inset 
3061 </cell>
3062 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3063 \begin_inset Text
3064
3065 \layout Standard
3066
3067 1
3068 \end_inset 
3069 </cell>
3070 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3071 \begin_inset Text
3072
3073 \layout Standard
3074
3075 1
3076 \end_inset 
3077 </cell>
3078 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3079 \begin_inset Text
3080
3081 \layout Standard
3082
3083 1
3084 \end_inset 
3085 </cell>
3086 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3087 \begin_inset Text
3088
3089 \layout Standard
3090
3091 1
3092 \end_inset 
3093 </cell>
3094 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3095 \begin_inset Text
3096
3097 \layout Standard
3098
3099 1
3100 \end_inset 
3101 </cell>
3102 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3103 \begin_inset Text
3104
3105 \layout Standard
3106
3107 1
3108 \end_inset 
3109 </cell>
3110 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3111 \begin_inset Text
3112
3113 \layout Standard
3114
3115 1
3116 \end_inset 
3117 </cell>
3118 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3119 \begin_inset Text
3120
3121 \layout Standard
3122
3123 1
3124 \end_inset 
3125 </cell>
3126 </row>
3127 <row topline="true">
3128 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3129 \begin_inset Text
3130
3131 \layout Standard
3132
3133 Mode ID
3134 \end_inset 
3135 </cell>
3136 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3137 \begin_inset Text
3138
3139 \layout Standard
3140
3141 frame
3142 \end_inset 
3143 </cell>
3144 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3145 \begin_inset Text
3146
3147 \layout Standard
3148
3149 4
3150 \end_inset 
3151 </cell>
3152 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3153 \begin_inset Text
3154
3155 \layout Standard
3156
3157 4
3158 \end_inset 
3159 </cell>
3160 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3161 \begin_inset Text
3162
3163 \layout Standard
3164
3165 4
3166 \end_inset 
3167 </cell>
3168 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3169 \begin_inset Text
3170
3171 \layout Standard
3172
3173 4
3174 \end_inset 
3175 </cell>
3176 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3177 \begin_inset Text
3178
3179 \layout Standard
3180
3181 4
3182 \end_inset 
3183 </cell>
3184 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3185 \begin_inset Text
3186
3187 \layout Standard
3188
3189 4
3190 \end_inset 
3191 </cell>
3192 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3193 \begin_inset Text
3194
3195 \layout Standard
3196
3197 4
3198 \end_inset 
3199 </cell>
3200 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3201 \begin_inset Text
3202
3203 \layout Standard
3204
3205 4
3206 \end_inset 
3207 </cell>
3208 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3209 \begin_inset Text
3210
3211 \layout Standard
3212
3213 4
3214 \end_inset 
3215 </cell>
3216 </row>
3217 <row topline="true">
3218 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3219 \begin_inset Text
3220
3221 \layout Standard
3222
3223 LSP
3224 \end_inset 
3225 </cell>
3226 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3227 \begin_inset Text
3228
3229 \layout Standard
3230
3231 frame
3232 \end_inset 
3233 </cell>
3234 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3235 \begin_inset Text
3236
3237 \layout Standard
3238
3239 0
3240 \end_inset 
3241 </cell>
3242 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3243 \begin_inset Text
3244
3245 \layout Standard
3246
3247 18
3248 \end_inset 
3249 </cell>
3250 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3251 \begin_inset Text
3252
3253 \layout Standard
3254
3255 18
3256 \end_inset 
3257 </cell>
3258 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3259 \begin_inset Text
3260
3261 \layout Standard
3262
3263 18
3264 \end_inset 
3265 </cell>
3266 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3267 \begin_inset Text
3268
3269 \layout Standard
3270
3271 18
3272 \end_inset 
3273 </cell>
3274 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3275 \begin_inset Text
3276
3277 \layout Standard
3278
3279 30
3280 \end_inset 
3281 </cell>
3282 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3283 \begin_inset Text
3284
3285 \layout Standard
3286
3287 30
3288 \end_inset 
3289 </cell>
3290 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3291 \begin_inset Text
3292
3293 \layout Standard
3294
3295 30
3296 \end_inset 
3297 </cell>
3298 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3299 \begin_inset Text
3300
3301 \layout Standard
3302
3303 18
3304 \end_inset 
3305 </cell>
3306 </row>
3307 <row topline="true">
3308 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3309 \begin_inset Text
3310
3311 \layout Standard
3312
3313 OL pitch
3314 \end_inset 
3315 </cell>
3316 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3317 \begin_inset Text
3318
3319 \layout Standard
3320
3321 frame
3322 \end_inset 
3323 </cell>
3324 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3325 \begin_inset Text
3326
3327 \layout Standard
3328
3329 0
3330 \end_inset 
3331 </cell>
3332 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3333 \begin_inset Text
3334
3335 \layout Standard
3336
3337 7
3338 \end_inset 
3339 </cell>
3340 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3341 \begin_inset Text
3342
3343 \layout Standard
3344
3345 7
3346 \end_inset 
3347 </cell>
3348 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3349 \begin_inset Text
3350
3351 \layout Standard
3352
3353 0
3354 \end_inset 
3355 </cell>
3356 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3357 \begin_inset Text
3358
3359 \layout Standard
3360
3361 0
3362 \end_inset 
3363 </cell>
3364 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3365 \begin_inset Text
3366
3367 \layout Standard
3368
3369 0
3370 \end_inset 
3371 </cell>
3372 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3373 \begin_inset Text
3374
3375 \layout Standard
3376
3377 0
3378 \end_inset 
3379 </cell>
3380 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3381 \begin_inset Text
3382
3383 \layout Standard
3384
3385 0
3386 \end_inset 
3387 </cell>
3388 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3389 \begin_inset Text
3390
3391 \layout Standard
3392
3393 7
3394 \end_inset 
3395 </cell>
3396 </row>
3397 <row topline="true">
3398 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3399 \begin_inset Text
3400
3401 \layout Standard
3402
3403 OL pitch gain
3404 \end_inset 
3405 </cell>
3406 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3407 \begin_inset Text
3408
3409 \layout Standard
3410
3411 frame
3412 \end_inset 
3413 </cell>
3414 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3415 \begin_inset Text
3416
3417 \layout Standard
3418
3419 0
3420 \end_inset 
3421 </cell>
3422 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3423 \begin_inset Text
3424
3425 \layout Standard
3426
3427 4
3428 \end_inset 
3429 </cell>
3430 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3431 \begin_inset Text
3432
3433 \layout Standard
3434
3435 0
3436 \end_inset 
3437 </cell>
3438 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3439 \begin_inset Text
3440
3441 \layout Standard
3442
3443 0
3444 \end_inset 
3445 </cell>
3446 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3447 \begin_inset Text
3448
3449 \layout Standard
3450
3451 0
3452 \end_inset 
3453 </cell>
3454 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3455 \begin_inset Text
3456
3457 \layout Standard
3458
3459 0
3460 \end_inset 
3461 </cell>
3462 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3463 \begin_inset Text
3464
3465 \layout Standard
3466
3467 0
3468 \end_inset 
3469 </cell>
3470 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3471 \begin_inset Text
3472
3473 \layout Standard
3474
3475 0
3476 \end_inset 
3477 </cell>
3478 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3479 \begin_inset Text
3480
3481 \layout Standard
3482
3483 4
3484 \end_inset 
3485 </cell>
3486 </row>
3487 <row topline="true">
3488 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3489 \begin_inset Text
3490
3491 \layout Standard
3492
3493 OL Exc gain
3494 \end_inset 
3495 </cell>
3496 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3497 \begin_inset Text
3498
3499 \layout Standard
3500
3501 frame
3502 \end_inset 
3503 </cell>
3504 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3505 \begin_inset Text
3506
3507 \layout Standard
3508
3509 0
3510 \end_inset 
3511 </cell>
3512 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3513 \begin_inset Text
3514
3515 \layout Standard
3516
3517 5
3518 \end_inset 
3519 </cell>
3520 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3521 \begin_inset Text
3522
3523 \layout Standard
3524
3525 5
3526 \end_inset 
3527 </cell>
3528 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3529 \begin_inset Text
3530
3531 \layout Standard
3532
3533 5
3534 \end_inset 
3535 </cell>
3536 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3537 \begin_inset Text
3538
3539 \layout Standard
3540
3541 5
3542 \end_inset 
3543 </cell>
3544 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3545 \begin_inset Text
3546
3547 \layout Standard
3548
3549 5
3550 \end_inset 
3551 </cell>
3552 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3553 \begin_inset Text
3554
3555 \layout Standard
3556
3557 5
3558 \end_inset 
3559 </cell>
3560 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3561 \begin_inset Text
3562
3563 \layout Standard
3564
3565 5
3566 \end_inset 
3567 </cell>
3568 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3569 \begin_inset Text
3570
3571 \layout Standard
3572
3573 5
3574 \end_inset 
3575 </cell>
3576 </row>
3577 <row topline="true">
3578 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3579 \begin_inset Text
3580
3581 \layout Standard
3582
3583 Fine pitch
3584 \end_inset 
3585 </cell>
3586 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3587 \begin_inset Text
3588
3589 \layout Standard
3590
3591 sub-frame
3592 \end_inset 
3593 </cell>
3594 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3595 \begin_inset Text
3596
3597 \layout Standard
3598
3599 0
3600 \end_inset 
3601 </cell>
3602 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3603 \begin_inset Text
3604
3605 \layout Standard
3606
3607 0
3608 \end_inset 
3609 </cell>
3610 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3611 \begin_inset Text
3612
3613 \layout Standard
3614
3615 0
3616 \end_inset 
3617 </cell>
3618 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3619 \begin_inset Text
3620
3621 \layout Standard
3622
3623 7
3624 \end_inset 
3625 </cell>
3626 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3627 \begin_inset Text
3628
3629 \layout Standard
3630
3631 7
3632 \end_inset 
3633 </cell>
3634 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3635 \begin_inset Text
3636
3637 \layout Standard
3638
3639 7
3640 \end_inset 
3641 </cell>
3642 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3643 \begin_inset Text
3644
3645 \layout Standard
3646
3647 7
3648 \end_inset 
3649 </cell>
3650 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3651 \begin_inset Text
3652
3653 \layout Standard
3654
3655 7
3656 \end_inset 
3657 </cell>
3658 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3659 \begin_inset Text
3660
3661 \layout Standard
3662
3663 0
3664 \end_inset 
3665 </cell>
3666 </row>
3667 <row topline="true">
3668 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3669 \begin_inset Text
3670
3671 \layout Standard
3672
3673 Pitch gain
3674 \end_inset 
3675 </cell>
3676 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3677 \begin_inset Text
3678
3679 \layout Standard
3680
3681 sub-frame
3682 \end_inset 
3683 </cell>
3684 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3685 \begin_inset Text
3686
3687 \layout Standard
3688
3689 0
3690 \end_inset 
3691 </cell>
3692 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3693 \begin_inset Text
3694
3695 \layout Standard
3696
3697 0
3698 \end_inset 
3699 </cell>
3700 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3701 \begin_inset Text
3702
3703 \layout Standard
3704
3705 5
3706 \end_inset 
3707 </cell>
3708 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3709 \begin_inset Text
3710
3711 \layout Standard
3712
3713 5
3714 \end_inset 
3715 </cell>
3716 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3717 \begin_inset Text
3718
3719 \layout Standard
3720
3721 5
3722 \end_inset 
3723 </cell>
3724 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3725 \begin_inset Text
3726
3727 \layout Standard
3728
3729 7
3730 \end_inset 
3731 </cell>
3732 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3733 \begin_inset Text
3734
3735 \layout Standard
3736
3737 7
3738 \end_inset 
3739 </cell>
3740 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3741 \begin_inset Text
3742
3743 \layout Standard
3744
3745 7
3746 \end_inset 
3747 </cell>
3748 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3749 \begin_inset Text
3750
3751 \layout Standard
3752
3753 0
3754 \end_inset 
3755 </cell>
3756 </row>
3757 <row topline="true">
3758 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3759 \begin_inset Text
3760
3761 \layout Standard
3762
3763 Innovation gain
3764 \end_inset 
3765 </cell>
3766 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3767 \begin_inset Text
3768
3769 \layout Standard
3770
3771 sub-frame
3772 \end_inset 
3773 </cell>
3774 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3775 \begin_inset Text
3776
3777 \layout Standard
3778
3779 0
3780 \end_inset 
3781 </cell>
3782 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3783 \begin_inset Text
3784
3785 \layout Standard
3786
3787 1
3788 \end_inset 
3789 </cell>
3790 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3791 \begin_inset Text
3792
3793 \layout Standard
3794
3795 0
3796 \end_inset 
3797 </cell>
3798 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3799 \begin_inset Text
3800
3801 \layout Standard
3802
3803 1
3804 \end_inset 
3805 </cell>
3806 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3807 \begin_inset Text
3808
3809 \layout Standard
3810
3811 1
3812 \end_inset 
3813 </cell>
3814 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3815 \begin_inset Text
3816
3817 \layout Standard
3818
3819 3
3820 \end_inset 
3821 </cell>
3822 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3823 \begin_inset Text
3824
3825 \layout Standard
3826
3827 3
3828 \end_inset 
3829 </cell>
3830 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3831 \begin_inset Text
3832
3833 \layout Standard
3834
3835 3
3836 \end_inset 
3837 </cell>
3838 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3839 \begin_inset Text
3840
3841 \layout Standard
3842
3843 0
3844 \end_inset 
3845 </cell>
3846 </row>
3847 <row topline="true" bottomline="true">
3848 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3849 \begin_inset Text
3850
3851 \layout Standard
3852
3853 Innovation VQ
3854 \end_inset 
3855 </cell>
3856 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3857 \begin_inset Text
3858
3859 \layout Standard
3860
3861 sub-frame
3862 \end_inset 
3863 </cell>
3864 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3865 \begin_inset Text
3866
3867 \layout Standard
3868
3869 0
3870 \end_inset 
3871 </cell>
3872 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3873 \begin_inset Text
3874
3875 \layout Standard
3876
3877 0
3878 \end_inset 
3879 </cell>
3880 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3881 \begin_inset Text
3882
3883 \layout Standard
3884
3885 16
3886 \end_inset 
3887 </cell>
3888 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3889 \begin_inset Text
3890
3891 \layout Standard
3892
3893 20
3894 \end_inset 
3895 </cell>
3896 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3897 \begin_inset Text
3898
3899 \layout Standard
3900
3901 35
3902 \end_inset 
3903 </cell>
3904 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3905 \begin_inset Text
3906
3907 \layout Standard
3908
3909 48
3910 \end_inset 
3911 </cell>
3912 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3913 \begin_inset Text
3914
3915 \layout Standard
3916
3917 64
3918 \end_inset 
3919 </cell>
3920 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3921 \begin_inset Text
3922
3923 \layout Standard
3924
3925 96
3926 \end_inset 
3927 </cell>
3928 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
3929 \begin_inset Text
3930
3931 \layout Standard
3932
3933 10
3934 \end_inset 
3935 </cell>
3936 </row>
3937 <row topline="true" bottomline="true">
3938 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3939 \begin_inset Text
3940
3941 \layout Standard
3942
3943 Total
3944 \end_inset 
3945 </cell>
3946 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3947 \begin_inset Text
3948
3949 \layout Standard
3950
3951 frame
3952 \end_inset 
3953 </cell>
3954 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3955 \begin_inset Text
3956
3957 \layout Standard
3958
3959 5
3960 \end_inset 
3961 </cell>
3962 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3963 \begin_inset Text
3964
3965 \layout Standard
3966
3967 43
3968 \end_inset 
3969 </cell>
3970 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3971 \begin_inset Text
3972
3973 \layout Standard
3974
3975 119
3976 \end_inset 
3977 </cell>
3978 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3979 \begin_inset Text
3980
3981 \layout Standard
3982
3983 160
3984 \end_inset 
3985 </cell>
3986 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3987 \begin_inset Text
3988
3989 \layout Standard
3990
3991 220
3992 \end_inset 
3993 </cell>
3994 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
3995 \begin_inset Text
3996
3997 \layout Standard
3998
3999 300
4000 \end_inset 
4001 </cell>
4002 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4003 \begin_inset Text
4004
4005 \layout Standard
4006
4007 364
4008 \end_inset 
4009 </cell>
4010 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4011 \begin_inset Text
4012
4013 \layout Standard
4014
4015 492
4016 \end_inset 
4017 </cell>
4018 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4019 \begin_inset Text
4020
4021 \layout Standard
4022
4023 79
4024 \end_inset 
4025 </cell>
4026 </row>
4027 </lyxtabular>
4028
4029 \end_inset 
4030
4031
4032 \layout Caption
4033
4034 Bit allocation for narrowband modes
4035 \begin_inset LatexCommand \label{cap:bits-narrowband}
4036
4037 \end_inset 
4038
4039
4040 \end_inset 
4041
4042
4043 \layout Standard
4044
4045 So far, no MOS (Mean Opinion Score
4046 \begin_inset LatexCommand \index{mean opinion score}
4047
4048 \end_inset 
4049
4050 ) subjective evaluation has been performed for Speex.
4051  In order to give an idea of the quality achivable with it, table 
4052 \begin_inset LatexCommand \ref{cap:quality_vs_bps}
4053
4054 \end_inset 
4055
4056  presents my own subjective opinion on it.
4057  It sould be noted that different people will perceive the quality differently
4058  and that the person that designed the codec often has a bias (one way or
4059  another) when it comes to subjective evaluation.
4060  Last thing, it should be noted that for most codecs (including Speex) encoding
4061  quality sometimes varies depending on the input.
4062  Note that the complexity is only approximate (within 0.5 mflops and using
4063  the lowers complexity setting).
4064  Decoding requires approximately 0.5 mflops
4065 \begin_inset LatexCommand \index{complexity}
4066
4067 \end_inset 
4068
4069  in most modes (1 mflops with perceptual enhancement).
4070 \layout Standard
4071
4072
4073 \begin_inset Float table
4074 placement h
4075 wide true
4076 collapsed false
4077
4078 \layout Standard
4079
4080
4081 \begin_inset  Tabular
4082 <lyxtabular version="3" rows="17" columns="4">
4083 <features>
4084 <column alignment="center" valignment="top" leftline="true" width="0pt">
4085 <column alignment="center" valignment="top" leftline="true" width="0pt">
4086 <column alignment="center" valignment="top" leftline="true" width="0pt">
4087 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
4088 <row topline="true" bottomline="true">
4089 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4090 \begin_inset Text
4091
4092 \layout Standard
4093
4094 Mode
4095 \end_inset 
4096 </cell>
4097 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4098 \begin_inset Text
4099
4100 \layout Standard
4101
4102 Bit-rate
4103 \begin_inset LatexCommand \index{bit-rate}
4104
4105 \end_inset 
4106
4107  (bps)
4108 \end_inset 
4109 </cell>
4110 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4111 \begin_inset Text
4112
4113 \layout Standard
4114
4115 mflops
4116 \begin_inset LatexCommand \index{complexity}
4117
4118 \end_inset 
4119
4120
4121 \end_inset 
4122 </cell>
4123 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4124 \begin_inset Text
4125
4126 \layout Standard
4127
4128 Quality/description
4129 \end_inset 
4130 </cell>
4131 </row>
4132 <row topline="true">
4133 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4134 \begin_inset Text
4135
4136 \layout Standard
4137
4138 0
4139 \end_inset 
4140 </cell>
4141 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4142 \begin_inset Text
4143
4144 \layout Standard
4145
4146 250
4147 \end_inset 
4148 </cell>
4149 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4150 \begin_inset Text
4151
4152 \layout Standard
4153
4154 N/A
4155 \end_inset 
4156 </cell>
4157 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4158 \begin_inset Text
4159
4160 \layout Standard
4161
4162 No sound (VBR only)
4163 \end_inset 
4164 </cell>
4165 </row>
4166 <row topline="true">
4167 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4168 \begin_inset Text
4169
4170 \layout Standard
4171
4172 1
4173 \end_inset 
4174 </cell>
4175 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4176 \begin_inset Text
4177
4178 \layout Standard
4179
4180 2,150
4181 \end_inset 
4182 </cell>
4183 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4184 \begin_inset Text
4185
4186 \layout Standard
4187
4188 6
4189 \end_inset 
4190 </cell>
4191 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4192 \begin_inset Text
4193
4194 \layout Standard
4195
4196 Vocoder (mostly for comfort noise)
4197 \end_inset 
4198 </cell>
4199 </row>
4200 <row topline="true">
4201 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4202 \begin_inset Text
4203
4204 \layout Standard
4205
4206 2
4207 \end_inset 
4208 </cell>
4209 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4210 \begin_inset Text
4211
4212 \layout Standard
4213
4214 5,950
4215 \end_inset 
4216 </cell>
4217 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4218 \begin_inset Text
4219
4220 \layout Standard
4221
4222 9
4223 \end_inset 
4224 </cell>
4225 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4226 \begin_inset Text
4227
4228 \layout Standard
4229
4230 Very noticeable artifacts/noise, good intelligibility
4231 \end_inset 
4232 </cell>
4233 </row>
4234 <row topline="true">
4235 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4236 \begin_inset Text
4237
4238 \layout Standard
4239
4240 3
4241 \end_inset 
4242 </cell>
4243 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4244 \begin_inset Text
4245
4246 \layout Standard
4247
4248 8,000
4249 \end_inset 
4250 </cell>
4251 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4252 \begin_inset Text
4253
4254 \layout Standard
4255
4256 10
4257 \end_inset 
4258 </cell>
4259 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4260 \begin_inset Text
4261
4262 \layout Standard
4263
4264 Artifacts/noise sometimes noticeable
4265 \end_inset 
4266 </cell>
4267 </row>
4268 <row topline="true">
4269 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4270 \begin_inset Text
4271
4272 \layout Standard
4273
4274 4
4275 \end_inset 
4276 </cell>
4277 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4278 \begin_inset Text
4279
4280 \layout Standard
4281
4282 11,000
4283 \end_inset 
4284 </cell>
4285 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4286 \begin_inset Text
4287
4288 \layout Standard
4289
4290 14
4291 \end_inset 
4292 </cell>
4293 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4294 \begin_inset Text
4295
4296 \layout Standard
4297
4298 Artifacts usually noticeable only with headphones
4299 \end_inset 
4300 </cell>
4301 </row>
4302 <row topline="true">
4303 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4304 \begin_inset Text
4305
4306 \layout Standard
4307
4308 5
4309 \end_inset 
4310 </cell>
4311 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4312 \begin_inset Text
4313
4314 \layout Standard
4315
4316 15,000
4317 \end_inset 
4318 </cell>
4319 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4320 \begin_inset Text
4321
4322 \layout Standard
4323
4324 11
4325 \end_inset 
4326 </cell>
4327 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4328 \begin_inset Text
4329
4330 \layout Standard
4331
4332 Need good headphones to tell the difference
4333 \end_inset 
4334 </cell>
4335 </row>
4336 <row topline="true">
4337 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4338 \begin_inset Text
4339
4340 \layout Standard
4341
4342 6
4343 \end_inset 
4344 </cell>
4345 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4346 \begin_inset Text
4347
4348 \layout Standard
4349
4350 18,200
4351 \end_inset 
4352 </cell>
4353 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4354 \begin_inset Text
4355
4356 \layout Standard
4357
4358 17.5
4359 \end_inset 
4360 </cell>
4361 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4362 \begin_inset Text
4363
4364 \layout Standard
4365
4366 Hard to tell the difference even with good headphones
4367 \end_inset 
4368 </cell>
4369 </row>
4370 <row topline="true">
4371 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4372 \begin_inset Text
4373
4374 \layout Standard
4375
4376 7
4377 \end_inset 
4378 </cell>
4379 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4380 \begin_inset Text
4381
4382 \layout Standard
4383
4384 24,600
4385 \end_inset 
4386 </cell>
4387 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4388 \begin_inset Text
4389
4390 \layout Standard
4391
4392 14.5
4393 \end_inset 
4394 </cell>
4395 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4396 \begin_inset Text
4397
4398 \layout Standard
4399
4400 Completely transparent for voice, good quality music
4401 \end_inset 
4402 </cell>
4403 </row>
4404 <row topline="true">
4405 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4406 \begin_inset Text
4407
4408 \layout Standard
4409
4410 8
4411 \end_inset 
4412 </cell>
4413 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4414 \begin_inset Text
4415
4416 \layout Standard
4417
4418 3,950
4419 \end_inset 
4420 </cell>
4421 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4422 \begin_inset Text
4423
4424 \layout Standard
4425
4426 -
4427 \end_inset 
4428 </cell>
4429 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4430 \begin_inset Text
4431
4432 \layout Standard
4433
4434 Very noticeable artifacts/noise, good intelligibility
4435 \end_inset 
4436 </cell>
4437 </row>
4438 <row topline="true">
4439 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4440 \begin_inset Text
4441
4442 \layout Standard
4443
4444 9
4445 \end_inset 
4446 </cell>
4447 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4448 \begin_inset Text
4449
4450 \layout Standard
4451
4452 N/A
4453 \end_inset 
4454 </cell>
4455 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4456 \begin_inset Text
4457
4458 \layout Standard
4459
4460 N/A
4461 \end_inset 
4462 </cell>
4463 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4464 \begin_inset Text
4465
4466 \layout Standard
4467
4468 reserved
4469 \end_inset 
4470 </cell>
4471 </row>
4472 <row topline="true">
4473 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4474 \begin_inset Text
4475
4476 \layout Standard
4477
4478 10
4479 \end_inset 
4480 </cell>
4481 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4482 \begin_inset Text
4483
4484 \layout Standard
4485
4486 N/A
4487 \end_inset 
4488 </cell>
4489 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4490 \begin_inset Text
4491
4492 \layout Standard
4493
4494 N/A
4495 \end_inset 
4496 </cell>
4497 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4498 \begin_inset Text
4499
4500 \layout Standard
4501
4502 reserved
4503 \end_inset 
4504 </cell>
4505 </row>
4506 <row topline="true">
4507 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4508 \begin_inset Text
4509
4510 \layout Standard
4511
4512 11
4513 \end_inset 
4514 </cell>
4515 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4516 \begin_inset Text
4517
4518 \layout Standard
4519
4520 N/A
4521 \end_inset 
4522 </cell>
4523 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4524 \begin_inset Text
4525
4526 \layout Standard
4527
4528 N/A
4529 \end_inset 
4530 </cell>
4531 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4532 \begin_inset Text
4533
4534 \layout Standard
4535
4536 reserved
4537 \end_inset 
4538 </cell>
4539 </row>
4540 <row topline="true">
4541 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4542 \begin_inset Text
4543
4544 \layout Standard
4545
4546 12
4547 \end_inset 
4548 </cell>
4549 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4550 \begin_inset Text
4551
4552 \layout Standard
4553
4554 N/A
4555 \end_inset 
4556 </cell>
4557 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4558 \begin_inset Text
4559
4560 \layout Standard
4561
4562 N/A
4563 \end_inset 
4564 </cell>
4565 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4566 \begin_inset Text
4567
4568 \layout Standard
4569
4570 reserved
4571 \end_inset 
4572 </cell>
4573 </row>
4574 <row topline="true">
4575 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4576 \begin_inset Text
4577
4578 \layout Standard
4579
4580 13
4581 \end_inset 
4582 </cell>
4583 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4584 \begin_inset Text
4585
4586 \layout Standard
4587
4588 N/A
4589 \end_inset 
4590 </cell>
4591 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4592 \begin_inset Text
4593
4594 \layout Standard
4595
4596 N/A
4597 \end_inset 
4598 </cell>
4599 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4600 \begin_inset Text
4601
4602 \layout Standard
4603
4604 Application-defined, interpreted by callback or skipped
4605 \end_inset 
4606 </cell>
4607 </row>
4608 <row topline="true">
4609 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4610 \begin_inset Text
4611
4612 \layout Standard
4613
4614 14
4615 \end_inset 
4616 </cell>
4617 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4618 \begin_inset Text
4619
4620 \layout Standard
4621
4622 N/A
4623 \end_inset 
4624 </cell>
4625 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4626 \begin_inset Text
4627
4628 \layout Standard
4629
4630 N/A
4631 \end_inset 
4632 </cell>
4633 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4634 \begin_inset Text
4635
4636 \layout Standard
4637
4638 Speex in-band signaling
4639 \end_inset 
4640 </cell>
4641 </row>
4642 <row topline="true" bottomline="true">
4643 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4644 \begin_inset Text
4645
4646 \layout Standard
4647
4648 15
4649 \end_inset 
4650 </cell>
4651 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4652 \begin_inset Text
4653
4654 \layout Standard
4655
4656 N/A
4657 \end_inset 
4658 </cell>
4659 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4660 \begin_inset Text
4661
4662 \layout Standard
4663
4664 N/A
4665 \end_inset 
4666 </cell>
4667 <cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
4668 \begin_inset Text
4669
4670 \layout Standard
4671
4672 Terminator code
4673 \end_inset 
4674 </cell>
4675 </row>
4676 </lyxtabular>
4677
4678 \end_inset 
4679
4680
4681 \layout Caption
4682
4683 Quality versus bit-rate
4684 \begin_inset LatexCommand \label{cap:quality_vs_bps}
4685
4686 \end_inset 
4687
4688
4689 \end_inset 
4690
4691
4692 \layout Subsection
4693
4694 Perceptual enhancement
4695 \begin_inset LatexCommand \index{perceptual enhancement}
4696
4697 \end_inset 
4698
4699
4700 \layout Standard
4701
4702 This part of the codec only applies to the decoder and can even be changed
4703  without affecting inter-operability.
4704  For that reason, the implementation provided and described here should
4705  only be considered as a reference implementation.
4706  The enhancement system is devided in two parts.
4707  First, the synthesis filter 
4708 \begin_inset Formula $S(z)=1/A(z)$
4709 \end_inset 
4710
4711  is replaced by an enhanced filter
4712 \begin_inset Formula \[
4713 S'(z)=\frac{A\left(z/a_{2}\right)A\left(z/a_{3}\right)}{A\left(z\right)A\left(z/a_{1}\right)}\]
4714
4715 \end_inset 
4716
4717 where 
4718 \begin_inset Formula $a_{1}$
4719 \end_inset 
4720
4721  and 
4722 \begin_inset Formula $a_{2}$
4723 \end_inset 
4724
4725  depend on the mode in use and 
4726 \begin_inset Formula $a_{3}=\frac{1}{r}\left(1-\frac{1-ra_{1}}{1-ra_{2}}\right)$
4727 \end_inset 
4728
4729  with 
4730 \begin_inset Formula $r=.9$
4731 \end_inset 
4732
4733 .
4734  The second part of the enhancement consists of using a comb filter to enhance
4735  the pitch in the excitation domain.
4736  
4737 \layout Section
4738 \pagebreak_top 
4739 Speex wideband mode (sub-band CELP)
4740 \begin_inset LatexCommand \index{wideband}
4741
4742 \end_inset 
4743
4744
4745 \layout Standard
4746
4747 For wideband, the Speex approach uses a 
4748 \emph on 
4749 q
4750 \emph default 
4751 uadrature 
4752 \emph on 
4753 m
4754 \emph default 
4755 irror 
4756 \emph on 
4757 f
4758 \emph default 
4759 ilter
4760 \begin_inset LatexCommand \index{quadrature mirror filter}
4761
4762 \end_inset 
4763
4764  (QMF) to split the band in two.
4765  The 16 kHz signal is thus divided into two 8 kHz signals, one representing
4766  the low band (0-4 kHz), the other the high band (4-8 kHz).
4767  The low band is encoded with the narrowband mode described in section 
4768 \begin_inset LatexCommand \ref{sec:Speex-narrowband-mode}
4769
4770 \end_inset 
4771
4772  in such a way that the resulting 
4773 \begin_inset Quotes eld
4774 \end_inset 
4775
4776 embedded narrowband bit-stream
4777 \begin_inset Quotes erd
4778 \end_inset 
4779
4780  can also be decoded with the narrowband decoder.
4781  Since the low band encoding has already been described only the high band
4782  encoding is described in this section.
4783 \layout Subsection
4784
4785 Linear Prediction
4786 \layout Standard
4787
4788 The linear prediction part used for the high-band is very similar to what
4789  is done for narrowband.
4790  The only difference is that we use only 12 bits to encode the high-band
4791  LSP's using a multi-stage vector quantizer (MSVQ).
4792  The first level quantizes the 10 coefficients with 6 bits and the error
4793  is then quantized using 6 bits too.
4794 \layout Subsection
4795
4796 Pitch Prediction
4797 \layout Standard
4798
4799 That part is easy: there's no pitch prediction for the high-band.
4800  There are two reasons for that.
4801  First, there is usually little harmonic structure in this band (above 4
4802  kHz).
4803  Second, it would be very hard to implement since the QMF folds the 4-8
4804  kHz band into 4-0 kHz (reversing the frequency axis), which means that
4805  the location of the harmonics are no longer at multiples of the fundamental
4806  (pitch).
4807 \layout Subsection
4808
4809 Excitation Quantization
4810 \layout Standard
4811
4812 The high-band excitation is coded in the same way as for narrowband.
4813  
4814 \layout Subsection
4815
4816 Bit allocation
4817 \layout Standard
4818
4819 For the wideband mode, all the narrowband frame is packed before the high-band
4820  is encoded.
4821  The narrowband part of the bit-stream is as defined in table 
4822 \begin_inset LatexCommand \ref{cap:bits-narrowband}
4823
4824 \end_inset 
4825
4826 .
4827  The high-band follows, as described in table 
4828 \begin_inset LatexCommand \ref{cap:bits-wideband}
4829
4830 \end_inset 
4831
4832 .
4833  This also means that a wideband frame may be correctly decoded by a narrowband
4834  decoder with the only caveat that if more than one frame is packed in the
4835  same packet, the decoder will need to skip the high-band parts in order
4836  to sync with the bit-stream.
4837 \layout Standard
4838
4839
4840 \begin_inset Float table
4841 placement h
4842 wide true
4843 collapsed false
4844
4845 \layout Standard
4846
4847
4848 \begin_inset  Tabular
4849 <lyxtabular version="3" rows="7" columns="7">
4850 <features>
4851 <column alignment="center" valignment="top" leftline="true" width="0pt">
4852 <column alignment="center" valignment="top" leftline="true" width="0pt">
4853 <column alignment="center" valignment="top" leftline="true" width="0pt">
4854 <column alignment="center" valignment="top" leftline="true" width="0pt">
4855 <column alignment="center" valignment="top" leftline="true" width="0pt">
4856 <column alignment="center" valignment="top" leftline="true" width="0pt">
4857 <column alignment="center" valignment="top" leftline="true" rightline="true" width="0pt">
4858 <row topline="true" bottomline="true">
4859 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4860 \begin_inset Text
4861
4862 \layout Standard
4863
4864 Parameter
4865 \end_inset 
4866 </cell>
4867 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4868 \begin_inset Text
4869
4870 \layout Standard
4871
4872 Update rate
4873 \end_inset 
4874 </cell>
4875 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4876 \begin_inset Text
4877
4878 \layout Standard
4879
4880 0
4881 \end_inset 
4882 </cell>
4883 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4884 \begin_inset Text
4885
4886 \layout Standard
4887
4888 1
4889 \end_inset 
4890 </cell>
4891 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4892 \begin_inset Text
4893
4894 \layout Standard
4895
4896 2
4897 \end_inset 
4898 </cell>
4899 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4900 \begin_inset Text
4901
4902 \layout Standard
4903
4904 3
4905 \end_inset 
4906 </cell>
4907 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4908 \begin_inset Text
4909
4910 \layout Standard
4911
4912 4
4913 \end_inset 
4914 </cell>
4915 </row>
4916 <row topline="true">
4917 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4918 \begin_inset Text
4919
4920 \layout Standard
4921
4922 Wideband bit
4923 \end_inset 
4924 </cell>
4925 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4926 \begin_inset Text
4927
4928 \layout Standard
4929
4930 frame
4931 \end_inset 
4932 </cell>
4933 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4934 \begin_inset Text
4935
4936 \layout Standard
4937
4938 1
4939 \end_inset 
4940 </cell>
4941 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4942 \begin_inset Text
4943
4944 \layout Standard
4945
4946 1
4947 \end_inset 
4948 </cell>
4949 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4950 \begin_inset Text
4951
4952 \layout Standard
4953
4954 1
4955 \end_inset 
4956 </cell>
4957 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4958 \begin_inset Text
4959
4960 \layout Standard
4961
4962 1
4963 \end_inset 
4964 </cell>
4965 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4966 \begin_inset Text
4967
4968 \layout Standard
4969
4970 1
4971 \end_inset 
4972 </cell>
4973 </row>
4974 <row topline="true">
4975 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4976 \begin_inset Text
4977
4978 \layout Standard
4979
4980 Mode ID
4981 \end_inset 
4982 </cell>
4983 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4984 \begin_inset Text
4985
4986 \layout Standard
4987
4988 frame
4989 \end_inset 
4990 </cell>
4991 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
4992 \begin_inset Text
4993
4994 \layout Standard
4995
4996 3
4997 \end_inset 
4998 </cell>
4999 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5000 \begin_inset Text
5001
5002 \layout Standard
5003
5004 3
5005 \end_inset 
5006 </cell>
5007 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5008 \begin_inset Text
5009
5010 \layout Standard
5011
5012 3
5013 \end_inset 
5014 </cell>
5015 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5016 \begin_inset Text
5017
5018 \layout Standard
5019
5020 3
5021 \end_inset 
5022 </cell>
5023 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5024 \begin_inset Text
5025
5026 \layout Standard
5027
5028 3
5029 \end_inset 
5030 </cell>
5031 </row>
5032 <row topline="true">
5033 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5034 \begin_inset Text
5035
5036 \layout Standard
5037
5038 LSP
5039 \end_inset 
5040 </cell>
5041 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5042 \begin_inset Text
5043
5044 \layout Standard
5045
5046 frame
5047 \end_inset 
5048 </cell>
5049 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5050 \begin_inset Text
5051
5052 \layout Standard
5053
5054 0
5055 \end_inset 
5056 </cell>
5057 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5058 \begin_inset Text
5059
5060 \layout Standard
5061
5062 12
5063 \end_inset 
5064 </cell>
5065 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5066 \begin_inset Text
5067
5068 \layout Standard
5069
5070 12
5071 \end_inset 
5072 </cell>
5073 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5074 \begin_inset Text
5075
5076 \layout Standard
5077
5078 12
5079 \end_inset 
5080 </cell>
5081 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5082 \begin_inset Text
5083
5084 \layout Standard
5085
5086 12
5087 \end_inset 
5088 </cell>
5089 </row>
5090 <row topline="true">
5091 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5092 \begin_inset Text
5093
5094 \layout Standard
5095
5096 Excitation gain
5097 \end_inset 
5098 </cell>
5099 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5100 \begin_inset Text
5101
5102 \layout Standard
5103
5104 sub-frame
5105 \end_inset 
5106 </cell>
5107 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5108 \begin_inset Text
5109
5110 \layout Standard
5111
5112 0
5113 \end_inset 
5114 </cell>
5115 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5116 \begin_inset Text
5117
5118 \layout Standard
5119
5120 5
5121 \end_inset 
5122 </cell>
5123 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5124 \begin_inset Text
5125
5126 \layout Standard
5127
5128 4
5129 \end_inset 
5130 </cell>
5131 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5132 \begin_inset Text
5133
5134 \layout Standard
5135
5136 4
5137 \end_inset 
5138 </cell>
5139 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5140 \begin_inset Text
5141
5142 \layout Standard
5143
5144 4
5145 \end_inset 
5146 </cell>
5147 </row>
5148 <row topline="true" bottomline="true">
5149 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5150 \begin_inset Text
5151
5152 \layout Standard
5153
5154 Excitation VQ
5155 \end_inset 
5156 </cell>
5157 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5158 \begin_inset Text
5159
5160 \layout Standard
5161
5162 sub-frame
5163 \end_inset 
5164 </cell>
5165 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5166 \begin_inset Text
5167
5168 \layout Standard
5169
5170 0
5171 \end_inset 
5172 </cell>
5173 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5174 \begin_inset Text
5175
5176 \layout Standard
5177
5178 0
5179 \end_inset 
5180 </cell>
5181 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5182 \begin_inset Text
5183
5184 \layout Standard
5185
5186 20
5187 \end_inset 
5188 </cell>
5189 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5190 \begin_inset Text
5191
5192 \layout Standard
5193
5194 40
5195 \end_inset 
5196 </cell>
5197 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5198 \begin_inset Text
5199
5200 \layout Standard
5201
5202 80
5203 \end_inset 
5204 </cell>
5205 </row>
5206 <row topline="true" bottomline="true">
5207 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5208 \begin_inset Text
5209
5210 \layout Standard
5211
5212 Total
5213 \end_inset 
5214 </cell>
5215 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5216 \begin_inset Text
5217
5218 \layout Standard
5219
5220 frame
5221 \end_inset 
5222 </cell>
5223 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5224 \begin_inset Text
5225
5226 \layout Standard
5227
5228 4
5229 \end_inset 
5230 </cell>
5231 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5232 \begin_inset Text
5233
5234 \layout Standard
5235
5236 36
5237 \end_inset 
5238 </cell>
5239 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5240 \begin_inset Text
5241
5242 \layout Standard
5243
5244 112
5245 \end_inset 
5246 </cell>
5247 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5248 \begin_inset Text
5249
5250 \layout Standard
5251
5252 192
5253 \end_inset 
5254 </cell>
5255 <cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
5256 \begin_inset Text
5257
5258 \layout Standard
5259
5260 352
5261 \end_inset 
5262 </cell>
5263 </row>
5264 </lyxtabular>
5265
5266 \end_inset 
5267
5268
5269 \layout Caption
5270
5271 Bit allocation for high-band in wideband mode
5272 \begin_inset LatexCommand \label{cap:bits-wideband}
5273
5274 \end_inset 
5275
5276
5277 \end_inset 
5278
5279
5280 \layout Standard
5281
5282
5283 \begin_inset ERT
5284 status Open
5285
5286 \layout Standard
5287
5288 \backslash 
5289 clearpage
5290 \end_inset 
5291
5292
5293 \layout Standard
5294
5295
5296 \begin_inset ERT
5297 status Collapsed
5298
5299 \layout Standard
5300
5301 \backslash 
5302 clearpage
5303 \end_inset 
5304
5305
5306 \layout Section
5307 \start_of_appendix 
5308 FAQ
5309 \layout Subsection*
5310
5311 Vorbis is open-source
5312 \begin_inset LatexCommand \index{open-source}
5313
5314 \end_inset 
5315
5316  and patent-free
5317 \begin_inset LatexCommand \index{patent}
5318
5319 \end_inset 
5320
5321 , why do we need Speex?
5322 \layout Standard
5323
5324 Vorbis is a great project but its goals are not the same as Speex.
5325  Vorbis is mostly aimed at compressing music and audio in general, while
5326  Speex targets speech only.
5327  For that reason Speex can achieve much better results than Vorbis on speech,
5328  typically 2-4 times higher compression at equal quality.
5329 \layout Subsection*
5330
5331 Under what license is Speex released?
5332 \layout Standard
5333
5334 As of version 1.0 beta 1, Speex in released under Xiph's BSD-like license.
5335  This license is the most permissive of the open-source licenses.
5336 \layout Subsection*
5337
5338 Ogg
5339 \begin_inset LatexCommand \index{Ogg}
5340
5341 \end_inset 
5342
5343 , Speex, Vorbis
5344 \begin_inset LatexCommand \index{Vorbis}
5345
5346 \end_inset 
5347
5348 , what's the difference?
5349 \layout Standard
5350
5351 Ogg is a 
5352 \begin_inset Quotes eld
5353 \end_inset 
5354
5355 container format
5356 \begin_inset Quotes erd
5357 \end_inset 
5358
5359  for holding multimedia data.
5360  Vorbis is an audio codec that uses Ogg to store its bit-streams as files,
5361  hence the name Ogg Vorbis.
5362  Speex also uses the Ogg format to store its bit-streams as files, so technicall
5363 y they would be 
5364 \begin_inset Quotes eld
5365 \end_inset 
5366
5367 Ogg Speex
5368 \begin_inset Quotes erd
5369 \end_inset 
5370
5371  files (I prefer to call them just Speex files).
5372  One difference with Vorbis however, is that Speex is less tied with Ogg.
5373  Actually, if what you do is Voice of IP (VoIP), you don't need Ogg at all.
5374 \layout Subsection*
5375
5376 What's the extension for Speex?
5377 \layout Standard
5378
5379 Speex files have the .spx extension.
5380  Note however that all the Speex tools (speexenc, speexdec) do not rely
5381  on the extension at all so any extension will work.
5382 \layout Subsection*
5383
5384 Can I use Speex for compressing music
5385 \begin_inset LatexCommand \index{music}
5386
5387 \end_inset 
5388
5389 ?
5390 \layout Standard
5391
5392 Just like Vorbis is not really adapted to speech, Speex is really not adapted
5393  for music.
5394  In most cases, you'll be better of with Vorbis when it comes to music.
5395 \layout Subsection*
5396
5397 I converted some MP3's to Speex and the quality is bad.
5398  What's wrong?
5399 \layout Standard
5400
5401 This is called transcoding and it will always result in much poorer quality
5402  than the original MP3.
5403  Unless you have a really good (size) reason to do so, never transcode speech.
5404  This is even valid for self transcoding (tandeming), i.e.
5405  If you decode a Speex file and re-encode it again at the same bit-rate,
5406  you will lose quality.
5407 \layout Subsection*
5408
5409 Does Speex run on Windows?
5410 \layout Standard
5411
5412 As of 0.8.0, Speex can now compile on Windows.
5413  There are also several front-ends available from the web site.
5414 \layout Subsection*
5415
5416 Why is encoding so slow compared to decoding?
5417 \layout Standard
5418
5419 For most kinds of compression, encoding is inherently slower than decoding.
5420  In the case of Speex, encoding consists of finding, for each vector of
5421  5 to 10 samples, the entry that matches the best within a codebook consisting
5422  of 16 to 256 entries.
5423  On the other hand, at decoding all that needs to be done is lookup the