IETF draft for CELT itself
[opus.git] / doc / ietf / draft-valin-celt-codec.xml
1 <?xml version='1.0'?>
2 <!DOCTYPE rfc SYSTEM 'rfc2629.dtd'>
3 <?rfc toc="yes" ?>
4
5 <rfc ipr="full3978" docName="Constrained-Energy Lapped Transform (CELT) Codec">
6
7 <front>
8 <title>draft-valin-celt-codec-00</title>
9
10
11
12 <author initials="J-M" surname="Valin" fullname="Jean-Marc Valin">
13 <organization>Octasic Semiconductor</organization>
14 <address>
15 <email>jean-marc.valin@octasic.com</email>
16 <postal>
17 <street>4101, Molson Street, suite 300</street>
18 <city>Montreal</city>
19 <region>Quebec</region>
20 <code>H1Y 3L1</code>
21 <country>Canada</country>
22 </postal>
23 </address>
24 </author>
25
26 <author initials="et" surname="al." fullname="et al.">
27 <organization></organization>
28 </author>
29
30 <date day="18" month="December" year="2008" />
31
32 <area>General</area>
33 <workgroup>AVT Working Group</workgroup>
34 <keyword>I-D</keyword>
35
36 <keyword>Internet-Draft</keyword>
37 <keyword>CELT</keyword>
38 <abstract>
39 <t>
40 CELT is an open-source voice codec suitable for use in very low delay 
41 Voice over IP (VoIP) type applications.  This document describes the encoding
42 and decoding process.
43 </t>
44 </abstract>
45 </front>
46
47 <middle>
48
49 <section anchor="Conventions used in this document" title="Conventions used in this document">
50 <t>
51 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
52 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
53 document are to be interpreted as described in RFC 2119 <xref target="rfc2119"></xref>.
54 </t>
55 </section>
56
57 <section anchor="Overview of the CELT Codec" title="Overview of the CELT Codec">
58
59 <t>
60 CELT stands for "Constrained Energy Lapped Transform". It applies some of the CELP principles, but does everything in the frequency domain, which removes some of the limitations of CELP. CELT is suitable for both speech and music and currently features:
61 </t>
62
63 <t>
64 <list style="symbols">
65 <t>Ultra-low latency (typically from 3 to 9 ms)</t>
66 <t>Full audio bandwidth (44.1 kHz and 48 kHz)</t>
67 <t>Support for both voice and music</t>
68 <t>Stereo support</t>
69 <t>Packet loss concealment</t>
70 <t>Constant bit-rates from 32 kbps to 128 kbps and above</t>
71 <t>Free software/open-source</t>
72 </list>
73 </t>
74
75 </section>
76
77 <section anchor="CELT Encoder" title="CELT Encoder">
78
79 <t>Insert encoder overview</t>
80
81 <t>Pre-emphasis</t>
82
83 <section anchor="Range Coder" title="Range Coder">
84 </section>
85
86 <section anchor="Forward MDCT" title="Forward MDCT">
87 </section>
88
89 <section anchor="Energy Envelope Quantization" title="Energy Envelope Quantization">
90 <t>Coarse quantization with 6 dB resolution, prediction, Laplace distribution</t>
91 <t>Fine quantization using resolution determined by the bit allocation</t>
92 </section>
93
94 <section anchor="Bit Allocation" title="Bit Allocation">
95 <t>Bit allocation is performed based only on information available to both the encoder and decoder.
96 The same calculations are performed in a bit-exact manner in both the encoder and decoder to ensure
97 that the result is always exactly the same. Any mismatch would cause an error in the decoded output.</t>
98 </section>
99
100 <section anchor="Pitch Prediction" title="Pitch Prediction">
101 </section>
102
103 <section anchor="Spherical Vector Quantization" title="Spherical Vector Quantization">
104 CELT uses a Pyramid Vector Quantization (PVQ) [] codebook for quantising the details
105 of the spectrum in each band that haven't been predicted by the pitch predictor.
106
107 <section anchor="Index Encoding" title="Index Encoding">
108 </section>
109
110 </section>
111
112 <section anchor="Short windows" title="Short windows">
113 </section>
114
115
116 </section>
117
118 <section anchor="CELT Decoder" title="CELT Decoder">
119
120 <t>
121 Some more text
122 </t>
123
124 <section anchor="Range Decoder" title="Range Decoder">
125 </section>
126
127 <section anchor="Spherical VQ Decoder" title="Spherical VQ Decoder">
128 CELT uses a Pyramid Vector Quantization (PVQ) [] codebook for quantising the details
129 of the spectrum in each band that haven't been predicted by the pitch predictor.
130 </section>
131
132 <section anchor="Index Decoding" title="Index Decoding">
133 </section>
134
135
136 <section anchor="Backward MDCT" title="Backward MDCT">
137 </section>
138
139 <section anchor="Packet Loss Concealment" title="Packet Loss Concealment (PLC)">
140 </section>
141
142 <t>De-emphasis</t>
143
144 </section>
145
146
147
148 <section anchor="Security Considerations" title="Security Considerations">
149
150 <t>
151 A potential denial-of-service threat exists for data encodings using
152 compression techniques that have non-uniform receiver-end
153 computational load.  The attacker can inject pathological datagrams
154 into the stream which are complex to decode and cause the receiver to
155 be overloaded.  However, this encoding does not exhibit any
156 significant non-uniformity.
157 </t>
158
159 </section> 
160
161 <section anchor="Evaluation of CELT Implementations" title="Evaluation of CELT Implementations">
162
163 <t>
164 Insert some text here.
165 </t>
166
167 </section>
168
169
170
171 <section anchor="Issues that need to be addressed" title="Issues that need to be addressed">
172
173 <t>
174 <list>
175 <t>Dynamic bit allocation</t>
176 <t>Stereo coupling</t>
177 </list>
178 </t>
179
180 </section>
181
182
183 <section anchor="Acknowledgments" title="Acknowledgments">
184
185 <t>
186 The authors would also like to thank the following members of the 
187 CELT and AVT communities for their input:
188 </t>
189 </section> 
190
191 </middle>
192
193 <back>
194
195 <references title="Normative References">
196
197 <reference anchor="rfc2119">
198 <front>
199 <title>Key words for use in RFCs to Indicate Requirement Levels </title>
200 <author initials="S." surname="Bradner" fullname="Scott Bradner"></author>
201 </front>
202 <seriesInfo name="RFC" value="2119" />
203 </reference> 
204
205 <reference anchor="rfc3550">
206 <front>
207 <title>RTP: A Transport Protocol for real-time applications</title>
208 <author initials="H." surname="Schulzrinne" fullname=""></author>
209 <author initials="S." surname="Casner" fullname=""></author>
210 <author initials="R." surname="Frederick" fullname=""></author>
211 <author initials="V." surname="Jacobson" fullname=""></author>
212 </front>
213 <seriesInfo name="RFC" value="3550" />
214 </reference> 
215
216 <reference anchor="rfc2045">
217 <front>
218 <title>Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies</title>
219 <author initials="" surname="" fullname=""></author>
220 </front>
221 <date month="November" year="1998" />
222 <seriesInfo name="RFC" value="2045" />
223 </reference> 
224
225 <reference anchor="rfc2327">
226 <front>
227 <title>SDP: Session Description Protocol</title>
228 <author initials="V." surname="Jacobson" fullname=""></author>
229 <author initials="M." surname="Handley" fullname=""></author>
230 </front>
231 <date month="April" year="1998" />
232 <seriesInfo name="RFC" value="2327" />
233 </reference> 
234
235 <reference anchor="H323">
236 <front>
237 <title>Packet-based Multimedia Communications Systems</title>
238 <author initials="" surname="" fullname=""></author>
239 </front>
240 <date month="" year="1998" />
241 <seriesInfo name="ITU-T Recommendation" value="H.323" />
242 </reference> 
243
244 <reference anchor="H245">
245 <front>
246 <title>Control of communications between Visual Telephone Systems and Terminal Equipment</title>
247 <author initials="" surname="" fullname=""></author>
248 </front>
249 <date month="" year="1998" />
250 <seriesInfo name="ITU-T Recommendation" value="H.245" />
251 </reference> 
252
253 <reference anchor="rfc3551">
254 <front>
255 <title>RTP Profile for Audio and Video Conferences with Minimal Control.</title>
256 <author initials="H." surname="Schulzrinne" fullname=""></author>
257 <author initials="S." surname="Casner" fullname=""></author>
258 </front>
259 <date month="July" year="2003" />
260 <seriesInfo name="RFC" value="3551" />
261 </reference> 
262
263 <reference anchor="rfc3534">
264 <front>
265 <title>The application/ogg Media Type</title>
266 <author initials="L." surname="Walleij" fullname=""></author>
267 </front>
268 <date month="May" year="2003" />
269 <seriesInfo name="RFC" value="3534" />
270 </reference> 
271
272 </references> 
273
274 <references title="Informative References">
275
276 <reference anchor="celt-website">
277 <front>
278 <title>The CELT ultra-low delay audio codec</title>
279 </front>
280 <seriesInfo name="CELT website" value="http://www.celt-codec.org/" />
281 </reference> 
282
283 </references> 
284
285 <section anchor="Reference Implementation" title="Reference Implementation">
286
287 <t>Insert a copy of the CELT source code here.</t>
288 <!--<t><?rfc include="source/celt.c"?></t>
289 <t><?rfc include="source/bands.c"?></t>
290 -->
291 </section>
292
293
294 </back>
295
296 </rfc>