draft-ietf-avt-tones-05.txt   draft-ietf-avt-tones-06.txt 
Internet Engineering Task Force AVT WG Internet Engineering Task Force AVT WG
Internet Draft Schulzrinne/Petrack Internet Draft Schulzrinne/Petrack
draft-ietf-avt-tones-05.txt Columbia U./MetaTel draft-ietf-avt-tones-06.txt Columbia U./MetaTel
December 20, 1999 January 14, 2000
Expires: May 2000 Expires: May 2000
RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals RTP Payload for DTMF Digits, Telephony Tones and Telephony Signals
STATUS OF THIS MEMO STATUS OF THIS MEMO
This document is an Internet-Draft and is in full conformance with This document is an Internet-Draft and is in full conformance with
all provisions of Section 10 of RFC2026. all provisions of Section 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
skipping to change at page 1, line 43 skipping to change at page 1, line 43
This memo describes how to carry dual-tone multifrequency (DTMF) This memo describes how to carry dual-tone multifrequency (DTMF)
signaling, other tone signals and telephony events in RTP packets. signaling, other tone signals and telephony events in RTP packets.
1 Introduction 1 Introduction
This memo defines two payload formats, one for carrying dual-tone This memo defines two payload formats, one for carrying dual-tone
multifrequency (DTMF) digits, other line and trunk signals (Section multifrequency (DTMF) digits, other line and trunk signals (Section
3), and a second one for general multi-frequency tones in RTP [1] 3), and a second one for general multi-frequency tones in RTP [1]
packets (Section 4). Separate RTP payload formats are desirable since packets (Section 4). Separate RTP payload formats are desirable since
low-rate voice codecs cannot be guaranteed to reproduce these tone low-rate voice codecs cannot be guaranteed to reproduce these tone
signals accurately enough for automatic recognition. Defining a signals accurately enough for automatic recognition. Defining
separate payload formats also permits higher redundancy while separate payload formats also permits higher redundancy while
maintaining a low bit rate. maintaining a low bit rate.
The payload formats described here may be useful in at least three The payload formats described here may be useful in at least three
applications: DTMF handling for gateways and end sytems, as well as applications: DTMF handling for gateways and end sytems, as well as
"RTP trunks". In the first application, the Internet telephony "RTP trunks". In the first application, the Internet telephony
gateway detects DTMF on the incoming circuits and sends the RTP gateway detects DTMF on the incoming circuits and sends the RTP
payload described here instead of regular audio packets. The gateway payload described here instead of regular audio packets. The gateway
likely has the necessary digital signal processors and algorithms, as likely has the necessary digital signal processors and algorithms, as
it often needs to detect DTMF, e.g., for two-stage dialing. Having it often needs to detect DTMF, e.g., for two-stage dialing. Having
skipping to change at page 3, line 34 skipping to change at page 3, line 34
ISDN terminals may generate dial tone locally and then send a Q.931 ISDN terminals may generate dial tone locally and then send a Q.931
SETUP message containing the dialed digits. If the terminal just SETUP message containing the dialed digits. If the terminal just
sends a SETUP message without any Called Party digits, then the sends a SETUP message without any Called Party digits, then the
switch does digit collection, provided by the terminal as KEYPAD switch does digit collection, provided by the terminal as KEYPAD
messages, and provides dial tone over the B-channel. The terminal can messages, and provides dial tone over the B-channel. The terminal can
either use the audio signal on the B-channel or can use the Q.931 either use the audio signal on the B-channel or can use the Q.931
messages to trigger locally generated dial tone. messages to trigger locally generated dial tone.
Ringing tone (also called ringback tone) is generated by the local Ringing tone (also called ringback tone) is generated by the local
switch at the callee, with a one-way voice path opened up as soon as switch at the callee, with a one-way voice path opened up as soon as
the callee's phone rings. (This reduces the chance of clipping of the the callee's phone rings. (This reduces the chance of clipping the
called party's response just after answer. It also permits pre-answer called party's response just after answer. It also permits pre-answer
announcements or in-band call-progress-indications to reach the announcements or in-band call-progress indications to reach the
caller before or in lieu of ringing tone.) Congestion tone and caller before or in lieu of a ringing tone.) Congestion tone and
special information tones can be generated by any of the switches special information tones can be generated by any of the switches
along the way, and may be generated by the caller's switch based on along the way, and may be generated by the caller's switch based on
ISUP messages received. Busy tone is generated by the caller's ISUP messages received. Busy tone is generated by the caller's
switch, triggered by the appropriate ISUP message, for analog switch, triggered by the appropriate ISUP message, for analog
instruments, or the ISDN terminal. instruments, or the ISDN terminal.
Gateways which send signalling events via RTP MAY send both named Gateways which send signalling events via RTP MAY send both named
signals (Section 3) and the tone representation (Section 4) as a signals (Section 3) and the tone representation (Section 4) as a
single RTP session, using the redundancy mechanism defined in Section single RTP session, using the redundancy mechanism defined in Section
3.7 to interleave the two representations. It is generally a good 3.7 to interleave the two representations. It is generally a good
skipping to change at page 9, line 21 skipping to change at page 9, line 21
of telephone events at a sampling rate of 8000 Hz. of telephone events at a sampling rate of 8000 Hz.
Including the starting time of previous events allows Including the starting time of previous events allows
precise reconstruction of the tone sequence at a gateway. precise reconstruction of the tone sequence at a gateway.
The scheme is resilient to consecutive packet losses The scheme is resilient to consecutive packet losses
spanning this interval of 2.048 seconds or r digits, spanning this interval of 2.048 seconds or r digits,
whichever is less. Note that for previous digits, only an whichever is less. Note that for previous digits, only an
average loudness can be represented. average loudness can be represented.
An encoder MAY treat the event payload as a highly-compressed version An encoder MAY treat the event payload as a highly-compressed version
of the current audio frame. In that mode, each RTP packet during an of the current audio frame. In that mode, each RTP packet during an
even would contain the current audio codec rendition (say, G.723.1 or event would contain the current audio codec rendition (say, G.723.1
G.729) of this digit as well as the representation described in or G.729) of this digit as well as the representation described in
Section 3.5, plus any previous events seen earlier. Section 3.5, plus any previous events seen earlier.
This approach allows dumb gateways that do not understand This approach allows dumb gateways that do not understand
this format to function. See also the discussion in Section this format to function. See also the discussion in Section
1. 1.
3.8 Example 3.8 Example
A typical RTP packet, where the user is just dialing the last digit A typical RTP packet, where the user is just dialing the last digit
of the DTMF sequence "911". The first digit was 200 ms long (1600 of the DTMF sequence "911". The first digit was 200 ms long (1600
skipping to change at page 12, line 13 skipping to change at page 12, line 13
described below, with additional detail in Table 7. described below, with additional detail in Table 7.
ANS: This 2100 +/- 15 Hz tone is used to disable echo ANS: This 2100 +/- 15 Hz tone is used to disable echo
suppression for data transmission [8,9]. For fax machines, suppression for data transmission [8,9]. For fax machines,
Recommendation T.30 [9] refers to this tone as called Recommendation T.30 [9] refers to this tone as called
terminal identification (CED) answer tone. terminal identification (CED) answer tone.
/ANS: This is the same signal as ANS, except that it reverses /ANS: This is the same signal as ANS, except that it reverses
phase at an interval of 450 +/- 25 ms. It disables both phase at an interval of 450 +/- 25 ms. It disables both
echo cancellers and echo suppressors. (In the ITU echo cancellers and echo suppressors. (In the ITU
Recommendation, this signal is rendered as ANS with a bar Recommendation V.25 [8], this signal is rendered as ANS
on top.) with a bar on top.)
ANSam: The modified answer tone (ANSam) [3] is a sinewave signal ANSam: The modified answer tone (ANSam) [3] is a sinewave signal
at 2100 +/- 1 Hz with phase reversals at an interval of 450 at 2100 +/- 1 Hz without phase reversals, amplitude-
+/- 25 ms, amplitude-modulated by a sinewave at 15 +/- 0.1 modulated by a sinewave at 15 +/- 0.1 Hz. This tone is sent
Hz. This tone [10,8] is sent by modems [11] and faxes to by modems if network echo canceller disabling is not
disable echo suppressors. required.
/ANSam: This is the same signal as ANSam, except that it /ANSam: The modified answer tone with phase reversals (ANSam)
reverses phase at an interval of 450 +/- 25 ms. It disables [3] is a sinewave signal at 2100 +/- 1 Hz with phase
both echo cancellers and echo suppressors. (In the ITU reversals at intervals of 450 +/- 25 ms, amplitude-
Recommendation, this signal is rendered as ANSam with a bar modulated by a sinewave at 15 +/- 0.1 Hz. This tone [10,8]
on top.) is sent by modems [11] and faxes to disable echo
suppressors.
CNG: After dialing the called fax machine's telephone number CNG: After dialing the called fax machine's telephone number
(and before it answers), the calling Group III fax machine (and before it answers), the calling Group III fax machine
(optionally) begins sending a CalliNG tone (CNG) consisting (optionally) begins sending a CalliNG tone (CNG) consisting
of an interrupted tone of 1100 Hz. [9] of an interrupted tone of 1100 Hz. [9]
CRd: Capabilities Request (CRd) [12] is a dual-tone signal with CRd: Capabilities Request (CRd) [12] is a dual-tone signal with
tones at tones at 1375 Hz and 2002 Hz for 400 ms for the tones at tones at 1375 Hz and 2002 Hz for 400 ms for the
initiating side and 1529 Hz and 2225 Hz for the responding initiating side and 1529 Hz and 2225 Hz for the responding
side, followed by a single tone at 1900 Hz for 100 ms. side, followed by a single tone at 1900 Hz for 100 ms.
skipping to change at page 14, line 8 skipping to change at page 14, line 9
frequency shift keying (FSK). It is used by Group 3 fax frequency shift keying (FSK). It is used by Group 3 fax
machines to exchange T.30 information. The calling machines to exchange T.30 information. The calling
transmits on channel 1 and receives on channel 2; the transmits on channel 1 and receives on channel 2; the
answering modem transmits on channel 2 and receives on answering modem transmits on channel 2 and receives on
channel 1. Each bit value has a distinct tone, so that V.21 channel 1. Each bit value has a distinct tone, so that V.21
signaling comprises a total of four distinct tones. signaling comprises a total of four distinct tones.
In summary, procedures in Table 2 are used. In summary, procedures in Table 2 are used.
Procedure indications Procedure indications
________________________________________________________ ___________________________________________________
V.25 and V.8 ANS, ANS, ... V.25 and V.8 ANS
V.25, echo canceller disabled ANS, /ANS, ANS, /ANS V.25, echo canceller disabled ANS, /ANS, ANS, /ANS
V.8 ANSam, ANSam, ... V.8 ANSam
V.8, echo canceller disabled ANSam, /ANSam, ANSam, ... V.8, echo canceller disabled /ANSam
Table 2: Use of ANS, ANSam and /ANSam in V.x recommendations Table 2: Use of ANS, ANSam and /ANSam in V.x recommendations
Event____________________encoding_(decimal) Event____________________encoding_(decimal)
Answer tone (ANS) 32 Answer tone (ANS) 32
/ANS 33 /ANS 33
ANSam 34 ANSam 34
/ANSam 35 /ANSam 35
Calling tone (CNG) 36 Calling tone (CNG) 36
V.21 channel 1, "0" bit 37 V.21 channel 1, "0" bit 37
skipping to change at page 15, line 48 skipping to change at page 15, line 50
Special information tone: The callee cannot be reached, but the Special information tone: The callee cannot be reached, but the
reason is neither "busy" nor "congestion". This tone should reason is neither "busy" nor "congestion". This tone should
be used before all call failure announcements, for the be used before all call failure announcements, for the
benefit of automatic equipment. benefit of automatic equipment.
Comfort tone: The call is being processed. This tone may be used Comfort tone: The call is being processed. This tone may be used
during long post-dial delays, e.g., in international during long post-dial delays, e.g., in international
connections. connections.
Hold tone: The caller has been placed on hold. Replaced by Hold tone: The caller has been placed on hold.
Greensleeves
Record tone: The caller has been connected to an automatic Record tone: The caller has been connected to an automatic
answering device and is requested to begin speaking. answering device and is requested to begin speaking.
Caller waiting tone: The called station is busy, but has call Caller waiting tone: The called station is busy, but has call
waiting service. waiting service.
Pay tone: The caller, at a payphone, is reminded to deposit Pay tone: The caller, at a payphone, is reminded to deposit
additional coins. additional coins.
skipping to change at page 16, line 49 skipping to change at page 17, line 5
Payphone recognition tone: The person making the call or being Payphone recognition tone: The person making the call or being
called is using a payphone (and thus it is ill-advised to called is using a payphone (and thus it is ill-advised to
allow collect calls to such a person). allow collect calls to such a person).
3.13 Extended Line Events 3.13 Extended Line Events
Table 5 summarizes country-specific events and tones that can appear Table 5 summarizes country-specific events and tones that can appear
on a subscriber line. on a subscriber line.
3.14 Trunk Events
Event encoding (decimal) Event encoding (decimal)
_____________________________________________ _____________________________________________
Off Hook 64 Off Hook 64
On Hook 65 On Hook 65
Dial tone 66 Dial tone 66
PABX internal dial tone 67 PABX internal dial tone 67
Special dial tone 68 Special dial tone 68
Second dial tone 69 Second dial tone 69
Ringing tone 70 Ringing tone 70
Special ringing tone 71 Special ringing tone 71
skipping to change at page 17, line 35 skipping to change at page 17, line 36
Warning tone 83 Warning tone 83
Intrusion tone 84 Intrusion tone 84
Calling card service tone 85 Calling card service tone 85
Payphone recognition tone 86 Payphone recognition tone 86
CPE alerting signal (CAS) 87 CPE alerting signal (CAS) 87
Off-hook warning tone 88 Off-hook warning tone 88
Ring 89 Ring 89
Table 4: E.182 line events Table 4: E.182 line events
3.14 Trunk Events
Table 6 summarizes the events and tones that can appear on a trunk. Table 6 summarizes the events and tones that can appear on a trunk.
Note that trunk can also carry line events (Section 3.12), as MF Note that trunk can also carry line events (Section 3.12), as MF
signaling does not include backward signals [15]. signaling does not include backward signals [15].
ABCD transitional: 4-bit signaling used by digital trunks. For ABCD transitional: 4-bit signaling used by digital trunks. For
N-state signaling, the first N values are used. N-state signaling, the first N values are used.
The T1 ESF (extended super frame format) allows 2, 4, and The T1 ESF (extended super frame format) allows 2, 4, and
16 state signalling bit options. These signalling bits are 16 state signalling bit options. These signalling bits are
named A, B, C, and D. Signalling information is sent as named A, B, C, and D. Signalling information is sent as
robbed bits in frames 6, 12, 18, and 24 when using ESF T1 robbed bits in frames 6, 12, 18, and 24 when using ESF T1
framing. A D4 superframe only transmits 4-state signalling framing. A D4 superframe only transmits 4-state signalling
with A and B bits. On the CEPT E1 frame, all signalling is with A and B bits. On the CEPT E1 frame, all signalling is
carried in timeslot 16, and two channels of 16-state (ABCD)
signalling are sent per frame.
Event encoding (decimal) Event encoding (decimal)
___________________________________________________ ___________________________________________________
Acceptance tone 96 Acceptance tone 96
Confirmation tone 97 Confirmation tone 97
Dial tone, recall 98 Dial tone, recall 98
End of three party service tone 99 End of three party service tone 99
Facilities tone 100 Facilities tone 100
Line lockout tone 101 Line lockout tone 101
Number unobtainable tone 102 Number unobtainable tone 102
Offering tone 103 Offering tone 103
skipping to change at page 18, line 27 skipping to change at page 18, line 26
Queue tone 106 Queue tone 106
Refusal tone 107 Refusal tone 107
Route tone 108 Route tone 108
Valid tone 109 Valid tone 109
Waiting tone 110 Waiting tone 110
Warning tone (end of period) 111 Warning tone (end of period) 111
Warning Tone (PIP tone) 112 Warning Tone (PIP tone) 112
Table 5: Country-specific Line events Table 5: Country-specific Line events
carried in timeslot 16, and two channels of 16-state (ABCD)
signalling are sent per frame.
Since this information is a state rather than a changing Since this information is a state rather than a changing
signal, implementations SHOULD use the following triple- signal, implementations SHOULD use the following triple-
redundancy mechanism, similar to the one specified in ITU-T redundancy mechanism, similar to the one specified in ITU-T
Rec. I.366.2 [16], Annex L. At the time of a transition, Rec. I.366.2 [16], Annex L. At the time of a transition,
the same ABCD information is sent 3 times at an interval of the same ABCD information is sent 3 times at an interval of
5 ms. If another transition occurs during this time, then 5 ms. If another transition occurs during this time, then
this continues. After a period of no change, the ABCD this continues. After a period of no change, the ABCD
information is sent every 5 seconds. information is sent every 5 seconds.
Wink: A brief transition, typically 120-290 ms, from on-hook Wink: A brief transition, typically 120-290 ms, from on-hook
skipping to change at page 18, line 50 skipping to change at page 19, line 5
Incoming seizure: Incoming indication of call attempt (off- Incoming seizure: Incoming indication of call attempt (off-
hook). hook).
Seizure: Seizure by answering exchange, in response to outgoing Seizure: Seizure by answering exchange, in response to outgoing
seizure. seizure.
Unseize circuit: Transition of circuit from off-hook to on-hook Unseize circuit: Transition of circuit from off-hook to on-hook
at the end of a call. at the end of a call.
Wink off: A brief transition, typically 100-350 ms, from off-
hook (seized) to on-hook (unseized) and back to off-hook
Event encoding (decimal) Event encoding (decimal)
__________________________________________________ __________________________________________________
MF 0... 9 128... 137 MF 0... 9 128... 137
MF K0 or KP (start-of-pulsing) 138 MF K0 or KP (start-of-pulsing) 138
MF K1 139 MF K1 139
MF K2 140 MF K2 140
MF S0 to ST (end-of-pulsing) 141 MF S0 to ST (end-of-pulsing) 141
MF S1... S3 142... 143 MF S1... S3 142... 143
ABCD signaling (see below) 144... 159 ABCD signaling (see below) 144... 159
Wink 160 Wink 160
skipping to change at page 19, line 29 skipping to change at page 19, line 30
Default continuity tone 166 Default continuity tone 166
Continuity tone (single tone) 167 Continuity tone (single tone) 167
Continuity test send 168 Continuity test send 168
Continuity verified 170 Continuity verified 170
Loopback 171 Loopback 171
Old milliwatt tone (1000 Hz) 172 Old milliwatt tone (1000 Hz) 172
New milliwatt tone (1004 Hz) 173 New milliwatt tone (1004 Hz) 173
Table 6: Trunk events Table 6: Trunk events
Wink off: A brief transition, typically 100-350 ms, from off-
hook (seized) to on-hook (unseized) and back to off-hook
(seized). Used in operator services trunks. (seized). Used in operator services trunks.
Continuity tone send: A tone of 2010 Hz. Continuity tone send: A tone of 2010 Hz.
Continuity tone detect: A tone of 2010 Hz. Continuity tone detect: A tone of 2010 Hz.
Continuity test send: A tone of 1780 Hz is sent by the calling Continuity test send: A tone of 1780 Hz is sent by the calling
exchange. If received by the called exchange, it returns a exchange. If received by the called exchange, it returns a
"continuity verified" tone. "continuity verified" tone.
skipping to change at page 20, line 20 skipping to change at page 20, line 22
o Telephony tones consist of either a single tone, the addition o Telephony tones consist of either a single tone, the addition
of two or three tones or the modulation of two tones. (Almost of two or three tones or the modulation of two tones. (Almost
all tones use two frequencies; only the Hungarian "special all tones use two frequencies; only the Hungarian "special
dial tone" has three.) Tones that are mixed have the same dial tone" has three.) Tones that are mixed have the same
amplitude and do not decay. amplitude and do not decay.
o Tones for telephony events are in the range of 25 (ringing o Tones for telephony events are in the range of 25 (ringing
tone in Angola) to 1800 Hz. CED is the highest used tone at tone in Angola) to 1800 Hz. CED is the highest used tone at
2100 Hz. The telephone frequency range is limited to 3,400 Hz. 2100 Hz. The telephone frequency range is limited to 3,400 Hz.
(The piano has a range from 27.5 to 4186 Hz.)
o Modulation frequencies range between 15 (ANSam tone) to 480 Hz o Modulation frequencies range between 15 (ANSam tone) to 480 Hz
(Jamaica). Non-integer frequencies are used only for (Jamaica). Non-integer frequencies are used only for
frequencies of 16 2/3 and 33 1/3 Hz. (These fractional frequencies of 16 2/3 and 33 1/3 Hz. (These fractional
frequencies appear to be derived from older AC power grid frequencies appear to be derived from older AC power grid
frequencies.) frequencies.)
o Tones that are not continuous have durations of less than four o Tones that are not continuous have durations of less than four
seconds. seconds.
skipping to change at page 20, line 48 skipping to change at page 21, line 4
Recommendation E.180 [18]. Note that there are no specific guidelines Recommendation E.180 [18]. Note that there are no specific guidelines
for these tones. In the table, the symbol "+" indicates addition of for these tones. In the table, the symbol "+" indicates addition of
the tones, without modulation, while "*" indicates amplitude the tones, without modulation, while "*" indicates amplitude
modulation. The meaning of some of the tones is described in Section modulation. The meaning of some of the tones is described in Section
3.12 or Section 3.11 (for V.21). 3.12 or Section 3.11 (for V.21).
4.3 Use of RTP Header Fields 4.3 Use of RTP Header Fields
Timestamp: The RTP timestamp reflects the measurement point for Timestamp: The RTP timestamp reflects the measurement point for
the current packet. The event duration described in Section the current packet. The event duration described in Section
3.5 extends forwards from that time.
4.4 Payload Format
Tone name frequency on period off period Tone name frequency on period off period
______________________________________________________ ______________________________________________________
CNG 1100 0.5 3.0 CNG 1100 0.5 3.0
V.25 CT 1300 0.5 2.0 V.25 CT 1300 0.5 2.0
CED 2100 3.3 -- CED 2100 3.3 --
ANS 2100 3.3 -- ANS 2100 3.3 --
ANSam 2100*15 3.3 -- ANSam 2100*15 3.3 --
V.21 "0" bit, ch. 1 1180 0.033 V.21 "0" bit, ch. 1 1180 0.033
V.21 "1" bit, ch. 1 980 0.033 V.21 "1" bit, ch. 1 980 0.033
V.21 "0" bit, ch. 2 1850 0.033 V.21 "0" bit, ch. 2 1850 0.033
skipping to change at page 21, line 28 skipping to change at page 21, line 28
ITU ringing tone 425 0.67--1.5 3--5 ITU ringing tone 425 0.67--1.5 3--5
U.S._ringing_tone_______440+480________2.0_________4.0 U.S._ringing_tone_______440+480________2.0_________4.0
ITU busy tone 425 ITU busy tone 425
U.S. busy tone 480+620 0.5 0.5 U.S. busy tone 480+620 0.5 0.5
______________________________________________________ ______________________________________________________
ITU congestion tone 425 ITU congestion tone 425
U.S. congestion tone 480+620 0.25 0.25 U.S. congestion tone 480+620 0.25 0.25
Table 7: Examples of telephony tones Table 7: Examples of telephony tones
3.5 extends forwards from that time.
4.4 Payload Format
Based on the characteristics described above, this document defines Based on the characteristics described above, this document defines
an RTP payload format called "tone" that can represent tones an RTP payload format called "tone" that can represent tones
consisting of one or more frequencies. (The corresponding MIME type consisting of one or more frequencies. (The corresponding MIME type
is "audio/tone".) The default timestamp rate is 8,000 Hz, but other is "audio/tone".) The default timestamp rate is 8,000 Hz, but other
rates may be defined. Note that the timestamp rate does not affect rates may be defined. Note that the timestamp rate does not affect
the interpretation of the frequency, just the durations. the interpretation of the frequency, just the durations.
In accordance with current practice, this payload format does not In accordance with current practice, this payload format does not
have a static payload type number, but uses a RTP payload type number have a static payload type number, but uses a RTP payload type number
established dynamically and out-of-band. established dynamically and out-of-band.
It is shown in Fig. 3. It is shown in Fig. 3.
The payload contains the following fields: The payload contains the following fields:
modulation: The modulation frequency, in Hz. The field is a 9- modulation: The modulation frequency, in Hz. The field is a 9-
bit unsigned integer, allowing modulation frequencies up to bit unsigned integer, allowing modulation frequencies up to
511 Hz. If there is no modulation, this field has a value
of zero.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| modulation |T| volume | duration | | modulation |T| volume | duration |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency | |R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency | |R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
...... ......
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|R R R R| frequency |R R R R| frequency | |R R R R| frequency |R R R R| frequency |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 3: Payload format for tones Figure 3: Payload format for tones
511 Hz. If there is no modulation, this field has a value
of zero.
T: If the "T" bit is set (one), the modulation frequency is to T: If the "T" bit is set (one), the modulation frequency is to
be divided by three. Otherwise, the modulation frequency is be divided by three. Otherwise, the modulation frequency is
taken as is. taken as is.
This bit allows frequencies accurate to 1/3 Hz, since This bit allows frequencies accurate to 1/3 Hz, since
modulation frequencies such as 16 2/3 Hz are in modulation frequencies such as 16 2/3 Hz are in
practical use. practical use.
volume: The power level of the tone, expressed in dBm0 after volume: The power level of the tone, expressed in dBm0 after
dropping the sign, with range from 0 to -63 dBm0. (Note: A dropping the sign, with range from 0 to -63 dBm0. (Note: A
skipping to change at page 23, line 48 skipping to change at page 24, line 4
MIME media type name: audio MIME media type name: audio
MIME subtype name: telephone-event MIME subtype name: telephone-event
Required parameters: none. Required parameters: none.
Optional parameters: The "events" parameter lists the events Optional parameters: The "events" parameter lists the events
supported by the implementation. Events are listed as one supported by the implementation. Events are listed as one
or more comma-separated elements. Each element can either or more comma-separated elements. Each element can either
be a single integer or two integers separated by a hyphen. be a single integer or two integers separated by a hyphen.
No white space is allowed in the argument. The integers
designate the event numbers supported by the
implementation. All implementations MUST support events 0
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| V |P|X| CC |M| PT | sequence number | | V |P|X| CC |M| PT | sequence number |
| 2 |0|0| 0 |0| 96 | 31 | | 2 |0|0| 0 |0| 96 | 31 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp | | timestamp |
| 48000 | | 48000 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier | | synchronization source (SSRC) identifier |
skipping to change at page 24, line 42 skipping to change at page 24, line 43
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| modulation=0 |0| volume=5 | duration=12000 | | modulation=0 |0| volume=5 | duration=12000 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|0 0 0 0| frequency=440 |0 0 0 0| frequency=480 | |0 0 0 0| frequency=440 |0 0 0 0| frequency=480 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 4: Combining tones and events in a single RTP packet Figure 4: Combining tones and events in a single RTP packet
No white space is allowed in the argument. The integers
designate the event numbers supported by the
implementation. All implementations MUST support events 0
through 15, so that the parameter can be omitted if the through 15, so that the parameter can be omitted if the
implementation only supports these events. implementation only supports these events.
The "rate" parameter describes the sampling rate, in Hertz. The "rate" parameter describes the sampling rate, in Hertz.
The number is written as a floating point number or as an The number is written as a floating point number or as an
integer. If omitted, the default value is 8000 Hz. integer. If omitted, the default value is 8000 Hz.
Encoding considerations: This type is only defined for transfer Encoding considerations: This type is only defined for transfer
via RTP [1]. via RTP [1].
skipping to change at page 26, line 29 skipping to change at page 26, line 33
example RFC 1890 [19]).This implies that confidentiality of the media example RFC 1890 [19]).This implies that confidentiality of the media
streams is achieved by encryption. Because the data compression used streams is achieved by encryption. Because the data compression used
with this payload format is applied end-to-end, encryption may be with this payload format is applied end-to-end, encryption may be
performed after compression so there is no conflict between the two performed after compression so there is no conflict between the two
operations. operations.
This payload type does not exhibit any significant non-uniformity in This payload type does not exhibit any significant non-uniformity in
the receiver side computational complexity for packet processing to the receiver side computational complexity for packet processing to
cause a potential denial-of-service threat. cause a potential denial-of-service threat.
In older networks employing in-band signaling and lacking appropriate
tone filters, the tones in Section 3.14 may be used to commit toll
fraud.
Additional security considerations are described in RFC 2198 [6]. Additional security considerations are described in RFC 2198 [6].
8 IANA Considerations 8 IANA Considerations
This document defines two new RTP payload formats, named telephone- This document defines two new RTP payload formats, named telephone-
event and tone, and associated Internet media (MIME) types, event and tone, and associated Internet media (MIME) types,
audio/telephone-event and audio/tone. audio/telephone-event and audio/tone.
Within the audio/telephone-event type, additional events MUST be Within the audio/telephone-event type, additional events MUST be
registered with IANA. Registrations are subject to approval by the registered with IANA. Registrations are subject to approval by the
skipping to change at page 27, line 4 skipping to change at page 27, line 11
The meaning of new events MUST be documented either as an RFC or an The meaning of new events MUST be documented either as an RFC or an
equivalent standards document produced by another standardization equivalent standards document produced by another standardization
body, such as ITU-T. body, such as ITU-T.
9 Acknowledgements 9 Acknowledgements
The suggestions of the Megaco working group are gratefully The suggestions of the Megaco working group are gratefully
acknowledged. Detailed advice and comments were provided by Fred acknowledged. Detailed advice and comments were provided by Fred
Burg, Steve Casner, Fatih Erdin, Bill Foster, Mike Fox, Gunnar Burg, Steve Casner, Fatih Erdin, Bill Foster, Mike Fox, Gunnar
Hellstrom, Terry Lyons, Colin Perkins and Steve Magnell. Hellstrom, Terry Lyons, Steve Magnell, Vern Paxson and Colin Perkins.
10 Authors 10 Authors
Henning Schulzrinne Henning Schulzrinne
Dept. of Computer Science Dept. of Computer Science
Columbia University Columbia University
1214 Amsterdam Avenue 1214 Amsterdam Avenue
New York, NY 10027 New York, NY 10027
USA USA
electronic mail: schulzrinne@cs.columbia.edu electronic mail: schulzrinne@cs.columbia.edu
 End of changes. 

This html diff was produced by rfcdiff 1.23, available from http://www.levkowetz.com/ietf/tools/rfcdiff/