Standing alone
Paul Jones, Rapporteur for Question 12, Study Group 16, ITU-T outlines how H.323 will evolve to become the Advanced Multimedia System
Paul Jones has been involved in research and
development of protocols and system architectures in
the area of multimedia communications, including
voice, video, and data conferencing over IP networks,
since 1996. In addition to architecture and software
development activities within Cisco Systems' Voice
Technology Group, he has actively participated in a
number of standards and industry organizations,
including the ITU, TIA, IETF, ETSI, and the IMTC.
Most notably, he served as editor of ITU-T
Recommendation H.323 and Rapporteur for ITU-T
Q.2/16 and Q.5/16. He is also an active participant
in accessibility-related work in the ITU and TIA and
serves as the Rapporteur for ITU-T Q26/16. He has
authored or co-authored several IETF RFCs and ITU
standards, including several focused on text
communication over IP networks.
Q: Why replace H. 323 with AMS/H.325 and why
now?
A: With the Advanced Multimedia System (AMS), we
are developing a system that is significantly more
advanced in terms of featured capabilities. Simply
giving it a number wouldn't do it justice. Even with
H.323 there were actually three primary
standards document and with AMS we expect
there to be many, many more.
Study Group 16 's charter is for multimedia
communication systems and has long history of
doing just that with protocols; H.320, H.324 and
H.323. The last protocol, H.323 was first
published in 1996, so it's been some time.
Estimates are that today, H.323 has roughly 15
percent of international voice business and it is
still the dominant protocol used in video within
LAN. When we developed H.323, our focus was on
Video Teleconferencing (VTC). What we found
however, was that it had a much wider
applicability and it could be used for VoIP as well.
Why now is because we have been looking at
the deficiencies of the existing systems for the
past several years; their weaknesses, the
functionality that users will want and how we
bring that about. Today data conferencing is
usually done with audio visual as a standalone
application separate from any application sharing.
The objective for AMS is to try to introduce a
system in which all applications are tightly
integrated to enable multiple lines of
communication between entities and user. It's not
necessarily a user, it might be that it is used for
robotic controls so that robots are able to
communicate with each other.
There are also further reaching applications.
With AMS we will define an architecture, and build
a system in which we define the interfaces
between the users and device. Historically that
device has had a significant amount of
functionality in it such as all of the audio and
video intelligence. What we are doing with AMS is
ensuring that the end point - the device - is
responsible only for the basic establishment of
communication between two users. The device
doesn't perform any application functionality itself
and has no inherent voice capability, no video, no
whiteboarding and no file transfer. The
applications all connect to the end point device,
through some kind of interface. We are also using
IP so the end point device might communicate
with the various applications over an IP
connection.
Q: Can you give me an example?
A: If you imagine a scenario where the telephone
call comes into the device at the user's end point,
it is in communication with a number of different
applications. Some of the applications may
physically be on the device. AMS is not going to
preclude that. I don't think that there would be
many mobile phone manufacturers who would
produce a phone without voice functionality for
instance. I might be I am talking to you and using
my mobile device and you have something that
you want to share with me. That information is on
my PC, and so you initiate an application sharing
session and because my mobile device is
associated with my PC I can do that on my PC, so
it is immediately available for me to view.
You may be able to use your mobile phone to
watch your TV programme why you are on the
Subway or Metro. You are watching streaming
video and this application might be just a
specialised form of the video application, with
maybe a five or ten second buffering before
playing it out. When you arrive home, it might be
that you haven't finished watching the programme
so you'd like to continue watching it. We actually
want to enable this for every type of multi-media application. You would have the ability to pause
the video, and to tell the mobile device to move
that video from the mobile device to a flat panel
HDTV as you walk in your home. That might be
built onto the TV or it might be connected to the
set top box that is connected to the TV. The user's
end point is communicating with the video
applications, so that when you arrive home it
communicates with a second buffered video
application which would be the LCD screen and it
would communicate between those two
applications and move the end point to the HD TV
and the video would continue. Then the video
parameters would likely change which means that
the applications would probably renegotiate
capabilities and perhaps change the media flow.
What we are trying to do is collect as many
requirements as possible and to try to define the
interface between the user end point, the network
and the various applications so we can enable any
kind of new application, without changing
anything on the end point. This is probably one of
the most important difference between AMS and
the systems we though about previously. If you
take a typical H.323 system if you want to add
new functionality to it, what it generally means is
that you are going to have to upgrade that device.
You are going to have to integrate the new
technology physically into the hardware device
that you are using.
If you are thinking about adding entirely new
capabilities such as whiteboarding that will simply
require an upgrade of the software. What we are
looking at for AMS is that application developers
can create applications, it might be white
boarding, it might be file transfer, it might be
something like video games. The application
doesn't matter. How the application behaves
doesn't really matter. That gives a lot of autonomy
to the application developers that can plug into
the systems and it doesn't necessarily require the
user end point to be upgraded in order to
accommodate those new applications.
Q: What is the schedule?
A: The schedule is still fluctuating. We are looking
at having the requirement more fully fleshed out
this year. We hope to have a fully defined
architecture by 2009. My guess is that we will
probably have some protocol element earlier as
interfaces would probably be defined in parallel
with the architecture. The goal is to have the
architecture and some of the protocols defined
for the interfaces that are used within the
architecture by the first quarter of 2009. During
the rest of 2009 we will work to harden that
systems specification, so it really depends on how
quickly contributions come in. I wouldn't expect
version 1 of the system before 2010 but it might
be that we will have something before then.
Q: In terms of gap analysis where's the least
work, where's the most work?
A: The areas where most work has already been
done is with basic audio video transmission
functionality. Clearly we are going to borrow from what we have done in H.323 and other standards
for audio visual transmission. Although a lot of
that work has been done, it has been done in such
a way that it is not going to be integrated
perfectly into this new system. We know how to
stream voice and video for VoIP or VTC, but the
way that it has been done in the past won't be
smoothly integrated. The work areas we are going
to spend most of our time on is the problem of
distributed systems. We are not going to build one
system with everything into it. It's a distributed
system where some of the functionality exists in
the end point but with most of the functionality
existing in the user video or audio terminal or in
some other separate physical devices.
We are also looking at everything doing IP.
That immediately opens up some challenges in
how we deal with multiple networks such as high
speed LAN, narrowband wireless and we have to
get those to work together. That is going to be
complicated.
We don't necessarily think that we will solve
all of those problems for the first day. It might be
that the first AMS deliverable has all the audio
and video functionality built into the user device,
simply because we can do QoS control a little bit
better that way. We will however define the
interfaces between that device and the external
physical devices so that the potential is there to
move that AV functionality off board to a separate
physical device. There are also things that are not
real time which might include application sharing
and file transfer. Those things could be used
immediately.
Where we have the most work is figuring how
to get the separate physical entities to
communicate with each other and doing so in
such a way that users get the expected quality of
experience. We know that is going to be a
challenge and this may require time to solve.
For more information visit:
ITU website at www.itu.in or email us on toby.johnson@itu.int |