Multimedia Users Group
Oklahoma State University
July 18, 2000
Jane Carpenter, Bill Elliot, John Gelder
An Introduction To Real and SMIL
Video is just now becoming a hot feature on the Internet.
There are two...maybe three...environments for video over the
Internet; Real (Real Networks), QuickTime (Apple), Media Player
(Microsoft).
Historically video was pretty simple on the web. Click on a
link and as long as you had the player it would load and play.
Here are two examples, one using the Real player, the other the
QuickTime player.
But the sophistication of the players and the 'movies' have
increased tremedously in the past year. Look at these examples,
The current use of video on the Internet is more interactive
than ever before. More and more tools are becoming available that
allow this interactivity which can make your video more user
friendly to your audience.
SMIL BASICS
SMIL is a neumonic for Synchronized Multimedia Integration
Language. SMIL
is a markup language, related to XML, using tags to synchronize different media
streams on a 'page'. The page could be displayed in the
RealNetworks player, embedded within a web page or even played in
QuickTime. The
markup language provides the author with more control over
placement of media
streams on the screen and interactivity with those streams. SMIL 1.0 is the current
version of this powerful yet simple
language, with SMIL 2.0 slatted for release later this year.
While you may feel like SMIL is something totally new and you've
never seen it in action, you may in fact have used it without
knowing. Many news and entertainment services such as CNN,
ABCNews,
Take 5 use
SMIL to display their multimedia and provide interactivity on
their sites.
So how do we create a SMIL file and what does a SMIL file
look like? Since SMIL is a markup language you can create SMIL
files using a text editor. So Notepad, SimpleText, BBedit or any
other text editor can be used, even MicroSoft Word works. A
familiarity with the tags/elements is useful to get started, but
I think in our case we'll just learn those as we go along.
Perhaps the best way to learn SMIL is to look at SMIL files
created by others. This is possibly the most efficient approach.
Once you've created a SMIL file you can use the Realplayer (RealPlayer
Basic) by RealNetworks
to view the results locally. If the media files are not
particularly large you can actually serve up your materials from
an http server. But for larger files which need to be accessed by
many users a RealServer is important.
So lets look at the four important tags that should appear in
your SMIL file.
<smil></smil>
<head></head>
<layout></layout>
<body></body>
The basic organization of these four tags is,
<smil>
<head>
<layout>
</layout>
</head>
<body>
</body>
</smil>
Specific information about the design of your screen is
placed between the <layout> tags, and the actual media data
types are placed within the <body> tag. The file is saved
with an '.smi', or '.smil' extension.
The layout tag has several important features. Within the
basic layout tag we can define the root window size which will
contain the media elements. An example is,
<layout>
<root-layout
background-color="black" width="600"
height="500"/>
</layout>
The root-layout element defines the window size for the
player, in this case the player window will be 600 pixels wide
and 500 pixels high. Within that window a different tag is used
to place the media elements. This element is called the region
id="". The region element controls the position, size
and scaling of the media elements. For example if within the
window defined by the root-layout element there are two regions
for media streams and a region for a background image to play the
media elements over, they would be defined in the following way,
<layout>
<root-layout
background-color="black" width="600"
height="500"/>
<region id="backregion"
top="0" left="0" width="600"
height="500" z-index="0"/>
<region id="videoregion"
top="78" left="142" width="320"
height="240" z-index="1"/>
<region id="textregion"
top="327" left="142" width="442"
height="161" z-index="1"/>
</layout>
The left, right, width and height attributes define the
position and size of the regions. Notice the videoregion is 320
pixels by 240 pixels. That is the size of the video clip I
captured. The textregion is 442 pixels wide and 161 pixels high.
Those dimensions were arrived at on the basis of the information
I display within that region. I use the textregion to display
slides showing information that I'm discussing in my
presentation. The backregion is the same dimensions as the
root-layout. I place a background image which I created in
Fireworks 3.0. The background image looks like;

Within this background image you can see the embossed region
for the video and the slide region (in gray). The z-index defines
the relative layer for each region. The higher number region is
on top of the lower number region.
Now that we've defined the regions in our file we can
associate the different media streams within these regions. So
the following should be placed within the <body>
</body> tag.
<body>
<seq>
<par>
<ref src="API16800.rm"
region="videoregion"/>
<ref src="apinstitutep1jpegs.rp"
region="textregion" fill="freeze"/>
<ref src="APIbackP1.gif"
region="backregion" fill="freeze">
</ref>
</par>
</seq>
</body>
The <seq> and <par> elements define how the media
elements are sychronized contained within these elemetns are
played. In this particular case all three media streams are
played in parallel. They begin together and end according to the
time length of the longest stream. If one of the streams has a
shorter length the fill="freeze" attribute freezes the
last frame of the stream. If two media streams are to be played
one after the other the streams would be placed between the
<seq> element. For example,
<body>
<seq>
<ref src="API16800.rm"
region="videoregion"/>
<ref src="apinstitutep1jpegs.rp"
region="textregion" fill="freeze"/>
<ref src="APIbackP1.gif"
region="backregion" fill="freeze">
</ref>
</seq>
</body>
This arrangement would display the background image then play
the video file on top of the image. When the video was over the
slides would begin to play.
Note each media stream is assigned a particular region id
based on those defined within the <layout> element.
You've already noticed that the background image has several
'buttons'. These areas can be defined as buttons to advanced the
video, or to load a different SMIL file. Lets add some additional
elements within the <body> element to enhance the
interactivity.
<body>
<seq>
<ref src="API16800.rm"
region="videoregion"/>
<ref src="apinstitutep1jpegs.rp"
region="textregion" fill="freeze"/>
<ref src="APIbackP1.gif"
region="backregion" fill="freeze">
<anchor
href="command:seek(5:30.0)" target="_player"
coords="6,100,126,190"/>
<anchor
href="command:seek(11:30.0)" target="_player"
coords="6,205,126,295"/>
<anchor
href="command:seek(22:54.0)" target="_player"
coords="6,307,126,397"/>
<anchor
href="command:seek(46:45.0)" target="_player"
coords="6,403,126,493"/>
<anchor href="APInstituteP2.smi"
target="_player" coords="471,171,541,220"
begin="0s"/>
<anchor href="APInstituteP3.smi"
target="_player" coords="471,244,541,293"
begin="0s"/>n
</ref>
</seq>
</body>
The <anchor href> element allows the author to add a
measure of interactivity to the file. So we can define a
rectangle so when the user clicks the mouse within a particular
behavior occurs. In the example above two different actions
occur. The "command:seek(x:xx.x)" action causes the
current video to jump to the particular time and continue to
play. The <anchor href> element can also load a new file.
In this particular case I've used relative addressing because the
new SMIL file is within the same directory.
It is also possible to make a particular media stream 'hot'.
If I wanted to allow the user to click the mouse within the
streaming video media I would use a slight variation,
<a href="APInstituteP3.smi"
show="new">
<video src="API16800.rm"
region="videoregion"/>
</a>
Notice the 'video src' is different compared to the 'ref
src" I used. These are examples of media object elements.
SMIL allows several different media object elements, including;
ref, animation, audio, img, video, text and textstream. As I
understand the 'ref' is the broadest and will handle any of the
sources.
There are many additional attributes allowed for each of the
elements I've discussed. These are described in the SMIL language
specifications. If you are like me the details of these
specifications are better understood by looking at examples of
different SMIL files and analyzing what they do.
The particular example I've used in this discussion is one I
currently favor for use of video captured in my lectures. It
allows the students to advance to particular example problems, or
discussion of concepts. The student does not have to guess where
in a 50 minute video particular events have occurred. This means
the author must define those particular time points in the video
by reviewing the video. I've found through experience I can
capture a 50 minute video in...50 minutes. Using Real Producer
Plus the video is compressed while it is captured. A 50 minute
lecture is approximately 100 MB in size. An additional 30 minutes
are required to determine the exact time points within the video
important to the student. Producing the slides are where I spend
the majority of my time. These may take several hours depending
on how detailed you desire. After all of the video is captured,
the time points within the video determined and the slides are
created I upload the files to the Real Server and generate a link
on my web page to access the file.
One particular area I've not discussed is the issue of
bitrate. This is very important for the user. This has to do with
speed the video, audio, text, pics are delivered to the user.
That depends on the connection of the user, 56Kbaud, T1, ISDN,
etc. A SMIL file containing media elements streaming for a T1
line will not play very well on a 56 K connection. I've not got
all this figured out completely at this point, but hope to by the
time this workshop is ready for the Fall semester.
n