INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11

CODING OF MOVING PICTURES AND AUDIO
 

ISO/IEC JTC1/SC29/WG11 N10233
Oct 2008  Busan, Korea

Title:          MAF Overview
Source:      Requirement Group
Editor:       Kyuheon Kim, Florian Schreiner, Klaus Diepold
Status:       approved

 

MAF Overview

 

Table of Contents

1    Motivation for Multimedia Application Formats (MAFs)

1.1      Requirements for MAFs.

1.2      Requirements for starting a new MAF.

1.3      Template for MAF Candidates.

2    MAFs Already Specified..

2.1      Music Player Application Format.

2.2      Photo-Player Application Format.

2.3      Musical Slide Show Application Format.

2.4      Media Streaming Application Format.

2.5      Open Access Application Format.

2.6      Digital Multimedia Broadcasting Application Format.

2.7      Professional Archival Application Format.

2.8      Video Surveillance Application Format.

2.9      Stereoscopic Application Format.

3    MAFs under Development..

3.1      Protected Musical Slide Show Application Format.

3.2      Portable Video Application Format.

3.3      Interactive Music Application Format.

References.

Annex A URN Structure for MAFs.

 

1      Motivation for Multimedia Application Formats (MAFs)

This document presents an overview of MPEG’s Multimedia Application Formats (MAF), which provide the framework for integration of elements from several MPEG standards into a single specification that is suitable for specific but widely usable applications. Typically, MAFs specify how to combine metadata with timed media information for a presentation in a well-defined format that facilitates interchange, management, editing, and presentation of the media. The presentation may be ‘local’ to the system or may be accessible via a network or other stream delivery mechanism. Selected Multimedia Application Formats are candidates to become parts of the ISO/IEC 23000 (MPEG-A) specification.

The current situation is:

·         ISO/IEC 23000-1: Purpose of Multimedia Application Formats (TR)

·         ISO/IEC 23000-2: Music Player (FDIS in April 2007)

·         ISO/IEC 23000-3: Photo Player (FDIS in October 2006)

·         ISO/IEC 23000-4: Musical Slide Show (FDIS in April 2007)

·         ISO/IEC 23000-5: Media Streaming (FDIS in October 2007)

·         ISO/IEC 23000-6: Professional Archival (FDIS in October 2008)

·         ISO/IEC 23000-7: Open Access (FDIS in January 2008)

·         ISO/IEC 23000-8: Portable Video (FDIS in April 2008)

·         ISO/IEC 23000-9: Digital Multimedia Broadcasting (FDIS in October 2007)

·         ISO/IEC 23000-10: Video Surveillance (FDIS in July 2008)

·         ISO/IEC 23000-11: Stereoscopic Video  (FDIS in October 2008)

·         ISO/IEC 23000-12: Interactive Music  (in)

1.1    Requirements for MAFs

MAF specifications shall integrate elements from different MPEG standards into a single specification that is useful for specific but very widely used applications. Examples are delivering music, pictures or home videos. MAF specifications may use elements from MPEG-1, MPEG-2, MPEG-4, MPEG-7 and MPEG-21.

Typically, MAF specifications include:

·         The ISO file format family for storage

·         MPEG-7 tools for metadata

·         One or more coding Profiles for representing the media

·         Tools for encoding metadata in either binary or textual form (XML)

MAFs may specify use of:

·         MPEG-21 Digital Item Declaration Language for representing the structure of the media and the metadata

·         Other MPEG-21 tools as they are required

·         non-MPEG coding tools (e.g., JPEG) for representation of "non-MPEG" media

·         Elements from non-MPEG standards that are required to achieve full interoperability

MAF Specifications can contain elements from all existing MPEG Standards. MAF specification shall use existing Profiles In exceptional circumstances, non-Profile tool sets may be used.

MAF Specifications shall fully specify the elements that are used from the MPEG standards (and other standards if applicable) with all their constraints, so that full interoperability can be achieved.

MAF Specifications shall specify a core metadata set to be supported by any implementation of that MAF, and may enable private extensions.

Any URNs defined in MAF Specifications shall conform to the structure defined in Annex A.  This requirement supplements the general requirements for URNs in the MPEG namespace, as detailed in IETF RFC 3614 and MPEG output N8945.

MAF Specifications shall be made available in the form of text and Reference Software. The Reference SW shall enable a rapid uptake of the MAF in question.

 

 Figure 1: MAF Conceptual Overview

In Figure 1 the concept for MAFs is illustrated. In [9] a more complete description for the MPEG-A approach is given, which is also reflected in the technical report on MPEG-A to be issued as ISO/IEC 23000-1 [4].

1.2    Requirements for starting a new MAF

Before the work on a new MAF (a part of the MPEG-A Standard) can be started, the following items need to be available:

·         Documentation of the technologies that the MAF would encompass, and that these technologies are advanced enough in standardization: the base technologies will be certain to be at FDIS stage or beyond, when MAF reaches FDIS stage.

       This includes MPEG technologies as well as external technologies.

·         Clear statement of Requirements and Value Proposition,

       Including an assessment of the relation to any solutions that may already exist, standards-based or proprietary.

·         It is also requested to clearly explain if there are technically similar MAFs already in existence and in what respect the newly proposed MAF differs from them in terms of requirements

·         Documentation of enough industry support to successfully complete the work and deploy the format, in the form of registered MPEG input contributions.

·         Commitments for Reference Software in registered MPEG input contributions. The reference SW should allow a successful launch of the format.

·         Commitments, in the form of one or more registered input contributions, for exchange and crosscheck of content encoded according to the MAF between multiple parties and multiple independent implementations.

       The content should exercise/utilize all the tools in the MAF (but not necessarily in all possible combinations).

·         Commitment, in the form of one or more registered input contributions, to produce relevant marketing materials, including at least one white paper explaining the benefits of the new MAF.

1.3    Template for MAF Candidates

MPEG can start the process for the specification a new part of MPEG-A based on technical contributions which allow the perusal if there is sufficient industry support for such a standardization. The pertaining contribution documents shall be based on a template as specified in this section.  

1.3.1     Application Scenario Description

This section shall contain a description of the application scenario under consideration. The application scenario shall guide the requirements process leading into the subsequent specification process. Note that it is not the application that is the subject of standardization.

1.3.2     Requirements

Based on the application scenario described in the previous section technical requirements shall be extracted and specified to a level of detail that allows to identify and quantify the elements from the MPEG body of standards needed to satisfy the requirements. It is conceivable that this process produces requirements for which there exist no appropriate MPEG technologies. This situation may trigger new standardization activities within MPEG if this is considered to be necessary.

1.3.3     List of technologies

MPEG technologies shall be picked in order to match the requirements, which have been compiled in the previous section. Pick fitting profiles from existing MPEG parts or take this information as a guideline to define new profiles as needed.

1.3.4     Comparison with other MAFs

MPEG shall not duplicate its efforts and shall not create a set of MAFs that comprise a large amount of overlap. In order to distinguish the technical scope of a newly proposed MAF, proponents shall provide a comparison of the new MAF with existing or recently proposed MAFs. This exercise could also help to see if there exist MAF proposals which can be merged to produce a more generic MAF that facilitates a wider range of applications. 

1.3.5     Issues

MPEG experts need to identify any issues which are essential to facilitate the application scenario but which may lie outside of MPEG’s jurisdiction. 

1.3.6     Supporting Companies and Organizations

MPEG standards are asking for reference implementation in terms of reference software. The software is intended to demonstrate and facilitate the use of the specified MPEG technologies in the market place. The necessary software elements may need to include normative and non-normative parts.


 

2      MAFs Already Specified

In this section MAFs are summarized that have already been standardized or have at least progressed close to the finished standard, i.e. IS status.

2.1    Music Player Application Format

2.1.1     Overview

The Music Player MAF specifies a simple and uniform way to carry MP3 coded audio content in MPEG-4 File Format augmented by simple MPEG-7 metadata and a JPEG image for cover art. Specific MPEG-7 Metadata represents data commonly expressed in ID3 tags as follows:

·         Song title – title of the song

·         Album title – title of the album

·         Artist – artist performing the song

·         Year – year of the recording

·         Comment – an account of the content of the resource

·         Track – CD track number of song

·         Genre – a category of artistic, musical, or literary composition characterized by a particular style, form, or content.

MPEG-1/2 Audio Layer III, also known as MP3, is one of the most widely used MPEG standards.  Currently, the ID3, which was created by appending simple metadata tags such as Artist, Album, Song Title, etc. at the end of the MP3 turned out to be very positive thing to do since they provide useful content description about the music clip. 

Since that time MPEG has developed a number of standards, all of which strive to serve the needs of consumers and industry. Among those are MPEG-4, a next-generation suite of standards for media compression, and MPEG-7, a suite of standards for meta-data representation. MPEG-4 specifies the ISO/MPEG-4 File Format, while MPEG-7 specifies not only signal-derived meta-data, but also archival meta-data such as Artist, Album and Song Title. MPEG-21 tools are used to enable album functionality on top of that. It allows to collect several song files in the above described song file format into one album file.

As such, MPEG-4, MPEG-7 and MPEG-21 represent an ideal environment to support the current “MP3 music library” user experience, and, moreover, to extend that experience in new directions.

MPEG has moved one step further to refine Music Player MAF in its second edition by adding protection feature by incorporating AES-128 counter mode encryption as default protection tool and MPEG-21 Intellectual Property Management and Protection (IPMP) and MPEG-21 Rights Expression Language (REL) for flexible and extensible protection and governance description.

2.1.2     Application Scenario for Protection

2.1.2.1       Simple protection scenario with default encryption

Producer Jim records a few new songs storing them in a format as explained in MAF Part-2. To protect his songs against unauthorized copying, he encrypts them as specified in this proposed music player MAF, 2nd edition and sends the files to retailer Tom. Since the content is already protected, transport does not need additional security measures. He puts the access information (encryption keys) in a separate container and delivers these in a secure way to Tom (this may be over the Internet or on a CD or whatever way he chooses). Tom is now able to use the protected content for distribution to the end user without modifying the song format. He just needs to add proper license information.

Advantages for the song producer

·         Content that is meant to be distributed is protected from the very beginning

·         If “everyone” uses the same protection (and encoding) format, lots of economically available tools will support this “mastering process”

·         The producer has the full control over the content

Advantages for the retailer

·         Needs only one set of well supported tools, since every producer uses a standardized content encoding and protection format

·         The song can be distributed to the end customer without changing the format (encoding, protection and file format). It is just necessary to add proper license information for the end customer.

·         The protected content format is directly suitable to be used for concepts like super distribution and subscription models that require the separation of content and license.

·         The protected content format is slim and efficient and thus suitable for a broad range of customer devices and for various DRM schemes ranging from simple to sophisticated.

2.1.2.2       Flexible Protection Scenario

Distributor applies protection

Producer Peter records songs. He then enters an agreement with Distributor Annie to widely sell the songs online. Peter sends the songs to Annie without any protection because he trusts her.

Upon receiving the songs, Annie takes the following actions:

·         Protect the songs with the appropriate content protection tool.

·         Create metadata to describe the protection that she applied to the songs. The description includes the tools (id and/or location to get it), in which part of songs they are applied (if the protection is not applied to entire song).

·         Express the rights that govern the use of that protected song. The rights expression reflects the business model that is used. The rights expression may describe who has privilege to consume, what is the privilege (rights), on what conditions, etc. Key information to “unprotect” might be included in the rights expression as well.

·         Package the protected data, metadata that describe the protection, rights expression into the music player MAF format.

·         Make the package available for download/streaming by the end user. Note the business model may require proof of payment for purchase or a valid subscription.

Producer applies protection

Peter signs a separate agreement with Distributor Bill for some of his live recordings. However, Bill only offers hosting services and does not provide content protection services. Peter applies a simple content protection tool to his songs before uploading them to Bill’s site. Peter also packages timed lyrics and photos as “extras”, and uses protects them as well.

Consumer acquires content

Consumer Charlene sees Peter live at a local pub and approaches him about recordings. He hands her a card with a URL to Bill’s site where she can download some of his songs for free (these are sometimes known as “teasers”.)

Charlene sits in front of her PC at home and downloads the songs from Bill’s website. She uses a Music Player MAF, 2nd edition -compliant player, which parses the file, identifies and applies the content protection tool(s) necessary to play the song. If the player does not have the protection tool that Peter selected, the player can automatically download the tool for a seamless consumer experience.

Charlene likes what she hears, but Peter has not enabled copy to portable players in these free teasers. She connects to her favourite music download site, run by Annie, and finds songs by Peter for sale. Charlene charges her account and downloads the files. She listens to the songs and then transfers them to her portable music player.

The portable music player also is compliant with the Music Player MAF and can process the files downloaded from Annie’s site. The player already supports all the necessary protection tools and can parse the file and play the music payload.

2.1.3     List of technologies

Until its second edition, the following technologies have been identified as necessary for the Music Player MAF:

·         All technologies needed for ISO/IEC 23000-2, Music Player Application Format

·         Access Unit header (entry point for decryption) as defined in ISMACryp and OMA2.0 DCF to enable random access to encrypted content

·         AES128 CTR encryption tool (default encryption)

·         ProtectedSchemeBox (‘sinf’) of the ISO Base Media File Format 2nd Edition. The ‘sinf’ box is added to the sample description ‘stsd’ to protect the audio track and/or added to the ‘ipro’ box to protect the meta-data. The ‘sinf’ box contains the signalization of the protection format.

·         Protection format according to the ISMACryp 1.1 specification.

·         MPEG-21 IPMP Components Base Profile.

·         MPEG-21 REL MAM Profile.

2.1.4     Conformance Points

There are three conformance points that can be taken by applications and/or devices to implement Music Player MAF.

·         Unprotected music player.

·         Protected music player with default protection

·         Protected music player with flexible protection.

2.2    Photo-Player Application Format

The Photo-Player MAF enables a new experience in creating, sharing and viewing digital photo albums. It specifies a simple and uniform way to carry JPEG images and their associated MPEG-7 metadata in an MPEG-4 file, providing an interoperable solution for digital photo library applications.

The supported metadata include image-acquisition parameters (such as date, time and camera settings), as well as MPEG-7 visual content descriptions. This allows conforming devices to support new, content-enhanced functionality, such as intelligent browsing, content-based search or automatic categorization.

MPEG Photo-Player makes it possible to:

·         Organize photos into categories based on their content – for example, the people and places depicted, or events taking place in photos.

·         Perform advanced content-based searches, through collections of photos.

·         Bundle photos and favourite ways to show them, into a single MPEG-4 file.

·         Link to external resources, such as other images (outside the MPEG-4 file) or web-pages.

The major components of the Photo-Player standard are:

·         A method to encapsulate a set of JPEG compressed images in an MPEG-4 file

·         Concise subsets of MPEG-7 metadata, to describe the individual images and the overall collection

·         A method to embed this binary MPEG-7 metadata in the MPEG-4 file

Digital Cameras are one of the most popular consumer digital devices, with estimates showing over 400 million units sold worldwide (2005). The sales of digital cameras overtook the sales of their analogue counterparts in Japan, US and Europe. JPEG is the common data format used in the great majority of the cameras.

MPEG has developed the MPEG-7 standard, which defines rich meta-data descriptions for audio-visual applications. MPEG also provides the ISO/IEC Media File Format (ISO FF) as a storage container. The Photo Player MAF also takes advantage of the widely supported EXIF meta-data since it has the basic information about picture creation.

As such, Photo Player MAF is an ideal tool to enhance the digital photo library experience.

2.3    Musical Slide Show Application Format

The existing Music Player MAF was designed as a simple format for enhanced MP3 players. It contains MP3 audio data, optional MPEG-7 meta-data and JPEG still image for cover art. The Photo Player MAF under development combines JPEG still images with MPEG-7 meta-data.

This Musical Slideshow MAF builds on top of the Music Player and the Photo Player MAF and is meant as a superset of these two MAFs.

The following section describes the applications scenarios and derived requirements for a “Musical Slide Show MAF” format.

2.3.1     Application Scenarios

Foreign language exercise materials

It is popular to listen the audio contents by using MP3 players. Sometimes, we may want to have more detailed information by some visual contents. For example, students learn and practice foreign languages by lecture materials in MP3 audio by their pocket players. The lecture contents include daily conversation, foreign language cartoon, and problem exercises. A user can see images which show the positions of the tongue in the vocal track to help precise pronunciation of certain words. The dialogue audio contents can also be sounded with pictures that describe the dialogue situation. After the dialogue conversation, some questions are presented on the screen with text overlaid on top of images.

Figure 2 – Exercising a foreign language with extended Music Player MAF contents

Photo-Music album applications

People can better enjoy music by listening the songs with seeing music album images on the display window at the same time in their multimedia devices. The music album pictures of an artist can be synchronized to MP3 song tracks. Also the lyric can be displayed in a synchronized way on the display window so that people can sing the songs by seeing the lyrics of the songs. This will definitely enhance the MP3 experience with improved functionalities.

Figure 2 – a photo-music album application

Storytelling application 

The Music Player MAF finds very interesting applications in story telling with multiple JPEG images and text data with synchronization to MP3. Fro example, the story contents in Aesop's Fables can be better understood with associated pictures and text synchronized to MP3 audio tracks. This is definitely beneficial against the current Music Player MAF which does not allow for text and only allow for a single thumbnail image as metadata.

Figure 3 – A story telling application

Personal Slide Show application

Various single still images are used as a source in the personal slideshow application scenario. The user arranges the images on a timeline and generates the timing information for the presentation of the still images.

The images may be rescaled in size to optimize the resolution for the target output device: higher resolutions for presentations optimized for computer displays, lower resolutions optimized for mobile devices. Either no transitions between the images are used, or transition effects are applied at the presentation time.

An audio soundtrack and a text track (for sub-titles, annotations or intro text) are used to enrich the presentation. If meta-data describing the images is available, it is stored so that the relationship of the meta-data and the image is maintained.

The timed and optionally rescaled images, the meta-data, the audio soundtrack and the timed text are stored to a file.

Simple Karaoke application

Mary’s favourite song is “Blue” by Leann Rimes. She wishes to listen, as well as sing along to her favourite song. Mary uses the karaoke function (= the device is removing the original voice from the music as a post-processing step) on her mobile phone. She performs the following:

·         She chooses one of Leann Rimes’ slideshows with animation effects

·         She hears “Blue” without the voice of Leann Rimes

·         She sees the lyrics

·         She reads the lyrics and sings along to the music

Slide Show+Karaoke application

Now, she wishes to listen, as well as sing along to her favourite song while viewing her images. Mary performs the following:

·         She visits the mobile service provider’s website

·         She chooses “Blue” from the service provider’s music selection

·         She uploads her vacation photos

·         She chooses the animation effects for the photos

·         She synchronizes the photos to the music using a web-based tool provided by the service provider

·         The service provider stores the synchronized photos as in the file together with the music track and the lyrics of the song

·         She downloads the file to her mobile phone

After the download process, Mary plays the MAF file on her mobile phone. She performs the following:

·         She listens to “Blue” by Leann Rimes

·         She watches her slideshow with animation effects

·         She sees the lyrics

·         She reads the lyrics and sings along to the music

2.3.2     Derived requirements

·         The Musical Slide Show MAF should support multiple, timed still images.

·         The Musical Slide Show MAF should support audio.

·         The Musical Slide Show MAF should support timed text data.

·         The Musical Slide Show MAF should support synchronization of multiple images, audio and text.

·         The Musical Slide Show MAF should support backward compatibility to the Music Player MAF and the Photo Player MAF under development.

·         The album functionality (= collection of several “Musical Slide Show MAFs” in one single file) of the Music Player MAF should be supported.

·         The Musical Slide Show MAF should support simple animation technology (like filtering and transition effects) with low computational complexity.

2.3.3     List of technologies

·         ISO Base Media and MPEG-4 File Format

·         JPEG ISO Standard

·         MPEG-7 Meta-data (aligned to Music Player MAF and Photo Player MAF)

·         MPEG-1/-2 Layer 3 (“mp3”) audio coding

·         MPEG-4 Part-17 “Streaming Text Format” for timed text

·         MPEG-21 File Format and DID (aligned to Music Player MAF)

·         MPEG-4 Part-20 “LASeR” Mini Profile

2.3.4     Comparison with other MAFs

Music Player MAF

The Music Player MAF is an audio application that provides the file format for combining MP3 audio, MPEG-7 metadata, and an optional JPEG image. There is a similarity in that, both MAFs use MP3, however, the Musical Slide Show MAF contains multiple JPEG images, and the focus is on synchronized image slide show presentation with animation effects.

Photo Player MAF

The Photo Player MAF defines the specification for a file format that carries JPEG images with MPEG-7 Visual and MDS metadata. Both MAFs use multiple images, however, in case of the Musical Slide Show MAF, the images must be synchronized, and supports MP3.

Music Player MAF, 2nd edition

The Music Player MAF, 2nd edition features the same functionalities as the Music Player MAF with an addition of content protection tools.

Professional Archival MAF

The Professional Archival MAF is an application that focuses on handling of audio files in a single archived file. The target domains of the two MAFs are inherently different.

Media Streaming MAF

The Media Streaming MAF standardizes streaming video content and protocols. The target domains of the two MAFs are inherently different.

Open Access MAF

The Open Access MAF is an application that focuses on publication and delivery of content that is governed in a “light-weight” form. The target domains of the two MAFs are inherently different.

Portable Video MAF

The Portable Video MAF is an application that focuses on local playback of prepackaged video content. The target domains of the two MAFs are inherently different.

Digital Video/Cinema MAF

The Digital Video/Cinema MAF is an application that concentrates on the distribution of high-resolution video content to professional users with emphasis on color management information. The target domains of the two MAFs are inherently different.

Video Surveillance MAF

The Video Surveillance MAF is an application that is specifically targeted towards the application domain of surveillance. There are similarities in that multiple images are involved in both MAFs, however, the target domain of surveillance has little relevance to the Musical Slide Show MAF.

2.3.5     Issues

It is preferable that the audio track and the synchronized JPEG images can alternatively be used without LASeR scene description (losing the transition effects). It should be investigated if the synchronization of audio and the JPEG images can be aligned with the time line of the LASeR scene description.

If all the JPEG images are stored in a single track, it should be investigated how the JPEG images can be referenced by the LASeR scene description.

The proponents of the Musical Slide Show MAF believe the LASeR mini profile to provide the technology to support the required functionalities for animation and scene representation. However, it is should be investigated if this profile needs to be further limited to an even simpler subset in order to accommodate low complexity requirements.

2.3.6     Supporting Companies and Organization

·         Information and Communications University (ICU)

·         LG electronics

·         Samsung Electronics

·         Fraunhofer IIS

2.4    Media Streaming Application Format

The Media Streaming MAF specifies how to use specific MPEG technologies to build a full-fledged media player for the streaming of governed and ungoverned content. This standard references the data formats exchanged between a number of devices in a media streaming scenario: a Content Provider Device, a License Provider Device, an IPMP Tool Provider Device, a Domain Management Device and a Media Streaming Player. In the most general case a Media Streaming Player obtains streaming content from a Content Provider Device using a Content Access Protocol. In order to use that content, a Media Streaming Player obtains a license from a License Provider Device using a License Access Protocol. Further, to actually process the content, a Media Streaming Player may need to obtain the appropriate IPMP Tools from an IPMP Tool Provider Device using an IPMP Tool Access Protocol, as shown in the figure below.

2.4.1     Application Scenario Description

This MAF is aimed at applications involving the distribution of governed and ungoverned media resources, metadata and related information over streaming channels to Media Streaming Players, possibly members of a domain. In a domain of Media Streaming players, content can be securely distributed once stored in a file. Typical examples of such applications are the IPTV and the Digital Broadcasting without a return channel, Pay TV, internet television, etc. A more detailed list is given in the sections below.

·         Public broadcasting in the digital world

Carrying over the public broadcasting service from the analogue to the digital world entails a number of adaptations that may be implemented using the proposed MS MAF standard. One way to achieve this is illustrated by the following fictitious case.

Daphnis lives in the island of Lesbos, where the local broadcaster GreenRadioTelevision (GRT) has implemented the following form of Digital Terrestrial Television (DTT)

1.      GRT broadcasts digital television in clear text

2.      As long as Daphnis only watches GRT programs, Daphnis’ receiving device behaves like a regular DTT set top box

3.      When Daphnis wants to record a program, the receiving device requests an REL license from GRT

a.       The license gives Daphnis the right to store and subsequently watch and distribute the stored program in the island of Lesbos

b.      The license is stored as part of a Digital Item in the receiving device

c.       The actual resources are encrypted and stored together with the Digital Item.

4.      When Daphnis uses the stored program, the receiving device performs Daphnis’ commands in accordance to the license.

·         Commercial broadcasting in the digital world

The proposed MS MAF standard can be used to address issues of a nature comparable to the case above that exist for commercial television. The following fictitious case illustrates how.

GreenTVToday (GTT) is a commercial broadcaster operating in the island of Lesbos. Each TV program broadcasted by GTT is accompanied by a Digital Item transmitted using the Digital Item Streaming technology. The Digital Item contains a license giving the citizens of the island of Lesbos the right to use the content received using their set top boxes. When Chloe, a citizen of the island of Lesbos, receives a TV program, she is immediately instructed of her rights, e.g. that she may watch and store the program and share with other inhabitants of the island of Lesbos for one week. However, she may also acquire more rights from GTT by requesting an extended license.

·         Internet streaming via Multicast

The same features described above for broadcasting over MPEG-2 TS can be obtained from a broadcasting system implemented using the IP Multicast technology.

·         Pay TV

Pay TV systems have been deployed in many countries mostly based on a business model whereby the set top box is proprietary because the content protection technology has been licensed by the Conditional Access System (CAS) provider to the pay TV operator. This model achieves the goal of allowing a tight control of the pay TV service subscribers but carries the high price of set top box manufacturing and distribution.

The proposed MS MAF standard can be used to support a different model, i.e. one where the set top box can still be used as a means to control the subscriber base but can tap a horizontal market of set top box manufacturers, thereby achieving lower device prices.

·         IPTV

The deployment of IPTV services is being considered by many and experimented by some Telco and CATV operators. The proposed MS MAF standard can be used to support the implementation of IPTV services in a way that can decrease some cost elements of the total system. One cost element can be the receiving devices but also some components of the back office (e.g. subscriber management systems, play-out centers, etc.) can be positively affected.

·         Internet Television

Internet Television is being considered by some as the mirror of today’s model of web surfing applied to video content. An Internet Television user would be able to surf the web and enjoy paid content that is protected in a way that allows its consumption by “internet set top boxes” that can be purchased from the open market and therefore enable new forms of pay audio and video services over the web.

·         Storage and subsequent play of content in a domain

The ability of using content received from a plurality of sources on multiple devices belonging to a home or to its members is a feature that is expected to stimulate new forms of content consumption. The proposed MS MAF standard can be used to support a wide variety of models for domain management.

2.4.2     Requirements

The requirements for the Media Streaming MAF are preceded by the definition of the terms used within the requirements definition.

2.4.2.1       Nomenclature

1.      Media Streaming Content is digital content produced according to the MS MAF standard, including

a.       Resources

b.      Identifiers

c.       Metadata

d.      Licenses

e.       IPMP information

f.       IPMP Tools

g.      Decryption Keys

h.      Digital Signature and Hash values

i.        Device Information

2.      Use of Media Streaming Content includes

a.       Decoding

b.      Presentation

c.       Storage

3.      Media Streaming Player is a device capable of using Media Streaming Content as specified by the rights holder of the content

4.      Domain: a set of Devices sharing some common attributes, such as personal or group ownership that is appropriate for various business models

5.      Media Streaming Protocol is a protocol used by a Media Streaming Player to exchange data with other devices, namely

a.       Content Provider Device, a device capable of interacting with a Media Streaming Player to provide Media Streaming Content

b.      License Provider Device, a device capable of interacting with a Media Streaming Player to provide Licenses

c.       IPMP Tool Provider Device, a device capable of interacting with a Media Streaming Player to provide IPMP Tools

d.      Domain Management Device, a device capable of managing various functions needed for a proper functioning of a domain, e.g.

                                                                                      i.      Create domain

                                                                                    ii.      Renew domain

                                                                                  iii.      Delete domain

                                                                                  iv.      Add device to domain

                                                                                    v.      Remove device from domain

6.      Media Streaming Device (MSD): any of the following devices

a.       Media Streaming Player

b.      Content Provider Device

c.       License Provider Device

d.      IPMP Tool Provider Device

e.       Domain Management Device

7.      Media Streaming Content Interoperability: the capability of Media Streaming Content to be used by a Media Streaming Device as specified by the MS MAF standard

8.      Media Streaming Device Interoperability: the capability of Media Streaming Devices to process the data from other Media Streaming Devices as specified by the MS MAF standard

2.4.2.2       General

1.      The MS MAF shall standardize

a.       The Media Streaming Protocols

b.      The Media Streaming Content including its component elements

2.      The MS MAF standard shall enable making Media Streaming Content and Devices so that full interoperability is achieved

3.      The format shall be flexible and usable to implement a variety of business models (e.g. those in some of the scenarios described in M13516)

4.      The standard may need to be structured in profiles targeted to application areas/business models providing for a degree of interoperability between profiles, e.g. in case of hierarchical profiles

5.      The standard shall support the use of a broad range of security technologies, including proprietary ones, without hampering interoperability

6.      The standard shall support the use of content on

a.       A single Media Streaming Player

b.      A Media Streaming Player belonging to a domain

7.      Use of Media Streaming Content is governed by the associate license

8.      Media Streaming Content can be moved or copied between Media Streaming Players as per license terms

9.      The standard shall support the inclusion of Media Streaming Device information used for

a.       Identification

b.      Registration

c.       Acceptance

d.      Revocation

e.       Authentication

10.  The standard shall enable services that employ Media Streaming Players without return channel

11.  The standards shall support incorporation of external “system specifications” matched to support requirements that are specific to application domains and/or world regions

2.4.2.3       Content

1.      Rights

a.       The standard shall support storage of Media Streaming Content for

                                                                                                              i.      later play on the same Media Streaming Player

                                                                                                            ii.      transfer to another Media Streaming Player

                                                                                                          iii.      later play on another Media Streaming Player

b.      The standard shall offer different users (rights holders, service providers etc.) the ability to flexibly determine how the content will be used, e.g.

                                                                                      i.      Copying Media Streaming Content

                                                                                    ii.      Moving Media Streaming Content to another Media Streaming Player

                                                                                  iii.      Modifying Media Streaming Content

2.      Audio and video

a.       The standard shall support the most common video and audio compression formats

b.      The standard shall enable the use of non MS MAF native video and audio compression formats

3.      Metadata

a.       The standard shall support metadata that are most friendly to media streaming

b.      The standard shall support the inclusion of additional application-specific metadata

4.      Interaction protocols

a.       The standard shall define the protocols for a Media Streaming Player to obtain

                                                                                      i.      Media Streaming Content from a Content Provider Device

                                                                                    ii.      License from a License Provider Device

                                                                                  iii.      DRM Tool from a DRM Tool Provider Device

b.      The standard shall define the protocols for an Media Streaming Player (an Administrator) to

                                                                                      i.      Create a domain

                                                                                    ii.      Renew a domain

                                                                                  iii.      Delete a domain

c.       The standard shall define the protocols for a Media Streaming Player to

                                                                                      i.      Add itself to a domain

                                                                                    ii.      Renew membership of itself to a domain

                                                                                  iii.      Remove itself from a domain

5.      Transport protocols

a.       The standard shall support the transport of Media Streaming Content via

                                                                                      i.      File

                                                                                    ii.      Stream

2.4.2.4       Interaction Protocols:

1.      The standard shall define the protocols for a Media Streaming Player to obtain

a.       Content from a Content Provider Device

b.      Licenses from a License Provider Device

c.       IPMP Tools from an IPMP Tool Provider Device

d.      Decryption Keys from a Content Provider Device

2.4.2.5       Domains of Media Streaming Players

1.      The solution shall provide the means to identify Users, Devices and Domains

2.      The standard shall define the protocols for a Media Streaming Player (an Administrator) to

a.       Create a domain

b.      Renew a domain

c.       Delete a domain

3.      The standard shall define the protocols for a Media Streaming Player to

a.       Add itself to a domain

b.      Renew membership of itself to a domain

c.       Remove itself from a domain

4.      The solution shall allow a Domain Administrator to revoke the Membership to a the Domain to a device or a user

5.      The solution shall allow a Domain Administrator to revoke a Domain

2.4.2.6       Transport protocols

   The standard shall support the transport of Media Streaming Content via

a.       File

b.      Stream

2.4.2.7       Device information

1.      MPEG-21 IPMP Components shall support expression of information that enables applications to choose an appropriate tool implementation.

The solution shall support the signaling of the IPMP Capabilities of a device that are required to select an appropriate IPMP Tool. A minimum set is the following:

a.       Operating System

                                                                          i.      Vendor

                                                                        ii.      Model

                                                                      iii.      Version

                                                                      iv.      Virtual Machine

b.      CPU

                                                                          i.      Vendor

                                                                        ii.      Model

                                                                      iii.      Size

                                                                      iv.      Minimum speed

c.       Memory

                                                                          i.      Vendor

                                                                        ii.      Model

                                                                      iii.      Minimum speed

d.      Assistant Hardware

                                                                          i.      Smartcard

                                                                        ii.      HardKey

e.       Network

f.       RPCMechanism

g.      IPMP Tool Instantiation API

h.      IPMP Tool Messaging API

2.      The solution shall enable a Media Streaming player to request to other Media Streaming players an appropriate IPMP Tool based on the low level software and hardware characteristics of the Media Streaming player.

2.4.2.8       IPMP Tool package information

The solution shall enable the signaling of the type of package in which the code of the IPMP Tool is delivered in the DIDL, (e.g. "application/zip"," application/java-archive", etc...)

2.4.2.9       IPMP Messages

The solution shall support interoperability across IPMP Tools and Media Streaming players.

a.       The solution shall provide interoperability in the communication between the Media Streaming player and IPMP Tools from different providers, and between two IPMP Tools

b.      A minimum set of interoperable operation must cover the following scenarios:

                                                                          i.      Authentication between Tools an the MS player and among Tools

                                                                        ii.      instantiation of IPMP Tools

                                                                      iii.      IPMP Tool Connection and Disconnection

                                                                      iv.      notification of events to IPMP Tools/player

                                                                        v.      delivery of information to/from IPMP Tools (e.g. keys, licenses, watermarking information, etc.)

                                                                      vi.      User Interaction Messages

2.4.2.10   Tool Agent

1.      The solution shall allow the separation between the IPMP algorithms performed by an IPMP Tool (e.g. encryption, watermarking, etc.) and the logic enabling the IPMP Tool to communicate with the other Tools or the Media Player (e.g. the messaging API).

It shall be possible to separate the IPMP algorithm's API from the (standard) IPMP Tool's API.

2.      The solution shall allow the instantiation of IPMP Tools in the various control points exposed by the Media Streaming player without requiring the signaling of which IPMP Tool shall be instantiated in each control point.

2.4.3     List of technologies

1.      The Media Streaming MAF contains elements drawn from existing MPEG standards, namely

a.       MPEG-2 and MPEG-4 IPMP-X

b.      MPEG-21 DID

c.       MPEG-21 DII

d.      MPEG-21 IPMP Components

e.       MPEG-21 REL

f.       MPEG-21 File Format and its reference ISO Base Media File Format

g.      MPEG-21 Binary Format

h.      MPEG-21 DIS

2.      The Media Streaming MAF requires the development of appropriate profiles of MPEG standards suitable for the MS MAF, namely

a.       MPEG-21 DID

b.      MPEG-21 IPMP Components

3.      The Media Streaming MAF requires the extension of existing MPEG standards for the purpose of providing the necessary Media Streaming functionalities, for instance

a.       The payload of IPMP-X Messages expressed as XML, for reasons of compatibility with the rest of the MPEG-21 standards

b.      The IPMP Tool Pack technology that provides additional functionalities to those already provided by the MPEG-2/-4 IPMP-X standards to handle complete IPMP systems

4.      The Media Streaming MAF requires the development of new MPEG standard technology to provide the necessary Media Streaming MAF functionality, for instance Management of Domains made up of different types of Devices

2.4.4     Comparison with other MAFs

·         Music Player MAF

The Music Player Application Format specifies the file format for the combination of MP3 files, a subset of MPEG-7 MDS metadata corresponding to ID3 metadata and JPEG pictures.

As the proposed MS MAF requires a file format and this is based on the same technologies, there is a degree of relationship. However, it is believed that the two MAFs can be considered as largely independent.

 

·         Photo player MAF

The Photo Player MAF specifies the file format for the combination of JPEG files and a subset of MPEG-7 Visual and MDS metadata.

As the proposed MS MAF requires a file format and this is based on the same technologies, there is a degree of relationship. However, it is believed that the two MAFs can be considered as largely independent.

 

·         Music Player MAF, 2nd edition

The general goal of the Music Player MAF, 2nd edition seems to be to support a level of governance information to the Music Player MAF.

As the MS MAF requires governance information there is potentially a relationship for

1.      The definition of the Digital Item used by the Music Player MAF, 2nd edition

2.      The IPMP Component Profile used by the Music Player MAF, 2nd edition

However, it should be borne in mind that this MAF and the proposed MS MAF address very different application domains.

 

·         Musical Slide Show MAF

This Musical Slideshow MAF is expected to be a superset of the Music Player and the Photo Player MAF. Additional technologies to be added are streaming text and Laser Mini Profile.

Therefore this MAF can be considered as largely independent from the proposed MS MAF.

 

·         Portable Video MAF

The Portable Video MAF can be considered as the Video equivalent of the Music Player MAF.

As this MAF will require the selection of audio and video coding, metadata for the video program and the file format to store it, there is potentially a relationship with the proposed MS MAF. However, this MAF seems to address the area of devices that are used to play video content acquired as a file from a web site and therefore is driven by different considerations than the MS MAF.

 

·         Video Surveillance MAF

The Video Surveillance MAF is targeted to the specific application domain of surveillance.

It cannot be excluded that there will be some technologies required for this MAF that are also required for the proposed MS MAF. However, the application-specific nature of this MAF makes the relationship with the proposed MS MAF rather thin.

 

·         Open Access MAF

The Open Access MAF addresses the publication and delivery of governed but unprotected content.

The proposed MS MAF can be considered as a possible vehicle for the distribution of Open Access Content. However, at this stage the relationship appears to be rather thin.

 

·         Professional Archival MAF

The Professional Archival MAF seems to be targeted to the specific application domain of professional archival of audio and possibly video information.

There may be some technologies required for this MAF that are also required for the proposed MS MAF. However, the application-specific nature of this MAF makes the relationship with the proposed MS MAF rather thin.

 

·         Digital Video/Cinema MAF

The Digital Video/Cinema MAF addresses the high range of video distribution to professional users. One example is the use of metadata for colour management of rendering devices typically in movie theatres.

There may be some technologies required for this MAF that are also required for the proposed MS MAF. However, the application-specific nature of this MAF makes the relationship with the proposed MS MAF rather thin.

2.4.5     Issues

The Media Streaming MAF requires referencing a non-MPEG standard, for Metadata: TV Anytime Metadata (ETSI TS 102 822-5 v1.1.1)

2.4.6     Supporting Companies and Organizations

The following list of companies supports the development of the Media Streaming MAF:

·         ADETTI/ISCTE

·         CEDEO.net

·         Enterprise of the Future, Inc.

·         ETRI

·         Panasonic

·         Peking University

·         SDAE

·         SpID

·         Telecom Italia

·         Telefónica

·         University of Tokyo

·         Victor Company of Japan, Ltd.

2.5    Open Access Application Format

The Open Access Application Format is a format to ease the exchange and promotion of open contents. It is designed for the cases where users own rights to a piece of content and have an interest in releasing it in such a way that other users can freely access it. However, the users do not want to make the content public domain. Users want to release a piece of content that is governed in a “light-weight” form. This type of release is called “Open Access” and the set of technologies that support it is called Open Access application format.

Examples of Open Access are publicity material and teasers. One important set of major potential applications is represented by the BBC’s Creative Archive project.

The Open Access MAF is a packaging format designed for the release and exchange of contents. It packages different contents into a single container file and provides a mechanism to attach meta-data information, by using MPEG-7 and MPEG-21 technologies. The MPEG-21 REL is used to model the intentions of the license. MPEG-21 Event Reporting provides a feedback mechanism, which can notify the author, when a user wants to derive a content or extract an item out of the container file.

2.5.1     Application Scenarios Description

Some of the application scenarios the Open Access MAF can be used for are the following:

·         Release of a creative work or other material in a single package to the public ot specific persons. Additional metadata can be easily attached.

·         Feedback to the author: The author can specify if he wants to get a feedback notification about the usage of his content.

·         License Management is supported by the machine-readable licenses. Licenses can be browsed and searched easily. Licenses can also be generated automatically.

·         Publishing of public funded research results.

·         Publishing of E-Learning material. This material can be published with an attached human- or / and machine-readable license.

·         Support for a variety of licenses by different organizations. E.g. the licenses provided by Creative Commons.

Figure 1 – Open Access scenario

2.5.2     List of technologies

To support this MAF the following set of MPEG technologies are needed:

·         ISO/IEC 21000-2, MPEG-21, Part 2, Digital Item Declaration

·         ISO/IEC 21000-3, MPEG-21, Part 3, Digital Item Identification

·         ISO/IEC 21000-3 Amd 1, MPEG-21, Part 3 Amendment 1, Relates identifier types

·         ISO/IEC 21000-5 Amd 3, MPEG-21, Part 5 Amendment 3, Open Access Content profile

·         ISO/IEC 21000-9, MPEG-21, Part 9, File Format

·         ISO/IEC 21000-15, MPEG-21, Part 15, Event Reporting

·         ISO/IEC15938-5, MPEG-7, Part 5, Multimedia Description Schemes

 

2.6    Digital Multimedia Broadcasting Application Format

2.6.1     Application Scenario Description

Digital Multimedia Broadcasting (DMB) is the first global mobile TV service based on a digital radio transmission system. DMB provides people with crystal-clear video, theatre-quality audio, and other data services on the move via in-vehicle terminals or hand-held gadgets like mobile phones so that it makes possible the information acquisition and consumption anywhere.

Due to the availability of multiple DMB broadcasting services, the DMB users have many mobile TV programs available on their terminals and they may want to watch their preferred DMB program contents. However, it is not easy for users to consume the DMB programs at their convenient time. Therefore, it is required that the DMB program contents could be stored and play backed at anytime. The stored DMB contents are also expected to be exchanged between different DMB terminals. Thus, a standardized DMB file format needs to be specified to guarantee the interoperability across the DMB terminals from different vendors.

DMB Multimedia Application Format (MAF) specify how to combine the variety of DMB contents with associated information for a presentation in a well-defined format that facilitates interchange, management, editing, and presentation of the DMB contents.

Figure 1: A DMB Service Architecture

2.6.1.1       Storage of Mobile Broadcasting Contents

Major application area of DMB MAF is for storage of mobile broadcasting contents. In the DMB services, DMB-HE (Head End) broadcasts the digital contents through satellite and terrestrial DMB network. A user can watch the live contents though a DMB player and can store them in a unified form (DMB MAF file) for a next-time usage. He can also search and select some scheduled contents using EPG metadata and make a reservation for recording on a local or remote storage.

In addition to the main audio-visual contents, various contents can be acquired through DMB data services such as BWS (broadcasting website), slide show, and TTI (traffic and travel information), or can be signaled by metadata describing the locator (e.g. URI (uniform resource identifier)) of the contents. A user can not only consume the additional contents, he can also select and store them in a packaged form (with or without the main contents) for a next-time usage.

These stored DMB MAF contents can be managed and re-accessed by a DMB-MAF browser which can display the contained item list and description for each DMB MAF content. A user can play, edit, or export the DMB MAF contents on a file basis, an item basis, or a stream segment basis.

In DMB services, the multimedia contents can be governed or protected using some protected or governed mechanisms such as IPMP. The stored DMB MAF content shall keep the governance with the originally transferred one.

2.6.1.2       IP Media Services

DMB MAF contents can also be stored to and/or acquired from a network storage such as a PDR (personal digital recorder), an NDR (network digital recorder), or a DMB Portal. For example, a user can search and select some DMB contents on a DMB Portal, record them on network storage, and watches them through Internet streaming or just download them for next-time usage.

Not only on the DMB-receivable terminals, DMB MAF can be created and played on DMB terminals without RF-module through the IP media services and/or removable storage media.

2.6.1.3       Interchange between Terminals

DMB MAF contents can be stored to and interchanged through a remote storage or a removable storage media. For example, a user can transfer and consume DMB MAF contents between the DMB MAF terminals through a CD, an USB drive, or a memory card. Partial and whole items contained in a DMB MAF content can be exported. Proper right management and protection scheme should be involved in this case.

2.6.1.4       User-Creative DMB contents

If they are interpretable on DMB terminals, a user can freely add his own contents or data to an existing DMB MAF content. For example, he can add an image, verbal or textual comments, secondary audio or video, or graphical annotation, and can make some of them spatially and temporally synchronized with the existing contents.

A user can also add his usage history and bookmark for next-time usage. For example, if information describing which contents/items/segments are already watched is stored with the contents, he can start to consume the contents from the point of last consumption at any time. More over, if bookmark information is stored with the contents, he can selectively consume the contents by just navigating his favorite bookmarked items or segments. The marked items or segments over multiple DMB MAF contents can be gathered and reformatted to a new MAF and can be shared to other users. For example, sports highlight content can be easily generated by this mechanism.

2.6.1.5       Metadata Applications with DMB MAF

Since there can be a large amount of DMB MAF contents on the Internet, search and select process may need to be personalized by some filtering or recommendation services. This kind of services requires the DMB MAF to include content description metadata and targeting information.

When the DMB video channel bandwidth is insufficient, EPG or ECG metadata should be transferred to each DMB terminals in alternative ways, e.g. through a DMB data channel or IP network. In this case, a unified and interoperable file format that can carry EPG or ECG information should be supported.

2.6.2     Requirements

2.6.2.1       General Requirements

The DMB MAF specification shall support:

·         The functionalities and components of T-DMB, S-DMB, and IP based DMB services.

·         Protection and governance of DMB MAF contents for creation, delivery, and consumption.

·         Interoperability among DMB MAF compliant services.

The DMB MAF compliant terminals shall support the followings:

·         DMB MAF Player is a device or terminal capable of using DMB MAF Contents as specified by the rights holder of the content. If the player which has functionality to transfer internal digital contents onto other DMB players, the protection information should be persistently associated with the transferred protected contents and trustfully applied to the target players.

·         The DMB MAF terminal should provide the functionality of storing DMB contents in DMB MAF format.

·         Basic and enhanced functionalities: the basic functionality includes the main A/V data (IOD/OD, BIFS, H.264/AVC, MPEG-4 ER-BSAC, MPEG-2 AAC+, MPEG-4 HE-AAC); and the enhanced functionality may encompass the IPMP, metadata, and user data, etc.

2.6.2.2       File Format Requirements

For storing and transferring the DMB MAF contents, DMB MAF shall support the following functionalities:

·         The file format shall provide unified storage and interchange of audio-visual data and metadata.

o   The file format shall support the media data such as text, image, audio, video, and metadata. The media data can be obtained from DMB broadcast services or can be retrieved and downloaded through communication networks.

o   The file format shall support synchronization of multiple media.

o   The file format shall support editing functionality of the included contents on item basis, or stream segment basis.

o   The file format shall support access to the whole or part of the contained data.

o   The file format shall allow for multiple DMB programs in a single DMB MAF file.

·         The file format shall provide the protection for audio-visual data and metadata

o   The file format shall provide the method to carry encrypted/governed resources (audio, video, metadata, digital item, and etc).

·         The file format shall support random access to the media data.

·         The file format shall allow for easy provisions for streaming and media adaptation.

·         The file format shall support the inclusion of private contents such as an image, verbal or textual comments, secondary audio or video, or graphical annotation, as long as they can be processed on DMB terminals.

·         The file format should provide metadata for fast search and preview without parsing all the details, e.g., title, keyword, genre, grade, simple descriptions for embedded digital items, representative image.

·         The file format should allow for easy access to the information about the playability of the file or individual media data, codec types, media data types, versions, etc.

·         The file format shall allow for DMB MAF player’s operations such as Play, Store, Edit, Copy, and Adapt.

2.6.2.3       Requirements for Metadata

The DMB MAF shall support the following functionalities with regard to metadata:

·         Identification and location resolution of contents

·         Description of video and audio resources and usage information:

o   Content description such as Title, Grade, Genre, Recorded time, Station, Broadcast schedule, URL, synopsis, etc.

o   Usage history and user profile

o   Segment metadata

o   ECG (Electronic Content Guide) and EPG (Electronic Program Guide)

·         Description of additional DMB data services

o   MOT (multimedia object transfer) slide show, BWS (broadcasting website), TTI (traffic and travel information), etc.

·         Description of related contents, consumption condition, and targeting information.

2.6.2.4       Requirements for Intellectual Property Management and Protection (IPMP)

The DMB MAF shall support the following functionalities with regard to IPMP:

·         Protection of audio, video and metadata

o   Carriage of protected resources

o   Detection of protection and/or governance

o   Granularity of IPMP signaling such as multi-level protection and partial/ whole protection

·         Governed use of protected resource

o   Various usage rules over governed resources such as valid period, maximum usage count, etc.

o   Usage control over protected/governed resources

·         The standard methods for authentication of users/devices/domains/IPMP tools

·         Carriage of protection information such as Rights Expression and IPMP Information

·         Accessing protection information with secured ways.

·         Renew ability of IPMP capabilities and tools

o   Update or upgrade of protection mechanism.

·         Management of copyright of DMB MAF Contents

o   License association with DMB MAF

§  License within DMB MAF/License referenced within DMB MAF/License service referenced within DMB MAF

o   License access and retrieval

§  Location from which the applicable license may be retrieved

§  method or process for acquiring the applicable license

o   Supporting the standard handling process for digital contents without licence

·         IPMP Tools and capability management

·         Flexibility and extensibility of IPMP Tools

·         Management of private information such as usage history, user preference data, etc.

2.6.3     List of technologies

The DMB MAF will be based on the following technologies:

·         ISO Base Media File Format

·         DMB Video: MPEG-4 AVC Baseline Profile Level 1.3  

·         DMB Audio: MPEG-2 AAC+, MPEG-4 ER-BSAC  MPEG-4 HE-AAC

·         System: IOD/OD, BIFS, MPEG-4 File format, AVC File format

·         DMB Data Services: BWS, MOT slide show, TTI

·         Content description and user metadata: TV-Anytime

·         Protection and Governance information: MPEG-21 IPMP, REL, etc.

Further details of the specification are currently in progress.

2.6.4     Comparison with other MAFs

Music slide show MAF, Photo Player MAF, Music Player MAF, Open Access MAF, and Professional Archival MAF show different target applications compared with DMB MAF, that is, those are not for video applications. Therefore, it is reasonable to inspect Portable Video MAF, Video Surveillance MAF, Digital Video/Cinema MAF, and Media Streaming MAF.

ü  Comparison with Media Streaming MAF and Portable Video MAF

 

 

MS MAF

PVP MAF

DMB MAF

Target market

Application scenario address on DTV Settop, IPTV Settop, etc.

(not specifically targeted)

-Mini-DVD

-Portal service

-DMB Application Terminals only.

Application scenario

Wide application scenario related to governance.

-Public broadcasting,

-Commercial broadcasting,

-Internet streaming

-Pay TV,

-IPTV,

-Internet Television

-Storage and subsequent play of content in a domain

A possible standard format for portable applications that utilize mid-resolution video:

-mini-DVD/UMD

-portable video (PMP)

-DMB recording

(main video and audios only, no DMB data contents, not MPEG-4 IOD/OD/BIFS)

DMB specific services:

-Storage of DMB contents only with associated metadata

-IP Media Services with DMB contents

-Exchange of DMB contents accros DMB MAF Terminals

-User-Creative DMB contents

-Metadata Applications with DMB MAF

File format

MPEG-21 File Format

(not specified in detail)

-ISO Base Media / MPEG-4 / AVC File Format

-MPEG-21 File Format

-Only specified with ftyp, moov, meta, mdat boxes

-Use xml box to contain MPEG-4 LASeR

-Content information is carried in xml box inside trak box

Derived MPEG-4 File Format with extension and restrictions

(extend AVC File Format)

-Use meta box to carry MPEG-21 IPMP & DID- Content information is carried in MPEG-21 DID

Video resources

-MPEG-2 Video Main Profile

-MPEG-4 Visual Advanced Simple Profile-MPEG-4 Advanced Video Coding Main Profile

-MPEG-4 AVC Baseline Profile, Level 1.2 for a resolution of 320x240 (4:3)

-MPEG-4 AVC Baseline Profile, Level 2.1 for a resolution of 480x272 (16:9)  

-MPEG-4 Part2 Visual

-MPEG-4 AVC Baseline profile level 1.3 only

Audio resource

-MPEG-1 Audio Layer-2

-MPEG-4 Advanced Audio Coding

-MPEG-4 High Efficiency-AAC

-MPEG-1/-2 Layer-3

-MPEG-4 AAC/AAC+/HE-AAC/BSAC Audio

-MPEG-2 AAC+, MPEG-4 ER-BSAC  MPEG-4 HE-AAC

Systems support

Not defined

Use MPEG-4 LASeR to represent menu to select content (video, audio, text)

MPEG-4 OD/IOD/BIFS are used for scene descriptions, interactions, and synchronizations of multiple objects

Data service resources

Not defined

Not defined

BWS, TTI, MOT Slideshow, Java midlet

Metadata

- MPEG-21 DID

- TV-Antime Phase 1 metadata (a limited set of elements to include only the Copyright, Classification and ProgramDescription elements)

-MPEG-7 MDS

1) Creation DS

2) UsageHistory DS

3) HierarchicalSummary DS

-MPEG-21 DID to organize DMB contents, metadata, and IPMP

- TV-Anytime Phase 1 Phase 2 Metadata (Part 3)

Streaming framework

MPEG-21 DIS

not defined

not defined

IPMP

-MPEG-21 REL DAC Profile

-MPEG-21 IPMP Componenents Media Streaming Profile

-MPEG-21 IPMP Message Extentions

-Media Streaming Technologies (access protocol, domain management protocol)

-Protection technologies to be aligned with the Music Player MAF, 2nd edition

-MPEG-21 REL DAC Profile

-MPEG-21 IPMP Components Media Streaming Profile

 

ü  Comparison with Video Surveillance MAF

n  The purpose of Video Surveillance MAF is to make standard of the CCTV (Closed Circuit TV), and provide compatibility between surveillance videos from different CCTV. Consequently, the application domain of Surveillance MAF is different from that of DMB MAF.

 

ü  Comparison with Digital Video/Cinema MAF

n  Digital Video/Cinema MAF proposed for keeping color information which can be changed by circumstance with metadata. With that, a user can consume digital video with color information which is created when digital video is created in the studio. DMB MAF and Digital Video/Cinema MAF have independent application domains.

2.6.5     Issues

MPEG experts need to identify any issues which are essential to facilitate the application scenario but which may lie outside of MPEG’s jurisdiction. 

2.6.6     Supporting Companies and Organizations

·         Broadcasters

o   KBS (Korea Broadcasting System)

o   MBC (Munhwa Broadcasting Company)

o   SBS (Seoul Broadcasting System)

·         Telcos

o   KT (Korea Telecom)

o   LG TeleCom

·         Solution & Terminal Vendors

o   Samsung Electronics

o   LG Electronics

o   Pixtree

o   onTimetek

o   DRM Inside

o   ContentGaurd

o   KAI MEDIA

o   OnionTech

·         Institute & Universities

o   ETRI (Electronics and Telecommunications Research Institute)

o   UOS (University of Seoul)

o   ICU (Information and Communications University)

 

2.7    Professional Archival Application Format

2.7.1     Objective and Scope

The purpose of the Professional Archival Application Format (PA-AF) is to provide a standardized packaging format for digital files. This packaging format can also serve as an implementation of the information package specified by the Reference Model of Open Archival Information System (OAIS). The OAIS Reference Model is a framework for understanding and applying concepts necessary for long-term digital information preservation (where “long-term” is long enough to be concerned about changing technologies). In addition, PA-AF can also be used as an intermediate or exchange packaging format for any kind of multimedia content.

 

PA-AF specifies the following: a metadata format to describe the original structure of digital files archived in a PA-AF file; a metadata format to describe context information related to a PA-AF file and digital files archived in it; a metadata format to describe necessary information to reverse the pre-processing processes applied to digital files prior to archiving them in a PA-AF file; and a file format for carriage of the metadata formats and digital files.

 

While a general archival process may include processes ranging from creation, delivery to the archival system, to dissemination to consumers, PA-AF is limited in scope as follows. PA-AF specifies neither how to create input content nor any agreement on how the content should be handled and delivered to the archiving process. PA-AF assumes that input content for the archiving process is available in an appropriate digital format. PA-AF specifies the format of a digital archive produced by the archival process. It does not specify how the archive output by the archival process is disseminated to end-users.

 

Note that the archiving policy and agreements are not included in the scope of PA-AF. PA-AF is independent from any kind of compression scheme or any kind of metadata format. PA-AF provides a mechanism for identifying the pre-processing tools applied to the archived content files. Any kind of compression tool or encryption tool can be specified as an external pre-processing tool. In addition, though PA-AF optionally provides a predefined minimum set of descriptive metadata for its archived contents, any kind of application- specific meta-information can be stored in the PA-AF package as a content file or files if the archive's policy or agreements require it. For this purpose, PA-AF provides a mechanism for linking that meta-information to the archived content file.

 

2.7.2     Application Scenario Description

2.7.2.1          General

Application scenarios for the PA-AF are basically categorized into several domains, such as packaging formats for long- and short-term preservation, and a packaging format for information exchange. Though there is no well-known definition of those categories, there is one useful standard called the OAIS Reference Model, which defines a good abstract model for the long-term preservation process. With combinations of appropriate compression tools and a set of meta-information, PA-AF can be applied to a variety of application scenarios. This section provides several example application scenarios that include the OAIS Reference Model, archiving digital information, and audio data processing domain.

 

2.7.2.2          Implementation as the information package specified by the reference model of Open Archival Information System (OAIS)

The PA-AF can be regarded as one implementation of the information package specified by the Reference Model of the Open Archival Information System (OAIS) [1]. In the OAIS Reference Model, the information package is specified as a logical object. OAIS is designed as an abstract model, so that the concept can be applied for any kind of long-term preservation. It gives a guideline for designing agreements between a producer and an archiving system. Whether meta-information should or should not be stored in an archive is up to archive's own policy or agreements. In order to give users freedom to define their own set of meta-information, PA-AF allows users to include any kind of meta-information in a PA-AF package as a file or files. It also provides a mechanism to link meta-information with a certain object file.

 

2.7.2.3          Multimedia Information archiving in digital libraries

Many libraries have started implementing archiving systems for digital media contents including Audio, Video media. Some libraries have even started archiving web sites, which may often contain multimedia data files.

Cost efficiency is one of the most important issues for all archiving sites. Standards for archival tools or systems provided by international standardization organizations can help to share the developed tools and minimize the cost of maintaining those tools. Cross mirroring is one possible way to minimize the risk of losing any content information while keeping the maintenance cost as low as possible, because several archives can share the resources. In order to minimize operating cost, archiving systems should be built with modular building blocks and use standardized tools like PA-AF.

 

Sometimes each archive has defined its own set of meta-information on the basis of its own policy or legacy system. All archives can use the same tested standard format with the freedom to use its own legacy format for meta-information.

 

Considering the availability of numerous collections of digital archives in a digital library, it is also important to attach richer information in the form of metadata to the archived digital media content to enable searching by metadata. For example, a search for a specific video can be done not only by its title but also by using richer matching information, such as the genre, year of production, country of production, the movie director’s name, and actors or actresses starring in the movie.

 

 

2.7.2.4          Handling assets related to international standardization work

In MPEG standardization, more than 200 documents are uploaded to the MPEG web site for input document registration and more than 150 output documents are issued as results of a meeting.

These documents are used in both for short-term and long-term preservation, as well as for file exchange. Once a standard has been specified, the document should be accessible. It may be important to keep the history of changes applied to these documents. On the other hand, during the international standardization work, quite a number of intermediate documents and other digital files, such as tentative drafts, test data and source code, and binary images of an example application, may be handled along with the draft standard documents.

 

The PA-AF can be applied as a packaging format for the storage of standardization assets that have been produced in the course of development of various multimedia specifications within JTC1 or other standardization committees.

 

In order to make standardization assets easily searchable, each archive can carry metadata to describe assets being archived, such as project names, editors’ names, the history of status changes, and the latest status in the standardization.

Since PA-AF provides a mechanism for parsing those metadata, PA-AF enables users to make use of those metadata.

For example, when editors upload documents to the MPEG web server, the server can automatically check whether appropriate metadata are provided with the document. Otherwise, the server can request editors to provide missing information and complete it via the web page.

When downloading those documents, users can see the table of contents (e.g., index.html), which is automatically generated by the web server. This helps users find the content information needed.

In addition, the PA-AF compliant browser on your PC can automatically generate a table of contents that contains links to all assets archived in several PA-AF files in your local HDD. In other words, you can get a description of all content you have in your HDD.

 

 

2.7.2.5          Packaging file format for medical data

An electronic medical chart is systematic documentation of a patient's medical history and care. It can be regarded as a kind of multimedia contents (e.g., bio-medical data files such as X-rays, EEG: electroencephalography, MEG: magnetoencephalography). The medical history can be kept in metadata. The PA-AF would be used for packing such data with some other relevant information.

 


 

 

2.7.2.6          Digital backup and direct delivery of recorded music project

Nashville members of the P&E Wing of The Recording Academy® formed a Delivery Specifications Committee, and the committee has created The Delivery Recommendations for Master Recordings document [3].

 

In the document, the committee says that they expect that direct delivery (via secure connection on the Internet, etc) will be commonplace in the future, and uploading files to very large-scale Digital Libraries will be recommended.

 

The preferred delivery of a recorded music project would include "flattened" continuous Broadcast Wave Files (BWF) of every multi-track and two-track element, without processing or automation. All of the audio tracks should be "flattened", which means to convert small recorded segments into a contiguous sound file padded with silence between the recorded segments, and migrated to the Broadcast Wave file format with the maximum of 1 channel per BWF file.

 

In addition to the BWF files, digitized traditional documentations (tracking sheets, engineer notes, set-up notes, sketches of microphone placement, recording map documents, lyrics, charts, orchestral arrangements and parts, mix documentation, and any other data pertinent to the recording project) should be included in the delivery files.

 

Folder-structure-based archiving is suitable for the direct delivery of a recorded music project. It is natural that all files relative to the recorded music project should be kept in an appropriate folder. Those files in the folder should be compressed and archived together in a single file (see Fig. 1). This may be much safer, since it can prevent users from miss-copying a file during the delivery process. When uploading or downloading a project by FTP, a single archive file is much easier to transmit than transmitting many BWF files individually (see Fig. 2).

Fig. 1: A folder image and an archive file.

 

Fig. 2: An example of application of the Professional Archival AF.

 

2.7.2.7          Compression and archiving intermediate data generated by sound editing tools

Many professional sound editing tools keep work files in folders (i.e. ProTools from DigiDesign [4], Nuendo from Steinberg [5]). Raw waveform data from separate tracks are stored in separate files with a specific file format, such as .wav or .aiff, and those files are kept in a folder or folders. The set of files in the folder for a song consisting of the waveform files of all tracks is called a “Project”. A project contains the project information file and individual audio track files.

 

During editing operations, compression and archiving tools that can compress all files in a folder with the folder structure into a single file may be very sufficient for this application. Users can keep a snap-shot of a version of edited files in an archive file, so that they can track back to the previous version of the edits if they want.

 

Sometimes, a target file may be a non audio data file or already encoded with a lossy compression tool, such as MP3/AAC. In that case, the file should be simply added to an archive file as is.

 

2.7.2.8          Preservation/Archiving

As stated in the industry input submitted as document M12688 to the 74th MPEG meeting in October 2005 [6], lossless compression of files is becoming very popular because it reduces the demand on storage media for bitwise-exact copies of digitized masters (see Fig. 3).

 

All files relative to the content are archived together in a single archive. Sometimes, there may be tight relations among the files, but sometimes not. Folder-structure-based archiving is more suitable for archiving these kinds of files since it's too difficult to define a general way to represent relations among the files because the relations highly depend on the content itself.

Fig. 3: Another example of application of the Professional Archival AF.

 

 


 

 

2.7.3     Requirements

 

 

Requirement

Notes

A: Packaging

A01

Packaging format of PA-AF should be able to serve as an implementation of the information package defined in the OAIS reference model

 

 

A02

PA-AF should pack files and folder structures into single archive.

 

 

A03

PA-AF should support large files exceeding 4GB.

 

 

A04

PA-AF should support a mechanism to allow the splitting of large archive files into several smaller archive files without loss of information.

 

 

A05

PA-AF should be able to pack any kind of files, including audio, video, images, metadata files.

The types of files should be stored in an archive and should not be stored in an archive shall be controlled by the policy.

 

A06

PA-AF should preserve the original file names, attributes, and the folder structure.

 

 

A07

PA-AF shall support perfect extraction of an archive into its original form.

 

 

A08

Files output from extraction of an archive shall have the same directory structure as that of input files.

 

 

A09

It shall be possible to unpack an archived file losslessly, which means complete reconstruction of original files, including the filenames, folder structures, and attributes of those files or folders.

 

 

A10

Files output from extraction of an archive shall be the same bit for bit as the original input files.

 

 

A11

PA-AF shall support extraction of all files in the archive.

 

 

A12

PA-AF shall support extraction of single file from a collection of files in archive.

 

 

A13

PA-AF shall support browsing of archived files without having to extract the files from the archive.

 

B: Cross-platform operation

B01

PA-AF should support cross-platform operation, such as Windows, Mac, and Linux platforms.

 

 

B02

The cross-platform support should include interoperability among different file-systems (file attributes) and character-sets.

 

C: Compression

C01

PA-AF itself shall be compression-scheme independent.

 

 

C02

PA-AF should compress an archive’s input files.

 

 

C03

If any compression scheme is used, PA-AF shall use a lossless compression algorithm.

 

 

C04

PA-AF shall allow different compression algorithms for different data types. (e.g., MPEG-4 ALS for audio data, JPEG2000LS for image data, ZIP compression for text data)

 

 

C05

PA-AF shall allow the use of one or more compression algorithms for files with composite data types.

 

D: Meta-information

D01

PA-AF should support association of richer information about files in the archive.

 

 

D02

PA-AF should provide context information about the content in the archive and the archive file itself.

 

 

D03

PA-AF should provide creation context information.

 

 

D04

PA-AF should provide content profile information.

 

 

D05

PA-AF should provide the access and/or modification history of the archive.

 

 

D06

PA-AF shall provide a mechanism to accommodate application-specific context information.

 

 

D07

PA-AF should provide Modality information of the content (e.g., text, image, audio, video, graphics).

 

 

D08

PA-AF should provide File format information of the content (e.g., MP3 audio, AAC audio, MP4 video, JPEG images, GIF images).

 

 

D09

PA-AF should provide Resolution information of the content (e.g., bit rate, spatial resolution, temporal resolution).

 

E: Identification

E01

PA-AF should have a mechanism for detecting the types files stored in a PA-AF file

 

 

E02

There shall be a mechanism for providing unique and unambiguous identification of each digital element.

 

G: DRM

G01

PA-AF should support inclusion of Digital Rights Management (DRM) information for trusted exchange of an archive among rights holders, intermediaries, and users.

 

 

G02

PA-AF should support governance of archive usage and distribution.

 

 

G03

There should be a mechanism for storing licensing and intellectual property rights information for each item.

 

 

G04

A mechanism should be available to allow fine-grained access control to all data items (essence, metadata, administrative data) in the archive system.

 

 

G05

PA-AF should protect individual files and file structures at various levels of granularity, including the entire archive.

 

 

G06

PA-AF should support a simple passwords-lock-encryption mechanism for the Professional Archival AF file.

 

 

G07

PA-AF should support detection of post-creation content tampering.

 

 

G08

PA-AF should support validation of PA- AF files.

 

 

G09

PA-AF should support a mechanism for identifying the encapsulated DRM system.

 

 

2.7.4     List of technologies

To realize the above system, PA-AF adopts the following component technologies:

·         MPEG-21 Digital Item Declaration Language 2nd Edition Profile for PA-AF

·         MPEG-21 Digital Item Identifier

·         MPEG-21 Intellectual Property Management and Protection Components Base Profile for PA-AF

·         MPEG-21 Rights Expression Language MAM Profile

·         MPEG-7 Multimedia Description Scheme Profile for PA-AF

·         MPEG-21 File Format

·         Lossless compression tool identifiers

·         Encryption, hash, and digital signature identifiers

·         Additional metadata dedicated for use in PA-AF only

 

2.7.5     Components of ISO/IEC 23000-6

2.7.5.1           Normative Components of a PA-AF File

A PA-AF file consists of a header and content part as illustrated in Fig. 4. The header part contains the metadata (called Preservation Description Information) needed to understand the PA-AF file itself and all files archived in it. The content part contains one or more archived files. These archived files are called Content Information. Content Information includes digital data in its original format as input into the PA-AF file and/or in the format obtained after pre-processing with tools allowed by this specification.

 

Fig. 4:  Logical view of a PA-AF file.

 

Preservation Description Information is in the XML metadata format. It includes Archive Structure Information, Context Information, and Pre-processing Information.

 

Fig. 5: Structure of input files stored in a PA-AF file

 

Archive Structure Information models relationships among the various kinds of Content Information. Archive Structure Information provides three functionalities:

         It preserves the structural relationships in Content Information. As illustrated in Fig. 5, digital files input into the PA-AF file can be hierarchical in one or more directories; however, in the physical PA-AF file format, they are stored in a flat manner. Archive Structure Information preserves the original hierarchical structure of the input digital files so that when Content Information is extracted from the PA-AF file, the structure of output digital files is the same as it was input into the PA-AF.

         It preserves file attributes of Content Information. The attribute values are stored in such a way to support cross platform Content Information extraction. In other words, even if Content Information of a PA-AF file is extracted to an operating system other than its original operating system, all file attribute values still prevail.

         It provides an entry point to access contents of the PA-AF file.

 

Context Information describes context information attached to a PA-AF file and Content Information. It includes:

          PA-AF file creation information and content Information, such as information about what, how, when, where, who, and why.

          Profile information of the Content Information, such as the file format, file size, audio and visual profile of the content (if it is audio-visual data).

          History of access to the Content Information, which records any actions applied to the Content Information, such as archiving and extraction. This record may include who the actor is and when the action was performed.

In addition to pre-defined Context Information, the PA-AF allows inclusion of application-specific context information to satisfy application-specific requirements. Application-specific context information can be manifested in any format and archived in a PA-AF file as Content Information. PA-AF provides a link to this application-specific context information so that applications that can understand this information can read and use it; those that cannot understand it can simply skip it.

 

Pre-processing Information describes profiles of tools that can be used to reverse pre-processing applied to Content Information. It contains information, such as tool identification, parameters required to execute the tool, part of Content Information pre-processed with that tool, and tool location. Pre-processing processes that can be applied to Content Information include data compression, data protection, data integrity checking (authentication of originality), and data governance validation checking.

 

The component technologies listed above can be used in combination to achieve the basic functionality and enhanced functionality of PA-AF as illustrated in Fig. 6. The enhanced functionality is optional and implemented on top of the basic functionality. For example, the combination of MPEG-21 File Format, MPEG-21 DIDIL 2nd Edition Profile for PA-AF, MPEG-21 DII, and MPEG-7 Creation Information Tool provides solutions to satisfy the basic functionality of PA-AF, which is packaging Content Information in a PA-AF file. By adding MPEG-21 IPMP Components Base Profile for PA-AF, one can add functionality, such as compression, protection, and integrity checking to the PA-AF. By adding MPEG-21 REL MAM Profile, one can add license information to govern the usage of the PA-AF file. Finally, by adding MPEG-7 MDS Scheme Profile for PA-AF, one can have an interoperable description of Content Information that can be exploited to implement functionality for interoperable content searching. The combination of all component technologies provides a full solution for PA-AF.

 

Fig. 6: Basic and enhanced functionality of PA-AF

 

2.7.5.2    Architecture of PA-AF File Authoring Tool

Figure 7 outlines an informative packaging tool that may produce an output file that complies with the PA-AF specification. The tool consists of the following modules:

         A file structure information generator, which analyzes and generates metadata to model hierarchical structure of the input files.

         A context information generator, which creates metadata to record context information related to the output PA-AF file and input files to be archived.

         A pre-processing information generator, which creates metadata for required tools and their execution parameters to reverse any pre-processing processes applied to the input files.

         An archive header wrapper, which combines all the generated metadata into PA-AF file header.

         A file formatter, which takes the header and input files (original or after pre-processed) and wraps them in a file.

 

Fig. 7:  Overview of a PA-AF packaging tool for creating a PA-AF file.

2.7.6     Comparison with other MAFs

Music Player MAF

The Music Player MAF is an audio application that provides the file format for combining MP3 audio, MPEG-7 metadata, and an optional JPEG image. The target domains of the two MAFs are inherently different.

Photo Player MAF

The Photo Player MAF defines the specification for a file format that carries JPEG images with MPEG-7 Visual and MDS metadata. The target domains of the two MAFs are inherently different.

Music Player MAF, 2nd edition

The Music Player MAF, 2nd edition features the same functionalities as the Music Player MAF with the addition of content protection tools. The target domains of the two MAFs are inherently different.

Music Slide Show MAF

The Musical Slideshow MAF builds on top of the Music Player and the Photo Player MAF. It is a superset of these two MAFs. The target domains of the two MAFs are inherently different.

Media Streaming MAF

The Media Streaming MAF standardizes streaming video content and protocols. The target domains of the two MAFs are inherently different.

Portable Video MAF

The Portable Video MAF is an application that focuses on local playback of prepackaged video content. The target domains of the two MAFs are inherently different.

Digital Video/Cinema MAF

The Digital Video/Cinema MAF is an application that concentrates on the distribution of high-resolution video content to professional users with emphasis on color management information. The target domains of the two MAFs are inherently different.

Video Surveillance MAF

The Video Surveillance MAF is an application that is specifically targeted towards the application domain of surveillance. Professional Archival MAF may be used as the container for the Video Surveillance MAF but it does not support a mechanism required for the surveillance-domain-specific demands.

Open Access MAF

The Open Access MAF is an application that focuses on publication and delivery of content that is governed in a “light-weight” form. Professional Archival MAF may be used as the container for the Open Access MAF but it does not support a mechanism required for the Open Access specific demands.

 

2.7.7     Supporting Companies and Organization

·         NTT

·         Technical University of Berlin

·         Philips

·         NIST

·         LG Electronics

·         Information and Communications University (ICU)

 

2.7.8     References

[1] ISO 14721:2003, "Space data and information transfer systems – Open archival information system – Reference model," 2003.

[2] ISO/IEC 14496-3:2005/Amd.2:2006, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions, July 2005.

[3] ISO/IEC 14496-3:2005, Subpart10: Lossless coding of oversampled audio (DST), July 2006.

[4] " The Delivery Recommendations for Master Recordings," P&E Wing Delivery Recommendations 030609. 31 revision. http://www.grammy.com/PDFs/Recording_ Academy/Producers_And_Engineers/DeliveryRecs.pdf

[5] ProTools: http://www.digidesign.com

[6] Nuendo: http://www.steinberg.net

[7] M12688, "Lossless Compression – Applications & Requirements," October 2005, Nice, France.

[8] A short description of WAVE 64: http://media.vcs.de/download/content/show/04345113457

[9] The latest version of the RF64 draft specification: http://www.sr.se/utveckling/tu/bwf/

[10] ISO/IEC 14496-12, ISO Base Media File Format

[11] ISO/IEC 14496-14, MP4 File Format

[12] ISO/IEC 21000-9, MPEG-21 File Format

 

2.8    Video Surveillance Application Format

In this section, video surveillance is introduced as an application domain that will benefit from a MAF that introduces a small number of key improvements to interoperability in the CCTV industry.

The abbreviation CCTV stands for Closed Circuit Television, which indicates that standards are not considered to be essential since this type of TV takes place in a hitherto closed environment with no need for interoperability with systems offered by other companies. On the other hand the video surveillance industry is subject to a changing environment, and MPEG standards can offer interoperability between different systems, as well as to have access to mature and cost effective multimedia components to be integrated.

The proposed Video Surveillance (VS) MAF is not intended to directly accommodate the legacy content and components. Rather, it is intended to provide a lightweight and useful wrapper to the video content from the MPEG technologies that are the best fit for purpose at the date of expected finalisation. However, the description of any relation existing between VS MAF content and other video content will need to be addressed. Currently, JPEG and MPEG-4 Part-2 are arguably the most commonly deployed digital video standards. However, in due course it is expected that AVC will be more commonly deployed, not least because it is understood to be the most ‘fit-for-purpose’ of the available video technologies.

The CCTV application domain undoubtedly has many requirements in the areas of auditing, data integrity and protection. However, this is not included in the VS MAF Application Scenario. Indeed, it is expected that a MAF that did address these requirements would bear a close and hierarchical relationship with the proposed VS MAF.

A further design requirement for the proposed MAF is that it has a small group of key features that will provide tangible benefit to the manufacturer, installer or user. Even though the MPEG standards can provide many more features of possible benefit, there are several reasons why this design is preferred, including the need to constrain the scope of the MAF to sizeable proportions and the idiosyncrasies of the CCTV market.

2.8.1     Application Scenario Description

In many countries, notably the USA, U.K. and other European and East Asian countries, video surveillance in public places is increasingly used for crime prevention and for the detection of similar incidents. Examples for public places being observed are streets, squares, railway and subway stations.

More and more cameras are being in place forming huge surveillance systems. Within those systems required basic functionalities are identical. The video stream needs to be transmitted from the site to an appropriate place where it will be archived. The video might be looked at by a number of persons and in case of an incident it could be exported to the appropriate authorities.

In order to identify a requested stream it would be necessary to enhance the pure video stream by appropriate metadata. Here, the information about recording time and place as well as camera parameters used for recording would be sufficient to achieve basic interoperability.

For efficient archival a packaging of the video and metadata information into a file format should be supported. That file format should also provide for the inclusion of user data and possibly additional MPEG-7 metadata. This metadata should provide key functionality to support the activities of the CCTV manufacturers, installers and users.

Recording of audio signals in the context of surveillance is not commonly used, and is therefore not included in this application scenario. Companies in this space have indicated their interest to include such functionality in the future.

General Requirements

The Video Surveillance MAF shall support:

·         Packaging of visual data and associated metadata

·         Selective access to visual data and metadata

File Format Requirements (to Support Packaging)

1)      The file format should support all existing transport networks (transport agnostic).

2)      The file format should support storage of visual data and metadata, and the information necessary to enable these elements to be synchronized.

3)      The file format should support visual content coded by AVC.

4)      The file format should provide a means for referencing an arbitrary external source of digital video content.

Video Requirements

1)      It is required to support the AVC video compression standard.

2)      Sampling Formats and Bitrates:

a.       Scanning Formats: progressive scan, only 4:2:0 chroma-sampling, only 8 bits per pixel.

b.      The video compression shall support up to 4CIF image resolution.

c.       The video compression shall support temporal resolutions of 25 Hz and 30 Hz and also lower resolutions.

d.      The video compression shall support a bit-rate of up to 2 Mbps (4CIF).

3)      Video Functionalities:

a.       Low latency coding (less than 500 ms).

b.      Temporal scalability.

c.       Random access and trick modes (e.g. fast forward, fast reverse).

Requirements for Metadata

The file format shall support the following metadata. For ‘required’ items, the file is only valid if the item is present. For ‘optional’ items, the file is valid with or without the item. The format should support efficient mechanisms to allow devices to ignore unused optional items. Unless otherwise stated, all items are optional.

A mechanism should be supported to allow the values and labels of user-entered metadata to conform to a user-supplied taxonomy, specific to a particular application scenario, e.g. train station, airport, shopping centre; convention for labeling cameras; or lexicon for describing events.

1)      Information associated with captured data:

a.       Time and date of capture of visual data in a standard unambiguous interoperable format (required).

b.      Identification of the equipment:

                                                              i.      Identification tag for the camera (required).

                                                            ii.      Identification tag for the cluster to which the camera belongs.

                                                          iii.      Identification tags for each of the multiple streams from a single camera.

c.       Description of equipment used and equipment settings:

                                                              i.      Camera make and model.

                                                            ii.      Camera settings such as the aperture and shutter-speed values, peak-to-peak voltage etc.

2)      Metadata for content:

a.       The format should provide a means of annotating events (usually depicted in the video content).

                                                              i.      In a free text format, along with an event ID (unique in the file only) and a timestamp.

                                                            ii.      In a semantically structured format.

b.      The format should provide a means of representing the location of an object of interest in one or more frames of video data, in image co-ordinates.

c.       The format should provide a means of representing the colour appearance of an object of interest, in order to enable retrieval of observations of that object from multiple video sources.

d.      The format should provide a means of storing the identity of objects observed in one or more video sources, to enable cross-referencing.

2.8.2     List of technologies

·         AVC

·         MPEG-7 simple meta data profile and visual descriptors

·         ISO Media File Format, AVC File Format

2.8.3     Comparison with other MAFs

 

Music Player MAF

The Video Surveillance MAF does not include audio. Also, the VS MAF does not include requirements for protection, so there is no close relation with the 2nd Edition of the Music Player MAF.

 

Photo player MAF

The Photo Player MAF specifies the file format for the combination of JPEG files and a subset of MPEG-7 Visual and MDS metadata. As the proposed VS MAF requires a file format and this is based on the same technologies, there is a degree of relationship, and it may be useful to consider the PP MAF structures for re-use. However, it is believed that the two MAFs can be considered as largely independent due to the differences in required video content and metadata.

 

Musical Slide Show MAF

The Video Surveillance MAF does not include audio. However, there may be some benefit in examining if the structures for representing video and metadata content will be of use in the case of the VS MAF.

 

Professional Archival MAF

There is a sense in which CCTV activity includes the archiving of observations made about the scene under surveillance. However, the details of the Professional Archival MAF address a substantially different application domain, such as lossless archiving of directory structures. It is thought that the procedures for archiving used by the CCTV operators will lie outside the scope of the proposed VS MAF.

 

Media Streaming MAF

The Media Streaming MAF includes extensive requirements for Digital Rights Management, which is not included in the proposed VS MAF. Also, the requirements for video and metadata are substantially different. Therefore, there is no close relation with the MS MAF.

 

Portable Video MAF

The Media Streaming MAF includes extensive requirements for Digital Rights Management. This is not included in the VS MAF, therefore there is no close relation with the MS MAF.

 

Open Access MAF

The Open Access MAF addresses substantially different requirements than the proposed VS MAF.

 

Digital Media Broadcasting MAF

The DMB MAF includes requirements for Digital Rights Management, and the requirements for video content and metadata are significantly different. Therefore, there is no close relation with the DMB MAF.

 

Digital Video/Cinema MAF (under consideration)

The DMB MAF includes requirements for Digital Rights Management, and the requirements for video content and metadata are significantly different. Therefore, there is no close relation with the DMB MAF.

 

2.8.4     Issues

The question of which video compression standards to include in the proposed VS MAF is an important issue. There are many competing standards (and non-standards) in use in the surveillance industry today, and there is a cost (in terms of decoder complexity and MAF manageability) associated with the inclusion of each additional standard. Therefore the approach adopts a strategy of only including a single video standard. The facility for allowing external references to arbitrary video content will support cases in which the video content is coded in other standards.

Two areas of metadata were excluded from the list of requirements: information about the locality of the camera and its relation to the scene; and more sophisticated surveillance requirements such as support for automatic face recognition systems. These areas were excluded because they were not regarded as sufficiently mature (in terms of standardization and real-world experimental performance, respectively).

2.8.5     Supporting Companies and Organizations

·         Cognitive Video Technologies (CoVi) (US)

·         Ecole Polytechnique Federal Lausanne (EPFL) (CH)

·         EMITALL Surveillance (CH)

·         Mitsubishi (JP)

·         Nine Tiles (UK)

·         QinetiQ (UK)

·         Siemens AG (D)

·         Technische Universität München (TUM) (D)

·         Effective Pictures (UK)

·         Thales (F)

2.9    Stereoscopic Video Application Format

Stereoscopic contents are getting widely used in various applications including mobile devices. This user preference has caused the mobile phones with stereoscopic cameras appeared in the market. A user can take a picture and a video sequences, and display acquired stereoscopic contents with his/her mobile phones.

However, a user can only store and display these acquired stereoscopic contents with his/her own devices due to the non-existence of a common file format for these contents. This limitation causes a user not share his/her contents with any other users, which makes it difficult the relevant market to stereoscopic contents is getting expanded.

On the basis of these demand from the industry, it is required to develop the common file format for stereoscopic contents, which can be called as a stereoscopic MAF. Stereoscopic MAF can be achieved and implemented by the current MPEG standard technologies, and finally will help boost the relevant market.

State of the art

Various application devices with larger and smaller LCD screens are released in the market for supporting stereoscopic contents, which also provide 2D/3D switching functionality. Especially, a mobile phone with a stereoscopic camera has been released in the market, which provides a user as a consumer to have more realistic experiences with or without glasses, and also, as a content creator to take stereoscopic images or record the stereoscopic video contents. It is thus expected that Mobile TV will become one of indispensable digital devices for getting information and entertainment anytime, and the experimental broadcasting for 3D mobile services is being planned in Korea [1, 2]. Along with the mobile device based 3D services, various applications for higher resolution based stereoscopic contents are under development such as 3D Cinema direction [2].

2.9.1     Application Scenario Description

As described above, the current personal devices such as mobile phones with the stereoscopic camera support to take and display the stereoscopic image/video. On the basis of this personal device, there are various application scenarios possible as being shown in the Figure 1. For realization of the application scenarios, it is required that a stereoscopic MAF has the functionalities of supporting 2D/3D contents combination and of interoperability between them, and also of providing information of preventing the tiredness of eyes for long-hour stereoscopic contents.

Figure 2 Functionalities for advanced stereoscopic contents.

The functionalities for Stereoscopic MAF are as follows:

2.9.1.1       Combination services of mono and stereoscopic contents

A multimedia content can be composed of stereoscopic contents as a whole. Also, the content can be composed of the combination of stereoscopic and monoscopic contents. The combination service can extract a user’s interest by temporally providing stereoscopic images/motion pictures. For example, advertisement content would be suitable for this combination of mono and stereoscopic contents by providing stereoscopic effect for an enriched feeling. Also, movie clips would be another possible service candidate. As another example, while a user records the B-boy’s performance, a specific dance clip can be recorded with stereoscopic camera functionality for providing real 3D effects.

2.9.1.2       Interoperability between stereoscopic and monoscopic contents

Stereoscopic mobile phones and devices are, for example, equipped with the stereoscopic digital camera with or without wired/wireless communication functionalities. Using dual cameras on a device, a user can create his own contents and he/she can share them with other users through communication channels if available or storage devices if a communication channel not available. The differences between a stereoscopic mobile phone and a mobile device are image resolution size, display time, scene description complexity such as that the stereoscopic contents using mobile devices are bigger image size, longer display time, more complex scene description than the stereoscopic contents using mobile phone.

However, the mooscopic player is more common than the stereoscopic player in the current market and the stereoscopic contents are not working in the current monoscopic devices. Also, a user may want to watch the stereoscopic content without the real effects. So, information for interoperability between mono and stereoscopic contents is needed. For example, by including information such as content composition type, physical location of each sided image/video data in the file format, the stereoscopic contents can be displayed in the monoscopic mobile phone and devices.

2.9.1.3       Interactive services by Scene Representation of the Stereoscopic contents

Currently, stereoscopic contents are widely used for mobile devices, which have the general property of interactivity. Also, stereoscopic contents can be combined with 2D media data. In order to enrich interactive and 2D combined services, it is required to have the functionality of scene description. By using this functionality, a user can enjoy variety services on the limited display environment. For example, a user can access the corresponding web site on the way of watching the stereoscopic video advertisement.

2.9.2     Functionalities in terms of application scenario

2.9.2.1       Creation of the Stereoscopic MAF Contents

Stereoscopic content can be created from a stereoscopic camera. This content will contain stereoscopic still images and/or motion pictures. Stereoscopic content can also be created by using CG animation tools.

Since the combination of mono and stereoscopic contents will be used for various applications, it is required for stereoscopic MAF to contain mono and stereoscopic contents in a single file format.

2.9.2.2       Metadata for the stereoscopic MAF contents

Stereoscopic MAF content can be created combining acquisition/creation information with stereoscopic still images and motion pictures.

In order to identify a stereoscopic content it would be necessary to provide additional information by using stereoscopic metadata. This metadata should provide key functionality to support the activities of the mobile service companies, mobile phone manufacturers, stereoscopic content providers and users.

2.9.2.3       Player for the Stereoscopic MAF Contents

The Stereoscopic MAF content player will be able to play stereoscopic and monoscopic contents as well.  A user will be given the option to play stereoscopic MAF content from the following modes: Monoscopic mode, Stereoscopic mode.

2.9.2.4       Interactive service thought the Stereoscopic MAF contents

Interactivities can be included Stereoscopic MAF. By selecting or clicking individual components, a user can access extended services or extra information, and thus can enjoy various rich experiences on the personal devices.

2.9.2.5       Stereoscopic video safety

Since stereoscopic contents are displayed on the basis of different view data, visual discomfort in stereoscopic contents may be occurred more frequently than in monoscopic contents. It is required to have the process for preventing visual discomfort with the parallax information of the stereoscopic MAF contents.

2.9.3     Requirements

General Requirements

·        shall support to contain Stereoscopic still image and motion pictures service.

·        shall support the compatibility with ISO base media file format, which can be applicable to 2D based ISO base media file player.

·        should support the mono and stereoscopic contents to be simultaneously displayed in a scene

·        should support random access of media data.

Requirements for metadata

·        shall support the information on construction methods for generating stereoscopic contents include still images and motion pictures.

·        shall support stereoscopic still images and motion pictures acquisition/creation information from the Stereoscopic acquisition/creation system.

o   For examples, distance between two cameras, viewing types such as crossed-eye view or parallel-eye view, viewing distance versus use/validity depth ratio and kinds of camera, etc.

·        shall support the efficient contents management method for indexing and searching applications.

·        should support the contents’ information such as author, date, category, manufacturer information and usage history, etc.

Requirements for player

·        shall support the decoder(s) defined in the list of technology.

·        should support the mono and stereoscopic contents to be simultaneously displayed in a scene

Requirements for Scene Representation

·        should provide the scene representation for rich and interactive service.

Requirements for video safety

·        should provide the parallax information such as representative disparity information of the stereoscopic MAF contents

2.9.4     List of technologies

The Stereoscopic MAF will be based on the following technologies:

·        ISO Base Media File Format

·        Video compression format : MPEG-4 part2 (SP@level 3) , MPEG-4 part10 (BP@level 1.3)

·        Audio code: AAC-LC, AAC+

·        Voice Codec: AMR, EVRC

·        Scene description codec: LASeR

·        Still Image: JPEG, PNG

·        Metadata: MPEG-7 MDS

2.9.5     Comparison with other MAFs

Music Player MAF, Music slide show MAF, Photo Player MAF, Open Release MAF, and Professional Archival MAF have different goals than Stereoscopic MAF in that the former MAFs are not used in video applications. Therefore, it is more reasonable to compare it with Portable Video MAF, Video Surveillance MAF, Digital Video/Cinema MAF, Digital Multimedia Broadcasting MAF and Media Streaming MAF [3].

·        Comparison with Portable Video MAF (PVP MAF)

o   Stereoscopic MAF has a very specific application domain which deals with stereoscopic imaging and metadata specifically designed for such applications, where as Portable video player MAF provides a “DVD-like” video application file format for portable devices.

·        Comparison with Media Streaming MAF (MS MAF)

o   MS MAF focuses on DRM of contents for streaming service. Stereoscopic MAF is mainly concerned with storing and interchanging Stereoscopic contents. Therefore, the two MAFs have different application domains.

·        Comparison with Digital Multimedia Broadcasting MAF

o   DMB MAF focuses on contents for DMB service. Stereoscopic MAF is mainly concerned with storing and interchanging Stereoscopic contents. Therefore, the two MAFs have different application domains and video codec types.

·        Comparison with Video Surveillance MAF

o   The purpose of Video Surveillance MAF is to make a standard for CCTV (Closed Circuit TV), and provide compatibility between surveillance videos from different CCTVs. Thus, the application domain of Video Surveillance MAF is different from that of Stereoscopic MAF.

·        Comparison with Digital Video/Cinema MAF

o   Digital Video/Cinema MAF’s purpose is to keep colour information which can be changed by the metadata. A user can consume digital video with colour information, which is created when digital video is created in the studio. Stereoscopic MAF and Digital Video/Cinema MAF have independent application domains.

Especially, the stereoscopic MAF has more relevant functionalities to PVP and Digital Multimedia Broadcasting MAFs in terms of its handled media and application services, and thus, has been compared with those MAFs in terms of its technical specifications as follows:

 

PV MAF

DMB MAF

Stereoscopic MAF

Target market

·  Mini-DVD

·  Portal service

DMB Application Terminals only.

 Stereoscopic File download service, stereoscopic UCC

 

 

Application scenario

A possible standard format for portable applications that utilize mid-resolution video:

·  mini-DVD/UMD

·  portable video (PV)

DMB specific services:

·  Storage of DMB contents only with associated metadata

·  IP Media Services with DMB contents

·  Exchange of DMB contents accros DMB MAF Terminals

·  User-Creative DMB contents

·  Metadata Applications with DMB MAF

File format of wide application :

·  Creation of the Stereoscopic MAF Contents

·  Player of the Stereoscopic MAF Contents

 

File format

·  ISO Base Media / MPEG-4 / AVC File Format

·  MPEG-21 File Format

·  Only specified with ftyp, moov, meta, mdat boxes

·  Use xml box to contain MPEG-4 LASeR

·  Content information is carried in xml box inside trak box

Derived MPEG-4 File Format with extension and restrictions

(extend AVC File Format)

·  Use meta box to carry MPEG-21 IPMP & DID- Content information is carried in MPEG-21 DID

ISO Base Media file format

Video resources

·  MPEG-4 AVC Baseline Profile, Level 1.2 for a resolution of 320x240 (4:3)

·  MPEG-4 AVC Baseline Profile, Level 2.1 for a resolution of 480x272 (16:9) 

·  MPEG-4 Part2 Visual

MPEG-4 AVC Baseline profile level 1.3 only

MPEG-4 part2 (SP@level 3) , MPEG-4 part10 (BP@level 1.3)

 

Audio resource

·  MPEG-1/-2 Layer-3

·  MPEG-4 AAC/AAC+/HE-AAC/BSAC Audio

MPEG-2 AAC+, MPEG-4 ER-BSAC  MPEG-4 HE-AAC

MPEG4 AAC-LC, AAC+

 

Voice codec

 

 

AMR, EVRC

Systems support

Use MPEG-4 LASeR to represent menu to select content (video, audio, text)

MPEG-4 OD/IOD/BIFS are used for scene descriptions, interactions, and synchronizations of multiple objects

LASeR

 

 

Data service resources

Not defined

BWS, TTI, MOT Slideshow, Java midlet

Not required

Metadata

MPEG-7 MDS

1) Creation DS

2) UsageHistory DS

3) HierarchicalSummary DS

·  MPEG-21 DID to organize DMB contents, metadata, and IPMP

·   TV-Anytime Phase 1 Phase 2 Metadata (Part 3)

MPEG-7 MDS

IPMP

Protection technologies to be aligned with the Music Player MAF, 2nd edition

·  MPEG-21 REL DAC Profile

·  MPEG-21 IPMP Components Media Streaming Profile

Not required

2.9.6     Issues

Stereoscopic MAF is motivated from the industry’s needs. The current market already has its related products, thus Stereoscopic MAF file format will boost up the market.

Thus, basically, the current technology set should consider the current implementation considering standardization progress and time-to-market requirement.

However, generally, in the future, it should be investigated that adding technologies makes bigger complexity to the terminal or help to broaden the application spread out further considering the market requirement and standardization process.

2.9.7     Supporting Companies and Organizations

·         Next generation broadcasting forum, Korea

·         SK Telecom

·         Samsung Electronics co., Ltd

·         LG Electronics

·         Samsung Electro-Mechanics co., Ltd

·         MBC(Munhwa Broadcasting Corp.)

·         ETRI(Electronics and Telecommunications Research Institute)

·         KETI(Korea Electronics Technology Institute)

·         KIST(Korea Institute of Science and Technology)

·         Samsung SDI co., Ltd

·         Samsung Techwin co., Ltd

·         ECT(Enhanced Chip Technology Inc.)

·         KWU (Kwangwoon University)

·         TU Media Corp.

·         3D Project Inc.

·         Kyung Hee University

2.9.8     References

[1]   Stereoscopic Multimedia Processor, http://www.ect.co.kr

[2]   BigIEntertainment, http://www.bigient.com

[3]   ISO/IEC JTC1/SC29/WG11/N9164, MAFs Overview, Lausanne, Switzerland, July 2007.

 


 

3      MAFs under Development

This section described the MAFs under development this means the MAFs that have been under consideration and reached consensus on the value of its specification as a standard.

3.1    Protected Musical Slide Show Application Format

The work of ISO/IEC 23000-4 (Musical Slide Show MAF) specifies a file format for packing up MP3 audio data, JPEG images, MPEG-7 Simple Profile metadata, timed text, and MPEG-4 LASeR script. The presentation of Musical Slide Show MAF contents is made by synchronizing all resources with animation defined by MPEG-4 LASeR script. This Musical Slide Show MAF will enrich the user’s experience in consuming different resources together. However, the current Musical Slide Show MAF does not consider any protection and governance mechanism yet which is the essential elements to deploy it into real business.

In this section, protection mechanism to enhance the current Musical Slide Show MAF is considered. The proposed protection mechanism utilized IPMP technology so that it is flexible and extensible to use any protection tools.

3.1.1     Application Scenario Description

The resources inside the Musical Slide Show MAF can be protected as a whole with protection tool chosen by the creator. Before exercising/play backing the resources, the player needs to be able to unprotect them by using the correct protection tool. In the case the required protection tool and key to unprotect the resources are not available and cannot be retrieved, the player should give notification to the user.

Moreover, the protection can be applied selectively. For example, we might want to protect only the MP3 audio data, or combination of audio data along with the slideshow, or protecting the animation for the slideshow. There is also possibility to protect only some segments of the MP3 audio data or several JPEG images in the slide show, or even protecting some region in a JPEG image.

The following are the possible use-cases of protecting the Musical Slide Show MAF:

ü  Protecting the whole resources: MP3 audio, JPEG images, LASeR script animation, timed text, and MPEG-7 metadata at the same time

ü  Protecting the combination of the aforementioned resources. E.g. protecting the MP3 audio and the slideshow, or MP3 audio and the animation, or JPEG images and the animation, etc.

ü  Protecting certain segment of the MP3 audio

ü  Protecting one or more JPEG images

ü  Protecting certain region of the JPEG image

3.1.2     Requirements

The requirements to support the above use-case scenarios are as the following:

ü  The protection mechanism should be able to protect MP3 audio, JPEG images, LASeR script, timed text, and the MPEG-7 metadata one by one and/or all together.

ü  The protection mechanism should be able to accommodate various protection tools flexibly.

ü  The protection mechanism should be able to protect selected JPEG images only.

ü  The protection mechanism should be able to protect certain segment of MPEG3 audio.

ü  The protection mechanism should be able to protect certain region of the JPEG image.

3.1.3     List of technologies

The potential technologies to support implementation of this MAF are the following:

ü  Technologies used in ISO/IEC 23000-4.

ü  MPEG-21 DIDL

ü  MPEG-21 IPMP Base Profile

ü  MPEG-21 REL MAM Profile

ü  MPEG-21 Fragment Identifier

3.1.4     Comparison with other MAFs

At the current time, only Music Player MAF has protection capability in its specification. Other MAFs may have intention to define protection scheme but they are not mature yet.

3.1.4.1       Protected Music Player MAF

Compared to the protected music player MAF, this under consideration MAF has similarity as the following:

ü  Use MPEG-21 DIDL and MPEG-21 IPMP Components Base Profile to signal the resource structure and type of protections and location of protections that are applied to them.

ü  Use MPEG-21 IPMP Components Base Profile to support flexible protection.

ü  Use MPEG-21 REL MAM Profile for the license description.

 

The differences are as the following:

ü  Both MAFs have difference file structure

o   Protected Music Player MAF support two cases of file structure (track in MP4 FF and track(s) in MP21 FF) while (Protected) Musical Slide Show MAF support only MP4 FF.

o   The location of protection description for both MAFs is different due to the file structure.

o   Protected Music Player MAF support both default protection tool (AES-128 Counter Mode) and flexible protection while the current proposed protection mechanism for Protected Photo Player open the choice of protection tool to the content creator (flexible protection).

3.1.5     Issues

There is no major issue encountered yet.

Note that document M14175 contains a preliminary thought on how the list of technologies listed in sub clause 4.4.3 are used for this under consideration MAF.

3.1.6     Supporting Companies and Organizations

·         Information and Communications University (ICU)

·         Korean Broadcasting Systems (KBS)

·         LG Electronics

Support by other companies is currently under investigation.

 

3.2    Portable Video Application Format

Recently, more and more portable music players (mp3 players) are extended to multimedia players capable of displaying images and videos, because the processing power and displays are now available for this kind of devices.

Unfortunately, there is no standard format for mobile, mid-resolution video like there is the DVD format for standard definition. So, devices have to support a wide variety of audio coding (e.g. MP3, AAC), video coding (e.g. MPEG-1, MPEG-4, AVC) and file formats (e.g. AVI, MP4), because a different combination of audio and video codecs and file formats is used for content. But there is still no guarantee that every device can playback all available kinds of content. This uncertainty and in general the lack of interoperability prohibits the market.

So both, industry and users could greatly benefit from a portable video MAF which would enable interoperability of these mobile multimedia devices.

This section describes application scenarios and derived requirements for a “Portable Video MAF”.

3.2.1     Application Scenarios

Portable video players (PVP) today have the ability to play back good quality video on 3‑5” size displays at the usual resolutions of 320x240 or 480x272 pixels. Flash memory or hard disk storage can hold many hours of video. Also small-size pre-recorded media like mini-DVD/UMD (1.8 GB capacity) are used in portable devices. As an example, 512MB of total movie file size correspond to 134 min at a bit rate of 500 kbps.

Three different scenarios apply:

·         Playback of content the users generated on their own for the PVP (downsize and encode), e.g. personal recordings or home video

·         Playback of content sold on disk media (mini-DVD), comparable to the DVD business

·         Playback of content sold over the internet, comparable to the online music store business.

In order to enable business of selling content for the PVP devices not only interoperability at the codec level is required but also interoperable digital rights management to protect the rights of the content owners.

Premium content will support advanced features like multiple audio tracks in different languages and subtitles (as known by DVDs). Metadata like the film title or the names of the actors should be included. The movie poster can be included as still image.

Simple past TV program “download & play” application
John’s favourite television show is “Friends.” He decides to download last season’s episodes from the website (owned by the television network) and store it in his portable media player so that he can watch it during his daily commute to work in the subway. John performs the following:

·         He logs on to the website owned by the television network

·         He chooses the episodes of “Friends” that he wants to download

·         He pays for his selections

·         He downloads the files to his computer

·         He moves the files to his portable media player for later viewing

On his way to work the next morning, John takes out his portable media player and opens the file “friends01_season8.mp4.” Once the file loads, he sees a visually enhanced, interactive menu screen. He performs the following:

·         He browses the menu options (“Main feature” and “Extra features”)

·         He chooses the “Extra features” option and presses the “play” button

·         He sees another sub-menu screen that has links to “previews” and “behind-the-scene” clips

·         He presses the “cancel” button and returns to the main menu screen

·         He chooses the “Main feature” option and presses the “play” button

·         He begins to watch the episode

Simple playback application utilizing “Usage History Description”
John has an appointment with his clients at their office, and he has to take the subway, which is the most reliable means of transportation in the city. He has to make two transfers in order to get to his clients’ office. John takes out his portable media player once again, and opens another file, “friends02_season8.mp4”. He performs the following:

·         He chooses the “Main feature” option and presses the “play” button

John arrives at a station where he has to make his first transfer.

·         He presses the “stop” button, and he exits the program.

The program saves John’s usage history description onto a separate file. He transfers onto another subway, and then reloads the file “friends02_season8.mp4” on his PMP. The system automatically loads the usage history description.

·         He chooses the “Continue” option from the menu screen and presses the “play” button

·         He begins to watch the episode from where he left off previously

Simple playback application utilizing “Hierarchical Summary Description”
Susan has missed several episodes of her favourite television newsmagazine program “60 Minutes” while she was away on a business trip. She decides to log on to the TV network’s website and download the past episodes and play them on her portable media player during her commute to work (by using the subway). Susan performs the following:

·         She logs on to the website owned by the television network.

·         She chooses the episodes of “60 minutes” that she wants to download.

·         She pays for her selections.

·         She downloads the files to her computer.

·         She moves the files to her portable media player for later viewing.

On her way to work the next day, Susan loads the file “60min_050510.mp4” on her PMP. Among many of the options provided in the menu screen, she chooses the “Summary Segments” option, and presses the “play” button. She notices a sub-menu screen that displays three key frame images with key words/key phrases which represent the three stories that are presented in the episode of “60 minutes” that she has chosen. The key phrases are “Climate change”, “X-treme sports”, and “Violence in television”. 
·         She chooses “Climate change” from the menu and presses the “play” button.
·         She browses a series of key frames related to the “Climate change” story. The key frames are labelled as “Global warming”, “Water pollution”, and “Rain forest”.
·         She chooses one of the key frames labelled “Global warming” and presses the “play” button.
·         She begins to watch a clip on global warming instead of watching the whole story on “Climate change.”
Simple mobile broadcast “save & replay” application
John is a high school student who frequently watches educational programs pertaining to the 10th grade curriculum on television. He used to record them on VHS tapes and watch them again whenever he needed to. But, now that he has a mobile phone capable of receiving DMB signal, he chose his mobile phone over his television set. He performs the following:
·         John selects the DMB function on his mobile phone
·         He chooses the EBS (Educational Broadcasting System) channel
·         He begins watching a program on 10th grade “Advanced Mathematics”
·         He presses the button that corresponds to the “SAVE” menu on his phone
·         He makes notes using the memo pad function on his phone
After 20 minutes, the program ends, and John performs the following:
·         He presses the button that corresponds to the “END” menu on his phone
The media content is saved as a Portable Video MAF file along with the metadata that contains his notes for later viewing. As long as it is saved as a file, John can always watch them later, on any multimedia device that supports this format, and with the metadata, it is easier to browse when there are lots of files.

3.2.2     Derived requirements

3.2.2.1       File Format:

The Portable Video MAF should use a file format for the following purposes:
·         Local playback on devices
·         Web downloads
·         Content storage
·         Distribution

 

3.2.2.2       Audio and Video:

·         The Portable Video MAF should support commonly available audio and video coding schemes.
·         The Portable Video MAF should support multiple video tracks for:
Extra materials (e.g. movie previews)
·         The Portable Video MAF should support multiple audio tracks for:
Multilanguage support (audio dubbing)

 

3.2.2.3       Textual presentation:

·         The Portable Video MAF should support timed text data.
·         The Portable Video MAF should support multiple text tracks for:
Multilanguage support (subtitles)
·         Textual presentation should be aligned with Musical Slide Show MAF

 

3.2.2.4       Metadata:

·        The Portable Video MAF should support content creation information description, including:

-          Title (title of the movie)

-          Artist (creator who took the clip/movie)

-          Clip description (general description of the clip/movie)

-          User comment (user annotation/text)

-          File date/time (date/time in which the clip was taken)

·        The Portable Video MAF should support usage history description for:

-          Profiling user actions on devices

·        The Portable Video MAF should support hierarchical summary description for:

-          Video summary

-          Organizing video clips into chapters

 

3.2.2.5       Menu scheme

·        The Portable Video MAF should support animated graphics for:

-          Actual implementation of the menu screen

·        The Portable Video MAF should support still image data for:

-          Preview images (to be used with the menu screen)

 

3.2.2.6       Protection:

The DRM components of the Portable Video MAF should be aligned with the “Music Player MAF, 2nd edition.”

3.2.3     List of technologies

3.2.3.1       File format:

ISO Base Media/MPEG-4/AVC File Format

3.2.3.2       Audio and Video:

·         MPEG-1/-2 Layer-3
·         MPEG-4 HE-AAC
·         MPEG-4 BSAC
·         MPEG-4 AVC

3.2.3.3       Textual presentation:

MPEG-4 Streaming Text Format

3.2.3.4       Meta data:

·         MPEG-7 Multimedia description scheme, Creation DS
·         MPEG-7 Multimedia description scheme, UsageHistory DS
·         MPEG-7 Multimedia description scheme, HierarchicalSummary DS

3.2.3.5       Menu scheme:

·         JPEG ISO Standard for still images
·         MPEG-4 LASeR

3.2.3.6       Protection:

The protection technologies involved in the Portable Video MAF should be aligned with the “Music Player MAF, 2nd edition”.

3.2.4     Comparison with other MAFs

The following MAFs are inherently different from the Portable Video MAF in terms of the main technologies that are used to define the characteristics of each MAF:

 

Music Player MAF

The Music Player MAF is an audio application that provides the file format for combining MP3 audio, MPEG-7 metadata, and an optional JPEG image.

 

Photo Player MAF

The Photo Player MAF defines the specification for a file format that carries JPEG images with MPEG-7 Visual and MDS metadata.

 

Music Player MAF, 2nd edition

The Music Player MAF, 2nd edition features the same functionalities as the Music Player MAF with an addition of content protection tools.

 

Musical Slide Show MAF

The Musical Slide Show MAF is a file format that carries MP3 audio, multiple JPEG images, text, and LASeR for animation.

 

Professional Archival MAF

The Professional Archival MAF is an application that focuses on handling of audio files in a single archived file.

 

Open Access MAF

The Open Access MAF is an application that focuses on publication and delivery of content that is governed in a “light-weight” form.

 

The following MAFs are video applications that involve similar technologies, but are different in terms of target application domain and devices:

 

Media Streaming MAF

The Media Streaming MAF standardizes streaming content and protocols.

 

Video Surveillance MAF

The Video Surveillance MAF is an application that is specifically targeted towards the application domain of surveillance.

 

Digital Video/Cinema MAF

The Digital Video/Cinema MAF is an application that concentrates on the distribution of high-resolution video content to professional users with emphasis on color management information.

3.2.5     Issues

It should be investigated if adding animated graphic technologies like MPEG-4 Part‑20 (LASER) to this MAF would

·         add a big complexity burden to this MAF or not

·         help to broaden the application scenarios if the additional complexity is no burden.

3.2.6     Supporting Companies and Organizations

·         Fraunhofer -IIS

·         LG Electronics

·         MediaLive

·         Samsung

·         Streamezzo

 

3.3    Interactive Music Application Format

In modern digital music, the producer or music engineer creates music contents through the recording, mixing and mastering processes. Because it is not efficient to simultaneously record plays of multiple instruments in order to produce music contents, the producer records a number of audio tracks which were recorded separately. These audio tracks are the different musical instruments in a band or vocalists, the sections of an orchestra, announcers and journalists, crowd noises, and so on. After mixing and mastering processes, the producer generates a single final track for distribution. Therefore, it is impossible to control the specific instrument or vocal as user wants, because audio tracks were already mixed as one track.

Interactive music content is packaged with audio tracks before mixing process, so users can freely control the individual audio track. And also, producers creates several versions (producer mixing 1, producer mixing 2, karaoke, rhythmic, and so on) with one music piece using the metadata structure for mixing information. It means that the interactive music content provides the rich interactivity to producers as well as users.

This table shows main characteristics of interactive music service compared with traditional one. Interactive music service needs to provide just one file i.e. interactive music file, then users enjoy all services with interactive music player.

 

Traditional music           (single music)

Interactive music           (multi-music)

Features

single track (L/R)

Multi-track (instruments, L/R)

Fixed style (produce’s music)

Free style (user’s music)

Potential

Limited

Unlimited (flexible music)

Users can’t do anything and are just listening and need different type of music for value-added service( ex. MIDI)

Flexible for valued-added service (karaoke, style mix, remix etc.)

The interactive music service provides a preset functionality and a hierarchy functionality as well as user controllability on each audio track. The preset means the multiple mixing information which is provided as well as audio track data. A hierarchy is a relationship organisation between tracks or elements. It uses the principle of levelled relations as for instance “parent” element (a track or a group of tracks) and its related “children”. The hierarchy opens a new level of interactivity as it allows actions on groups of elements and not only on tracks. All the functionalities can then be applied to the different levels of the hierarchy. Figure 1 shows a snapshot of interactive music album player. In Figure 1 (a), five presets (original mix version, voice cut version, a-ccapella version, simple mix version, rhythmic version) are provided to user as the name of producer track at this interactive music content. Figure 1 (b) shows an example of hierarchy with 3 levels.

 

music2

(a)

(b)

Figure 1. A snapshot of interactive music album player:
(a) hierarchy with 1 level; (b) hierarchy with 3 levels.

Figure 2. Conceptual model with presets of Interactive Music File

 

3.3.1     Application Scenario Description

1.    Listening to music selected/controlled to user’s taste

A.      Users listen to the music where sound of arbitrary track, for example, vocal, keyboard and/or drum was emphasized to their taste (or as needed).

B.      Users listen and feel more affluent mood of music by emphasizing the track of chorus.

C.      Users listen to another atmospheric mood of music by selection of a cappella version or remix version in preset.

2.    Karaoke and performing exercise

A.      Users, learning to play the guitar, practice on the guitar according to the music while the sound of guitar is off.

B.      Users exercise for singing a song according to accompaniment by selection of karaoke version in preset.

3.    Music UGC (User Generated Content) production by control/addition of audio tracks

A.      After audio tracks are controlled to users’ taste (or as needed), they make a stereo content i.e., Music UGC of recomposed music by downmix.

B.      After users substitute musician’s vocal by his voice or add new audio track to music, he makes a stereo content i.e., Music UGC of recomposed music by downmix. 

 

 

3.3.2     Requirements

3.6.2.1.       General Requirements

IM AF shall support the following functionalities:

·       unified storage and interchange of multiple audio tracks with the associated data either in a separate or combined form

·       random access to the stored multiple audio tracks

·       storage of multiple different presets

·       storage of content description metadata such as Title, Artists, Genre, Creation date/time, Duration, Related Material, etc.

·       storage of one or more still images associated with the music data, audio tracks and presets

·       storage of one or more timed text data for lyrics

·       storage of a collection of several IM AF contents in one IM AF file

·       unique identification of a IM AF file

·       mono or stereo or multi channel as audio output configurations

·       streaming

 

IM AF shall be able to describe the restrictions on the following functionalities:.

·       addition of new tracks to an IM AF file

·       removal of tracks from an IM AF file

·       editing of the existing tracks in an IM AF file

·       display of information contained in preset

·       addition of presets to an IM AF file

·       removal of presets from an IM AF file

·       editing of the existing presets in an IM AF file

 

IM AF may support presentation information (scene description).

If a default mix exists in an IM AF file, it should be able to be played by a non-interactive player.

 

3.6.2.2.       Brand and Audio Coding

IM AF shall have a brand that can be implemented in the current mobile devices such as mobile phones and portable audio players for instance.

IM AF brands shall define some limits on the maximum number of tracks to be simultaneously decoded.

IM AF brands shall specify compressed (lossy or lossless) and/or uncompressed audio coding.

Compressed audio bitsreams for IM AF should be able to provide high audio quality.

3.6.2.3.       Interactivity Requirements

IM AF shall support the following interactivity types;

·       Selection interactivity: the listener can choose the tracks or element of the hierarchy he/she wants to listen to.

·       Volume interactivity: the listener can change volume level of each track or element of the hierarchy.

IM AF should support the following interactivity types;

·       Equalization interactivity: the listener can modify the frequency band level of each track or element of the hierarchy.

NOTE: According to the ISO base media file format standard (ISO/IEC 14496-12), the ‘track’ is defined as “A collection of related samples (q.v.) in an ISO base media file. For media data, a track corresponds to a sequence of images or sampled audio. For hint tracks, a track corresponds to a streaming channel.” Thus, ‘object’ or ‘channel’ in audio coding scheme can be considered as a track in an ISO base media file format.

 

3.6.2.4.       Preset Requirements

IM AF shall support the following preset functionalities;

·       the number of stored audio tracks shall be included in an IM AF file

·       the number of stored presets shall be included in an IM AF file

·       each preset in an IM AF file shall be able to signal which audio track is to be mixed or muted

·       each preset in an IM AF file shall be able to describe the volume of each audio track

·       each preset in an IM AF file shall be able to describe the preset name

·       the audio track name shall be provided for each audio track in an IM AF file.

·       each preset in an IM AF file shall be able to describe equalization information (including the frequency band level of tracks).

 

The preset should be able to describe the temporal variation of parameters for interactivity.

 

3.6.2.5.       Hierarchy of Tracks Requirements

IM AF shall be able to contain hierarchical relationships between tracks.

 

3.6.2.6.       Interactivity Constraint Requirements

·       IM AF shall support constraint functionality as follows: Signaling of restriction rules on the interactivity defined in sub-clause 3.6.2.3.

·       Those limitations shall be applied to the hierarchy of tracks defined in sub-clause 3.6.2.5 of IM AF.

 

3.6.2.7.       Player Requirements

IM AF player shall support the following functionalities:

·       simultaneous decoding of multiple audio tracks

·       mixing of multiple decoded audio tracks according to the selected preset

IM AF player should support the following functionalities:

·       synchronized play of music and the associated text

·       display of information contained in preset

·       display of content description metadata

·       binaural output for multi channel configuration

 

3.3.3     List of Technologies

The IM AF may be based on the following technologies:

·       File Format: ISO Base Media File Format, MPEG-21 File Format

·       Audio

-          MPEG-4 Audio AAC profile

-          MPEG-D SAOC

-          MPEG-1 Audio Layer III

-          PCM (Uncompressed)

·       Image: JPEG

·       Text: 3GPP Timed-Text Format

·       Metadata: MPEG-7 MDS CreationDS

 

 

 

 

 

 

 
References

[1]   ISO/IEC JTC1/SC29/WG11/N7069, Draft PDTR ISO/IEC 23000-2, Purpose of Multimedia Application Formats, Busan, Korea, April 2005.

[2]   ISO/IEC 23000-2, Music Player Application Format, FDIS.

[3]   ISO/IEC JTC1/SC29/WG11/N7564, Committee Draft 23000-3 Photo Player Multimedia Application Format, Nice, France, October 2005.

[4]   ISO/IEC 21000-4 IPMP Components FCD, ISO/IEC JTC1/SC29 WG11 MPEG68/N7196, Busan, Korea, April 2005.

[5]   ISO/IEC 14496-12 & 15444-12: ISO Base media File Format, Amendment 1, ISO/IEC JTC1/SC29/WG11/N6596, Redmond, USA, June 2004.

[6]   K.Diepold, F.Pereira, W.Chang. MPEG-A: Multimedia Application Formats. IEEE Multimedia, Vol. 12, No.4, 2005, p.34-41.

[7]   Exchangeable Image File Format for Digital Still Cameras: EXIF Version 2.2, JEITA CP-3451, Standard of Japan Electronics and Information Technology Industries Assoc., Apr. 2002.

[8]   House of Lords, Digital Images as Evidence, Government Response, Select Committee on Science and Technology, Eighth Report, June 1998.

[9]   M. Pramateftakis, T.Oelbaum, K.Diepold: Authentication of MPEG-4 based Surveillance Video. International Conference on Image Processing (ICIP), Singapore, 2004.

 

Annex A

URN Structure for MAFs

urn:mpeg:maf:{usage}:{maf-identifier}:{name-string}:{year}

where

{usage}                              is regulated by MPEG N8945

{maf-identifer}      is one of the terms listed in Table A.1, identifying the relevant part of ISO/IEC 23000

{name-string}         is an optional US-ASCII string, unique within a given MAF, to describe the function of this URN, as necessary

{year}                     is regulated by MPEG N8945

 

Table A.1 – List of MAF identifiers

Specification

{maf-identifier}

ISO/IEC 23000-2

musicplayer

ISO/IEC 23000-3

photoplayer

ISO/IEC 23000-5

mediastreaming

 

 

EXAMPLE 1      urn:mpeg:maf:schema:photoplayer:collection:2006

EXAMPLE 2      urn:mpeg:maf:cs:musicplayer:CollectionElementsCS:2007

EXAMPLE 3      urn:mpeg:maf:schema:mediastreaming:tva:2006