Monthly Archives: May 2006

Riven DVD Edition’s MPEG 1/2 Layer II decompression problem fixed

Thanks to the awesome work of Christian Walther, the MPEG 1/2 Layer II decompression problem has been addressed. This entry explains the aforementioned problem and how it was solved.

The problem

MPEG 1/2 Layer II packets (frames in MPEG terminology, to which I do not adhere in this instance) always contain 1152 audio frames (an audio frame contains one sample for every channel at a given time). Consequently, encoders have to decide how to handle the situation where an input signal’s frame length is not an integer multiple of 1152.

Now there may be some manner of standard or convention as to what should be done in such a case, for example padding the beginning or the end of the signal with silence, or padding with silence half at the beginning and half at the end. But none of those are the case for Riven DVD’s MPEG 1/2 Layer II audio resources. I quickly became aware of that fact when I finished the Core Media release and tested a number of cards with looking ambiance effects: there was a very noticeable gap in the audio playback when an MPEG 1/2 Layer II resource looped back.

The solution

So I started examining DVD audio resources versus their ADPCM counterparts from the CD edition. Let’s look at the beginning of the waveforms of such a pair.

Riven CD waveform

Riven DVD waveform

As you can see, the DVD version is clearly the same signal as the ADPCM version, only with some amount of garbage (not pure silence) at the beginning. The fact that it’s not pure silence made it far more difficult to determine the number of frames to drop at the beginning of the DVD resources. Things began to linger for weeks, weeks turned into months, with no solution in sight.

And then it came to me: we have the “original” resources from the CD edition. We could run a cross-correlative analysis on a large number of CD-DVD pairs and see if a number comes out on top statistically. Having never done such an analysis, I posted a message on the Riven X development mailing list, in the vain hope that someone there might be able to help. To my great surprise, Christian Walther offered to perform the analysis. Here are his results, for 2 sample resources.

Cross-correlation 31

Cross-correlation 35

The results couldn’t have been better. For all the files he analyzed, 481 always came out as the point of maximum correlation, and generally in a very sharp manner. We had our number.

What about the end of DVD resources? Because a fixed number of frames have to be removed at the beginning, something must happen at the end as well to fit inside the 1152 frames / packet restriction. Again, Christian provided the answer.

Waveform end comparison

It would seem that the DVD resources were simply trimmed to fit. I’m not entirely sure how they made that work for resources meant to be looped, but empirical tests with updated Riven X and MHKKit did not reveal audible gaps in cards that used to exhibit them.

I’m going to be checking things more thoroughly in the coming days, but I’m carefully optimistic that this problem has been solved.

Serenity in H.264 glory

I am a big fan of Firefly and Serenity. So I chose the title scene at the beginning of Serenity to make sample bitstreams for my internship presentation. I realize I may be violating some copyright laws by posting these files, however they are short and the scene contains screen text for the various people who worked on Serenity. A fair trade, I say.

The source material is the Serenity widescreen DVD, title 3 of the main feature, which has a duration of 00:05:17.86. The MPEG-2 video elementary stream has a bit rate of approximately 5900 kbps and a resolution of 720 by 304 (2.53:1). The AC-3 audio elementary stream has a bit rate of approximately 340 kbps and 6 channels (5.1 layout).

The first file is a high-quality encode using the main profile. The characteristics are as follows:

Video
Bit rate: 2000 kbps (CBR mode)
Resolution: 720 x 304 (2.35:1)
Frames per second: 23.98 (NTSC FILM)
Encoder: x264
Coding tools: main profile, level 5,1, B slices (max one), CABAC, trellis, intra-picture 4×4 analysis, hexagon ME pattern, two pass

Audio
Bit rate: 192 kbps (CBR mode)
Channel layout: stereo
Sampling rate: 48000 Hz
Encoder: FAAC
Coding tools: low complexity profile

Download the Serenity high quality clip (82.31 MB).

The second file respects the DMB specification in terms of coding tools, bit rate and resolution. It gives a feel for what people may expect to see on their cell phones, PDAs and other mobile devices within a few years. The characteristics are as follows:

Video
Bit rate: 300 kbps (CBR mode)
Resolution: 320 x 128 (2.35:1)
Frames per second: 23.98 (NTSC FILM)
Encoder: x264
Coding tools: baseline profile, level 1,3, intra-picture 4×4 analysis, hexagon ME pattern, two pass

Audio
Bit rate: 96 kbps (CBR mode)
Channel layout: stereo
Sampling rate: 48000 Hz
Encoder: FAAC
Coding tools: low complexity profile (does not respect the DMB specification, but no free HE-AAC encoder is available on Mac OS X)

Serenity_dmb_300.mov (15.19 MB).

Rapport et présentation de stage au CRC

This entry is in French because it mainly concerns Université Laval people.

En plus de mon rapport de recherche, j’ai dû écrire un rapport de stage et une présentation orale pour l’Université Laval. Cette dernière c’est déroulée hier avec plus ou moins de succès. Disons que j’avais beaucoup de contenu… beaucoup trop. Mais bon, je n’étais pas certain du niveau technique adéquat auquel l’Université s’attendait, alors j’ai visé vers le haut.

Voici donc mon rapport de recherche en format PDF (signature électronique) et ma présentation orale en format PDF (signature électronique).

Winter 2006 internship research report

Update May 24: Added Amendment 1 and corrected a sequence diagram.
Update May 22: I have added a GPG digital signature.
Update May 21: I have corrected a few mistakes in the research report.

I finished writing my research report for my winter 2006 internship at CRC a few days ago. It’s a rather lengthy document that covers the project I worked on, Digital Multimedia Broadcasting, H.264 and MythTV. It’s available from this blog in PDF format (digital signature) and is covered by the Creative Commons license.

On a related note, I will be posting some sample movies using H.264 video coding and AAC audio coding that comply with the DMB specification, to show what kind of quality we’re really talking about.

Summer game plan

It’s long past time I wrote a little bit in here. Truth be told, I’ve been extremely busy these past 3 weeks, finishing my internship at CRC and writing my research report. But all that is coming to an end now, so let’s talk about what’s coming up, shall we?

  • Summer internship: I’ll be doing some web programming this summer for the CHUQ, a high-visibility medical research center affiliated to my university. I’m really happy to have gotten that one, and I’ll most likely be able to work at home for extra convenience.
  • Riven X: Work on Riven X has pretty much stalled for the aforementioned reasons. As soon as things have died down a little bit, I’ll pick up the pace again. My first objective is to come up with a solution to the MPEG-2 Audio Layer II decompression problem, which I will discuss in greater detail in an upcoming entry. Beyond that, my mid-term goal is to have first playable done by the time WWDC 2006 comes around. Although I still don’t know whether I’m going or not (it all depends on the WWDC student scholarship), if I do end up going, I want to have as much stuff done as possible to make full use of the expertise that will be available to me there.
  • libblp: I’m going to update libblp sometime in the (not too distant) future to clean up its API and add BLP2 support. This is partly motivated by the next item.
  • Develop a better DKP system for my World of Warcraft guild: World of Warcraft has been my time killer for a while now, and I’m interested as much in its technology as in its gameplay. My guild, Alexstraza Dragon Riders, uses a somewhat custom-made DKP system which is pretty much entirely manual right now and in my opinion is wholly inadequate. In any case, I’m planning the development of a DKP system that will use the same basic mechanism as the one we’re currently using (DKP auctions, essentially), but is a lot easier to use, both on the raid leader side and on the client side, and has other interesting features like a web component and a useful GUI. I’ve just started the design work, but I expect to have more to say about this project in the future.

Although one is never certain about the future, this is probably the gist of my summer.

Back home

I’ve been back home since late Saturday, however as you can imagine I since had to take care of a lot of things. Unpacking isn’t even done, but right now my priority is finishing my internship report. Once that’s done (I expect that to be tomorrow), I’ll come back and talk about the state of things for Riven X.

As they say, it’s good to be home.