October 16, 2013

What encoding parameters for video representations in adaptive streaming?

Dynamic Adaptive Streaming (DASH) is a technology that has been implemented and deployed although the scientific literature was inexistent. Simply put, the server offers several representations of the same video ; clients can choose the representation that best fit their capacities. Since 2008, many researchers have deciphered the global behavior of client-based adaptive mechanisms. However, one key piece of the theoretical cake is still missing: what is the optimal set of video representations the server should offer?

As far as we know, there are no commonly accepted rules on how to chose the encoding parameters of each representation (resolution and rate). Providers typically use somewhat arbitrary rules of thumb or follow manufacturers’ recommendations (e.g. Apple and Microsoft), which do not take into account neither the nature of video streams, nor the user base characteristics. These parameters can however have a large influence on both user QoE and delivery cost.

With fellow researchers from EPFL (Laura and Pascal), we have recently investigated this topic from an optimization standpoint. The objective is to maximize the average user satisfaction. We formulated an optimization problem with the following inputs, which any content provider hopefully knows:
  • for each video in the catalog, the expected QoE of users for any rate-resolution. This can be easily obtained from a rate-distorsion curve computed on a sample of the video on every resolution.
  • for each video in the catalog, the characteristics of the population of viewers. I mean here the client device (tablet, TV, smartphone, ...) and the available bandwidth of the network connection (xDSL, fiber, 3G, ...). This requires an "a priori" knowledge of the viewer population, but we guess it can be obtained from previous statistics. 
  • the minimum ratio of viewers that must be served, i.e. the users who actually get a video, even at a relatively bad quality.
  • for the delivery part, the overall bandwidth budget that can be provisioned. Typically, we consider that the cost of the CDN should be bounded, and so the overall used bandwidth is bounded too.
  • finally, the total number of representations that we want to encode. The idea here is to limit the storage and encoding costs, and to avoid huge, hard-to-administer Manifest files.
We solved the problem on a set of synthetic configurations (the above inputs). Our goal was twofold: (i) measure the "performances" of recommended set of representations, and (ii) provide guidelines for content providers.

About the former goal, our observation is that recommended sets are not that bad in terms of average QoE but, for a given expected quality, the number of representations in these recommended sets is almost twice the number of representations in the optimal solutions. In other words, the average QoE is obtained at the price of more video representations, which mean more encoders, more storage, more delivery bandwidth in the CDN infrastructure, and more complexity in the management. We also showed that these recommended sets perform poorly for more specific configurations. For instance, a content provider specialized in live e-sport videos or a content provider targeting mobile phones must absolutely not follow recommendations.

We also derive from our analysis a series of guidelines. Some of them may be obvious, but it is never bad to recall obvious things, especially when nobody seems to follow them.
  1. How many representations per video? The repartition of representations among videos needs to be content-aware. Put emphasis on the videos that are the more complex to encode (e.g. sports)
  2. For a given video, how many representations per resolution? It mainly follows the distribution of devices in user population. Put a slight emphasis on highest resolutions.
  3. How to decide bit-rates for representations in a given resolution? The higher is the resolution, the wider should be the range of rates. Put emphasis on lower rates.
  4. How to save CDN bandwidth? Reduce the range of rates for representations in a resolution. Reduce the number of representations at high resolution.
These first results are just preliminary tests. We have plenty of new topics to explore. Stay tuned!

1 comment: