The paper I'd like to emphasize is:
Kiran Misra, Andrew Segall, Michael Horowitz, Shilin Xu, Arild Fuldseth, and Minhua Zhou, “An Overview of Tiles in HEVC”, IEEE Journal of Selected Topics in Signal Processing, Vol. 7, No 6, December 2013
The High Efficiency Video Coding (HEVC) standard significantly improves coding efficiency (gains reported as 50% when compared to the state-of-the-art MPEG-4 AVC H264), and thus is expected to become popular despite the increase in computational complexity. HEVC also provides various new features, which can be exploited to improve the delivery of multimedia systems. Among them, the concept of tiles is in my opinion a promising novelty that is worth attention. The paper "An Overview of Tiles in HEVC" provides an excellent introduction to this concept.
The goal of a video decoder (respectively encoder) is to convert a video bit-stream (respectively the original sequence of arrays of pixel values) into a sequence of arrays of pixel values (respectively a bit-stream). The main idea that is now adopted in video compression is the hierarchical structure of video stream data. The bit-stream is cut into independent Group of Pictures (GOP), each GOP being cut into frames, which have temporal dependencies with regards to their types: Intra (I), Predicted (P) or Bidirectional (B) pictures. Finally, each frame is cut into independent sets of macroblocks, called slices in the previous encoders.
The novelty brought by HEVC is the concept of tile, which is at the same "level" as slice in the hierarchical structure of video stream data.
The motivations for both slices and tiles are, at least, twofold: error concealment and parallel computing. First, having an independently parsable unit within a frame can break the propagation of errors. Indeed, due to the causal dependency between frames, an error in a frame can make the decoder unable to process a significant portion of the frames occurring after the loss event. Slices and tiles limit, at least from a spatial perspective, the propagation of an error on the whole frame. Second, the complexity of recent video and the requirements of high-speed CPU speed (which unfortunately requires power and generates heat) can be partially addressed by parallelizing the decoding computation task across multiples computing units, regardless of whether these are cores in many-cores architectures or computing units in Graphics Processing Units (GPUs). The independency of slides and tiles is expected to facilitate the implementation of video decoder on parallel architectures.
Unfortunately, the concept of slices suffers in practice from serious weaknesses, which tiles are expected to fix.
In the paper, the authors introduce the main differences between tiles and slices, which are two concepts that, at a first glance, can be confused. They focus on the motivation for parallel computation.
The first part of the paper explains in details the main principles between both approaches, in particular the fact that tiles are aligned with the boundaries of Coded Tree Blocks (CTD), which provides more flexibility to the partitioning. This brings several benefits: a tile is more compact, which leads to a better correlation between pixels within a tile when compared to the correlation between pixels in a slice. Tiles also require less headers, among other advantages.
The authors also introduce the known constraints to be taken into account when one wants to use tiles today. The whole Section 3 is about the tile proposal in HEVC, and the main challenges to be addressed for a wide adoption. Next, the authors present some examples when tiles are useful. Both parts are written so that somebody being just familiar with the concepts can understand both the limitations behind the concept of tiles and how these weaknesses have been addressed in practice.
The last part of the paper, in Section 5, deals with some experiments, which demonstrate the efficacy of HEVC for lightweight bit-streams and parallel architectures. At first authors assess the parallelization and the sensibility of network parameters, including the Maximum Transmission Unit (MTU), on the performances of slices versus tiles. They finally measure the performances of stream rewriting for both approaches.
In short, the paper shows that tiles appear to be more efficient than slices on a number of aspects. The paper proposes a rigorous, in-depth, introduction of the main advantages of tiles. This can foster research on the integration of tiles into next-generation multimedia delivery systems.