Multimedia Technology and Telecommunications Lab

--
--
breakdancers multiview

Research activities involving data compression can be divided into three main trends: compression of depth data, 3D video and big multimedia data.

Compression of Depth Data

Depth maps containing the three dimensional structure of the scen can be coded with standard image compression tools but the peculiar nature of this data requires ad-hoc coding schemes to obatain optimal performances.

In the recent work in collaboration with the signal processing research group at the University of New South Wales (AU) the use of a scalable breakpoint field together with a breakpoint adaptive Wavelet decomposition is proposed for the coding of the depth field [5,6].

We also introduced a novel strategy for the compression of depth maps in [2]. The scheme proposed a progressive characterization of depth maps where the geometry of the scene is initially characterized via a billboard representation, and progressively increased.

The depth coding scheme in [8] starts with a segmentation step which identifies and extracts edges and main objects, then it introduces an efficient compression strategy for the segmented regions' shape.

Compression of 3D Video

Free viewpoint video applications and autostereoscopic displays require the transmission of multiple views of a scene together with depth maps. Current compression and transmission solutions just handle these two data streams separately using techniques derived from standard image and video compression tools. However by using ad-hoc solutions explicitly targeted at these kind of data and jointly compressing depth and color data better performances can be obtained.

The award-candidate approach in [3] tries to exploit the information about the objects in the scene to perform a cognitive 3D compression of both color and depth streams.

An alternative approach has been introduced in [4] where the depth map is compressed together with color information and the segmentation of the compressed color data is used to aid the depth compression. In particular the segmentation is used to recognize the different surfaces in the scene and then the roughly planar pieces are approximated with planes.

Depth maps contain key information on the scene structure that can be effectively exploited to improve the performance of multi-view coding schemes. In [1] and [3] we introduce a novel coding architecture that replaces the inter-view motion prediction operation with a 3D warping approach based on depth information to improve the coding performances. The pixels of the different warped views are packed into a stack of aligned views which can be efficiently coded by transform coding techniques. Occluded areas are also handled with ad-hoc solutions.

Multimedia Big Data

The possibility of sharing multimedia contents in easy and ubiquitous way has brought to the creation of multiuser photo/video galleries. Pictures and video sequences taken by different people attending common social events (like concerts, sport competitions, etc.) are gathered together into huge sets of heterogeneous multimedia data. These databases require effective compression strategies that exploit the common visual information related to the scene but compensate effectively the differences depending on the acquiring viewpoints, camera models, and acquisition time instants.
Few preliminary works have been published on the subject, which presents new compression challenges.

Related papers:

[1] M. Zamarin, S. Milani, P. Zanuttigh, G.M. Cortelazzo,A Novel Multi-View Image Coding Scheme based on View-Warping and 3D-DCT", Journal of Visual Communication and Image Representation, Special Issue on "Multi-Camera Imaging, Coding and Innovative Display: Techniques and Systems", vol. 21, p. 462-473, July-August 2010, DOI: 10.1016/j.jvcir.2009.09.008, ISSN: 1047-3203

[2] S. Milani, G.Calvagno,"A Depth Image Coder Based on Progressive Silhouettes," IEEE Signal Processing Letters, vol. 17, no. 8, Aug. 2010, pp. 711 - 714. (software is available here).

[3] S. Milani, G. Calvagno, "A Cognitive Approach for Effective Coding and Transmission of 3D Video", ACM Transactions on Multimedia Computing, Communications, and Applications (ACM TOMCCAP), vol. 7S, No. 1, Oct. 2011, pp. 23:1-23:21, ISSN: 1551-6857, DOI: 10.1145/2037676.2037680 (video demo is available here).

[4] M. Zamarin, P. Zanuttigh, S. Milani, G.M. Cortelazzo, S. Forchhammer,
"A Joint Multi-View Plus Depth Image Coding Scheme Based on 3D-Warping", ACM International Workshop on 3D Video Processing (3DVP2010), Firenze, Italy, October 2010

[5]S. Milani, P. Zanuttigh, M. Zamarin, S. Forchhammer, "Efficient depth map compression exploiting segmented color data," IEEE Int. Conference on Multimedia and Expo (ICME), July 2011

[6] R. Mathew, P. Zanuttigh, D. Taubman, "Highly Scalable Coding of Depth Maps with Arc Breakpoints," Data Compression Conference (DCC), pp.42-51, April 2012

[7] R. Mathew, D. Taubman, P. Zanuttigh, "Scalable Coding of Depth Maps with R-D Optimized Embedding", IEEE Transaction on Image Processing, Vol. 22 , n. 5, pp 1982 - 1995, 2013

[8] P. Zanuttigh P., G.M. Cortelazzo, "Compression of depth information for 3D rendering", In Proceedings of 3DTV Conference, p. 1-4, Potsdam, Germania, May 2009

For all the publications: the copyright is hold by the corresponding publisher. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the publisher.