Paper
Time-of-Flight sensors and stereo vision systems are two of the most diffused depth acquisition devices for commercial and industrial applications. They share complementary strengths and weaknesses. For this reason, the combination of data acquired from these devices can improve the final depth estimation performance. This paper introduces a dataset acquired with a multi-camera system composed by a Kinect v2 ToF sensor, an Intel RealSense R200 active stereo sensor and a ZED passive stereo camera system. The acquired scenes include indoor settings with different external lighting conditions. The depth ground truth has been acquired for each scene of the dataset using a line laser. The data can be used for developing fusion and denoising algorithms for depth estimation and test with different lighting conditions. A subset of the data has already been used for the experimental evaluation of the stereo-ToF fusion method of Agresti et al. [1].
Dataset
The multi camera acquisition system used to acquire the proposed dataset is arranged as in the figure below. The reference system is the ZED camera in the center, underneath the ZED there is the Kinect and above there is the RealSense R200. The three cameras are kept in place by a plastic mount specifically designed to fit them. The depth camera of the Kinect is approximately horizontally aligned with the left camera of the ZED with 40 mm vertical displacement, while the color camera is approximately in between the passive stereo pair. The RealSense R200 is placed approximately 20 mm above the ZED camera, with the two IR and color camera inside the baseline of the passive stereo pair.
The subjects of the 10 scenes in the REAL3EXT dataset try to stress various flows of the stereo and ToF systems. Critical points are for example lack of texture for the passive stereo system and the presence of
low reflect elements and external illumination for the active sensors. The scenes are composed by flat surfaces with and without textures, plants and objects of various material such as plastic, paper and cotton fabric.
These are characterized by various specularity properties as reflective and glossy surfaces and rough materials. Each scene was recorded under 4 different external lighting conditions, which are the following: with no external light;
with regular lighting; with stronger light; with an additional incandescent light source. Each lighting condition can highlight the weakness and strength of the different depth estimation algorithms. We added the acquisitions with the
additional incandescent light source since its spectrum, in the IR wavelength, covers the working range of the active depth cameras and it is a known problem for those devices.
A zipped archive with the dataset can be downloaded from
here. It contains one folder for each of
the 10 scenes containing a folder for each of the considered external illumination condition.
Each of these sub-folders contains the following data:
- left color image from the ZED stereo system
(zed_left.png)
- right color image from the ZED stereo system
(zed_right.png)
- depth map, measured in millimeters, from the Kinect v2 ToF sensor
(kinect_depth.mat)
- amplitude map from the Kinect v2 ToF sensor
(kinect_amplitude.mat)
- color image from the Kinect v2 color camera
(kinect_color.png)
- left IR image from the R200 active stereo system
(r200_left.png)
- right IR image from the R200 active stereo system
(r200_right.png)
- color image from the R200 color camera
(r200_color.png)
- depth ground truth, measured in millimeters, from the left camera in the ZED stereo system
(gt.mat)
Finally the calibrationREAL.xml file contains the intrinsic and
extrinsic parameters of the employed setup. The format of the
calibration data is the one used by the OpenCV computer vision library,
refer to the documentation of OpenCV for more details.
Downloads
At the address http://lttm.dei.unipd.it/nuovo/datasets.html you can find other ToF and stereo datasets from our research group.
Contacts
For any information on the data you can write to lttm@dei.unipd.it . Have a look at our website http://lttm.dei.unipd.it for other works and datasets on this topic.
References
[1] G. Marin, G. Agresti, L. Minto and P. Zanuttigh, "A multi-camera dataset for depth estimation in an indoor scenario ", submitted to Elsevier Fusion Journal (under review).[2] G. Agresti, L. Minto, G. Marin and P. Zanuttigh, "Stereo and ToF Data Fusion by Learning from Synthetic Data", Information Fusion (2018).
xhtml/css website layout by Ben Goldman - http://realalibi.com