ROBOVIS 2022 Abstracts


Area 1 - Computer Vision

Full Papers
Paper Nr: 2
Title:

Eye Tracking Calibration based on Smooth Pursuit with Regulated Visual Guidance

Authors:

Yangyang Li, Lili Guo, Guangbin Sun, Rongrong Fu, Zhen Yan and Ji Liang

Abstract: Eye tracking calibration based on smooth pursuit has the characteristics of rapidity and convenience, but most smooth pursuit calibration methods are based on spontaneous and passive gazes. The spatial-temporal characteristics of the target movement can significantly affect the tracking performance, but few works have performed calibration considering the effects of both the spatial and temporal variance of the smooth pursuit target. Therefore, we proposed an off-line smoothing pursuit calibration featuring actively regulated speed under specially designed visual guidance paths. In our prelude experiments, we found that there was an obvious correlation between the eye movement velocity and the error of gaze point measurement. In particular, when the movement velocity of gaze exceeded 6°/s, the accuracy and precision of the eye-tracking system were obviously lower. Based on these findings, the visual guidance trajectory was regulated, with the speed kept below 6°/s. The smooth pursuit calibration was combined with the neural network learning method. The results showed that the mean absolute error was reduced from 1.0° to 0.4°, and the full calibration process took only approximately 45 seconds.
Download

Paper Nr: 3
Title:

Monocular 3D Detection and reID-enhanced Tracking of Multiple Traffic Participants

Authors:

Alexander Sing, Csaba Beleznai and Kai Göbel

Abstract: Autonomous driving is becoming a major scientific challenge and applied domain of significant impact, also triggering a demand for the enhanced safety of vulnerable road users, such as cyclists and pedestrians. The recent developments in Deep Learning have demonstrated that monocular 3D pose estimation is a potential detection modality in safety related task domains such as perception for autonomous driving and automated traffic monitoring. Deep Learning offers enhanced ways to represent targets in terms of their location, shape, appearance and motion. Learning can capture the significant variations seen in the training data while retaining class- or target-specific cues. Learning even allows for discovering specific correlations within an image of a 3D scene, as a perspective image contains many hints about an object’s 3D location, orientation, size and identity. In this paper we propose an attention-based representational enhancement to enhance the spatial accuracy of 3d pose and the temporal stability of multi-target tracking. The presented methodology is evaluated on the KITTI multi-target tracking benchmark. It demonstrates competitive results against other recent techniques, and when compared to a baseline relying solely on a Kalman-Filter-based kinematic association step.
Download

Paper Nr: 7
Title:

Exploiting AirSim as a Cross-dataset Benchmark for Safe UAV Landing and Monocular Depth Estimation Models

Authors:

Jon A. Iñiguez De Gordoa, Javier Barandiaran and Marcos Nieto

Abstract: As there is a lack of publicly available datasets with depth and surface normal information from a drone’s view, in this paper, we introduce the synthetic and photorealistic AirSimNC dataset. This dataset is used as a benchmark to test the zero-shot cross-dataset performance of monocular depth and safe drone landing area estimation models. We analysed state-of-the-art Deep Learning networks and trained them on the SafeUAV dataset. While the depth models achieved very satisfactory results in the SafeUAV dataset, they showed a scaling error in the AirSimNC benchmark. We also compared the performance of networks trained on the KITTI and NYUv2 datasets, in order to test how training the networks on a bird’s eye view affects in the performance on our benchmark. Regarding the safe landing estimation models, they surprisingly showed barely any zero-shot cross-dataset penalty when it comes to the precision of horizontal surfaces.
Download

Short Papers
Paper Nr: 1
Title:

Application of Deep Learning Techniques in Negative Road Anomalies Detection

Authors:

Jihad Dib, Konstantinos Sirlantzis and Gareth Howells

Abstract: Negative Road Anomalies (Potholes, cracks, and other road anomalies) have long posed a risk for drivers driving on the road. In this paper, we apply deep learning techniques to implement a YOLO-based (You Only Look Once) network in order to detect and identify potholes in real-time providing a fast and accurate detection and sufficient time for proper safe navigation and avoidance of potholes. This system can be used in conjunction with any existing system and can be mounted to moving platforms such as autonomous vehicles. Our results show that the system is able to reach real-time processing (29.34 frames per second) with a high level of accuracy (mAP of 82.05%) and detection accuracy of 89.75% when mounted onto an Electric-Powered Wheelchair (EPW).
Download

Area 2 - Robotics

Full Papers
Paper Nr: 4
Title:

Mental and Physical Training for Elderly Population using Service Robots

Authors:

Christopher Ruff, Isaac Henderson, Tibor Vetter and Andrea Horch

Abstract: In this paper we present the implementation and evaluation of mental and physical exercise applications on a humanoid service robot for use in an elderly care setting. As the mental exercise application a personalized, multi-medial quiz was designed and implemented using information from participants biography. The robot acts as the quiz master, interacting with the participants in a natural and encouraging way. For the physical exercise, a variant of the “charade” game was implemented that uses machine learning from previously collected video samples and computer vision on the robot to identify the activities that participants enact. Both applications were evaluated successfully in a real life setting and highlight the potential of using service robots in elderly care settings.
Download

Paper Nr: 6
Title:

RGB-D Structural Classification of Guardrails via Learning from Synthetic Data

Authors:

Kai Göbel, Csaba Beleznai, Alexander Sing, Jürgen Biber and Christian Stefan

Abstract: Vision-based environment perception is a key sensing and analysis modality for mobile robotic platforms. Modern learning concepts allow for interpreting a scene in terms of its objects and their spatial relations. This paper presents a specific analysis pipeline targeting the structural classification of guardrail structures within roadside environments from a mobile platform. Classification implies determining the type label of an observed structure, given a catalog of all possible types. To this end, the proposed concept employs semantic segmentation learned fully in the synthetic domain, and stereo depth data analysis for estimating the metric dimensions of key structural elements. The paper introduces a Blender-based procedural data generation pipeline, targeting to accomplish a narrow sim-to-real gap, allowing to use synthetic training image data to train models valid in the real-world domain. The paper evaluates two semantic segmentation schemes for the part segmentation task, and presents a temporal tracking and propagation concept to aggregate single-frame estimates. Results demonstrate that the proposed analysis framework is well applicable to real scenarios and it can be used as a tool for digitally mapping safety-critical roadside assets.
Download

Paper Nr: 9
Title:

Marine Snow Removal in Underwater Images

Authors:

Bogdan Smolka and Monika Mendrela

Abstract: In this paper two methods of marine snow detection in underwater images are presented. The proposed techniques are based on the pixel corruption measures which enable the identification of clusters forming the marine snow. As the detection of marine snow contaminating the images must be followed by an inpainting step, various techniques which allow to restore the images with missing regions were evaluated. The experiments revealed that the restoration quality of applied inpainting techniques is dependent on the image structure and the size of regions needed to be restored and that their overall efficiency is comparable. Therefore, the faster algorithms should be preferred. To asses the quality of marine snow removal techniques, a database of images with 5 levels of contamination was created. The experiments performed on this database showed that the proposed marine snow detection techniques coupled with fast inpainting methods yield very satisfactory results, superior to the techniques already known from the literature.
Download