For the current research, a ‘Spot the Face in a Crowd Test’ (SFCT) comprising six video clips depicting target-actors and multiple bystanders was loaded on TooManyEyes, a bespoke multi-media platform adapted here for the human-directed identification of individuals in CCTV footage. To test the utility of TooManyEyes, police ‘super-recognisers’ (SRs) who may possess exceptional face recognition ability, and police controls attempted to identify the target-actors from the SFCT. As expected, SRs correctly identified more target-actors; with higher confidence than controls. As such, the TooManyEyes system provides a useful platform for uploading tests for selecting police or security staff for CCTV review deployment.
In this paper we present the results of the 3D Shape Retrieval Contest 2011 (SHREC'11) track on generic shape retrieval. The aim of this track is to evaluate the performance of 3D shape retrieval algorithms that can operate on arbitrary 3D models. The benchmark dataset consists of 1000 3D objects classified in 50 categories. The 3D models are mainly classified based on visual shape similarity and each class has equal number of models to reduce the possible bias in evaluation results. Two groups have participated in the track with six methods in total.
This paper proposes a novel framework for 3-D object retrieval, taking into account most of the factors that may affect the retrieval performance. Initially, a novel 3-D model alignment method is introduced, which achieves accurate rotation estimation through the combination of two intuitive criteria, plane reflection symmetry and rectilinearity. After the pose normalization stage, a low-level descriptor extraction procedure follows, using three different types of descriptors, which have been proven to be effective. Then, a novel combination procedure of the above descriptors takes place, which achieves higher retrieval performance than each descriptor does separately. The paper provides also an in-depth study of the factors that can further improve the 3-D object retrieval accuracy. These include selection of the appropriate dissimilarity metric, feature selection/dimensionality reduction on the initial low-level descriptors, as well as manifold learning for re-ranking of the search results. Experiments performed on two 3-D model benchmark datasets confirm our assumption that future research in 3-D object retrieval should focus more on the efficient combination of low-level descriptors as well as on the selection of the best features and matching metrics, than on the investigation of the optimal 3-D object descriptor.
The availability of semantically annotated image and video assets constitutes a critical prerequisite for the realisation of intelligent knowledge management services pertaining to realistic user needs. Given the extend of the challenges involved in the automatic extraction of such descriptions, manually created metadata play a significant role, further strengthened by their deployment in training and evaluation tasks related to the automatic extraction of content descriptions. The different views taken by the two main approaches towards semantic content description, namely the Semantic Web and MPEG-7, as well as the traits particular to multimedia content due to the multiplicity of information levels involved, have resulted in a variety of image and video annotation tools, adopting varying description aspects. Aiming to provide a common framework of reference and furthermore to highlight open issues, especially with respect to the coverage and the interoperability of the produced metadata, in this chapter we present an overview of the state of the art in image and video annotation tools.
In this paper, a novel framework for 3D object retrieval is presented. The paper focuses on the investigation of an accurate D model alignment method, which is achieved by combining two intuitive criteria, the plane reflection symmetry and rectilinearity. After proper positioning in a coordinate system, a set of 2D images (multi-views) are automatically generated from the 3D object, by taking views from uniformly distributed viewpoints. For each image, a set of flip-invariant shape descriptors is extracted. Taking advantage of both the pose estimation of the 3D objects and the flip-invariance property of the extracted descriptors, a new matching scheme for fast computation of 3D object dissimilarity is introduced. Experiments conducted in SHREC 2009 benchmark show the superiority of the pose estimation method over similar approaches, as well as the efficiency of the new matching scheme.
In this paper, a software-based system for the real-time synchronization of images captured by a low-cost camera framework is presented. It is most well suited for cases where special hardware cannot be utilized (e.g. remote or wireless applications) and when cost efficiency is critical. The proposed method utilizes messages to establish a consensus on the time of image acquisition and NTP synchronization of computer clocks. It also provides with an error signal, in case of failure of the synchronization. The evaluation of the proposed algorithm using a precise LED array system (1ms accuracy) proves the effectiveness of this method.
Mobile interconnectivity and telemedicine are important issues for achieving effectiveness in health care, since medical information can be transmitted faster and physicians can make diagnoses and treatment decisions faster. In this context, the OTELO project (OTELO, 2001) aims to develop a fully integrated end-to-end mobile tele-echography system using an ultra light, remote-controlled robot, for population groups that are not served locally by medical experts. An expert located in the expert center will do the echographic diagnosis. There will be only a “non-sonographer” person in the isolated site and the wireless transmission system will be the only link between the two sites. At the master station site, the clinical expert’s role is to control and tele-operate the distant robot by holding a fictive probe.
Ultrasound imaging allows the evaluation of the degree of emergency of a patient. However, in some instances, a well-trained sonographer is unavailable to perform such echography. To cope with this issue, the Mobile Tele-Echography Using an Ultralight Robot (OTELO) project aims to develop a fully integrated end-to-end mobile tele-echography system using an ultralight remote-controlled robot for population groups that are not served locally by medical experts. This paper focuses on the user interface of the OTELO system, consisting of the following parts: an ultrasound video transmission system providing real-time images of the scanned area, an audio/video conference to communicate with the paramedical assistant and with the patient, and a virtual-reality environment, providing visual and haptic feedback to the expert, while capturing the expert’s hand movements. These movements are reproduced by the robot at the patient site while holding the ultrasound probe against the patient skin. In addition, the user interface includes an image processing facility for enhancing the received images and the possibility to include them into a database.