11.5 million image kitchen dataset released
Computer science researchers at the University of Bristol have released EPIC-KITCHENS, a dataset filmed in thirty two kitchens across four cities.
The University of Bristol reports the films, which include 11.5 million images, have been annotated with forty thousand action examples and half a million objects. This groundbreaking dataset will help machines to learn and advance first person vision, enabling improvements in robotics, healthcare and augmented reality.
EPIC-KITCHENS is the largest ever video dataset using wearable cameras, available to the academic research community, for automatic understanding of object interactions in daily living. It is aimed to advance the field of first person vision, perceiving the world from the wearer’s perspective, as well as the wearer’s intentions and interactions. Wearable vision is believed to be the next step beyond handheld computer vision.
Dr Dima Damen, Senior Lecturer at the University of Bristol’s Department of Computer Science, said “First-person vision has been hindered for years by the unavailability of big data. EPIC-KITCHEN will allow training of data-intensive machine learning algorithms. It offers a bed of interesting challenges from typical object detection – locating objects in the video, to behaviour analysis and action anticipation.”
EPIC-KITCHENS consists of 11.5 million images, recorded by thirty two individuals in their own homes over several consecutive days. The dataset is fully annotated for actions and objects in these videos. Around forty thousand action examples and half a million objects have been annotated. The annotation is unique in that it is based on the participants narrating their own videos, thus reflecting true intention. The ground truth was then crowdsourced based on these narrations.
EPIC-KITCHENS is the outcome of a twelve months collaboration between the University of Bristol, the University of Toronto, a leading research lab in deep learning and computer vision, and the University of Catania, a highly active research group in first person vision.
Dr Antonino Furnari, a senior research fellow at the University of Catania, said “I believe EPIC-KITCHENS will accelerate research in the field by providing realistic data useful to study and understand human-object interactions.”
The collaborators invite research groups worldwide to compete on available challenges and introduce new ones. Dr Sanja Fidler, Assistant Professor at the University of Toronto, said the plan is to “track the community’s progress on established challenges, with held-out test ground truth, via online leaderboards.”
The effort to collect, annotate, benchmark and release EPIC-KITCHENS required the dedication of eleven researchers across the three universities, for a full year. Will Price, PhD student at the University of Bristol, said “As a first year PhD student, I have found participating in the collection and annotation of a dataset enlightening. Working with a dataset of this size introduced me to the necessity of high performance computing to process data in a timely fashion, and challenged my skills in training and modifying neural networks including architectural improvements.”
EPIC-KITCHENS has made use of Bristol’s BlueCrystal4 as well as GW4’s JADE high performance computers to process such a large scale dataset. Dima Damen said “This is the largest dataset to be released by Bristol to-date, in terms of size. The professional effort of the Data.Bris team has been instrumental in today’s release.”
The annotation of EPIC-KITCHENS was made possible via a charitable donation from Nokia Technologies to Dima Damen, as well as seed funding from Bristol’s Jean-Golding Institute for data intensive research.