Skip to main content
Intended for healthcare professionals
Restricted access
Research article
First published online January 28, 2021

AIR-Act2Act: Human–human interaction dataset for teaching non-verbal social behaviors to robots

Abstract

To better interact with users, a social robot should understand the users’ behavior, infer the intention, and respond appropriately. Machine learning is one way of implementing robot intelligence. It provides the ability to automatically learn and improve from experience instead of explicitly telling the robot what to do. Social skills can also be learned through watching human–human interaction videos. However, human–human interaction datasets are relatively scarce to learn interactions that occur in various situations. Moreover, we aim to use service robots in the elderly care domain; however, there has been no interaction dataset collected for this domain. For this reason, we introduce a human–human interaction dataset for teaching non-verbal social behaviors to robots. It is the only interaction dataset that elderly people have participated in as performers. We recruited 100 elderly people and 2 college students to perform 10 interactions in an indoor environment. The entire dataset has 5,000 interaction samples, each of which contains depth maps, body indexes, and 3D skeletal data that are captured with three Microsoft Kinect v2 sensors. In addition, we provide the joint angles of a humanoid NAO robot which are converted from the human behavior that robots need to learn. The dataset and useful Python scripts are available for download at https://github.com/ai4r/AIR-Act2Act. It can be used to not only teach social skills to robots but also benchmark action recognition algorithms.

Get full access to this article

View all access and purchase options for this article.

References

Heenan B, Greenberg S, Aghel-Manesh S, Sharlin E (2014) Designing social greetings in human robot interaction. In: Proceedings of the 2014 Conference on Designing Interactive Systems. New York: ACM Press, pp. 855–864.
Hemminahaus J, Kopp S (2017) Towards adaptive social behavior generation for assistive robots using reinforcement learning. In: 2017 12th ACM/IEEE International Conference on Human–Robot Interaction (HRI). IEEE, pp. 332–340.
Hu T, Zhu X, Guo W, Su K (2013) Efficient interaction recognition through positive action representation. Mathematical Problems in Engineering 2013: 795360.
Huang CM, Mutlu B (2012) Robot behavior toolkit: Generating effective social behaviors for robots. In: 2012 7th ACM/IEEE International Conference on Human–Robot Interaction (HRI). IEEE, pp. 25–32.
Kay W, Carreira J, Simonyan K, et al. (2017) The kinetics human action video dataset. arXiv preprint arXiv:1705.06950.
Ko WR, Yoon Y, Jang M, Lee J, Kim J (2018) End-to-end learning-based interaction behavior generation for social robots. In: ICSR2018 Workshop on Social Human–Robot Interaction of Service Robots.
Liu J, Shahroudy A, Perez M, Wang G, Duan LY, Kot AC (2020) NTU RGB+d 120: A large-scale benchmark for 3D human activity understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence 42(10): 2684–2701.
Marszalek M, Laptev I, Schmid C (2009) Actions in context. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009). IEEE, pp. 2929–2936.
Microsoft Corp. (2014) Kinect for Windows SDK 2.0 Documentation. https://www.microsoft.com/en-gb/download/details.aspx?id=44561
Patron-Perez A, Marszalek M, Zisserman A, Reid ID (2010) High Five: Recognising human interactions in TV shows. In: BMVC, vol. 1.
Qureshi AH, Nakamura Y, Yoshikawa Y, Ishiguro H (2016) Robot gains social intelligence through multimodal deep reinforcement learning. In: 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids). IEEE, pp. 745–751.
Ryoo MS, Aggarwal J (2010) UT-interaction dataset, ICPR contest on semantic description of human activities (SDHA). In: IEEE International Conference on Pattern Recognition Workshops, Vol. 2.
Ryoo MS, Fuchs TJ, Xia L, Aggarwal JK, Matthies L (2015) Robot-centric activity prediction from first-person videos: What will they do to me? In: ACM/IEEE International Conference on Human-Robot Interaction (HRI), Portland, OR, pp. 295–302.
Shahroudy A, Liu J, Ng TT, Wang G (2016) NTU RGB+D: A large scale dataset for 3D human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019.
Shotton J, Fitzgibbon A, Cook M, et al. (2011) Real-time human pose recognition in parts from single depth images. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 1297–1304.
Tome D, Russell C, Agapito L (2017) Lifting from the deep: Convolutional 3D pose estimation from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2500–2509.
Van Gemeren C, Poppe R, Veltkamp RC (2016) Spatio-temporal detection of fine-grained dyadic human interactions. In: International Workshop on Human Behavior Understanding. Berlin: Springer, pp. 116–133.
Xia L, Chen CC, Aggarwal JK (2012) View invariant human action recognition using histograms of 3D joints. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE, pp. 20–27.
Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D (2012) Two-person interaction detection using body-pose features and multiple instance learning. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, pp. 28–35.

Cite article

Cite article

Cite article

OR

Download to reference manager

If you have citation software installed, you can download article citation data to the citation manager of your choice

Share options

Share

Share this article

Share with email
EMAIL ARTICLE LINK
Share on social media

Share access to this article

Sharing links are not relevant where the article is open access and not available if you do not have a subscription.

For more information view the SAGE Journals article sharing page.

Information, rights and permissions

Information

Published In

Article first published online: January 28, 2021
Issue published: April 2021

Keywords

  1. Social robot
  2. machine learning
  3. human–human interaction
  4. the elderly

Rights and permissions

© The Author(s) 2021.
Request permissions for this article.

History

Published online: January 28, 2021
Issue published: April 2021

Authors

Affiliations

Woo-Ri Ko
Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
Minsu Jang
Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
Jaeyeon Lee
Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea
Jaehong Kim
Electronics and Telecommunications Research Institute (ETRI), Daejeon, Republic of Korea

Notes

Woo-Ri Ko, ETRI, 218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, KR. Email: [email protected]

Metrics and citations

Metrics

Journals metrics

This article was published in The International Journal of Robotics Research.

VIEW ALL JOURNAL METRICS

Article usage*

Total views and downloads: 297

*Article usage tracking started in December 2016

Altmetric

See the impact this article is making through the number of times it’s been read, and the Altmetric Score.
Learn more about the Altmetric Scores


Articles citing this one

Web of Science: 1 view articles Opens in new tab

Crossref: 0

There are no citing articles to show.

Figures and tables

Figures & Media

Tables

View Options

Get access

Access options

If you have access to journal content via a personal subscription, university, library, employer or society, select from the options below:


Alternatively, view purchase options below:

Purchase 24 hour online access to view and download content.

Access journal content via a DeepDyve subscription or find out more about this option.

View options

PDF/ePub

View PDF/ePub

Full Text

View Full Text