Dataset: Simulated Customer-Shopkeeper Interactions with Periodically Changing Products

1 Overview

This webpage contains supplementary data and files to accompany “Data-driven Imitation Learning for a Shopkeeper Robot with Periodically Changing Product Information” (see Reference). It contains: (1) the simulated dataset of shopkeeper-customer interactions (in English); and (2) our neural network code. [9.37MB]

2 Simulated Interaction Dataset and Code

To train and evaluate a system that can deal with periodically changing products, we needed a dataset of interactions in a store that is taken over a prolonged time, so that it includes such changing products. In the future it will be possible to collect social interaction data in stores to train social service robots. For example, passive sensors, e.g. person-tracking 3-D range sensors and voice-recording microphone arrays, could record customer-employee dialogues about products in the store. But, such a dataset does not yet exist, and collecting one, even in a laboratory setting, would be time and cost prohibitive.

Therefore, to demonstrate the proposed system as a proof-of-concept, we simulated customer-shopkeeper interactions in a camera shop scenario, where the cameras on display periodically change. This was accomplished by creating an interaction simulator that combined elements of previously collected human-human interactions, hand-designed new products, and new utterances collected via crowdsourcing.

2.1 File and Directory Descriptions

  • ‘2020-01-08_advancedSimulator9’ contains the simulated interactions. Each of the 11 CSV files consists of 1000 interactions simulated with the database snapshop indicated by the number in the filename (.e.g/ “0-00”). The fieldnames are:
    • TRIAL – ID of the trial. A trial starts when the customer enters the store and ends when they leave.
    • DATABASE_ID – ID of the database snapshot that was used to simulate the interaction.
    • DATABASE_CONTENTS – The contents of the database that were used in the shopkeeper utterance. (Used for simulation.)
    • TURN_COUNT – Which turn of the interaction is it.
    • CURRENT_CAMERA_OF_CONVERSATION – The camera that is being talked about by the shopkeeper or customer. (Used for simulation.)
    • PREVIOUS_CAMERAS_OF_CONVERSATION – The cameras that the shopkeeper and customer have previously talked about. (Used for simulation.)
    • PREVIOUS_FEATURES_OF_CONVERSATION – The feature (attributes) of the cameras that were previously talked about. (Used for simulation.)
    • CUSTOMER_ACTION – The abstract customer action. (Used for simulation.)
    • CUSTOMER_LOCATION – The customer’s location in the store.
    • CUSTOMER_TOPIC – The customer’s topic of conversation. (Used for simulation.)
    • CUSTOMER_SPEECH – The customer’s utterance.
    • SPATIAL_STATE – The spatial state of the shopkeeper at the time of the customer’s action.
    • STATE_TARGET – In the case of the PRESENT_OBJECT spatial sate, this indicates which object is being presented.
    • OUTPUT_SHOPKEEPER_ACTION – The abstract shopkeeper action. (Used for simulation.)
    • OUTPUT_SHOPKEEPER_LOCATION – The location that the shopkeeper goes to after the customer’s action.
    • SHOPKEEPER_TOPIC – The topic of the shopkeeper’s speech.
    • SHOPKEEPER_SPEECH – The shopkeeper’s utterance.
    • OUTPUT_SPATIAL_STATE – The spatial state of the shopkeeper after the customer’s action.
    • OUTPUT_STATE_TARGET – See above.
    • SHOPKEEPER_SPEECH_DB_ENTRY_RANGE – The substring start and end indices of the shopkeeper utterance which matched with contents from the product information database.
    • SHOPKEEPER_SPEECH_WITH_SYMBOLS – The shopkeeper’s utterance after the matching substring is replaced by an attribute symbol.
    • SHOPKEEPER_SPEECH_STRING_SEARCH_TOKENS – The sequence of tokens that represent the shopkeeper utterance and were used for approximate string search.
    • SYMBOL_MATCH_SUBSTRINGS – Substrings of the shopkeeper utterance which matched with contents from the product information database.
    • SYMBOL_CANDIDATE_DATABASE_INDICES – The database indices for database contents that matched to substrings of the shopkeeper utterance.
    • SYMBOL_CANDIDATE_DATABASE_CONTENTS – The contents of the database that approximately matched the shopkeeper utterance.
  • ‘database_snapshots’ contain the database snapshots used to simulate the interactions. The rows correspond to the different cameras and the columns correspond to the camera attributes.
  • ‘shopkeeper_keywords_with_symbols’ contain the keywords used for speech clustering, which were extracted from the shopkeeper utterances with symbols (for the proposed system). Keywords were identified using an online API (Watson).
  • ‘shopkeeper_keywords_without_symbols’ similarly contains the keywords which were identified in the shopkeeper utterances without symbols (for the baseline system).
  • ‘20200109_withsymbols shopkeeper-tristm-3wgtkw-9wgtsym-3wgtnum-mc2-sw2-eucldist- speech_clusters.csv’ contains the shopkeeper speech clusters used in the proposed system.
  • ‘20200116_nosymbols shopkeeper-tristm-3wgtkw-9wgtnum-mc2-sw3-eucldist- speech_clusters.csv’ contains the shopkeeper speech clusters used in the baseline system.
  • ‘’ contains the Proposed, Baseline, and COREQA models whose results were reported in the paper.

3 The Code

The open-source neural network code (‘’) is intended to provide additional model architecture details that were not included in the paper (e.g., hyperparameters). Therefore, the code is not meant to be run; we open-source only the code that defines the neural network architecture. Readers who would like additional code and/or processed data in order to actually train/test the networks should contact us (below).

4 Usage

This dataset and code are free to use for research purposes only. This is not a production version; please use it at your own risk. If you use either the dataset or the code in your work, please be sure to cite the reference below.

5 Reference

Malcolm Doering, Dražen Brščić, and Takayuki Kanda. 2021. Data-driven Imitation Learning for a Shopkeeper Robot with Periodically Changing Product Information. ACM Trans. Hum.- Robot Interact. (in press).

6 Related Work

We encourage you to also look at the following works from our lab about learning interaction techniques from human-human data:

6.1 Publications

  • Nanavati, A., Doering, M., Brščić, D., and Kanda, T., 2020. Autonomously Learning One-To-Many Social Interaction Logic from Human-Human Interaction Data. In Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction (HRI’20).
  • Doering, M., Liu, P., Glas, D.F., Kanda, T., Kulić, D., and Ishiguro, H., 2019. Curiosity Did Not Kill the Robot: A Curiosity-based Learning System for a Shopkeeper Robot. ACM Trans. Hum.- Robot Interact.
  • Doering, M., Kanda, T., and Ishiguro, H., 2019. Neural-network-based Memory for a Social Robot: Learning a Memory Model of Human Behavior from Data. ACM Trans. Hum.-Robot Interact.
  • Doering, M., Glas, D.F. and Ishiguro, H., 2019. Modeling Interaction Structure for Robot Imitation Learning of Human Social Behavior. IEEE Transactions on Human-Machine Systems.
  • Liu, P., Glas, D.F., Kanda, T. and Ishiguro, H., 2018. Learning proactive behavior for interactive social robots. Autonomous Robots, pp.1-19.
  • Glas, D.F., Doering, M., Liu, P., Kanda, T. and Ishiguro, H., 2017, March. Robot’s Delight: A Lyrical Exposition on Learning by Imitation from Human-human Interaction. In Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction (pp. 408-408). ACM. [link]
  • Liu, P., Glas, D. F., Kanda, T., & Ishiguro, H., Data-Driven HRI: Learning Social Behaviors by Example from Human-Human Interaction, in IEEE Transactions on Robotics. 2016. doi:10.1109/tro.2016.2588880

6.2 Datasets

You can find similar camera shop interaction datasets here:

7 Contact

Please contact doering [at] with any questions, comments, concerns, and/or inquiries about this dataset, code, or research project more generally.