| CARVIEW |
Knowledge-based Programming by Demonstration using semantic action models for industrial assembly
IROS 2024
*Please direct correspondence to ding@fortiss.org
Abstract
In this paper, we introduce a knowledge-based Programming by Demonstration (kb-PbD) paradigm to facilitate robot programming in small and medium-sized enterprises (SMEs). PbD in production scenarios requires the recognition of product-specific actions but faces challenges in the lack of suitable and comprehensive datasets, due to the large variety of involved hand actions across different production scenarios. To address this issue, we utilize standardized grasp types as the fundamental feature to recognize basic hand movements, where a Long Short-Term Memory (LSTM) network is employed to recognize grasp types from hand landmarks. The product-specific actions, aggregated from the basic hand movements, are formally modeled in a semantic description language based on the Web Ontology Language (OWL). Description Logic (DL) is used to define the actions with their characteristic properties, which enables the efficient classification of new action instances by an OWL reasoner.
The semantic models of hand actions, robot tasks, and workcell resources are interconnected and stored in a Knowledge Base (KB), which enables the efficient pair-wise translation between hand actions and robot tasks. For the reproduction of human assembly processes, actions are converted to robot tasks via skill descriptions, while reusing the action parameters of involved objects to ensure product integrity. We showcase and evaluate our method in an industrial production setting for control cabinet assembly.
Demonstration video
Human's assembly process for a electrical control cabinet is captured and converted to a robot assembly process, where the product-specific parameters, i.e. type and position of the terminal block, are reused to keep same product configuration.
Concept of our kb-PbD paradigm with the industrial use case of control cabinet assembly. For the correct reproduction with robots, the product-specific actions have to be recognized and linked to the robot skills.
Overview of the kb-PbD paradigm with a perception pipeline for hand action recognition from RGB-D images (left). The KB semantically encodes and stores the hand action instances, within which a reasoning engine classifies the skill-based actions via logical inference (top right). Robot tasks are constructed from hand actions and executed in the workcell (bottom right).
An example for action recognition from the RGB-D video. We utilize grasp types and hand velocities (with a threshold ϵ set to 0.3 m/s) as criteria to instantiate the PrimitiveActions in the KB. A PickAndPlaceAction is composed upon the primitive ones and inferred on the product-specific type defined in DL expressed in 2 on action parameters.
An example of pair-wise translation from a hand action to a robot task that are semantically connected to the skill domain. The construction and the parameterization of robot tasks are enabled by a SPARQL query (Listing 6), which efficiently reuses the action parameters (marked in red) to ensure the correct reproduction.
Demonstration video
Human's assembly process for a electrical control cabinet is captured and converted to a robot assembly process, where the product-specific parameters, i.e. type and position of the terminal block, are reused to keep same product configuration.
Traditional GUI programming
Programming efficiency comparision with a classical GUI-based approach for the same product with 5 terminal blocks..
Poster
BibTeX
@inproceedings{Ding24-PbD,
author = {Ding, Junsheng and Zhang, Haifan and Li, Weihang and Zhou, Liangwei and Perzylo, Alexander},
title = {Knowledge-based Programming by Demonstration using semantic action models for industrial assembly},
booktitle = {IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year = {2024},
month = oct,
address = {Abu Dhabi, UAE},
}