Exploiting visual cues for safe and flexible cyber-physical production systems

Human workers are envisioned to work alongside robots and other intelligent factory modules, and fulfill supervision tasks in future smart factories. Technological developments, during the last few years, in the field of smart factory automation have introduced the concept of cyber-physical systems, which further expanded to cyber-physical production systems. In this context, the role of collaborative robots is significant and depends largely on the advanced capabilities of collision detection, impedance control, and learning new tasks based on artificial intelligence. The system components, collaborative robots, and humans need to communicate for collective decision-making. This requires processing of shared information keeping in consideration the available knowledge, reasoning, and flexible systems that are resilient to the real-time dynamic changes on the industry floor as well as within the communication and computer network infrastructure. This article presents an ontology-based approach to solve industrial scenarios for safety applications in cyber-physical production systems. A case study of an industrial scenario is presented to validate the approach in which visual cues are used to detect and react to dynamic changes in real time. Multiple scenarios are tested for simultaneous detection and prioritization to enhance the learning surface of the intelligent production system with the goal to automate safety-based decisions.


Introduction
The current era is experiencing the revolution of the production systems, using interconnected equipment, automation of processes, real-time data processing for developing decision tools, and human-machine collaboration.These activities collectively are termed as the fourth industrial revolution or Industry 4.0.Such systems are considered as highly flexible and productive.The concept is evolving from automation to intelligence, though in the nascent stage.A cyber-physical system (CPS) forms the basis of these production centers, defined as machines connected over the network and controlled by computers/users via sophisticated algorithms.Therefore, the CPS can be termed as a smart system incorporating physical and computational elements. 1These elements can be distributed into four portions, that is, sensing, networking, analysis, and application. 2In the realm of Industry 4.0, a new term of cyber-physical production systems (CPPSs) emerged in Germany that proposed complete automation of production systems, incorporating interconnected physical elements such as robots, conveyors, sensors, and actuators controlled by computational elements.These systems are flexible to the extent that they incorporate changes which are already stated or provided through decision rules. 3Internet of things (IoT) is an emerging communication protocol through which the elements of CPPS can interact with each other having unique identity. 4This further leads to the concept of production systems which are autonomous through the use of artificial intelligence (AI) and IoT commonly known as smart factories. 5Although the robots and computers take a major share in the CPS, human presence is essential for productivity either for supervision or for complicated jobs that robots cannot undertake.A conceptual smart factory based on an anthropocentric CPS is presented in Pirvu et al., 6 which states the crucial role of the human factor in any cyber-physical engineering artifact.With the increased interaction between humans and machines, the latest CPSs are faced with issues in the social domain and systems are now designed to cater these aspects. 7The transfer of information is now not limited to computational and physical elements only.Rather, social elements are incorporated which may be physical and verbal signs or social gaze.This requires monitoring of the human operator's physical and cognitive activity to determine the operator's intentions which, after the analysis, may be converted into tasks to be performed by the CPS.The collaborative robots (cobots) introduced here should perform efficiently with legibility while undergoing joint operations in a social environment like proxemics. 8The key concept is derived from social intelligence in humans and other social animals.The role and qualities of a cobot are dependent on the peculiarity of the specific application; like a robot performing in an industry, it should possess different traits than one serving old or disabled people at home. 9 A similar concept of social robots in the food industry is presented in Khan et al. 10 which describe their utility in the food industry ranging from service provision till production; this requires variant social skills such as social interaction and cleanliness.One of the basic elements of any social interaction among humans or animals is the assurance of safety from others.In any particular operation involving human-robot interaction, a real-time safety assurance is required.Safe human-robot collaboration (HRC) for heavy payload industrial robots is proposed in Khalid et al., 11 and a CPS is suggested incorporating the shelf sensors to implement both security and safety.Secured data and health monitoring at different nodes is proposed for the safety of a CPS in Khalid et al. 12 In medical science, safety is divided into two categories, as stated by McEwen and Stellar, 13 which are physical and psychological safety.The assurance provided by the first one is denial of physical contact or damage due to contact, whereas the latter assures the avoidance of discomfort or stress due to repeated interaction.A conceptual system is presented in Lasota et al. 14 assuring both physical and psychological safety.The idea was based on real-time evaluation of the separation distance which does not require any major modification in existing system hardware.This real-time measurement is used to control and adjust the speed of the robot.A matching idea for human psychological comfort due to the effect of robot motion is presented in Dragan et al. 15 The motion of the robot is divided into three categories of functional, predicable, and legible motions.These were then analyzed to gauge the human comfort level when subjected to each type of motion.Legible motion is given preference over predictable motion in collaborative tasks; both are types of functional motion.The concept of legibility in the article states that humans can infer the robot's goal with ease due to its intended motion which makes them feel more comfortable during collaborative tasks.In predictable motion, the goal is known prior to move; however, the operator does not feel comfortable due to the robot's initial path.The concept of legibility stated above is limited to the robot's motion; there is a need to design a system which can provide legibility of the complete processes involved in the CPS.As Industry 4.0 recommends the use of intelligent robots, the concept of comfort to human users can be equally valid for intelligent robots, that is, both physical and psychological safety.Here, from intelligent robots, we mean the ''CPS'' as the complete system that makes these robots intelligent.This can be done by increasing flexibility in the system to encounter any contingency in the task, for example, finding a bolt while performing an operation on a nut, in an assembly line.
The production centers involve many processes from supply of raw material to manufacturing, assembly, packaging, delivery, and so on requiring both machines and humans.Changing scenarios diverting from the main task affect the efficiency of the system which must be looked after.][18][19] Mainly, two types of sensors are used in a broader category: one based on vision systems and the other based on proximity/contact.The safety system presented will come into action as soon as the human operator will come into contact or in the near vicinity to the robot.
A list of the state-of-the-art existing cobots is presented in Khalid et al. 20 showing their capabilities for safe human and robot collaboration.The list shows that force sensors, torque sensors, and visual/infrared (IR) cameras are used for collision detection.However, these systems do not provide the choice to differentiate the user, nor do they take into account the interaction with any foreign object.One of the important aspects is the change in human intention that must be catered.An object classification technique was, however, used in Sharma et al. 21to identify a human body and some objects available in a workspace.The objective is to classify objects in the area of interest of the robot.
Ontology management is used all over the world to describe what exists in a system and what relationship exists among them.It is based on memories that integrate informal, semi-formal, and formal knowledge in order to facilitate its access, sharing, and reuse by members of the system for solving their individual or collective task.This requires knowledge of engineers, domain experts, analysts, and interviewers, among others. 22ntology can serve as the common basis for communication between humans and machines.Marvel et al. 23 utilized a flexible ontology for risk assessment during HRC.This was done by characterizing and decomposing tasks to the subcomponent level.Safety was assessed at each subtask component by evaluating the base elements.Djuric et al. 24 state that the effectiveness of a collaborative task in an advanced production environment is dependent on how well the technology is integrated.Zachman framework, a management technique generally used for enterprise architecture design, is modified for cobot integration by the authors.It provides a foundation for a four-tier framework comprising the system, work cell, machine, and the worker level.What, how, where, and why questions are answered at each tier?The ontology approach is proposed for task assessment that should be evaluated through task base elements and subtask components.Risk assessment and productivity are considered as important key factors.Another ontology approach is presented in Sadik and Urban 25 for an HRC-based manufacturing work cell.The authors decide a common language to address the communication protocol between the operator and the robot shared environment.Understanding and reasoning of the common language is worked out so that the collaborative work cell can adapt to changes in production demands.A collective system is formulated by teaming together the artificial agents in a flexible and distributed arrangement to overcome the issues beyond the capability of a single agent.A software agent is a computer system situated in a specific environment that is capable of performing autonomous actions in this environment in order to meet its design objective.Olszewska et al. 26 report the first IEEE RAS ontology standard for autonomous robotics developed by the IEEE RAS Autonomous Robotics (AuR) Study Group, for which the first implementation is successfully validated for a human-robot interaction scenario.This article involves a complex industrial scenario that is handled by the cobot using visual cues incorporating object detection, pose estimation, and location using AI where the decision-making happens in a realtime complex social space completely defined by the ontology-based framework.Due to the complex nature of real-world scenarios in which a large number of role players, their interactions, and intricate relationships form a dynamic system, the system cannot be modeled using a traditional mathematical model and an ontology-based strategy is preferred.The cobot implementation in the CPPS and solving for a social space to tackle issues related to physical and psychological safety become a new dimension to explore.Although the elements of the systems are conventional, their integration together forms a system that can be trained to handle complex dynamic situations.Section ''Scenario'' of the article describes the problem statement posed by an HRC-based CPPS, section ''Methodology'' describes the methodology to counter the stated problem, section ''Case ontology'' describes the specific case considered for the ontology-based solution, section ''Indexing of the anxiety factor'' describes the indexing of the anxiety factor, and section ''Experimental setup'' describes the experimental setup to validate the approach and the results obtained.

Scenario
An industrial scenario in a factory is considered where different types of parts are coming in at a station.A cobot is envisaged to perform operations in the presence of a human operator who is on both supervisory and collaborative roles.A set of assigned tasks are dedicated to the cobot and the operator to complete the desired operation.However, it is not expected that an unforeseen event may arrive astride from the intended situation, for which the system lacks flexibility.It must be kept in mind that physical and psychological safety for both the cobot and the human operator must be kept in consideration while planning for this flexibility.The physical safety here means physical contact between the robot and the operator, and seamless safe operation between the equipment and human.Psychological safety is the comfort both the operator and the CPPS may feel in case of an unforeseen event, such that any contingency will not compromise the main goal while assuring physical safety to the system.It should be added that the legibility of the operator's intention to the robot and robot's intention to the operator may further optimize the goal and psychological safety.The work zone considered as a social space is designed in a safe and secure way with the help of integrated devices, IoT, and AI.The robot rigorous training within the environment can bring a situation where almost all unforeseen events can be handled.The changes in the defined set of scenarios will now be catered first by their detection and then by addressing them according to the situation.The data training through AI can build rigorous scenarios to bring more flexibility.These should then be updated in the existing schedule of tasks.Here, a problem also arises where multiple sets of situations occur at a single moment.An illustration of the aforementioned industrial scenario is shown in Figure 1.As an example, a packaging work cell scenario is considered.The possible list of situations considered for a limited model is as follows: right item in place/right feed for packaging (intended situation), imminent collision between the cobot and the operator (inherent resilience is present in the latest cobots), wrong feed of parts, no feed of parts, human operator interference who detected wrong positioning of parts or a quality issue in parts, displacement of other affiliated items like packaging box in the example, entry of a person other than the operator, and entry of foreign objects.Here, each scenario can lead to multiple subscenarios, not defined in the existing system, which can be detected and disposed through AI, like various parts, previously not registered, arrives in or a variety of foreign objects, not in the knowledge of conventional system enters the workspace can be detected and disposed through object detection.

Methodology
A new concept is presented based on a virtual domain for physical and psychological safety of CPS.The complex real-world industrial scenario comprises a large number of elements that interact in a non-linear way with each other and exhibit the emergence of unplanned activities, lack of complete knowledge, and ethical and safety issues.This is a new domain in which the conventional visual cues method and ontology-based modeling are implemented with AI to manage industrial operations in an intelligent way that can provide a thinking base to the CPPS.The CPPS architecture and characteristics show that connectivity, sociability, flexibility, adaptability, and highly automated nature are inherent parts of its operation.These characteristics and properties clearly indicate that a complex industrial CPPS cannot fully operate based on a conventional mathematical control model.The framework of the methodology used is presented in Figure 2.
Visual cues are used to identify the current scenario/ change through object/pose detection and accordingly the controller will adapt the contingency plan.The scenarios for object detection can be right/wrong/no feed of part, unidentified person entry, and entry of foreign objects into the workspace.YOLO, an object detection algorithm, 27 is used for the detection of items in use, that is, persons, bottles, cans, and so on (see Figure 3).It is a fast detection algorithm that detects objects by running a neural network on a pre-trained classifier network.The state-of-the-art version of the algorithm, namely, YOLOv3, 28 is trained to detect custom objects. 29nterference of the human operator is another scenario which will be ascertained through pose detection of the operator, if or otherwise in line with the pose to pick/dispose the object.Open Pose 30 is used to detect the human operator's pose; it is an algorithm that estimates the position of human limbs in a twodimensional (2D) image (see Figure 4).The authors used a neural network to first detect different body parts and then find their association with each other, referred to it as a full body pose.Each position of a limb produces a 2D vector for a specific pose.There is a set of 2D vectors comprising positions of all limbs.In this way, machines can have an understanding of operators working alongside when performing a specific job in a specific pose.
Combination of poses can be verified to detect different situations, like the combination of two poses to differentiate disposal of a wrong item or a quality-issuerelated item.This can be done by detecting a pose first for the pickup location and then for the location of the disposal.As discussed previously, the psychological safety of the CPS is under consideration and a new factor, that is, anxiety of a CPS, is introduced.The name ''anxiety'' is chosen to relate with human like capability which can determine what situation is affecting the system most or creating an uncomfortable situation, in case of multiple scenarios at one time.It is important to highlight that there is no single equivalent criterion to gauge the priority of all situations in the context of the overall scenario.A management technique based on a qualitative analysis, that is, Ishikawa analysis, is used to calculate the anxiety.Whenever a change(s) in the scenario is(are) detected, the ''anxiety'' of the CPS will be ascertained.On deciding the contingency from the anxiety factor, the control algo determines what actions are to be performed according to the ontology related to that particular contingency and it commands the cobot to perform the specific tasks.The change in the process is registered and the sequence of tasks to be performed later is modified in the algo.The ontology in this article is formed for a particular case study to provide a general idea which can later be expanded to cater multipurpose scenarios for CPSs.

Case ontology
A packaging industry scenario is considered in which the bottles and the cans are to be packed in a carton.The bottles and cans are coming from the production bay through a conveyer where a human operator is segregating them and placing at designated locations.From these locations, the items are to be picked up by the cobot for further placing/packaging them in a carton.The carton is placed at a fixed location where after packaging it is replaced by the operator.The sequence of packaging can be any; the operator decides whether to place bottles or cans first.The item to be packed first will decide the status of that item in that scenario, that is, the current or next packaging item.In our case, six bottles and six cans are to be packed by the cobot, anyone can be first, and all will be placed in a sequence by the human operator, that is, the bottles are placed in a row ahead of a cans row.
Two machine vision cameras and proximity sensor for carton location are used to detect input/change in the scenario.If a right item is in place, the object will be detected from the input taken from the camera and will be packaged by the cobot at the dedicated place in the box.If the wrong item is in place, the item will be picked up by the cobot and placed at a spot dedicated for redundant items.The vacant space will then be filled by the operator and the cobot will move to the specific location.At every next location, the item is checked whether right or wrong.The system also requires protection from hazards like collision from unidentified persons or any foreign object in the workspace.These will be detected through object detection techniques using the camera input.Whereas the dedicated operator will be identified by the marked helmet he is wearing, the person detected along with cross mark is the authorized person in our case.There could be different interferences/changes in the intended scenario in the form of human operator's intervention or the displacement of carton from the intended location.
The human intervention can be due to two situations: one is that the operator finds the item damaged/ broken or the operator assesses that the cobot may not be able to pick up the object due to its intended movement or wrong placement of the object.In the first case, the item will be placed by the operator at the damaged/ broken item spot; however, the count will not be increased and the cobot will return to the same location on completion of the task.In the second case, the operator will pick the item himself or herself and place it in the box, where the control system will increase the count so that the cobot may move to the next location.These two cases will be verified by the combination of two pose detections, that is, if pose 1 and pose 2 are in combination, then a damaged/broken item is removed, and if pose 1 and pose 3 are in combination, then the operator has interfered and placed the item in the carton either to improve efficiency or to bypass imminent error in the system (see Figure 7).Pose 1 is the pose of the operator to access the stage from where items are to be picked by the cobot.Pose 2 is the pose of the operator to place broken/damaged items at their spot.Pose 3 is the pose of the operator while placing the item in the carton, as mentioned in the second case.In case the carton is displaced from the dedicated location which can be detected through proximity sensors/light-dependent resistors (LDRs), the cobot will move the carton to its proper place by pushing it to the fixed enclosure.In case any two or multiple scenarios are detected or overlapped, the action to be taken is decided based on the priority decided by the CPS.The priority in this case is set by the anxiety factor of the CPS, the indexing of which is explained in subsequent paragraphs.
The above statement is now converted into formal ontology diagrams as shown in Figures 5-7.Integrated Definition for Process Description Capture Method (IDEF3) 31 is used for the formulation of the described ontology.It is a process description modeling method, that is, the knowledge of how a system works.The whole ontology can be described in one single diagram, but for the sake of understanding and to avoid cluttering some subprocesses are converted into submodules.The main module is shown in Figure 5.The submodule that identifies whether the item in place is right or wrong is shown in Figure 6.The scenario implementation submodule is shown in Figure 7.

Indexing of the anxiety factor
As described previously, in case of multiple scenarios, the CPS will act on the scenario having maximum  anxiety.The problem was to establish a criterion to rank each scenario's priority.When it comes to problem solving in the behavioral and social domain, there exists no capability in computers to overcome human mind due to its intrinsic properties of consciousness, perception, judgment, and thinking.Therefore, a management technique based on a brainstorming tool is used to cater the case under consideration which exists in the social domain.Ishikawa analysis is used to find the anxiety of the CPS and further allot weights to each scenario.Six experts in the domain were consulted for brainstorming and giving weights to each scenario in comparison to every other scenario.The Ishikawa analysis is shown in Figure 8, which shows the anxiety factor that is to be determined on the right-hand side of the fish bone diagram.The main headings on the left show the scenarios under consideration.The subheadings under each scenario are stated to give weight to each in relation to the main heading; this could be 1 or 0. Each weight incorporated is based on the voting of six experts explained earlier.
The ranking of all scenarios is done based on the total weights assigned as shown in Table 1.Indexes calculated are then used in the ontology diagram.

Experimental setup
A setup is established to implement the case under consideration, which involves a universal robot (cobot) version UR5, a machine vision camera, a Robotiq kit composed of a camera and a gripper, an IR proximity sensor, and an Intel Core i5-2430M CPU computer with 6 GB RAM.Python is used to run object detection/pose detection algorithms and to connect the components/algorithm output with the UR5 software (PolyScope).The control box of UR5 has both digital and analog inputs/outputs to interface with digital and analog devices.The interface among the elements of the CPPS is shown in Figure 9.
The main scenario of packaging is programmed in PolyScope to pick bottles and cans from particular locations in a specific sequence and then to place them in the crate at their dedicated locations.For this, the waypoints for each of the pickup and drop-off locations are fed in PolyScope.Similarly, the waypoints for  drop-off locations pertinent to other scenarios are also set under the condition, in which the specific scenario is performed.A setup is established to implement the case study, as shown in Figure 10.
The live streams from the vision cameras connected to the i5 computer are monitored through the object detection and pose detection algorithms.Our object detection algorithm, YOLOv3, is trained for the detection of not only the objects used in the scenario but also the day-to-day general objects.The outputs from the object detection algorithm applied to the distant camera and the object detection algorithm applied to the robot's gripper camera are analyzed at every cycle when the cobot is at a location, being ready to pick an item.On detection of a particular case stated in the ontology diagrams shown in Figures 5-7, a signal based on digital code is sent to PolyScope, which takes action accordingly, and a subsequent command is given to the cobot.The object detection of the operator through a distant camera and the items through a camera installed on the UR5 gripper are shown in Figure 11.The bounding boxes around the objects show the accuracy of match with which the objects are detected by trained YOLOv3.
Similarly, particular poses of the operator are detected via the camera installed above the setup, covering the workspace.Detection of specific poses of the operator is taking place, for example, pose of the operator when no interference in task (normal position) by him or her is shown as compared to the pose of the operator while interference in task (pose 1) is taking place, followed by the pose of the operator while disposing the damaged/broken item (pose 2); the algorithm gives input to the UR5 control box accordingly (see Figure 12).
Here we want to show the efficacy of our system, by showing results first for individual scenarios other than the main scenario and later for multiple scenarios  detected at one time.However, it may be noted that, when the operation is under process, scenarios other than the main one are usually detected along with it, similar to the case where an unidentified person may mostly be detected with the right item scenario.Therefore, when multiple scenarios are detected, these are addressed one by one based on the anxiety level, for example, in case two scenarios are detected, on addressing the prior one individually, the second one becomes an individual scenario.The individual scenario, when an unidentified person enters the workspace, is shown in Figure 13; as per the strategy based on ontology, recorded into the algorithm, the robot has stopped working.The caution is also displayed on the computer screen, so that the operator may ask the other person to get away from the hazardous area.
Another scenario of wrong item feed is shown in Figure 14; the CPPS on the detection of the wrong item commanded the cobot to place the item at the redundant item spot.
As an example for multiple scenarios, two individual scenarios are considered, that is, a displaced carton/ crate and an unidentified person's entry.An unidentified person is shown entering the workspace and has mistakenly displaced the crate from its original position (see Figure 15).The proximity sensor gives output of the displaced crate to the main program and YOLO gives output of an unidentified person.As the two inputs are detected by the algorithm, it finds the anxiety factor of each situation; in case of unidentified person it is 0.83, and in case of displaced carton it is 0.5.Based on the larger anxiety factor, the algorithm choses the action for the unidentified person scenario and the cobot stops working.The algorithm keeps on checking the situation at each iteration and, as the scenario of unidentified person is removed, the CPPS then choses from the remaining situations, that is, the displaced carton.The signal is then given to PolyScope for the individual scenario of displaced carton which commands the cobot to resume operation and move the crate to its original location.
We also want to highlight here that the complete setup is developed using low-cost sensors/devices to implement the case study, meaning thereby that the flexibility in the CPPS can be inculcated by incorporating low-cost devices for detection and implementation.In future, we aim to expand our system for the case where a large number of multiple scenarios are emerging in the CPPS.There is a need to find an optimized solution using machine learning techniques.

Conclusion
An ontology-based approach is presented to solve industrial scenarios for safety applications in CPPSs.A case study of an industrial scenario is presented to  validate the approach in which visual cues are used to detect and react to dynamic changes in real time.The intelligent automated CPPS is designed with its inherent advanced capabilities for a safe work environment.The production system can decide on improvised situations which are real time, continuous, and complex in nature through AI-based methods that matches the requirement of human intervention.The system anxiety factor is introduced for possible scenarios, so that the flexible physical assets, which in this case are a cobot and the coordinating human, can anticipate well and react effectively with better awareness of the situation.The   developed ontology has supported in building the understanding of the industrial context at hand where the anxiety index can be seen as the perception of hazards to be posed to both the collaborative parties as well as other decision-making elements in the CPPS.The technique has an edge in developing safety mechanisms for smart factories and can be further integrated for security assessments and the appropriate mitigation strategies and defensive mechanisms to safeguard costly physical assets and accidents of the involved human workers.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figure 3 .
Figure 3. Left: detection of bottles; right: detection of person.

Figure 4 .
Figure 4. Left: pose of a standing person; right: pose of a person picking a bottle.

Figure 5 .
Figure 5. Ontology diagram for the case study-main module.

Figure 6 .
Figure 6.Ontology diagram for the right/wrong item module.

Figure 7 .
Figure 7. Ontology diagram for the scenario implementation module.

Figure 8 .
Figure 8. Ishikawa diagram for the anxiety calculation.

Figure 10 .
Figure 10.Setup and implementation of the approach for the case under consideration.

Figure 11 .
Figure 11.Left: detection of the operator; right: detection of bottles.

Figure 12 .
Figure 12.Picture showing the detection of (a) the normal pose, (b) pose 1, and (c) pose 2 of the operator.

Figure 13 .
Figure 13.Scenario for entry of an unidentified person.

Figure 14 .
Figure 14.Left: detection of the wrong item; right: disposal of the wrong item.

Table 1 .
Anxiety level and index for scenarios.