A Real-Time Bilateral Teleoperation Control System over Imperfect Network

Functionality and performance of modern machines are directly affected by the implementation of real-time control systems. Especially in networked teleoperation applications, force feedback control and networked control are two of the most important factors, which determine the performance of the whole system. In force feedback control, generally it is necessary but difficult and expensive to attach sensors (force/torque/pressure sensors) to detect the environment information in order to drive properly the feedback force. In networked control, there always exist inevita‐ ble random time-varying delays and packet dropouts, which may degrade the system performance and, even worse, cause the system instability. Therefore in this chapter, a study on a real-time bilateral teleoperation control system (BTCS) over an imperfect network is discussed. First, current technologies for teleoperation as well as BTCSs are briefly reviewed. Second, an advanced concept for designing a bilateral teleoper‐ ation networked control (BTNCS) system is proposed, and the working principle is clearly explained. Third, an approach to develop a force-sensorless feedback control (FSFC) is proposed to simplify the sensor requirement in designing the BTNCS, while the correct sense of interaction between the slave and the environment can be ensured. Fourth, a robust-adaptive networked control (RANC)-based master controller is introduced to deal with control of the slave over the network contain‐ ing both time delays and information loss. Case studies are carried out to evaluate the applicability of the suggested methodology.


Introduction
Teleoperation, which allows a human operator to interact with the environment remotely, extends humans' sensing, decision making, and operation beyond direct physical contact. Since its introduction in the late 1940s, teleoperation systems have been deployed worldwide in numerous domains ranging from space exploration, underwater operation, and hazardous assignment to micro-assembly, and minimally invasive surgery.
In common, control schemes for teleoperation systems can be classified as either compliance control or bilateral control. In the compliance control [1,2], the contact force sensed by the slave device is not reflected back to the operator, but is used for the compliance control of the slave device. On the contrary, in the bilateral control [3][4][5], the contact force is reflected back to the operator. The operator is able to achieve physical perception of interactions at the remote site similar to as directly working at this site. Consequently, it improves the accuracy and safety in teleoperation. Thus, the bilateral control has drawn a lot of attention. Figure 1 shows a generic configuration of a bilateral teleoperation system which includes five components: operator, master, communication network, slave, and environment. The master is capable of acquiring information about the desired manipulation actions and assigning the tasks to the slave. It normally consists of an input component, such as a joystick, console, or tactile device, and a force-reflecting mechanism (FRM) to exert the reflected forces on the human operator. The slave is a robotic device that takes the place of the human operator to carry out the required tasks at the remote environment. It is usually equipped with sensors to acquire information about the task process to feed back to the master. There are two common control architectures of bilateral teleoperation systems: positionposition and position-force architectures. In the first approach, the master position is passed to the slave device, and the slave position is passed back to the master side. The reflected force applied to the operator is derived from the position difference between the two devices and, therefore, this approach is not desirable in cases of free motion. In contrast, the position-force approach uses directly the force measured at the remote site rather than the position error. In this architecture, the contact force, sensed by a force/torque sensor mounted on the slave device, is scaled by a force-reflecting gain (FRG), and this scaled force is reflected back to the operator via the master device. This method then provides the operator a better perception of tasks execution at the remote site.
In order to derive sufficiently the FRG, two important tasks are required: first, to detect the environment and, second, to determine the contact force at the slave site. Many studies have been carried out to optimize the FRG [4,5]. Although the reported algorithms showed some remarkable results, there remain some drawbacks such as how to determine the FRG appropriately with unknown environments; and especially, it is difficult and expensive to attach proper sensors (force/torque/pressure sensors) to detect the loading conditions. Additionally, the use of these kinds of sensors is cost-ineffective and difficult to be installed in practice. Especially, it is easy to be damaged when the system operates under hazard conditions. Incorporation of teleoperation and force feedback requires bidirectional information exchange between the master and slave via the network. In contrast to the advantages such as cost saving, power consumption, easy implementation, and maintenance, a networked control system (NCS) leads to two major problems, inevitable random time-varying delays and packet dropouts because of restrictions imposed by the transmission data rate and channel bandwidth. Due to sensitivity of bilateral teleoperation systems to time delays and packet dropouts, even a small time delay can destabilize the system [6].
Disregarding the packet loss problem, numerous methods have been proposed [7][8][9] to minimize the bad effect of time delay. Herein, the controllers were designed based on the assumptions that time delay was constant [7], bounded [8], or had a probability distribution function [9]. Moreover, these assumptions just considered that the time delay in the closed control loops was less than one sampling period while in practice, it is random and irregular. To compensate large and uncertain delays, other studies suggested adaptive control schemes using variable sampling periods, which were based on neural networks (NNs) or prediction theories [10][11][12]. Although the performances were improved, there were requirements on acquiring the real delay data for the training processes, which were not appropriately discussed.
To adapt with NCSs compromising not only time delays but also packet dropouts, many important methodologies, such as state feedback control [13], robust control [14][15][16], predictive and model-based predictive control [17,18], were proposed. By using these techniques, the NCS performances were remarkably improved over the traditional methods. However in most studies, time delays and packet dropouts were assumed to be priorly known and bounded to design the controllers. The sampling period was always fixed in these studies and, subsequently, limited the control performances. Furthermore, computation delay normally at the master side is also one important factor affecting directly the system performance. Recently, an advanced robust variable sampling period control approach has been developed for nonlinear systems containing both the kinds of delays and packet dropouts [19][20][21]. The applicability of this method was proved through real-time experiments. Therefore to fully overcome the above-mentioned problems, a simple and cost-effective realtime bilateral teleoperation networked control system (BTNCS) is introduced in this chapter. The advanced concept for designing a bilateral master-slave teleoperation networked control system is proposed, and the working principle is clearly addressed. Next, an approach to develop a force-sensorless feedback control (FSFC) is proposed to simplify the sensor requirement in designing the BTNCS while the correct sense of interaction between the slave and the environment can be ensured. To deal with the networked control of the slave, a robust-adaptive networked control (RANC)-based master controller is suggested to compensate for the problems of time delays and information loss. Case studies are carried out to evaluate the applicability of the suggested methodology. Without loss of generality, the architecture of the BTNCS is suggested in Figure 2. The design for this BTNCS should address the following issues:

Bilateral teleoperation networked control system
• The local master (LM) controller functions as the main control unit of the system to ensure that the slave manipulator could execute robustly and accurately the tasks given by the operator via a human interface (HI), disregarding the impacts of networked communication problems and environments. The local slave (LS) controller is, therefore, an optional design. It can be just a buffer or a zero-order holder (ZOH) to receive the control input derived by the LM controller and distribute sequentially to the manipulator.
• There is no force/pressure/torque sensor to be attached at the manipulator end to minimize the cost and risks. Only an internal sensor, as displacement sensor, is used to monitor the manipulator trajectory and send back to the LM controller to form a closed control loop.
• The interaction between the manipulator and environment is, therefore, estimated by the FSFC module at the master side. Here, the FSFC should consist of two parts: first, an environmental interaction (EI) estimator to detect the environment characteristics as well as to derive properly a reflecting force required to be applied to a physical device in the HI module (as a joystick); second, an FRM controller to drive the FRM to generate the desired reflecting force.
• The FRM is suggested to be constructed using pneumatic rotary actuators. This use brings some advantages over the traditional design with DC electric motors. Compared with an electric motor, a pneumatic actuator provides a higher ratio of force-mass, and can produce larger reflected force without using any reduction mechanism, such as gearbox. In addition, with the pneumatic solution, the FRM is able to work under safe conditions without damage from the operator.
• Due to having different control modules and other functions at the master side, computation delays τ com need to be taken into account when designing the LM controller.
• The communication network has unavoidable delays, τ ca and τ sc , and packet dropouts, represented by "virtual" switches, S ca and S sc , in the forward and backward channels, respectively. S ca (S sc ) is opened (or 1) when a packet loss event exits.
The following are attempted to address the two important issues in designing the BTNCS which are the FSFC and the LM controller.

Design of FSFC
As stated in Section 2, the FSFC compromises the two main modules: EI estimator and FRM controller. The FRM controller can be selected as a typical feedback controller in which the reference input is the desired reflecting force sent from the environment classifier and the feedback signal can be the force/torque/pressure at the physical device of the HI module.
The EI estimator is suggested as in Figure 3. This estimator with two inputs and one output contains four blocks: Learning Vector Quantitative Neural Network (LVQNN) classifier, slave dynamics, environment dynamics, and fuzzy-based FRG tuner. Without loss of generative, the interactive environment can be represented by two factors: damping c e and stiffness k e . Thus, The LVQNN classifier is firstly designed with two inputs and two outputs to classify the environment. The two inputs are the command and response from the slave while the two outputs are predicted values of the environment damping and stiffness, ĉ e and k e , respectively.
Next, these predicted values are fed into both the environment dynamics and fuzzy-based FRG tuner. Synchronously, the slave command is also input to the slave dynamics. The interaction between the slave dynamics and environment dynamics-loading force-is, therefore, estimated and denoted as F e . Meanwhile, the fuzzy-based FRG tuner is designed with two inputs, ĉ e and k e , and one output which is value of the FRG. Finally, the desired reflected force, F dr , is derived from the estimated loading force and the FRG.

Learning vector quantitative Neural Network
NN is one of the powerful artificial intelligent techniques that emulates the activity of biological NNs in the human brain. LVQNN is a hybrid network which uses advantages of competitive learning and bases on Kohonen self-organizing map or Kohonen feature map to form the classification [22]. Figure 4 shows a generic structure of an LVQNN with four layers: one input layer with m nodes, first hidden layer named competitive layer with S 1 nodes, second hidden layer named linear layer with S 2 nodes, and one output layer with n nodes (in this case, S 2 ≡ n). The core of the LVQNN is based on the nearest-neighbor method by calculating the Euclidean distance weight function, D, for each node, n j , in the competitive layer as in the following: where X is the input vector; W 1 (j,i) is the weight of node j th in the competitive layer corresponding to element i th of the input vector.
Next, the Euclidean distances are fed into function C which is a competitive transfer function. This function returns an output vector o 1 , with 1, where each net input vector has its maximum value, and 0 elsewhere. This vector is then input to the linear layer to derive an output vector o 2 , where each element corresponded to each node of the output layer and computed as where W 2 (k,j) is the weight of node k th in the linear layer corresponding to element j th of the competitive output vector; k W (k) is linearized gain of node k th in the linear layer.
In the learning process, the weights of LVQNN are updated by the well-known Kohonen rule, which is shown in the following equation: where μ is the learning ratio with positive and decreasing with respect to the number of

Design of LVQNN classifier
Here, the LVQNN classifier is designed and implemented into the FSFC to distinguish different working environments at the slave side in an online manner. To enhance the given task with the limited number of input information, the input vector of the classifier is constructed as a vector of current and several historical values of four signals: the slave driving command, For applications to classify the environment which is varied with both the damping and stiffness values, the output from the classifier should be the mix of classes with different ratios.
In order to enhance this task, a so-called smooth switching algorithm is proposed here. The environment class is, therefore, determined by the smooth combination of the current class detected by the LVQNN (Y) and the previous class using a forgetting factor, λ Similarly, the estimated values of the damping coefficient and stiffness are produced by Additionally, in order to avoid influences of noises on the classification performance, the forgetting factor is online tuned with respect to the changing speed of the classifier outputs: Step 1: Set the initial value for the forgetting factor, λ=0.5; define a small positive threshold, 0 < γ 1 < γ 2 , for the classifier output changing speed, v Y , which is defined by the number of sampling periods when Y continuously changes.
Step 2: For each step, check v Y and update λ by comparing v Y with γ using the following rule:

Test rig setup
To evaluate the effectiveness of the LVQNN classifier, an experimental system has been designed as in Figure 5. One joystick with one Degree of freedom (DOF) is used to generate driving commands for the slave. An FRM which employs a valve-driven mini pneumatic rotary actuator and two bias springs is attached to the opposite side of the joystick handle. The slave employs an asymmetrically pneumatic cylinder as its manipulator. The displacement of the cylinder rod is sensed by a linear variable displacement transducer (LVDT). The interaction between the master and slave is enhanced by the communication module which can perform either wire-by-wire or wireless protocol. To generate environmental conditions for the slave manipulator, compression springs with different stiffness values are installed in serial with the cylinder rod. An air compressor is used to supply the pressurized air for both the FRM and the slave. A compatible PC and a multifunction data acquisition device are employed to perform the communication between the PC and master. The system specifications are listed in Table 1.

LVQNN classifier training and verification
In order to train the network, the prior task is acquiring the target data. Real-time teleoperation experiments without force feedback concept were performed on the test rig. Both the three springs mentioned in Table 1 were used to generate the environmental conditions. For each condition, a trajectory for the slave manipulator was randomly given by the operator. The slave information, including the valve driving command and cylinder rod displacement, with respect to each spring was acquired with a sampling period of 0.02 s.
Next, each of the acquired slave data sets was used to perform the input vector, while the correspondingly selected spring was used as the target output class (1, 2, or 3 Real-time teleoperation experiments were performed in order to investigate the ability of the LVQNN classifier in practice. Other three experiments with the three loading conditions were carried out with the same cylinder trajectories given from the master using an open-loop control. The classification results were then achieved as displayed in Figure 6-8. The results imply that the proposed classifier could online detect well the loading conditions and, therefore, is capable of producing precisely decisions on the environment characteristics.

Problem and RANC design concept
In this section, the RANC approach for a generic system is introduced to support the design of the LM controller. Consider a discrete-time plant [20,21]: where x k ∈ R n is the state vector, u k ∈ R m is the control input, y k ∈ R p is the controlled output, d k ∈ R d is the environment impact, and A, B, C and E are matrices with appropriate dimensions.
The RANC-based LM controller in Figure 2 is then designed based on the following issues [20,21]: • Both the actuators and controller are event-driven.
• The sensors are time-driven with variable sampling period T. For step (k + 1) th , this sampling period T k+1 is online optimized depending on the total system time delay τ k+1 to make sure: Using the results in references [20,21], the configuration of an NSC using the proposed RANC controller is clarified in Figure 9. Herein, the RANC mainly consists of five modules: Time Delay and Packet Detector (TDPD), Time Delay Predictor (TDP), Variable Sampling Period Adjuster (VSPA), Quantitative Feedback Theory (QFT), and Robust State Feedback Controller (RSFC). The TDPD module is firstly used to detect the network problems at the current state. This information is then sent to the TDP to perform one-step-ahead prediction of system delays which are the inputs of the VSPA to adjust effectively the sampling period. The two modules, QFT and RSFC, are employed to construct the so-called hybrid controller to compensate for the influences of delays and packet dropouts. A smart switch (SSW) is employed to switch the hybrid controller to QFT or RSFC based on the outputs of the TDPD detector and the TDP predictor with rule: • The QFT is selected (SSW = 0) once there is no packet loss and all delay components are less than their pre-defined threshold values: ( * is ceiling function to return the nearest integer) , , , , • Otherwise, the RSFC is chosen (SSW = 1).

Design of VSPA
By using the TDPD and TDP, the sampling period for the coming step (k + 1) is defined as ( ) where τ k +1 com , τ k +1 ca and τ k +1 sa are the delays estimated by the TDP.

Design of TDPD
To measure accurately time delays and packet dropouts, the TDPD employs a micro-control unit (MCU) with proper logic. Here, PIC18F4620 MCU from microchip equipped with a 4 MHz oscillator is suggested to be used [19]. The real-time measurement accuracy of 50μs [19] which is suitable for this application. For each step k th , the TDPD enhances the following tasks: • Derive the real working time using the MCU, t k .
• For a command u k sent from the controller to the plant, a time stamp is encapsulated into this packet to detect the forward delay, τ k ca , and packet dropouts if having, p k ca .
• For the signals sent from the sensors to the controller, they are combined with time stamps to detect the delay, τ k sc , and packet dropouts if having, p k sc .
• The total number of continuous packet dropouts is then determined as: • Depend on the time stamps to detect the computation delay at the controller side, τ k com .

Design of TDP
The TDP based on a so-called adaptive gray model with single-variable first-order, AGM (1,1) to estimate the system delays in the coming step, τ k +1 com , τ k +1 ca , and τ k +1 sa . The AGM (1,1) prediction procedure is as follows: Step ] coefficient matrix of the spline function going through the object data set {(t 1 , y Object (t 1 )), …, (t m , y Object (t m ))} .
Theorem 1: There always exist two non-negative additive factors, c 1 and c 2 , to convert any raw sequence (12) to a gray sequence which satisfies both the gray checking conditions.
Proof: The proof of this theorem can be found in reference [19].
Thus from (12), the gray sequence is derived using Theorem 1: Step 2: Use the 1-AGO to obtain a new series y (1) from y (0) : Step 3: Create the background series z (1) from y (1) by the following general algorithm: where α aver is the average weight and set as 0.5 [23,24]; β is a momentum rate within range [0, 1] and tuned by a Lyaponov-based SISO fuzzy mechanism (LFM) in order to guarantee a robust prediction performance (see the proof of this theory in reference [20]) Step 4: Establish the gray differential equation: Step 5: Setup the AGM (1,1) prediction as follows: Step 6: Perform the predicted value of y at step (n + p) th : where p is the step size of the grey predictor. In this case, p = 1.

Design of QFT controller
Denotes the transfer functions of the plant and the controller in the NCS as G p (s) and G c (s), respectively. The closed-loop transfer function from input R(s) to output Y(s) including delays can be expressed as And the open-loop transfer function is then derived as Next, the procedure to design this robust controller can be expressed as follows [25,26]: Step 1: The QFT controller should be designed on how the tracking signal meets the acceptable variation range with respect to a reference (for each value of ?i of interest) Step 2: Establish bounds for robust stability of the closed-loop system ( ) Step 3: A sensitivity function is defined as Establish bounds for noise and disturbance rejection of the closed-loop system Step 4: Define a working frequency range of the NCS.
Step 5: Bases on the nominal plant to design the controller, G QFT (s), with initial poles and zeros.
Step 6: Use the loop-shaping method to refine the controller designed in Step 5. The principle to guarantee a robust control performance is the loop gain value is always on or above the boundaries defined by constraints (28)-(31) and, is to the right or on the robustness forbidden region for any critical frequency within the range defined in Step 4.
Step 7: The controller resulted from Step 6 could ensure that the variation in magnitude of T NCS (s) (26) is satisfied at the desired constraints. However, it does not guarantee that the magnitude of T NCS (s) actually lies between the given bounds T l NCS ( jω i ) and T u NCS ( jω i )for the whole frequency range. Therefore, a pre-filter F QFT (s) is necessary to be designed and placed in front of the controller to compensate for this problem.

Design of RSFC
The RSFC is designed to deal with the NCS in cases that the delays are greater than the threshold values defined in Eq. (9), and/or there exists packet dropouts. To simplify the analysis, it can be assumed that r k = 0 (Figure 9) during the system modeling and RSFC design.
The system discrete form for step (k + 1) th can be expressed through the following three cases: Case 1: There is no packet loss in network transmission during both current period and previous period: Case 2: There exists i packet dropouts up to step k th : Case 3: There were p d continuous packet dropouts just passed up to step (k − 1) th : Since, the general discrete form of the system can be established as where δ l is an activation function: 1, If subsystem is activated, {0,1, 2} 0, Otherwise.
In case of no delays, the control rule is designed as Then, Eq. (35) can be re-written in the following form: Theorem 2: For given constants ξ i j > 0, if there exist positive definite matrices P i i ∈ 0, 1, …, p k +1 , j = {0, 1, 2} such that following inequalities: hold, then the NCS (39) is exponentially stable with a positive decay rate by using the RSFC control gain derived by Proof: The proof of this theorem can be found in reference [20].

Networked control system setup
To validate the RANC applicability, a simple networked control system was set up based on the configuration in Figure 9. Here, the main controller was built in an experimental PC equipped with the MCU (presented in Section 4.2. 2) The network module includes one coordinator and one router employed the ZigBee protocol [19]. The multifunction card was used to perform the communications with the TDPD module and the network. The control objective is speed tracking control of a servomotor system via the setup network. Here, the servomotor system was RE-max series manufactured by Maxon Corporation. The transfer function from the input to output of this system can be expressed as [20] ( ) To evaluate the capability of the proposed RANC, a comparative study of the RANC with three existing approaches, a QFT-based robust controller (QFTRC), a static state feedback controller (SSFC), and a hybrid QFT-SSFC as shown in Table 3, has been investigated. To make control challenges, disturbance loads and computation delays were randomly generated. Here, the loads were created by putting the system into a varied magnetic field which affected on the motor shaft. Meanwhile, an arbitrary matrix-based complex calculating procedure representing a complex application was added to the controller at the PC side. Based on Section 4.2.5 and by setting the initial bounds for the total delay and packet dropouts as: τ 0 = 0.1s > τ QFT , p 0 = 2; the initial RSFC gain was computed as K RSFC 0 = 0.3796 0.2642 .
The initial sampling period of the RANC was selected as 10 ms, while that of the other controllers must be fixed to 0.15 s to cover all the delays.

Real-time experiments
In this section, real-time speed tracking of a servomotor system over the setup network has been investigated in which the reference speed was selected as 10 rad/s. The experiments using the comparative controllers were carried out and, consequently, the results were displaced in Figure 10. An analysis on the control performances using four evaluation criteria, PO (percent overshoot), ST (settling time), SSE (steady state error) and MSE (mean square error), is then performed as demonstrated in Table 4.

Controller
Step  Table 4. Comparison of NCS performances using different controllers.

Figure 10.
Step speed performances using different controllers.
From Figure 10 and Table 4, it is clear that the worst-case performance was with the QFTRC. It is because the QFTRC was designed for an NCS in which the delays are bounded by constraint (9) and there is no packet dropout. Hence when facing with the networked servomotor including both the large delays and packet dropouts, the controller could not guarantee the robust tracking (SSE was 13.9%). In case of using the SSFC for which the control gain was designed for the worst network conditions (large delays and highest number of continuous packet dropouts), SSE was remarkably reduced to 6.18%. However due to this fixed control gain use, the system response was much slower than that of the QFTRC. Additionally, due to lack of disturbance rejection capability, the SSFC could not ensure a smooth tracking (PO = 8.67% and ST = 2.1s). The QFT--SSFC, by taking advantages of both the QFT and SSFC, could improve the tracking result. Nonetheless, the QFT-SSFC with fixed control gains and fixed sampling period restricted the system adaptability to the sharp variations of network problems and disturbances (SSE was slightly reduced to 5.88%).
The best tracking result was achieved by using the RANC (see Figure 10 and Table 4). It comes as no surprise because the RANC possesses all the advantages of the QFT and state feedback designs and, the high adaptability to the system changes by using the adaptive control gains and adaptive sampling period. Figure 11 demonstrates the delay and packet dropout detection and prediction results of the TDPD and TDP modules, respectively. During the operation, these were supported by the VSPA and hybrid QFT-RSFC modules to regulate properly the sampling period and control gains. Step speed case: performances of TDPD and TDP.

Conclusions
In summary, an advanced bilateral teleoperation NCS has been introduced. In addition, the two important issues in designing the BTNCS system, FSFC and RANC, have been clearly addressed. The LVQNN-based FSFC shows the safe and cost-effective solution in controlling the FRM while the RANC-based LM controller is a feasible choice to drive the slave manipulator over the network with delay and packet loss problems.
The effectiveness of the LVQNN classifier as well as the RANC controller has been confirmed through the two case studies and analysis. One of our future research topics would be the full design of the proposed BTNCS to investigate the applicability of this method in a practical application such as remote construction equipment.