Mixed Logic Dynamical Modeling and On Line Optimal Control of Biped Robot

Human being can decide his walking to auto-adapt to environment changes. Comparing with this, the autonomy of biped robots is inefficient or remains unsolved. One basic reason for this is the less of a unified modeling framework for various types of biped motion in varying environment which allows on-line deriving of the control input In this paper, we propose to recast types of biped motion in the framework of hybrid system, express it as a mixed logic dynamical model. Such a point of view possesses the advantage that it encompasses all the motions (such as single support motion and impact phenomenon) into a unique one, thus allow the control system to be design optimally even though the impact occurs as the interior point of the optimization horizon


I. INTRODUCTION
Walking control of biped robots is classically performed in two steps: (i) the synthesis of walking patterns and (ii) the control of the robot to track the prescribed trajectory [1], [2], [3].The problems of these approaches are that the movements are non-adaptable to environment, and unexpected events are not pre-computable.In this meaning, despite the huge amount of works on the topic, biped control is still not efficient especially in terms of stability, adaptability and robustness.Contrast with this, human being decides his path on line adaptively according to the environment rather than predicting it a very long time ahead.A consideration inspired by human behavior is to introduce the path changes on line while guarantee stability and motion in environment.The autonomy of biped robots is very difficult but challenging.
The control problem of on-line walking adaptation have been studied by some researchers.In [4], the zero moment point (ZMP) was controlled for preserving the dynamic equilibrium.However, this adaptation can only involve small changes of the original trajectory.In [5], a trajectory better adapted to the encountered situation was chosen on-line between a set of stored trajectories.However, switch from one trajectory to another may lead to unexpected effects in control.To cope with this problem, in [6], a continuous set of parameterized trajectories was used as the candidate of choice.The switches were suppressed, but the set of trajectories has to be defined beforehand.
Different with the previous considerations, a notable approach is to adapt the (nonlinear) model predictive control (MPC) to walking adaptation.In [8], [9], by optimizing on line the joint motion over a receding control horizon, biped robots have been controlled without pre-computed reference trajectory and switches at the algorithm level.The walking can auto-adapts to environment changes.However, in [9], accompanying with the approaching of an impact point, the length of the optimization horizon is shortened step by step within single support motion, impact point is not taken into account as the interior point of optimization horizon.Hence the effect of impact can not be treated positively.In [8], how to predict the occurrence of impact then compensate positively for the effect of impact by NMPC is not stated clearly.It is our belief that the main difficult for this is the lack of an unified expression for the hybrid motion of biped.
A model of biped walker should encapsulate phases of continuous motion, switching between types of motion, and occurrence of impacts.The over all system is naturally a hybrid one.In this paper, we provide a unified modeling framework for the biped motion from the hybrid system point of view, model the biped movement as a mixed logic dynamical (MLD) system by emphasis on impact and its effect on the walking dynamics.Based on the MLD model, we adapt the MPC to the on-line walking control of biped robot.The MPC of a MLD system can be solved using powerful mixed integer quadric programming (MIQP) algorithm.Its solution corresponds to the objective-oriented optimization of the gaits.Simulation results show that comparing with the MPC of non-MLD model, the proposed MLD approach results in a fast walking with smaller torque, which validates the effectiveness of our proposal.

II. MODELING OF PLANAR BIPED ROBOTS
The robot is comprised of rigid links, connected by revolute joints shown in Fig. 1.The model considered is a planar open kinematic chain connected at the hip joint, comprising two identical legs.The dynamics of feet is neglected.Torques are applied at hip joint and between the stance leg and the support foot.Motions are assumed to take place in the sagittal plane, consist of successive phases of single support and collision event, without slip.
The two phases of the walking cycle naturally lead to a mathematical model of the biped walker consisting of two parts: the differential equations describing the dynamics during the single support phase, and a velocity discontinuity caused of the impact phase.The model equations and its basic features are describing in the following.

A. Dynamic equation of swing phase
In the single support phase, one leg is pivoted to the ground while the other is swinging in the forward direction.The dynamic equation of the biped robot during the swing phase is given as where θ = [θ 1 θ 2 ] T is the angular vector.The details of the coefficient matrices are given in appendix.
In general, the control variable is bounded because of the power limit of actuator, where M τ , m τ are the upper bound, the lower bound of τ , respectively.

B. Impact model
For simplicity, the collision between the swing leg and the ground is assumed to be inelastic.The contact event produces two simultaneous event: 1) impact, which causes a discontinuity in the velocity component of the state, with the configuration remaining continuous.2) switching due to the transfer of pivot to the point of contact.
Following the law of conservation of angular momentum, we can obtain a transition equation of the angular velocity as where θ− and θ+ denote the angular velocity of pre-impact and post-impact, respectively.α is the half inter-leg angle at the transition instant, λ(α) (see appendix) is the reduction ratio of velocity of the motion transmission which depends on the position of impact point.
The switching of the pivot to the point of contact can be described as The transition from one leg to another is assumed to take place in an infinitesimal length of time.Hence there exists no double support phase.

C. A mixed logic dynamical model
The model of a biped walker is hybrid in nature, consisting of continuous dynamics (single support phase) separated by abrupt changes (impact phase) of velocity.How to describe the hybrid system is very important which affects directly the synthesis of biped motion and the control performance.
For a general hybrid dynamical system, one may have several points of view: continuous-time, discrete-event, or mixed.Whether one of these manners applies better than the others depends a lot on the task.For the biped walking problem, we stress the walking motion transmissions, the adaptability to environment, and the robustness to unexpected disturbance.For that, we consider both continuous and discrete dynamics simultaneously including motion mode switching, express all the motions in a unified framework, which will allow the synthesis of the walking system in a systematic way.
Define the state variables as Then, (1) becomes where .
On the other hand, by ( 3), ( 4), the states of pre-impact and post-impact satisfy the following relation: where Assume the collision event occurs at time t.By ( 6), (7), we have the dynamics of post-impact state evolution, where .
For describing both ( 6) and ( 8) uniformly, we introduce an auxiliary logic variable to associate the event of collision, Then, for both the single support mode and impact mode, the state of the biped evolves monotonically along the unified dynamical equation of were z is the auxiliary variable defined as (11) can be equivalently expressed by the following inequalities [7], On the other hand, the occurrence of collision can be measured experimentally or mathematically according to toe position of swing leg.For walking on smooth plane, collision occurs if y toe = cosθ 1 − cosθ 2 = 0, i.e., θ 1 = −θ 2 .In addition, for distinguishing the erected posture of the biped from collisions, another condition is necessary, θ 1 > 0. Therefore, the occurring condition of collision at smooth plane can be set as where 1 > 0. 2 is arbitrary small positive constant.
Note that (27) is right for impact occurring exactly at the sampling time k.If the impact occurs between k and k +1, the state of the impact point is approximated by the state of k.In this case, short sampling period results in good approximation.
The MLD model describes both the continuous dynamics and the impact event within one framework.This method is general and suitable for modeling the motion of either a full actuated or a underactuated biped walker.

III. PROGRESSING CONSTRAINT
A successful walking should be stable and successive progress forward.For that, conditions of stable and successive walking have to be taken into account as constraints subject to the optimal gait generation problem.

A. Erected body
For supporting the body against gravity, the position of hip should be above a positive level to avoid falling.This can be ensured by (28) results in the hip position cosθ * ≤ y hip ≤ .

B. Progression
For propel the body in the intended direction with a progression rhythm, horizonal velocity of hip should be kept positive, hence the horizonal position of hip is strictly monotonically increased.
For that, we set where v > 0. Then the horizonal velocity of hip satisfies

C. Environment
For walking at non-smooth ground, the walking surface can be physically expressed by a set of inequalities y = ψ(x). (30) Corresponding with the usage of (30), the impact occurrence condition (13) has to be modified.Furthermore, the constraint y toe ≥ ψ(x) has to be added for generating the biped motion.
(28)∼(30) can be included into (26).The resulted MLD model becomes a unified description for both the physical phenomena and the control constraints.

IV. OPTIMAL CONTROL
The problem is to generate on-line the biped motion without pre-defined trajectory, by the simultaneous synthesis and control of walking gait.The optimal gait to current environment consists of the continuous trajectory within each single support motion and the transition time of stance leg.
Basing on the MLD model, we can handle the control objective by using some systematical method such as the model predictive control (MPC).The MPC of MLD system can be transformed to a mixed integer quadric programming (MIQP) problem [7].Its solution corresponds to the continuous control input and motion mode.
The criterion for minimization is as the following.
which is subject to the MLD model (27).N is the horizon for optimization.Q 1 , . . ., Q 4 are weighting matrices.x(k+i+1|k) is the future state of x(k) predicted by the MLD model.The subscript r means the desired value.x r is set as a desired state of pre-impact, which can be time-varying or time invariant.For a given x r , z r and δ r can be uniquely determined.The second term at the right side of (31) implies the minimal input control.
The MPC needs to calculate the future control inputs τ (k + i) which minimize the criterion of (31), and result in the future variables of δ(k + i), z(k + i).Note that ( 27) is a nonlinear MLD model in the sense that it has nonlinear and time-varying coefficients such as B 2 (k).However, the optimization of a nonlinear MLD system is difficult to solve because of the system nonlinearity and the curse of dimension.
In this study, we linearize the nonlinear MLD system so as to avoid the computational difficulty.The linearization is carried out by freezing the coefficient matrices at the current values x(k) for the next N steps.For the approximately linearized MLD system, we can solve the optimal problem by MIQP solver so as to get the future N -steps control inputs τ (k), . . ., τ(k + N − 1).Then, we use only the first control input τ (k) to the actual nonlinear mechanical system for the next sampling step, which results in x(k + 1).At this updated working point, the above linearization and optimal calculation are repeated.This process continued.
The control procedure is concluded as follows. 1.
Set the sampling period T s and the horizon N for optimization.

2.
Substituting the current state x(k) into the nonlinear MLD model ( 27), and freezing the coefficient matrices at the current values for the next N steps.

3.
Based on the linear MLD model, solve the optimal problem by MIQP solver.

4.
Select only the control input for the next step, apply it to the nonlinear biped robot.Then get the updated state x(k + 1).

5.
The time is shift from k to k + 1.Then repeat the step 2 ∼ 5. Note that the proposed linearization and the MPC are based on the updating state.This feedback brings robust property to the control system, by which the nonlinear model error and external disturbance can be effectively treated.On the other hand, larger N results in better control performance, but leads the optimal problem more complex, and increases the computational burden exponentially.The computational burden will obstruct the real time implementation.The use of lookup table and the technique of mp-MIQP [11], which moves the complex computation to off line work, will drastically reduce the on line processing time.

V. APPLICATION TO A BIPED WALKER SYSTEM
The proposed MLD modeling and MPC approach are applied to a 2 D.O.F.planar biped robot system as shown in Fig. 1.The physical parameters of the biped system is as follows: the mass of hip is 10kg, the mass of the link 1 and 2 are 5kg.The length of the link 1 and link 2 are 1m.The distance between hip and the center of mass of link i is 0.5m, for i = 1, 2. For simplicity, the walking ground is assumed to be smooth plane.
All simulations are executed by a computer with a Pentium 3.20GHz CPU and 3GB memory.The sampling period is 10ms.The calculation for the synthesis of walking motion is carried out using Matlab and the free MIQP solver[10].Note that these simulations are preliminary studies for validating the effectiveness of MLD modeling and the feasibility of MPC, the computation time is absolutely not optimized and the used MIQP solver is not fast.
Simulation 1: The MLD modeling and MPC procedure are implemented.The results are shown in Fig. 2 and Fig. 3.
Fig. 2(a) showed the trajectory of joint θ 1 (t) by real line and that of joint θ 2 by dot line.Fig. 2(b) showed the trajectory of velocity θ1 (t) by real line and that of velocity θ2 (t) by dot line.Fig. 3 showed the applied torques comparatively, where the real lines in Fig. 3(a),(b) denotes the applied torque τ 1 , τ 2 by simulation 1.The dot lines in Fig. 3(a),(b) denotes the applied torque τ 1 , τ 2 by simulation 2.
Simulation 2: For verifying the effectiveness of MLD model comparatively, we also carried out the simulation by using the single support model (1) instead of the MLD model (10) for prediction.It implies that the mode transition is ignored within each N -horizonal optimization.
The simulation results are shown in Fig. 4 and Fig. 3. Fig. 4(a) showed the trajectory of joint θ 1 (t) by real line and that of joint θ 2 by dot line.Fig. 4(b) showed the trajectory of velocity θ1 (t) by real line and that of velocity θ2 (t) by dot line.In Fig. 3, the dot lines showed the applied torques τ 1 , τ 2 of simulation 2.
Comparing the results of simulation 1 and 2, by Fig. 3, we see that the applied torques of the two methods are the same before the first impact (t ≤ 0.5s).The results after the first impact are different.The applied torque by the MLD method (real lines) is smaller than that of non-MLD model (dot lines), with less jump at the impact point.In both the two methods, the applied torque τ 2 reached its lower boundary.From Fig. 2, Fig. 4, and Fig. 3, we see that the walking period by the MLD approach is more short than the non-MLD approach.The advantages of the MLD method come from the fact that it predicts the mode transition well within each N -horizonal optimization, then gives out a minimal control input which results in a good control performance.Simulation 3: Simulation is also carried out for checking the robustness to external disturbance added on the applied torque.For the MLD model based MPC, a pulse type of disturbance vector w = [4Nm 10Nm] T is added to the torque vector τ at 1s.The simulation results are shown in Fig. 5.
It is seen that both of the two torques converged to their stable trajectories after the disturbance disappeared.
By these simulation results, we see that the MLD approach results in fast walking with relative small toque jump at impact point, is robust to external disturbance.These results showed us the importance and necessity of the MLD model (the unified model for dynamical equation and impact effect), validated the effectiveness of taking the impact point into prediction within optimization.However, the computation time of MIQP problem depends exponentially on the dimension of the logical variable δ.For the proposed MLD model with 5 dimension vector δ, the computation time of simulation 1 and 3 is about 88s, which is longer than the general non-MLD model based MPC.This provides us another important research topic of how to decrease the computation time for real time implementation.

VI. CONCLUSIONS
In this paper, we proposed a MLD modeling and MPC approach for the on line optimization of biped motion.Such modeling approach possesses advantage that it describes both the continuous dynamics and the impact event within one framework, provide a unified approach for mathematical, numerical and control investigations.This MLD model allows predictive control and subsequent stability from the numerical analysis viewpoints, by powerful MIQP solver.Hence the biped robot can be on-line controlled without pre-defined trajectory.The optimal solution corresponds to the optimal gait for current environment and control objective.The usefulness and the necessity of the MLD model is shown comparatively by simulations.Parallel to this modeling work, we are working on the development of algorithm to effectively decrease the computation time.By it we expect to implement this approach in real time.
Finally, a human uses his predictive function based on an internal model together with his feedback function for motion, which is considered as a motor control model of a cerebellum [12].Stimulated by this, a general theoretical studies for motion control of hybrid systems is reported in [13] which is based on the MLD model.We are further developing this theory which will be useful for the realization of complex motion of bio-mimetic robots.

Fig. 5 .
Fig. 5. Obtained profiles by Simulation 3: the applied torques against the external disturbances.