Hierarchical Quadratic Programming for Robot Control

Davide De Benedittis

Università di Pisa

July 31, 2025

Mathematical Framework: Hierarchical Quadratic Programming

Model-Based Control

Nonlinear Model Predictive Control

\[ \begin{align} & \min_x \; && f (x) \\ & \text{s.t.} \; && x \in X \end{align} \]

Convex Model Predictive Control

\[ \begin{align} & \min_x \; && f (x) \\ & \text{s.t.} \; && x \in X \end{align} \]

\(f\) is convex

\(X\) is a convex set.

Quadratic Programming (QP)

Subset of Convex Programming.

Both efficient to solve and descriptive.

\[ \begin{align} & \min_x \; && x^T Q x + p^T x \\ & \text{s.t.} \; && A x = b \\ & && C x \leq d \end{align} \]

Problem 🤕

Problem: solve a set of tasks that cannot be satisfied concurrently.

E.g., maintain an end-effector pose (task), avoid obstacles (safety), and minimize energy consumption (efficiency).

Problem Formulation

Problem: solve a set of tasks that cannot be satisfied concurrently.

Solution: a tradeoff needs to be made among the tasks.

Two approaches can be adopted:

Weighted Approach

  • A weight is assigned to each task
  • All tasks solved together
  • Tasks compromise each other

Hierarchical Approach

  • A priority is assigned to each task
  • Solved in a hierarchical way
  • Low-priority tasks do not compromise higher-priority ones

Solution Approaches

Techniques to deal with redundancy:

  • Standard approach
  • Singularity-robust approach
  • Reverse priority approach
  • Hierarchical Quadratic Programming (HQP)

Tasks Definition

Consider a set of tasks \(T_p\) with \(p = 1, \dots, n_p\) in the general form \[ T_p = \left\lbrace \begin{matrix} A_p x - b_p = v_p \\ C_p x - d_p \leq w_p \end{matrix} \right. \]

where \(v_p\) and \(w_p\) are slack variables to be minimized.

(If the task is feasible, it will be \(v_p = 0\) and \(w_p = 0\).)

Tasks Reformulation

A single task \(T\) is formulated as the following optimization problem \[ \begin{aligned} \min_{x, w} \quad & \|{Ax - b}\|_2^2 + \| w \|_2^2 \\ \textrm{s.t.} \quad & C x - d \leq w. \end{aligned} \]

Hierarchical Quadratic Programming (HQP)

The solution is computed recursively, starting from the highest priority task \(T_1\).

  • The global solution is denoted as \(\overline{x}^*_{p}\) and accounts for all higher priority tasks.
  • The local solution is denoted as \(x^*_{p}\) and accounts only for the current task \(T_p\) and higher priority inequality constraints.

Hierarchical Quadratic Programming (HQP)

The solution is computed recursively, starting from the highest priority task \(T_1\).

The solution up to task \(T_{p+1}\) is computed as \[ \overline{x}_{p+1}^* = \overline{x}_{p}^* + \overline{N}_p x_{p+1}^* \]

The null space projector \(\overline{N}_p\) is computed iteratively with \[ \overline{N}_p = \operatorname{Null} \left(\overline{A}_p \right) = \overline{N}_{p-1} \operatorname{Null} \left(A_p \overline{N}_{p-1} \right), \]

where \(\overline{A}_p = \begin{bmatrix} A_1^T & \cdots & A_p^T \end{bmatrix}^T\).

A null space projector can be computed using the SVD decomposition.

The algorithm is initialized with \(\overline{N}_0 = I\).

Hierarchical Quadratic Programming (HQP)

\(\xi_{p+1}^* = \begin{bmatrix} x_{p+1}^* \\ w_{p+1}^* \end{bmatrix}\) is the solution of a QP problem whose non-null matrices are

\[ \begin{alignedat}{3} &Q_{p+1} &&= \begin{bmatrix} \overline{N}_p^T A_{p+1}^T A_{p+1} \overline{N}_p & 0 \\ 0 & I_{n_{w_{p+1}}} \end{bmatrix}, \quad &&p_{p+1} = \begin{bmatrix} \overline{N}_p^T A_{p+1}^T (A_{p+1}^T \hat{x}_p^* - b_{p+1}) \\ 0 \end{bmatrix}, \\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% &\hat{C}_{p+1} &&= \begin{bmatrix} C_{p+1} \overline{N}_p & - I_{n_{w_{p+1}}} \\ \overline{C}_p \overline{N}_p & 0 \\ 0 & - I_{n_{w_{p+1}}} \end{bmatrix}, \quad &&\hat{d}_{p+1} = \begin{bmatrix} d_{p+1} - C_{p+1} \hat{x}_p^* \\ \overline{d}_{p} - \overline{C}_{p} \hat{x}_p^* + \overline{w}_p^* \\ 0 \end{bmatrix}, \\ %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% & \overline{C}_{p} &&= \begin{bmatrix} C_p \\ \vdots \\ C_1 \end{bmatrix}, \quad && \overline{d}_p = \begin{bmatrix} d_p \\ \vdots \\ d_1 \end{bmatrix}, \quad \overline{w}_p^* = \begin{bmatrix} w_p^* \\ \vdots \\ w_1^* \end{bmatrix}. \end{alignedat} \]

Soft Bilinear Inverted Pendulum 🦿

Classic Control Scheme

Two-Blocks Control Scheme

Introduction

Research Goal: consider the contact compliance in the control design.

Contributions:

  • The Soft Bilinear Inverted Pendulum (SBIP) model is proposed and used in the Motion Planner design.​
  • A Whole Body Controller is developed using Hierarchical Optimization and a soft contact model.​
  • The improvements granted by considering the softness in the MP and the WBC are evaluated.​
  • The proposed approach is tested in simulations and experiments.

Soft Bilinear Inverted Pendulum Model

Equations of motion of the SBIP model: \[ \begin{align} \ddot{x}_{CoM} &= \frac{g + \ddot{z}_{CoM}}{h_{CoM}} (x_{CoM} - x_{ZMP}) \\ \ddot{y}_{CoM} &= \frac{g + \ddot{z}_{CoM}}{h_{CoM}} (y_{CoM} - y_{ZMP}) \\ \ddot{z}_{CoM} &= - \ddot{\delta} = - g + 1/m (k_p \delta + k_d \dot{\delta}) \end{align} \]

Simplified model used to plan the base and feet trajectories over a long horizon.

The planner is a QP-based MPC that tracks a base velocity reference.

Tracking Controller

Full robot model used to plan over a short horizon with higher fidelity.

The equations of motion of a legged robot are \[ M(q) \ddot{q} + h(q, \dot{q}) = S^T \tau + J^T (q) f \] which can be split into the underactuated and actuated part as \[ \begin{align} M_a (q) \ddot{v} + h_a (q, v) &= \tau + J_a^T (q) f \\ M_u (q) \ddot{q} + h_u (q, v) &= J_u^T (q) f \end{align} \]

The optimization vector is \[ x_{opt} = \begin{bmatrix} \dot{v}^T & f^T & \delta_d^T \end{bmatrix}^T \]

From the optimization vector and the robot model, the joint torques are \[ \tau^* = M_a (q) \dot{v}^* + h_a (q, v) - J_a^T (q) f^* \]

Priority Tasks
1 Physical consistency
2 Actuation torque limits
Contact friction cone limits
Force modulation
3 Soft contact constraints
4 Base linear trajectory tracking
Base angular trajectory tracking
Swing feet trajectory tracking
5 Energy minimization
Contact forces minimization

Validation in Simulation

ANYmal C robot

SOLO12 robot

Comparison in Simulation

State of the art -> failure

Proposed approach -> success

Comparison in Simulation

Four controller configurations compared using four KPIs and two different scenarios.

Experimental Results

Tested with the SOLO12 robot on a foam rubber terrain.

Flat ground

Inclined ground

SBIP Conclusion

  • Considering the terrain characteristics in both components of a two-blocks control scheme is beneficial.
  • SBIP model is a good compromise between fidelity and computational efficiency.
  • HQP deals with tasks priorities effectively.

Keep It Cool 🔥

Problem

Actuators overheating during prolonged use.

Introduction

Research Goal: guarantee thermal safety and optimal performance using control.

Contributions:

  • Consider the thermal dynamics in the control design.
  • HQP based approach for constraints prioritization.
  • Ablation study to evaluate the impact on performance.

Control Architecture

First order thermal model \[ \dot{T} = - \alpha (T - T_\text{amb}) + \beta | \tau | \]

Control Architecture

Optimization vector: \[ x = \begin{bmatrix} q^T & \dot{q}^T & T^T & \tau_s^T \end{bmatrix}^T \]

Tasks:

  • Dynamic consistency, Thermal model, Epigraph reformulation
  • Thermal safety, Safety limits
  • Primary task (e.g., end-effector pose)
  • Secondary tasks (e.g., control energy minimization)

Results - Reach and Hold

Results - Circle

Results - Lemniscate

Conclusion

  • Considering the temerature dynamics in the control can avoid overdimensioning the hardware or downsizing the task.
  • HQP effectively guarantees thermal safety.
  • Other approaches (e.g., CBFs) can further enhance performance in certain scenarios.

Conclusion 🏁

Conclusion

  • HQP allows to solve a set of tasks while enforcing a priority order
  • It has been successfully applied to:
    • Legged locomotion on compliant terrains [1]
    • Thermal safety in robots [2]
  • Other potential application:
    • Multi-robot control [3]
  • Future works:
    • Non-uniform time sampling [4]

Future Works – Multi-Robot Control

  • Multi-robot setting
  • HQP to manage multiple concurrent and potentially conflicting tasks

Heterogeneous coverage and battery recharging

Obstacle avoidance > centroid velocity reference > formation control

Future Works – Non-Uniform Time Sampling

Why: Constrained LQR (as most optimal control problems) assumes a uniform time sampling.

Core Idea: Embed the sampling durations into the optimization problem. Derive closed-form cost function and gradients that can be precomputed.

How: Use matrix exponentials to propagate the dynamics exactly.

Key Results: Provide both Single Shooting and Multiple Shooting formulations with analytical gradients.

Take-Away: Let the optimizer decide how to sample the horizon.

References 📚

[1]
D. De Benedittis, F. Angelini, and M. Garabini, “Soft bilinear inverted pendulum: A model to enable locomotion with soft contacts,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 55, no. 2, pp. 1478–1491, 2025, doi: 10.1109/TSMC.2024.3504342.
[2]
D. De Benedittis, F. Angelini, R. Incerpi, L. Pallottino, and M. Garabini, “Keep it cool: Optimal control of robotic systems under temperature constraints,” IEEE Transactions on Robotics, Under Review.
[3]
D. De Benedittis, M. Garabini, and L. Pallottino, “Managing conflicting tasks in heterogeneous multi-robot systems through hierarchical optimization,” IEEE Robotics and Automation Letters, vol. 10, no. 6, pp. 5305–5312, 2025, doi: 10.1109/LRA.2025.3559843.
[4]
P. Gori, D. De Benedittis, M. Pierallini, F. Angelini, G. Pannocchia, and M. Garabini, “Optimal non-uniform time sampling in continuous-time constrained LQR,” IEEE Transactions on Automatic Control, Under Review.

Grazie Per L’Attenzione (◕‿◕)

Contact me at:

  • davide.debenedittis@phd.unipi.it