In this blogarticlewe present an overview of adaptive and learning control, from historical perspectives to open challenges and future potential solutions. The main focus here is threefold: First, we give a historical overviewof adaptive and learning algorithms, we then aim at contrasting model-based approaches with data-driven approaches, and finally, we discuss open challenges and potential future research areas.

#### 1. Historical Perspectives

Model-based adaptive systems were first investigated in the early 1950s in order to improve the performance of autopilot designs at the time. As flight envelopes grew and higher performance levels were desired, interest in methods such as Model Reference Adaptive Control (MRAC) using the MIT-rule grew rapidly. Stability in feedback and adaptation itself were not well understood which, when coupled with flight-test failures, solidified the need to focus on stability.

The state-space system formulation, controllability and observability, and the use of Lyapunov stability theory for control systems were introduced in the early 1960s followed by the first MRAC law with Lyapunov based adaptation. The 1960s also saw the introduction of an adaptive variant on sliding mode control, which is still a popular and important topic today. Adaptive Pole Placement and Self-Tuning Regulators which allowed control of non-minimum phase systems were introduced in the 1970s.

While stability theory had matured, stability issues in adaptation were again faced when researchers showed problems due to small disturbances, high gains, high frequencies, fast adaptation, and time-varying parameters. The late 1970s and 1980s were spent addressing these issues in adaptive control (and observation) through methods such as: *σ *and * *modification, parameter projection, and dead-zone. The ideas of persistent excitation and sufficient richness were also developed during this period.

It was not until the 1970s that the work on controllability and observability were extended to non-linear systems (affine and non-affine in the control) through the use of Lie Theory. The early to mid-1980s saw the formulation of Feedback Linearization as a natural extension of this Lie Theory approach that had become quite popular. Once the theory of non-linear control had matured somewhat, the stage was set for a more integrated approach using Lyapunov theory that could address some of the pitfalls associated with Feedback Linearization such as higher derivatives of parameter estimates in the control law.

The Backstepping approach, developed in the early 1990s, allowed the control designer to guarantee stability while including helpful non-linearities for stability through judicious use of Lyapunov stability theory. This provided a systematic approach to handling uncertainties, as well as issues in Feedback Linearization. In the spirit of the robust adaptive control pursuit of the 1980s a new method, Adaptive Robust Control, combined ideas from robust control and adaptive Backstepping.

Non-model based adaptive control approaches, such as Extremum Seeking and Neural Networks, have their own timeline of development. The first known publication on Extremum Seeking (ES) was published in the early 1920s, well before the controls community was interested in adaptation. Work on ES was sparse until the early 2000s when the first rigorous stability proof was provided. The next ten years were fruitful for the subject, with an extension to discrete time systems, slope-seeking, and global properties. Extremum Seeking has since been applied successfully in a wide variety of interesting applications.

The Neural Networks concept was introduced in the early 1940s, with the first learning rule being published in 1949. Arguably, the next most important accomplishment in the field was the introduction of the Perceptron and its convergence theorem in 1958. This laid the foundation for much of the work on Multi-Layer Perceptron structures as well as the Back-Propagation algorithm, which is still used to this day. The late 1960s through the 1970s saw a period of diminished funding and interest until the creation of Recurrent Neural Networks (RNN) in the early 1980s, which showed that information could be stored in these networks. Radial Basis Function (RBF) networks and the Support Vector Machines (SVM) dominated the field from the late 1980s through the early 2000s. Significant technological improvements as well as the popularization of deep learning for faster training caused the community the return to previous methods such as Multi-Layered and Recurrent Networks (and variations of these) in the last decade.

While we cannot cover every event that has taken place in the adaptive control community, our goal is to inform on the major events and trends that have taken place over the last century. For a more detailed discussion along with references, please refer to [Black, et al.(2014)].

#### 2. Model-based vs. Data-Driven approaches

We give here a succinct discussion on model-based vs. data-driven methods in adaptive and learning control, we refer the reader to [Benosman(2018)] for a more thorough overview on this topic. One way to classify adaptive control methods, is to group them in terms of their correlation with the model of the system. Indeed, we propose to decompose adaptive control theory into the following two main classes: Model-based adaptive control and data-driven

Figure 1: A Classification of Adaptive Control

adaptive control. Let us first define what we mean by each class. In this section, when we refer to model-based adaptive controllers we mean controllers that are *entirely or partially *based on a given model of the system. The model can be motivated from physics or an input-output model. However, the key point here is that, all or part of, the controller is based on a model of the system. Alternatively, when we refer to data-driven adaptive controllers we mean controllers that do not rely on any model of the system, instead they are fully based on learning from direct interaction with the environment, e.g., trial and error approaches. Some example are: Data-driven extremum seekers (ES), model-free reinforcement learning (RL), genetic algorithms, neural network and deep learning algorithms, kernel function-based parameterization, particle filters, and model-free iterative learning control (ILC).

Furthermore, we can decompose the model-based class into two subclasses, namely, fully model-based or classical adaptive control, and partially model-based or learning-based adaptive control. Indeed, in the fully model-based adaptive control, the controller is entirely based on a model of the system. In this case both the controller and the adaptation filters are based on the model, some examples are: Model reference adaptive control (MRAC), multiple-model adaptive switching control, concurrent adaptive control, adaptive regulation or disturbance rejection, *l*_{1 }adaptive control, speed-gradient-based control, passivity-based adaptive control, composite (or combined) adaptive control, retrospective cost adaptive control, set-theoretic-based adaptive controller, and robust adaptive control.

In contrast, in the partially model-based or learning-based sub-class, the controller is based on a model of the system, but the adaptation filters are data-driven, e.g., based on machine learning algorithms. The key point here is that the adaptation layer is solely based on direct interaction with the environment, without any assumption on the system model, i.e., the design of the adaptation filters is not based on the model, some examples are: ESbased modular adaptive control, Gaussian-process modular adaptive control, NN-based modular adaptive control, model-based RL control, model-based approximate/ adaptive dynamic programming (ADP) control, model-based neuro-dynamic programming (NDP) control.

Each of these classes can be further decomposed into subclasses. Indeed, when talking about (fully/partially) model-based adaptive control, we can identify the following main subclasses: direct adaptive control and indirect adaptive control. Other- deeper- subclasses can be identified based on the mathematical nature of the model, e.g., linear, nonlinear, continuous, discrete, hybrid. For the sake of clarity, we have summarized this classification in Figure 1.

Fully model-based adaptive control field has a long history of theoretical analysis, and as such is considered to be very mature in terms of theoretical guarantees. However, due to its model-based formulations, the obtained results are rather restricted to some known types of models, and the remaining extensions to more general models are very challenging. Some relaxations of these restrictions have been obtained in the partially model-based adaptive control or learning-based control paradigm, where only part of the model is needed to design the adaptive controller, whereas the un-modeled part is handled by some data-driven optimization and learning algorithms. Still, this gained flexibility comparatively to the fully model-based approaches, remains constrained in comparison with the fully data-driven adaptive methods. Indeed, fully data-driven methods learn the best control policies by direct interaction with the system, without any prior knowledge about the model of the system. This, allows a great deal of flexibility, indeed. However, it does come at the cost of extensive measurements and probing of the system, or data collection. It is also prone to high computation power needs, since, by discarding any form of model, these data-driven methods do not use any prior knowledge about the physics of the system, and thus have to explore a larger space of action to find the optimal policies. In contrast, the learning-based methods, which rely partially on some prior knowledge of the system, by using a partial model, explore a smaller or a parameterized space of actions to search for optimal policies or optimal parameters, and by doing so, these learning-based methods prove to be faster or less computationally demanding that the fully data-driven methods. Finally, an important point of comparison of all these adaptive control approaches, is the stability, and performance guarantees, which have been obtained in the fully model-based, as well as, in the learning-based methods, but lack tremendously in the fully data-driven approaches.

**3. Open Challenges and Potential Future Solutions**

As documented by the large number of monographs and papers in adaptive control reported in [Black, et al.(2014), Benosman(2018)], we can say that this field is well studied. However, many challenging problems remain open for investigation. For instance, we found very few papers addressing the problem of adaptive control for hybrid dynamical systems, modeled by the general mathematical class of differential inclusions. We can also underline here one common drawback in all data-driven and learning-based adaptive controllers, which are by design more forgiving in terms of the system’s model knowledge, however, their learning algorithms often rely on a proper choice of weight functions and other coefficients defined in the learning algorithm, e.g., excursion amplitudes and dither frequencies in the ES algorithms, choice of the basis functions in NN algorithms, etc., which in some sense defeats the purpose of not needing a good tuning of the model in the first place. Maybe an interesting direction to improve these algorithms would be to use tools from robust control theory, merged with the learning-based and datadriven adaptive control, to design learning and adaptive algorithms which are robust with respect to their tuning parameters. This will make these adaptive algorithms less sensitive to the designer/user tuning. Furthermore, fully data-driven approaches like the deep-learning methods, would benefit immensely from the theoretical tools in dynamical systems theory, as well as, nonlinear and robust control theory, to achieve constructive design aiming for stability and performance guarantees.

One characteristic of adaptive control is asymptotic convergence to zero for tracking errors but not for parameters. With real systems, we often care more about transient performance, but it is generally not possible to give a-priori performance bounds for adaptive controllers. For example, initial parameter errors are typically unknown, which makes it difficult to estimate the total error of the system and predict performance. Using system identification to estimate initial parameter values is a good idea (if possible). Methods such as projection have been used with backstepping to bound transient performance, but also require bounds on parameters which may defeat the true purpose of adaptive control. At this point we look back and ask, what is the class of problems for which there is great enough parametric uncertainty which cannot be predicted, measured, or bounded a-priori? Moreover, how many variables can be realistically and safely addressed in adaptation?

It is known that the error dynamics of an adaptively controlled system have an equilibrium point at zero, but without persistent excitation this equilibrium point is not unique. Depending on the design goals, parameter convergence to true values may not be required. However, many boundedness theorems require exponential stability at the origin and adaptive control is another nonlinear control approach; something likely to violate small neighborhood assumptions. It is here that we may wish to study the stability properties of only part of the states, which is known as the partial stability problem. We believe studying partial stability in adaptive control would not only be interesting but beneficial for the community.

The decreasing size and cost of sensors, coupled with design for redundancy and reliability have resulted in many sensor-rich systems such as vehicles that can use a variety of functions of sensor measurements instead of conventional sensor fusion. These methods can account for nonlinearity in both sensor and system dynamics. Controllability and observability (along with their metrics) were extended to a narrow class of nonlinear systems via Lie Theory, but new methods may help expand to other system classes. Investigations into these areas may yield unexpected progress in other classical questions such as obtaining explicity a-priori bounds on the transient performance of adaptive systems under a variety of conditions.

#### Bibliography

[Black, et al.(2014)] Black, W. S., Haghi, P., Ariyur, K. B., Adaptive Systems: History, Techniques, Problems, and Perspectives. Systems, 2(4), 606-660, 2014.

[Benosman(2018)] Benosman, M., Model-based vs data-driven adaptive control: An overview. Int. Journal of Adaptive Control and Signal Processing, DOI: 10.1002/acs.2862, 2018.

#### Author

K. Ariyur, M. Benosman, W. Black