Beyond Resilience

Resilience is a fancy, fashionable à la mode word. Being able to withstand the slings and arrows of outrageous fortune is something everyone wants. The ability to recover quickly, to spring back into the original shape is certainly desireable in any field. Nature builds resilient things. Engineers are sometimes able to create resilient designs. This is particularly true if such designs are simple.

Before we proceed, it is important to underline that resilience is a well-defined physical property of systems. It is not subject to discussion. It is not the result of a point of view. It is not something you figure out by filling our questionnaires or by interviewing management. There is plenty of science behind resilience. In mechanical engineering, the ability of a material to absorb energy when deformed elastically and to return it when unloaded is called resilience. The resilience modulus is strain energy per unit volume required to stress the material from zero stress to the yield stress. The modulus of resilience for various materials is illustrated below.

From https://www.totalmateria.com/page.aspx?ID=CheckArticle&site=kts&NM=41

The above numbers are obtained scientifically, not in discussions, meetings or interviews.

Ontonix has developed a scientific method for measuring resilience of generic systems and processes, based on information flow topology, complexity and critical complexity. We offer a HW solution to measure resilience in real time and which is installed on mission-critical military equipment of our clients. This solution has been developed in collaboration with SAIC.

Ruggedized complexity and resilience monitoring device.

So, resilience is a good thing and the more you have the better. In theory. In fact, there are a few things to keep in mind before you invest a lot of money in your own resilience or the resilience of your business.

First of all, resilience has a ‘static’ flavor in that it is something you hard-wire into a system. Resilience may mean high costs. Since you cannot insure yourself against everything that an increasingly complex environment can throw at you, resilient systems may be ‘obese’. Living in an underground bunker, surrounded by meters of concrete, makes you immune to many things but it has obvious disadvantages.

Resilient systems resist change. Resilience can limit how fast a system can react or adapt to a new set of circumstances.

When things become very complex, resilience is not easy to achieve. Think of computer operating systems, flight control software in modern civilian and military aircraft, the software in modern cars, or the internet. These systems can become unstable, unreliable and cause plenty of headaches. Making them highly resilient is impossible if their design neglects complexity, their salient feature. Today, engineers try to design highly complex systems but they neglect complexity. This is truly extraordinary.

The bottom line:

People demand resilience but are not willing to measure it, they prefer to talk about it, to fill out questionnaires, to speculate about it and to come up with their own definitions thereof.

But, engineers are not willing to take into account the most important cause of fragility (i.e. the opposite of resilience) which is excessive complexity of the systems they design.

Whan can one do? What lies beyond resilience, which is an expensive and static property? Does it make sense to invest into being resilient, knowing that the environment changes quickly and that today’s resilience may be insufficient tomorrow? Think of cyber-resilience, nowadays a popular topic and buzzword. One can put in place an expensive infrastructure to protect oneself from cyber-attacks or other forms of aggression, but what happens if attack techniques change? Will old infrastructures guarantee protection and resilience? Besides, resilience does not prevent attacks, it only provides the ability to resume the original state after the fact. In theory.

So, what alternative is there? We believe that a more modern means of protection, beyond resilience, lies in anomaly detection and reaction to these anomalies. Being ‘statically resilient’ and hoping that past lessons and remedies will defend you tomorrow is risky. One cannot design resilience today for tomorrow’s (unknown) attacks. A vaccine against an unknown disease hasn’t been invented yet.

Anomaly detection, or early detection of attacks, is another popular subject. While it is paramount to dispose of an early warning, it is also important to know what constitutes an anomaly. And how many anomalies are there? How many things can go wrong in a piece of SW with tens of millions of lines of code? How many forms of attack are there? Do we have sufficient examples of anomalies to learn to recognize them? Is there enough time to use Machine Learning to recognize them? And what about zero-day attacks or zero-day vulnerabilities? Is Machine Learning the right approach? We think it is not.

Machine learning is fine for many applications, but not for all applications. It is slow and expensive, especially if examples of anomalies or failures are costly and/or rare.

Ontonix has developed a generic means of detecting anomalies, especially the nasty ones, those that have a systemic connotation, and in particular anomalies that have never been witnessed before. The approach is based on complexity.

We know that complexity tends to grow or develop sharp spikes before a crisis or any destabilizing phenomenon of endogenous or even exogenous nature. This defines an anomaly. Recognizing anomalies defined in such manner doesn’t require any form of learning, only complexity monitoring. An example is shown below. It shows the massive drop in the DOW (red curve) in February 2020, induced by the covid infodemic. The blue curve is the corresponding complexity.

DOW Jones index and index complexity. Horizontal axis units are hours.

The spike in complexity, which offers excellent opportunities to short the index, has occurred approximately 100 hours before the plunge. The system has never been trained to react to this kind of situation. It all happened on the fly. The only information that was used to produce the early warning signal was the DOW index itself sampled at 60 minute intervals.

The infrastucture needed to implement a fast complexity-based anomaly and attack detection capability is a set of sensors that allows to monitor a given system and produce a stream of live data. QCM does the rest.

QCM – Quantitative Complexity Management – is a trademark of Ontonix S.r.l.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s