Category: Opinion

  • “Only with investment can we build more resilient infrastructures ready for the increasing number of cyberattacks”

    “Only with investment can we build more resilient infrastructures ready for the increasing number of cyberattacks”

    In light of the recent cyberattacks on the Agência para a Modernização Administrativa (AMA), we spoke with INESC-ID researcher Ricardo Chaves, from High-performance Computing Architectures and Systems.

    Cyberattacks are becoming increasingly frequent, as was the case with the attack on AMA. Why is that?
    Cyberattacks are becoming more frequent, whether motivated by financial issues or political or ideological reasons. Ransomware attacks, such as the one on AMA, are typically financially motivated. The more valuable the target, the higher the potential gain for the criminal. In AMA’s case, we are talking about critical infrastructure responsible for one of the most essential processes in the electronic world—individual identification and authentication. Specifically for AMA, this involves citizen identification, not only in relation to state services but also the entire civil society that links a service to a person’s identity. Compromising this essential service means compromising the entire chain dependent on it, which explains the impact of this attack.

    Why did it take so long—over a week—to restore all functionalities?
    In this type of attack, the criminal blocks access to the data and, consequently, to the correct functioning of the system, then demands money for unlocking it. Typically, the attacker encrypts the data with a cryptographic key known only to them, and only after payment is this key provided, allowing the victim to decrypt and regain access to their data. In these situations, there are two main ways to recover. One is to pay the attacker, thus encouraging this kind of attack. The other is to have the capability to restore the entire system using functional system backups.
    In this case, the State, in my view wisely, did not pay. This left the option of restoring the system, though it appears that no functional backup was ready to immediately go online. They likely had to configure parts of the system, involving a thorough process of verification and credentialing to ensure that it could be safely and effectively restored for such a critical service.

    Should citizens be concerned about the consequences of this attack and future attacks?
    Although I am not familiar with the specific infrastructure used by AMA, this type of service is generally divided into two main layers: the interface layer, known as the frontend, and the server layer, known as the backend. The frontend handles the outside world, receiving user requests and interacting with the backend’s functional components. The frontend, by nature, is the most exposed part of the service and consequently more susceptible to attack. The backend runs functional processes related to identity management and citizen authentication.
    To ensure a very high level of security, the critical core of the authentication process is handled in systems known as HSMs, or Hardware Security Modules. HSMs are very robust, simple-operating components where the cryptographic keys associated with each citizen are stored and used, never leaving these components. The simplicity of these components allows them to be designed to withstand almost any physical or logical attack, achieving an extremely high security level. It is therefore highly unlikely that these systems have been compromised.

    What can we, as citizens, do to prevent this from happening again?
    There is little that individual citizens can do to prevent this type of attack. However, we should always be aware of the risks and act to avoid unnecessary exposure. At a national level, we can exert political pressure to encourage greater investment in security and in the people who ensure it. Only with investment can we build more resilient infrastructures ready for the increasing number of cyberattacks. In the case of AMA and similar services, preparation is crucial. While we cannot prevent such attacks from recurring, we must be prepared for immediate recovery, such as having backup systems ready to take over when the main system goes down. Naturally, this comes at a cost, but as we saw with this attack, the cost of downtime is far greater.

    Ricardo Chaves | INESC-ID researcher

  • “Our dependence on IT systems is growing, and therefore, the problems affect us more, seem to happen more, and have more visibility”

    “Our dependence on IT systems is growing, and therefore, the problems affect us more, seem to happen more, and have more visibility”

    Millions of computers affected, with airports, supermarkets, and TV stations worldwide having their activities compromised. All due to a software failure. INESC-ID board member Miguel Pupo Correia, from the Distributed Parallel and Secure Systems research area and head of the Computer Science and Engineering Department of Técnico explains what we can do to prevent such a blackout and reveals what has been learned from this episode, which was not the first of its kind and certainly will not be the last.

     A failure at Microsoft resulted in a ‘blackout’ that is said to have affected 8.5 million computers. What kind of failure are we talking about?

    In reality, it wasn’t one failure but two. The one with the most impact wasn’t at Microsoft, but at a cybersecurity company called CrowdStrike. As far as we know, someone at this company left a bug in a software product, which was propagated to millions of computers worldwide and caused them to stop working. So, there were several errors made by the company’s employees: a bug, lack of testing that would have detected the bug, and propagation of the buggy software version to computers worldwide. Worse, all these computers had and have to be fixed manually, one by one. The second failure was indeed at Microsoft, specifically in its cloud service, Windows Azure, which had a data center down for several hours in the early hours of July 18 to 19.

    Was it of malicious or accidental origin? How can one distinguish between the two?

    The causes appear to have been accidental, or rather, there is no reason to believe they were intentional or malicious. Distinguishing them is not easy. The distinction concerns the presence of intention on the part of those who caused them. What we know is that neither company presented the case as having an intentional cause. It also seems evident that if they had been intentional, the perpetrator or perpetrators would be easily identified and would suffer the associated consequences.

    Despite the impact, only about one percent of companies using Windows were affected. What do these companies have in common, with the most notable examples being from the aviation sector?

    According to Gartner, a market research firm, CrowdStrike’s cybersecurity software (“endpoint protection”) is currently the market leader in this type of product. Therefore, the companies that fell victim to the problem were those concerned with the cybersecurity of their systems to the point of investing in and using the most sophisticated product available. Apparently, the choice was not the best from a reliability standpoint, although it might have been from a cybersecurity perspective.

    How can this type of problem be prevented?

    The problem cannot be completely avoided. It must be managed, and the risk of it happening must be kept at an acceptable level. The scientific field that studies the problem of avoiding failures like this – Dependability – has existed for several decades and is a very active research area. In this field, we know well that there are four complementary categories of mechanisms to avoid system failures: 1) fault prevention, which tries to avoid the occurrence and introduction of faults in systems (the bug in CrowdStrike’s case); 2) fault tolerance, which aims to prevent faults from leading to failures (the stoppage of computers in this case); 3) fault removal, which attempts to reduce the number and severity of faults; 4) fault forecasting, which aims to estimate the number, future incidence, and consequences of faults.

    We have witnessed the impact at the business level. But this type of problem can also affect citizens. What can each of us, individually, do to avoid suffering such a blackout?

    Both companies and individuals are increasingly dependent on computers and, I would say, want to depend more and more on computers. In the case of companies, it is evident, but citizens also increasingly depend on personal computers: mobile phones, laptops, tablets, smartwatches, etc. What can be done is to avoid critical dependence. There are numerous examples. One I see as a professor: students who have their thesis presentation in software that is in the cloud (usually Google Slides). As it is in the cloud, the possibility of using this software depends on the availability of the Internet. It seems to me to be a bad idea to depend on the Internet at an important moment like a master’s or doctoral defence, not to mention that it is unnecessary since the presentation can simply be downloaded in advance. Identifying these dependencies is not trivial, but it is necessary to think if at an important moment, I will depend on IT and what I can do to avoid it. I once heard Admiral Gameiro Marques, who is the National Security Authority, say that companies should maintain the ability to perform much of their activity manually, without using computers. This may be possible in some cases and impossible in others, but it seems to me a good principle. He was thinking specifically about a company, the IMPRESA group, whose IT infrastructure suffered a devastating cyberattack and lost the ability to edit the Expresso newspaper using the IT systems they had been using for several years. They might have thought it was impossible to continue producing the newspaper manually, but they had no choice.

    What do we learn – companies and citizens – from this incident?

    A few decades ago, public and business services worked quite poorly. Today we are used to them generally working well, efficiently, and without major delays. What we need to learn is that reality is not perfect and that at certain times something that seemed as obvious as catching a plane can be delayed by hours or even days or even impossible. We need to learn to manage our expectations.

    There has been talk of an increase in the occurrence of such problems – whether accidental or malicious in origin. Is this your opinion? If so, is it an inevitable fact, or can precautions prevent its occurrence?

    I agree that there has been talk and that there is a perception that the occurrence of such problems, both for accidental and intentional reasons, has increased. However, I have no certainty that this is true. This type of problem has always happened. Our dependence on such IT systems is growing, and therefore, the problems affect us more, seem to happen more, and have more visibility in the media.