Gadi Naveh, Threat Prevention Evangelist, Check Point, outlines how Artificial Intelligence (AI) is transforming the way cyberthreats can be discovered.
Artificial Intelligence is already shaping up to be next Industrial Revolution. Billions of dollars have been invested into AI technologies and the startups that build them. Personal helpers, such as Siri, Cortana and Alexa are all still in their infancy stage. Yet, they are becoming actual companions, capable of human-like conversation.
The drive for AI
Whether you realise it or not, AI technologies are already present virtually everywhere you look, addressing almost every aspect of our modern and not-so-modern lives. Speech recognition, image recognition, autonomous cars that rely on AI-technology to keep them safe. The financial sector is moving to AI-based insurance risk analysis, credit scores and loan eligibility. We’re also seeing the emergence of AI-based robot lawyers and AI-based medical diagnostics and prognoses. And these are all just the beginning.
In general, there are three driving forces involved in this progression towards AI:
- Storage: We can now store enormous amounts of data at a fraction of what it used to cost
- Compute power: The capacity now available lets us process mountains of data
- Mathematics: Maths and algorithms drive AI. Machine learning, deep learning and Big Data analytics have all seen major breakthroughs in the past several years
AI technologies have moved from being purely a tool for academic research to something practical that companies can actually build into their commercial products. But can we trust AI to make the right choices?
Assessing current AI capabilities
This is a hard question to answer as, to date, AI solutions have delivered mixed results. For example, Tay Bot was an AI-based Twitter chatbot by Microsoft, which went online in March 2016. It took a few hours of free chatting on the Internet for it to learn the drill. Since the Internet has all sorts of ‘teachers,’ what this bot quickly learned and excelled at were profanity and racial bias. After 16 hours, Microsoft realised the catastrophe it was creating and shut it down for good.
A few months ago, Mashable ran an article about Google Translate and how it translated Turkish, which is a gender-neutral language with no distinction between male and female forms. They use ‘O’ for both ‘he’ and ‘she.’ But when translated to English through AI, the machine-driven algorithm shows bias: she is a cook, he is a doctor. She is a teacher, he is a soldier. And, seemingly apropos of nothing, he is happy, she’s unhappy. It’s not that Google engineers are sexist. They just fed their machines with all the pre-existing texts they could find and let the tool reach its own conclusions.
As such it would appear we are still decades away from a magical engine that takes data in and produces the correct decision. But does this make AI useless?
In short, the answer is no. It is just a matter of having the right balance of the two most crucial elements for AI to work as it should – data and expertise. First and foremost, this requires lots of data covering the entire spectrum of the problem you are trying to solve in order to provide enough material upon which to derive the right conclusions. In addition, this needs to be supported with the correct expertise, both in the mathematics that drives AI and in the specific domain being addressed. This element is a crucial component to get the most out of the data in question.
AI and cybersecurity
In cybersecurity, AI has great potential, although it does not come without limitations. Unsurprisingly, these are identical to those mentioned above – a lack of data and a lack of expertise. In addition to this, AI systems do not explain themselves, meaning you have to manually validate each decision or blindly trust it, only to then realise that this technology is notorious for having a fairly high false classification rate. Fundamentally, this is not an option in cybersecurity as we all know that missed detections and false positives can have disastrous consequences.
Despite these limitations AI, machine learning, deep learning and Big Data analytics are letting us mechanise tasks previously only handled by our scarcest resources – the smartest human analysts. They can make sense of our gigantic mountains of data logs, providing visibility where we were previously blind.
For example, at Check Point we are exploring AI’s role in cybersecurity, deploying AI-based engines across our threat prevention platform in a number of different capacities. The first example is campaign hunting. The goal with this AI engine is to enhance our threat intelligence. For example, a human analyst looking at malicious elements would typically trace the origins of those elements and incriminate similar instances (e.g. domains registered by the same person at the same time with the same lexicographic pattern).
Improving network security with AI
By using AI technologies to emulate – and mechanise – an analyst’s intuition, these algorithms can now analyse millions of known indicators of compromises and hunt for additional similar ones. As a result, we’re able to produce an additional threat intelligence feed that offers first-time-prevention of attacks that we’ve never seen before. More than 10% of the cyberattacks we block today are based on intelligence gained solely through campaign hunting.
A second engine, Huntress, looks for malicious executables, one of the toughest problems in cybersecurity. By nature, an executable can do anything when it’s running as it’s not breaching any boundaries. This makes it hard to figure out if it is trying to do something bad.
The good news, though, is that cyberattackers rarely, if ever, write everything from scratch. That means similarities to previously known malicious executables are likely to surface, though they are often hidden to the human eye.
But when we use a machine-driven algorithm, our scope of analysis broadens. Using a sandbox as a dynamic analysis platform, we let the executables run and collect hundreds of runtime parameters. Then, we feed that data to the AI-based engine, previously trained by millions of known good and known bad executables, and ask it to categorise those executables.
The results are quite astounding. We end up with a dynamic engine, capable of detecting malicious executables beyond what antivirus and static analysis would find. In fact, 13% of the detected malicious executables are based on findings solely from this engine. If it were not for Huntress, we would not have known to block them.
Another example is CADET, Context Aware Detection. This gives us access and visibility into all parts of the IT infrastructure – networks, data centers, cloud environments, endpoint devices and mobile devices. This means that rather than inspecting isolated elements, we can look at the full session context and ask whether it came through email or as a web download, whether the link was sent in an email or a text message on a mobile device, who sent it, when was the domain registered and by whom.
Essentially, we are extracting thousands of parameters from the inspected element and its context. By using the CADET AI engine, we can reach a single, accurate, context-informed verdict. So far, our testing shows a two-fold improvement in our missed detections rate and a staggering 10-fold reduction in the false-positive rate. You have to keep in mind, these are not just nice mathematical results. In real-life cybersecurity, engine accuracy is crucial.
In summary, the examples I’ve given above illustrate how the combination of expertise and vast amounts of data can produce the best approach to make cybersecurity practical, using the entire arsenal of available technologies. When AI is used as an additional layer, added to a mixture of expert engines designed to cover the entire attack landscape however, it can really come into its own.
Ultimately, cybersecurity must be practical and as we move farther along the AI continuum, those technologies are taking us farther along toward being able to develop smarter, more practical and more intelligent threat defences.