What if your boss or client came to you with the following question:
“Can you tell me how Twitter hashtags, our current road construction and the weather conditions are affecting the sales of Product X?”
Could you do it?
Enter: Machine learning.
Machine learning is the idea of taking collections of data sets that are large and complex when the size, diversity and complexity of data make it difficult to process using traditional applications. These systems, coupled with artificial intelligence, specialize in organizing solution sets that are unstructured, which are more difficult to find and correlate.
So when the need arises to correlate sets of data that likely have something to do with one another but can’t be assessed using traditional databases and methods, companies are still able to make these connections—through machine learning.
Using machine learning for security incidents: Now and then
In the cybersecurity world, machine learning is a deduction based on multiple sources of data to determine if a security threat risk exists. The industry looks at machine learning to marry security incidents with threat intelligence. The result is an industry term being coined as machine learning with Big Data analytics.
It used to be that attackers used flaws in software to exploit systems. Attacks grew sophisticated over the years by taking advantage of software flaws as well as business and logic flaws combined with social engineering attacks.
Security professionals played cat-and-mouse games between attackers and their systems; they patched their systems for vulnerabilities before attackers could exploit them. Over time, security devices got smarter; they were able to detect new variants of the same attack, making the perimeter and edge system more difficult to breach.
Now, the defense tactic has changed, mainly due to advanced persistent threats (APT) and zero-day attacks (custom threats that exploit people and business processes). Unlike threats of the past, they do not cause damage initially—they fly under the radar partly because IT organizations are understaffed and overworked (and partly because these attacks do not cause significant changes or denial-of-service to the organization they are attempting to breach).
As Lex Luthor said, “True power is best kept concealed.”
APTs and zero-day attacks require attackers to research an organization’s staff, business processes and valuable assets. They are not solely relying on software flaws to launch attacks. Therefore, traditional signature-based solutions used by security professionals are mostly rendered useless from APT and zero-day attacks.
Machine learning and SIEM solutions
Most security professionals concentrate on alerts detected by SIEM (security information and events management) solutions and ignore all other data. They are mostly trying to find the “signal in the noise.”
These systems work with correlating event information from multiple devices by using predefined rules that disregard irrelevant information, surfacing incidents that are more frequent or present a higher risk based on their behavior.
Traditional SIEMs often provide an incomplete picture of the risks facing an organization. That’s because SIEMs only collect information from portions of the IT infrastructure, leaving critical blind spots such as whether or not a user signed on to a laptop (or network) has access to do so. Or, if the tools and commands being used regularly are based on past behavior of the administrator or others with similar access (and do those tools/commands pose any known threats?)
This is an example of correlating multiple types of activities that will not be captured by most SIEM tools which is the essential promise and value of machine learning applications in cybersecurity.
Is it about full-packet capture?
In no way am I saying you need full-packet capture enabled on your network to determine if security threats exist. In many instances, meta and flow data analysis will provide in-depth analysis of metadata to determine if risks exist far beyond the capabilities of traditional SIEM products.
However, machine learning is all about data. Traditionally, the more data have access to to analyze, the more accurate intelligence picture you can paint. The advantage of full-packet capture is session reconstruction to detect and investigate how attackers infiltrated the environment and what they did once inside.
Artificial intelligence and machine learning solutions in cybersecurity are about combining multiple sources of data. It provides a platform that automatically ingests threat intelligence from external sources, providing valuable views of the threat environment outside the enterprise and comparing that to the current behavior of events occurring inside an organization.
Moreover, machine learning, when combined Big Data security analytics, creates a platform for collecting security data from multiple sources that are far beyond traditional log information both internal and external to an organization. Detection is not based on signatures or static correlation rules but on dynamic comparisons to normal baseline behaviors for individuals or groups that have similar job functions and requirements.
Behavior outside the normal baseline determines suspicious activities that may indicate attacker activity. This speeds up the identification of threats which have not been categorized by security vendors and provides efficiency within an organization on how they value the risk of security events occurring in their organization.
What do you think? Should more organizations utilize AI and machine learning when building their cybersecurity and threat detection programs? I’d love to hear your thoughts.
Interested in learning more? Join me for my sponsored session at the 2019 Secure360 Conference, “The things that go bump in the night” on Tuesday, May 14 at 11:15AM CST.