Security Information and Event Management (SIEM) is a type of computer security software solution that combines real-time monitoring of IT infrastructure for security threats with the gathering and analysis of log data and event data collected from disparate components of the IT environment.
The desired result of using SIEM is obtaining actionable information, such as notifications on security threats for incident response or preventative action, reducing false positives for security events, and implementing centralized policy-based rules for automated and enhanced regulatory compliance.
The term SIEM was initially used at Gartner as far back as 2005 when it was defined as a technology that provides real-time event management and historical analysis of security data from a wide set of heterogeneous sources. SIEM tools have evolved to facilitate the growing needs of modern organizations to meet a range of important regulations governing data, such as GDPR and HIPPA.
SIEM solutions are available as dedicated enterprise software or as managed services.
Big data and machine learning are two important fields that have huge potential to connect with SIEM software and enhance its functionalities. This article outlines the connection between SIEM, big data, and machine learning (for more detail on SIEM uses, see this what is SIEM? resource by Exabeam).
SIEM and Big Data
The emergence of large volumes of fast-moving unstructured data, such as data generated by web, email, social media applications, poses a challenge to SIEM in terms of its ability to unearth correlations and other important information about your current security status. Such information might be relevant for IT security, but older types of SIEM systems might not be designed to work with it.
The power of big data technology is such that distributed computing environments and frameworks like Hadoop are now available to easily store and analyze huge amounts of unstructured data. Big data analytics can extend the capabilities of SIEM in threat detection by giving access to correlations among pools of data that were previously inaccessible. This big data can encompass log files and events from internal systems in addition to external sources such as threat intelligence feeds, vulnerability databases, and social media data.
The volume of security event data is so huge that the largest modern enterprise security operations teams might see billions of events per day. A SIEM solution that integrates with big data infrastructure gives organizations the power to avoid the potential embarrassment of processing all the data/events required to detect a security breach, but not having the infrastructure in place to analyze all the information and find such breaches.
The ability to retain security-related data for longer increases with big data infrastructure and its huge storage capacity. This means that organizations have longer to analyze historical trends and unearth the type of detailed information on their baseline normal network activity to proactively combat future security threats.
SIEM and Machine Learning
Machine learning techniques involve the use of algorithms to give computer systems the ability to automatically learn and improve from experience without explicitly programming these systems. While machine learning applied to SIEM is still a nascent field, there is clear potential for this branch of AI to augment SIEM capabilities.
A pertinent and effective attack vector used by cybercriminals is through the compromise of user credentials. Advanced persistent threats, in which an intruder gains access to a network and remains undetected, typically encompasses phishing and social engineering to gain user credentials, and such attacks can cost millions of dollars and/or lead to the compromise of extremely sensitive information.
Defending against these types of threats is a huge battle that machine learning can assist with. By feeding a machine learning model with input data on user behavior, including location, login times, usage habits, and history, the model can become proficient at establishing a concise picture of what constitutes normal user behavior across your systems.
Combining the information gleaned from machine learning models with the log and event data SIEM software collects means that when anomalies arise in user behavior, you can sniff them out much more effectively.
Not only does machine learning help better identify stealthy attacks like APTs, when combined with SIEM, machine learning can also provide predictive analytics, wherein intelligent analysis and learning from historical and present data via machine learning techniques can help predict and prevent future attacks on your IT systems.
Wrap Up
SIEM systems need to evolve to cope with the ever-increasing sources of heterogeneous data passing through enterprise systems; data that is relevant in a security context and could prove crucial in helping combat against attacks.
Machine learning and big data infrastructure both have important roles to play in terms of helping SIEM evolve so it can detect the range of complex cybersecurity threats faced by modern organizations while still fulfilling its much-needed primary functions.

