As is the case with so many business intelligence tools, a Security Information & Event Manager (SIEM) is only as effective as the data it has access to for analysis. A SIEM platform that only has access to some of the relevant security data will be unable to adequately detect all threats, and will also be less capable of providing the analysis required to adequately investigate those few threats that are identified. For example, performing a root cause analysis without knowing all of the potential threat vectors will result in forensic gaps that either miss potential paths via which an attack might propagate, or provide investigative 'dead ends' that waste valuable time during the forensic process. When you consider the necessary context within which network forensics and regulatory compliance must be evaluated, best practices teach the 'more is better' guide for data collection; any and all information that adds greater perspective to an individual event helps to identify potential risk exposure, reduces unnecessary threat propagation and provides improved visibility to assets and applications throughout the enterprise that may require additional safeguards to ensure their protection.
~ Denise Dubie, Network World
More data may be better, but unfortunately it may not be possible to collect all data. There are virtually limitless sources of relevant data throughout each and every enterprise; event and log sources ranging in scope from security appliances (firewall and IDS/IPS platforms) and network infrastructure (switches and routers) to hosts and application servers. While most successful SIEM installations begin with grandiose plans for mass-scale event collection and analysis, all too often the well-intentioned security professional finds that with each additional data source processed, the infrastructure hosting the analysis platform suffers the burden of increased storage, elevated application processing overhead and exponentially more complex database administration requirements. The unfortunate reality is that most SIEM deployments are constrained not by the resourcefulness of their operators or by the availability of relevant data sources for greater context, but rather by the scalability and performance attributes of the host environment on which the analytics must be performed. For every new data source collected, additional resources must be allocated for the processing and storage of its events, flows or logs.
~ J. Michael Butler, SANS
Determining which data sources to collect for use in SIEM platform can sometimes be the most difficult deployment decisions to make both during the initial deployment and throughout the lifecycle of the platform. A compromise needs to be made, with consideration for the value of each data source, as well as the impact of that data in terms of collection-, storage-, and reporting- performance. To adequately benchmark a SIEM, first determine which data sources are important, and then assess the total performance impact that they will incur. For these purposes, please refer to the SANS whitepaper, "Benchmarking SIEM." This document outlines how to establish benchmarks based on expected average and peak event rates, and the total event volumes that must be analyzed over time.
Log files are generated by almost every network device, and in many cases there are specific requirements on how logs must be collected and stored, under regulations such as PCI, HIPAA, FISMA, and others. When considering logs from a data analysis standpoint, it should be realized that half of the work may already be done by Log Management system, such as NitroView ELM. Deciding which logs will be available for analysis, however, requires an understanding of the volume of total collected log files, as well as the rate at which it occurs.
While log files are typically written in real-time, they may or may not be made available to a SIEM or other device in real time. For example, syslog files can be sent directly to a SIEM as they occur, or log files can be batch processed. In either case, the rate at which the logs are produced should be considered. "Logs" and "events," therefore, can be considered the same for purposes of evaluating the total impact on a SIEM.
Host Logs -- include windows logs and local operating system logs from end user workstations. Host logs can indicate local login failures, unusual application usage, anti-virus warnings, attempted privilege changes, network access errors, and a number of other indicators that could be the symptom of a larger threat (either internal by a legitimate user, or external through the compromise of that host by a hacker). Important information obtained from host logs might include: IP address, host name, username, and any number of relevant events.Impact: The impact of host logs is typically low, anywhere from 0.001 to 0.15 events per second (note: this includes increased DNS activity, which is collected from DNS servers).1
Server Logs -- like host logs, server logs can indicate any number of symptoms that may be relevant to much larger security concerns. Servers, because they serve common information or applications by definition, are considered more important forensically, because they are typically the targets of an attack, and thus much more relevant to security analysis. Important information obtained from server logs might include: IP address, host name, username, application name, and any number of relevant events.Impact: Again, the impact of server logs is typically low, anywhere from 0.12 to 21.8 events per second for *nix, and 0.6 to 218 for Windows.
Database Logs -- Servers and applications ultimately depend upon some sort of stored information, and it's that information which is typically the goal of a hacker. Database activity logs should be considered separately because of their high importance. Database logs can indicate successful and unsuccessful data access, user privilege changes, and any number of other highly relevant activities. Important information obtained from database logs might include: username, application, data access, and any number of relevant events.Impact: Database activity is typically measured in events/second or transactions/second. Expect 20 transactions per second (with compression), per database server during peak hours. Session information may increase this value.NOTE: Database logs, if the database server is compromised (or if data is stolen from a privileged administrator) can be altered. For extremely sensitive data (such as Personal Identification Information (PII) or credit card information, the use of a Database Activity Monitor (DBM) such as NitroView DBM should be used. Important information obtained from a DBM includes the same types of information and events that can be found from database audit logs, but with greater trust, and often with greater detail. A DBM might be able to provide database server activity, privilege escalations, session information, and other information that may not be logged by the database.
Application Logs -- Often, an application is simply the interface to a database, and as such, database logging and/or the use of a DBM can overlap with the collection and correlation of application-specific logs. However, in many cases, application logs can also provide valuable information for linking network and event activity, especially in the case of web server logs. Important information obtained from application logs might include: username, relevant database(s), database username(s) and account information, etc.Impact: Application logs vary widely by application. Consult the application documentation for specific log file sizes and expected volume.
Because a Log Management system (typically responsible for encryption and storage of raw log files) and Log Analysis system (typically a SIEM, responsible for parsing the log and providing correlation and analysis of its contents) handle logs differently, they're often kept as separate systems. Because of this, "event per second" evaluation is often considered to be less important to the collection of log files. However, as Log Management, Log Analysis and SIEM functionality converge, event rates (per second) and event load (over time) should be considered.
In some cases, such as with NitroView ELM and NitroView ESM, these functions are tightly integrated, offering greater value when used together than if they were used as stand-alone point solutions. In this case, log management functions are supplemented with an initial analysis of each log file, providing alerts to NitroView ESM when something of interest is detected. NitroView ESM, if allowed to directly collect logs, will parse each log for more robust analysis and reporting. However, because not all logs are as important to security analysis as they are for compliance, it doesn't always make sense to manage a particular log with a SIEM when it is already being managed by a Log Management system. In these cases, limiting NitroView ESM's exposure only to those logs that are suspect in some way (as determined by the log management system) can improve the overall security capabilities of the SIEM (by adding additional, relevant data), while reducing the total volume of log data that must be parsed and stored for analysis by the SIEM.
Network Data is any information provided by a network switch, router, or in some cases using a dedicated network probe (e.g., the NitroGuard IPS includes flow collection in addition to in-line intrusion prevention). Network data involves network communications (flows), access control, or other network-centric information. "Flows," for the sake of comparison, can be considered "Events concerning the network," and should be tallied along with other events. Flows typically occur in real-time, and at relatively high rates.
Flows -- Network flow information includes information relevant to any network conversation. At a minimum, a "flow" should include source and destination pairs for the MAC, IP, and port addresses, as well as the time/duration of the conversation and the amount of information transferred (in bytes). Flow information is extremely important when piecing together the vector (or vectors) of a particular attack. One important example is a virus outbreak: Once a machine is infected, the destination IP address of every subsequent conversation from that machine is exposed and at risk. While Intrusion Prevention Systems (IPS) can often detect and block a virus, an infected machine inside the network may not communicate through the IPS, firewall, or other security device, masking subsequent infections from detection. Only using network flow data, as it relates to one or more events, will allow the virus to be tracked through its entire vector. Important information obtained from network flow data includes: Source IP, Dest IP, Source Port, Dest Port, Duration, Bytes, as well as potential anomalies that may be detected (traffic spikes, etc).Impact: Flow data accumulates very rapidly, creating new flows for every network conversation. Expect rates of 55 to 75 flows per second using NetFlow.
Router Logs -- Outside of flow information, many network devices (especially routers) keep logs that can prove useful for security analysis. Especially of concern are logs containing information about Access Control Lists (ACLs). Router logs might also be useful in determining malicious activity performed from a compromised router, such as man-in-middle attacks, spoofing, etc.Impact: Cisco IOS will generate logs at a rate of about 0.1 to 0.7 events per second.
Event Data is any information provided by a specific security appliance or security software solution. "Events," in this case, are alerts generated by these devices. They typically occur in real-time, and at varying rates that must be considered when planning a SIEM deployment.
Security Device Alerts -- Security devices are network devices such as Intrusion Prevention Systems (IPS), Firewalls, Threat Managers (UTM), and Anti-Virus solutions and other devices. The collection and management of all security device alerts should be considered mandatory. Important information obtained from security device alerts include: intrusion attempts, malware, viruses, portscans, trojans, exploits, and many other highly relevant events.Impact: The event rates of security devices will vary widely based on the device, the network, and the types of traffic that are typical on that network. However, for sizing purposes, you can expect events from an Internet-facing firewall to range from 5 events per second to over 800 events per second during an attack. An IDS or IPS device will likewise average a fairly small amount of event data (0.6 eps) under normal conditions, but may produce as many as 2400 events per second during an attack.
Security Device Logs -- Security device logs are essentially the same as security device alerts, only instead of being collected in real time (where the device sends the alert directly to the SIEM for analysis as it occurs), log files containing spans of event data are made available. This structure can help to reduce the strain on a SIEM cause by devices which produce very high event rates, at the cost of real-time notification (note: event correlation looks for patterns over a period of time n. Therefore, event logs need to be collected at a rate
Agents -- In some cases, agent-based probes may be used in place of network-based security devices. Agents are typically highly specialized, providing intrusion detection, database monitoring, or some other focused activity on a specific host. Agents are installed on specific servers or hosts, and unlike network-based devices, may be widely distributed throughout a network for relatively low cost. Agents represent broad visibility into the infrastructure and are therefore valuable to overall security management. However, because of their distributed nature, they may result in larger numbers of total events, especially when deployed in a widely distributed fashion. Impact: The impact of a host agent is typically similar to its network-based counterpart. In the case of database monitoring, the volume may increase slightly, as the agent may have visibility to more information. in the case of an IDS or IPS, host agents typically generate a lower number of events per second because they are monitoring a single host rather than a network segment--however, at the same time the total number of agents must be considered, making the total event load as much if not more than a network IPS.
Application meta-data refers to information about how an application is being used, and what it is doing. The concept of application meta-data is considered separately from what is made available by in-line application monitoring (e.g., a layer-7 IPS), or from application logs. This is because obtaining meta-data requires full application decoding in real-time. Especially if this information is going to be stored for analysis or for application-replay, it almost immediately presents issues in terms of scale. Because of this, a separate Application Monitoring system is typically required, which will bear the burden of application monitoring, the capture and decoding of application data and meta-data, and (potentially) the replay of application events. Application data is included as a potential data source for SIEM because certain SIEM solutions, such as NitroView ESM, do support the performance and scalability required to collect and analyze application sessions. In addition, using application data within a SIEM provides significant value, and is often worth the extra consideration that should be taken in terms of sizing & deployment requirements. Benefits include fraud detection, data leakage protection, and improved visibility into how sensitive data is being accessed and manipulated--useful for both information security and regulatory compliance.
To properly benchmark your SIEM's capability to support your required data source(s), you must consider several metrics against your specific goals for SIEM. For example, if using SIEM solely for compliance, or solely for incident response, certain performance metrics should be weighted more heavily. To support the "complete" functionality of SIEM, all metrics should be considered a hard requirement.
These metrics include:
Application Meta-Data
Benchmarking Security Information & Event Management (SIEM)
Luckily, work has been done by the SANS Institute to help perform real benchmark requirements. Their report "Benchmarking Security Information & Event Management" provides baseline network event rates as well as a means to extrapolate those results so that they will be more relevant to your specific network. What we see from this research is that even in mid-sized networks, event loads can become considerable, and SIEM performance should always be considered prior to implementation.
"Benchmarking Security Information & Event Management," SANS, Feb. 2009
This report is available online from NitroSecurity at: nitrosecurity.com/information/whitepapers or from the SANS Institute at sans.org/reading_room/analysts_program
The entire discussion of how to select the right data sources for management is necessary because there is too much data to manage--both in terms of system performance and of a security analyst's ability to make of sense of the results. The job of a SIEM, after collecting sufficient volumes of data at sufficient rates, is to analyze that data to produce useable, actionable information to a security analyst. This is done in two ways:
~ David Swift, SANS Institute
Either method is useful on its own: the first simplifies the detection of an incident or threat; the second simplifies the ability for a security analyst to detect unknown threats (for which a correlation rule does not exist)2 and to respond to that threat.
This highlights the final consideration of selecting data sources: the ability to report on that data as quickly as possible. Running reports should be fast enough to support manual correlation of all of the collected data. However, as data sets grow, the time required to run a query against the data set increases -- often to minutes or even hours. To produce actionable results on 90 days of collected security data,3 the SIEM must be able to analyze and report on that data in from 3 to 5 minutes. In a mid-sized network, that could mean producing multiple reports against 1.2 billion events. 5
Event and flow volumes are growing, producing ever-demanding quantities of security data that must be collected, normalized, analyzed, correlated and stored for forensic discovery or compliance reporting. Relevant information can be found from a variety of sources, and in a variety of formats. Ideally, all information would be collected; it would be retained forever; and it would be instantly accessible for analysis and reporting. Realistically, each source of security information must be assessed for its importance to the overall goals of information security. Each source will generate data at a given rate, and those events, over time, will accumulate at a given rate, creating predictable volumes of information that will need to be correlated and analyzed. Depending upon the specific security and compliance requirements within your organization, certain information sources will provide greater value than others; the determination of which data sources can and should be collected and stored is the first step in a successful SIEM deployment. Once it has been determined what information needs to be collected and stored, the next step is to estimate the performance impact of that information, so that it can be paired to a SIEM platform capable of managing the total event loads.
1 Event-per-second baseline data used in this document is sourced from the SANS Institute's research published in the document "Benchmarking SIEM," February, 2009. Peak values represent event rates during an incident.
2 For this reason Manual Correlation is sometimes referred to as "Zero Day" Correlation.
3 The PCI Digital Security Standard (PCI-DSS) requires that a minimum of 90 days of event data be retained for reporting and analysis.
4 Using the baseline network as defined in the SANS paper, "Benchmarking SIEM," 1.2 billion events will be collected over 90 days.
5 Please refer to the NitroSecurity whitepaper, "The Fundamental Requirements of SIEM" for a more detailed discussion of SIEM performance requirements.