Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

DIY Big Data Security Analytics: OpenSOC

Big data analytics provides scalable, high-performance analysis of large data sets. It allows for the examination of large volumes data to discover patterns, behaviors and correlations that can be used to drive decision making. Typically focused on business applications, big data analytics is now being used for security event monitoring and threat detection.

Today there are several vendors that offer big data-centric products that consume network and device telemetry such as asset information and logs to provide users with a correlated view of their network activities  and to identify threats. Cisco Systems has one such system: Managed Threat Defense, which utilizes an open source technology known as OpenSOC  -- short for open security operations center.

OpenSOC, originally developed by Cisco, defines a DIY framework for building a real-time, big data centric analysis and storage system using parallel computational tools on a scalable Hadoop architecture. Building an analytics solution in-house is non-trivial and requires knowledge of data science and complex systems. The OpenSOC framework is a starting point for understanding how to build your solution. The technology behind OpenSOC consists of:

Telemetry capture layer: Apache Flume

Flume agents aggregate telemetry data from dynamic and static sources through the implementation of customized parsers (e.g. Syslog, Netflow, CSV files).  Each unit of data is an event moving from Source to Sink via a Channel, one per agent. Sinks identify the next step in the processing path, for example a Kafka topic.

Data bus: Apache Kafka

Kafka is a distributed messaging system partitioned into user-defined topics specific to the message types received by Producers. A Flume sink output is consumed by a Topic to provide an ordered, normalized sequence of messages that are replicated and continuously appended to a commit log in a Kafka server cluster (by Brokers). Consumers, such as Storm, subscribe to topics and process the published messages.

Stream processor: Apache Storm

Storm provides the ability to process streaming data in real time. Storm can consume messages from Kafka topics via Spouts that then process these messages using functions defined in Bolts to produce an event. The functions performed on each stream type are defined in a Topology. In OpenSOC, Bolts can be used to apply analytics such as machine learning or to generate enriched events by adding intelligence information.

Real-time index and search: Elastic Search

Events are moved from Storm to Elasticsearch, which indexes and stores these events allowing for real-time correlation and analytics methods like anomaly detection.

Long-term data store: Apache Hive

Storm feeds into Hive to provide data summarization and querying using an SQL-like language. For example, the storage of compressed metadata in indexed tables in ORC format, or raw data stored in tabular form. Data stored in Hive may also be queried using a MapReduce job.

Long-term packet store: Apache Hbase

HBase is a scalable and distributed database that supports structured data storage for large data sets such as PCAP tables.

Visualization platform: Kibana

Kibana is an open source data visualization platform that provides powerful graphics and the ability to build custom dashboards

Although ideal for security threat analysis, the OpenSOC framework can be tailored to ingest, analyze and view any type of telemetry for a variety of other business functions. For companies, data scientists, and anyone considering building their own dig data solution, OpenSOC is worth a look.

Interop logo

Learn more about infrastructure security in the Security Track at Interop Las Vegas this spring. Don't miss out! Register now for Interop, May 2-6, and receive $200 off.