Know thyself and thy network stuff
Suricata is an OpenSource Network Intrusion System (NIDS), which is developed by the Open Information Security Foundation (OISF). The project's intention is to secure the network via the inherent transparency of OpenSource.
Many companies have an (N)IDS, with flashing lights, which they bought because a vendor told them "it's awesome"[tm]. Few companies get more value out of an IDS than a tick-box in an assessment.
Suricata can be the one thing, and/or the other. A valuable IDS which contributes to situational network awareness. Or just a tick-box in an assessment. Or both.
"Suri" is much like Snort: it's a daemon, which needs network traffic and rules to generate alert messages about possible network security issues. Easy, ha?
Open Source Network Intrusion Detection in an enterprise security setting
Commercial NIDS appliances can be difficult to deal with; as well. But usually the management is simpler, and there is more support. Not necessarily more supports in regards to the NIDS system itself, but more related to the enterprise requirements.
Most of the traffic which matters in a modern enterprise network is SSL/TLS encrypted. Suricata has no support for SSL/TLS offloading. If you don't have an appliance, which can do the offloading for you, there is your problem.
I have setup Suricata with F5 BigIP ClonePools and these newish Forcepoint (Websense) mirror ports. It's possible, if you are up for it. But with Suri it's not as easy as it could be. It works, but it's not nice.
I integrated Suricata with SIEMs, like IBM QRadar. This can be quite difficult, because you need to build upon Suricata's Snort compatibility, when it comes to the logging. That means you are limited to Syslog plus maybe some of the JSON fields if you are good at RegEx.
At the end of the day every SIEM, even something lightweight like Splunk or Sumo Logic, requires you to know RegEx. If you want to work with Suricata rules seriously, you need to be a master of RegEx.
If you want to get there, check out this one soon. Because the hard way pays off big time.
What not to do with an OpenSource IDS
I don't get the obsession "OpenSource IDS people" have with Barnyard2 and
unified2. Some even pipe unified2 with some crazy parsers into MySQL.
Such a tool stack unreliable and complex in my experience. Instead of Barny I use Suricata's EVE.json - via Syslog. This includes the information fields, like the matched decoded payload, and other technical data. I don't see any value in NIDS alerts without decoded payloads. In fact I even see limited use for these payload snippets, because in 99,9% of all cases they are too limited to perform a root cause analysis.
Are you an OpenSource security guy?
Don't underestimate the complexity of an OpenSource NIDS deployment and the management effort. I run multi-Gigabit Suricata sensors since a couple of years. The effort is not in the Linux setup. You can fire up an Ubuntu Server box on a Dell or IBM server in the data-center in ~30 minutes.
The effort is in the monitoring and analysis of the Suricata sensor, and in determining if it's healthy and efficient. That plays into the results of the NIDS directly. Packet loss can be situational: related to a DDoS or to a deployment. But if the senor's packet acquisition not working properly, your alerts will be garbage.
You should have sound Linux System Engineering skills as well as Network Forensics abilities in your team. And good insights into Security.Management. Like: what is the business case again? Right: getting network security metrics. Enabling Incident Response. Preventing data loss. How do you do that with Suricata? Can your Incident Response guys access the ring buffer on the relevant IDS sensor with their Bash skills? Meh... maybe.
Workflow 1 : understand the difference between the receiver application and the NIC driver
ethtool is your friend
NIC drivers matter for an NIDS. Whether it's Solarflare or Intel IXGB(E).
Personally I chose to use
Intel Corporation 82599ES PCIe adapters in
IBM System X standalone servers. But it really doesn't matter if it's a Supermicro box or a Dell server. It matters whether you can set the appropriate driver option in
Almost all common Linux distributions ship the
IXGBE driver as an included module, which is useful for common tasks. Like hosting a virtualization environment or for a storage server.
Many NIC driver options cannot be set via
ethtool with the commonly shipped
IXGBE driver. For an NIDS this is not good. So it's time for some kernel hacking to open this up.
Know your Linux kernel - and its modules
For 10 GbE it can look like this:
- Network signals go from the cable to an SFP, which essentially is a transceiver
- From the SFP we go to the NIC. We should configure the NIC hardware via the driver (IXGBE)
IXGBE driver sets the RX / TX buffers on the NIC, and the IRQs as well
- With RX OS I mean userland and lib stuff that interacts with network data streams, like
- The receiver application is Suricata, BroIDS,
tcpdump etc.. - preferably something multi-threaded and userland aware
NIC driver considerations
In my opinion you need to replace the
IXGBE drivers in the vanilla Linux kernel. I use the PF_RING ZC drivers. These have features you do not commonly find in NIC drivers:
- they are NUMA enabled, if you want to use that (
PF_RING queues will do this for you)
- they export a lot of options to
ethtool (that is why we are doing this in the first place)
- we can still use
af_packet at your receiver application (Suricata), but af_packet is meh, even though technically similar to PF_RING it performs worse in my cases.
- and optimally you use multi-threaded RX queues with PF_RING with a multi-threaded Suricata (or BroIDS) receiver app
It's important to understand that these drivers are available on GitHub as OpenSource distributions. But as far as I know it's a single developer who maintains these. I have not security-reviewed the code.
Build the Linux kernel AND Suricata for high-performance packet acquisition
In order to be able to replace the Linux NIC driver I build the commonly distributed
IXGBE in the Linux kernel as a module.
You can go to
/usr/src/linux or where ever your sources are. Issue a
make menuconfig and wait until the menu appears. Now press
/ like in Vim and search for "ixgbe".
Device Drivers -> ... as indicated by the results.
I build these as Modules because I want to be able to fall back. I boot up the system and load the
PF_RING ZC drivers manually. That is possible because the 10 GbE interfaces aren't used for IDS sensor management. Otherwise you cannot unload them directly; you'd have to do that via IMM or IPMI.
Use an alternativwe IXGBE driver for more control
I kick in the
rmmod ixgbe # the official one goes out
insmod /root/Source/pf_ring/PF_RING/drivers/ZC/intel/ixgbe/ixgbe-zc/src/ixgbe.ko FdirPballoc=3 vxlan_rx=0
That is not the cleanest solution. But if you want to auto-load drivers at boot for a PCIe device, some servers cause issues. I do this after the boot process is finished. That works for me[tm].
With the alternative
IXGBE driver we have more options for
ethtool. As promised:
ifconfig enp22s0f1 up
ifconfig enp22s0f1 mtu 1534
ifconfig enp22s0f1 promisc
ethtool -K enp22s0f1 tso off
ethtool -K enp22s0f1 gro off
ethtool -K enp22s0f1 ufo off
ethtool -K enp22s0f1 lro off
ethtool -K enp22s0f1 gso off
ethtool -K enp22s0f1 rx off
ethtool -K enp22s0f1 tx off
ethtool -K enp22s0f1 sg off
ethtool -K enp22s0f1 rxvlan off
ethtool -K enp22s0f1 txvlan off
ethtool -N enp22s0f1 rx-flow-hash udp4 sdfn
ethtool -N enp22s0f1 rx-flow-hash udp6 sdfn
ethtool -C enp22s0f1 rx-usecs 1 rx-frames 0
ethtool -C enp22s0f1 adaptive-rx off
This deactivates features of the hardware chips in the Intel NIC, which we won't need for our receiver application (Suri).
Take a look at the kernel ring buffer via
[255492.958094] ixgbe 0000:16:00.1 enp22s0f1: MAC: 2, PHY: 9, SFP+: 3, PBA No: G70XXX-00X
[255492.958098] ixgbe 0000:16:00.1 enp22s0f1: Enabled Features: RxQ: 12 TxQ: 12 FdirHash LRO[255492.965553] ixgbe 0000:16:00.1 enp22s0f1: Intel(R) 10 Gigabit Network Connection
[255493.031320] ixgbe 0000:16:00.1: registered PHC device on enp22s0f1
[255493.132355] ixgbe 0000:16:00.1 enp22s0f1: changing MTU from 1500 to 1534
[255493.391576] device enp22s0f1 entered promiscuous mode
[255493.397769] ixgbe 0000:16:00.1 enp22s0f1: enabling UDP RSS: fragmented packets may arrive out of order to the stack above
[255493.452194] ixgbe 0000:16:00.1 enp22s0f1: detected SFP+: 3
tl;dr: we have configured the Linux kernel to be modular, and adapted it for the Suricata NIDS. The NIDS is our receiver application. We use
ethtool options to adjust the Intel NIC behavior.
Now we have look at the RX OS and the receiver application.
Hyperscan, PF_RING and GCC flags - fire in the wire
To keep this short and sweet I use Hyperscan; and it works with the ETPro rules. I use
PF_RING libs and some Gentoo optimizations. Because that works for me.
emerge -aq dev-libs/boost && emerge -aq dev-util/ragel and
cmake -DBUILD_STATIC_AND_SHARED=1 ../ the stuff - effect is that the rule parer fails
PF_RING: install the ZC drivers, plus the lib and userland the
make install way
GCC: for an Intel Xenon CPU:
more /etc/portage/make.conf | grep CPU
CPU_FLAGS_X86="aes avx mmx mmxext popcnt sse sse2 sse3 sse4_1 sse4_2 ssse3"
And finally for Suricata (
./configure --enable-unix-socket --enable-python --enable-geoip --enable-pfring --enable-luajit --prefix=/usr/bin --sysconfdir=/etc/ --localstatedir=/var/ (--with-libhs-includes=/usr/local/include/hs/ --with-libhs-libraries=/usr/local/lib/)
Note: for low throughput scenarios you can roll with the
ebuild in Portage, maybe. Or apt-get etc.
The command line for the watchdog / init daemon can look like this:
sudo /usr/bin/suricata -c /opt/ids_rules/sensor_configs/.../suricata/suricata.yaml --pfring-int=enp17s0f1 --pfring-int=enp17s0f0 --set mpm-algo=hs --pfring-cluster-id=99 --pfring-cluster-type=cluster_flow --user suri --group suri
If you build the receiver application the right way, and fire it up with the right options, you can:
- capture traffic on multi-Gigabit networks without much packet-loss (you can use RX queues and multi-threading)
- inspect the network traffic in real-time (OpenSource Deep Packet Inspection on 10 GbE nets)
Workflow 2 : use OpenSource network measurement tools to determine the right config for the receiver application; Suricata
Measure the MTU and set it correctly in Suricata and on the interface
You might be inclined to just set the MTU of the capture interface to the maximum; e.g.
9000. I don't recommend to do that, because you will run into issues with the steam reassembly engine in the Suricata IDS engine / any receiver application.
Measure the MTU with
tcpstat -i eno5 -l -o "Time:%S\tn=%n\tavg=%a\tstddev=%d\tbps=%b\tMaxPacketSize=%M\n" 5
Time:1487330908 n=46530 avg=996.32 stddev=647.74 bps=74174027.20 MaxPacketSize=1518
Time:1487330913 n=26477 avg=493.90 stddev=596.93 bps=20923276.80 MaxPacketSize=1518
Time:1487330918 n=136229 avg=1276.56 stddev=518.63 bps=278247740.80 MaxPacketSize=1518
The Suricata NIDS sensor config for this:
suricata.yaml config should be:
# Preallocated size for packet. Default is 1514 which is the classical
# size for on ethernet. You should adjust this value to the highest
# packet size (MTU + hardware header) on your system.
And the respective MTU for the network interface should be:
ifconfig eno5 mtu 1518
The check is:
suricata --dump-config | grep default-packet-size
default-packet-size = 1518
The monitoring for this:
My recommendation is to use Scririus. The least amount of work... more on that later.
One indicator of a wrong MTU are dropped packets at the kernel level. That creates a domino effect, and can cause decoder fails in the Suricata engine. That looks like this:
If you see something similar to the following screenshot, you may have a wrong MTU config.
If the IDS engine reports a strong trend of
TCP reassembly gaps this means, that the TCP sessions have not been inspected. That can be related to decoding issues. Decoding issues mean that a protocol cannot get dissected. If the protocol is HTTP via TCP, you will see the count in
TCP reassembly gaps AND in
Decoder invalid increasing.
Generally any monitoring system can give you charts and trends for these values. Like Solarwinds (even via SNMP) or Zabbix. You name it. You use it. You check it.
tl;dr: The workflow must be to measure, to check, and then to configure. Don't configure random values, and don't configure maximum settings "to be on the safe side". This will not help you and most likely cause confusion and chaos.
It's very important that you measure capturing and NIDS stream reassembly continuously. Modern networks are dynamic. Your "coverage" is only as good as the capture quality.
- Packet loss is not necessary and can be avoided with a workflow that includes monitoring.
- Packet loss can cause domino effects, and limit your NIDS capabilities.
- This is an example of the various statistical properties Suricata logs, and what they mean in context
Workflow 3 : PCAPs, not rocket science
Suricata can keep the PCAPs, and split them by size. Just like daemonlogger back in its days.
I like to investigate "complex multi-vector high severity" offenses (compound alerts) with a complete data set to drill down to the root cause. A useful way to do this is to utilize the NIDS ringbuffer in conjunction with logs and threat intelligence.
Storage is no big deal
Let's say a 2-3 TB ringbuffer can keep ~2-3 hours of data. In a RAID 10 that isn't a big storage problem. 4-5 disks fit into a 1 HU rack server. If your server RAID controller supports it, you can even go for a RAID 6. Ringbuffer data isn't business critical; usually.
Then I simply use
ext4, because it's a mature and reliable file-system every Linux admin knows and can fix.
SSD backed RAIDs on Linux
For high-throughput situations (volume based DDoS for example) I use SSDs to have a page-buffer with enhanceio. It is transparent in between the RAID 10 and the SSDs. I have the SSDs in a RAID 0, because the page-buffer is volatile and all I want from the SSDs is their write speed. The only thing you need to keep in mind is, that you need to sync out the dirty pages before you reboot the sensor. Otherwise you will end up with a broken file-system. And we don't want that, do we?
Use separate partitions
Suricata will store the PCAPs in the default log director; something like
/var/log/suricata. I softlink this to
/data, and have the SSD backed RAID 10 separately.
There you will also find the
Really? Daily PCAP analysis - that sucks!
The problem with ringbuffer analysis is that you won't do it often, if you have to download gigabytes of PCAP data.
The downloads take time and having that much data on an analysis workstation or laptop is messy. In order to overcome this issue I use an appliance called Cloudshark. You can upload PCAPs from an IDS sensor with
curl using the Cloudshark API. The PCAPs get sent to a box, on premise. Then you have a web frontend similar to Wireshark to filter down the PCAP in correspondence with the alert data. There are automated functions in Cloudshark for Threat Analysis these days.
Since Cloudshark is a commercial appliance, it's on them to give you a feature tour. The point is, that I do not recommend to buy or setup any IDS, that does not give you the data. If a root cause analysis workflow is not possible, it's an amateur system.
My workflow here is:
- Get the offense from QRadar - extract the time-stamp
- Let the IDS sensor push the PCAPs to the Cloudshark appliance (+/- 15m)
- add the links to the PCAPs in Cloudshark as a note to the QRadar offense
You can do that with ArcSight or LogRythm as well, similarly.
Suricata PCAPs have timestamps - you can use them to get the recorded traffic from the time of the offense
Take a short look at this hacky code:
file_list =  # array for the files we want to upload
ids_server_ip = "126.96.36.199" # this is your sensor
ssh = paramiko.SSHClient()
ssh.connect(ids_server_ip, username=user_name, key_filename=keyfile)
command = "ls /data/log.pcap.*"
stdin, stdout, stderr = ssh.exec_command(command)
for line in stderr.readlines():
exit_status = stderr.channel.recv_exit_status()
for line in stdout.readlines():
# print line
file_path_fields = line.split(".")
time_stamp = file_path_fields
if time_stamp > epoch_timestamp:
line = unicodedata.normalize('NFKD', line).encode('ascii','ignore')
exit_status = stdout.channel.recv_exit_status() # Blocking call
if exit_status == 0:
print str(datetime.datetime.today()) + ": Command finished successfully. File list is present."
You can modify the
if statement to
if ( time_stamp >= epoch_timestamp - (15 * 60) ) or ( time_stamp <= epoch_timestamp + (15 * 60) ) - for 15 minutes. I don't want to do the upload via Python. So I loop over the list from
lftp. The script is not on the sensor. It remains local.
If you are really interested in the "QRadar -> Offense -> Cloudshark links -> add note" approach, shoot me a mail. QRadar has an API, which makes this easily possible. I am not 100% finished.
- A security analyst can upload ringbuffer to an appliance and perform Wirehshark-style root cause analysis
- This way Suricata can become part of security incident response, given that you mature the scripts which import the data into the analysis systems. QRadar is a good SIEM example, because it's rooted in the network context.
Workflow 4 : mind your IDS rules - centrally and daily
I use Scirius to manage a globally distributed Suricata IDS sensor network.
As almost everyone, who uses Suricata, I use ET Pro rules with the professional subscription from Proofpoint.
If you import the rules into Scirius it may look like this in the real world:
- These rulesets get managed for all appliances centrally. Or individually if you want that.
- Many of these generic ET rules do not fit my environments. Sometimes the rules have confusing names, which sound like they indicate big issues. In reality they can also cause false positive alerts. At the end of the day Suricata is nothing more than
ngrep and RegEx pattern matching.
- Suricata is well suited to supply custom IOCs, from Malware Analysis. You can have as many custom rules as you want. The rule DSL is as powerful as Snort's.
Since Scirius is a commercial project, I can leave it to Stamus networks to give you a feature tour. It's very useful to me, and I was allowed to test the product for 30 days freely. I works based on Ansible and SSH. The web front is a bit sluggish and occasionally the ELK stuff is laggy and slow. But it gets the job done to roll out the rules to all appliances and gathers the stats to fine-tune the systems.
Examples of custom Suricata rules / custom IOCs
I think I copied these from a Blackhat paper from 2003... they might work.
`alert tcp $EXTERNAL_NET any -> $HOME_NET any: (msg:"CUSTOM: MSSQL SQLi Attempt"; flow:to_server,established; pcre:"/exec\s+(s|x)p\w+/Ui"; threshold:type threshold,track by_src,count 10,seconds 60;classtype:web-application-attack; sid:209007000; rev:1;)`
alert http $EXTERNAL_NET any -> $HOME_NET any: (msg:"CUSTOM: Parnoid XSS Attempt Logging"; flow:to_server,established; pcre:"/((\%3C)|<)[^\n]+((\%3E)|>)/Ui"; threshold:type threshold,track by_src,count 10,seconds 60;classtype:web-application-attack; sid:209003000; rev:5;)
- You can have enterprise level features like centralized IDS sensor management and continuous monitoring with Scirius Enterprise
- You can get professionally supported rules from Proofpoint, and I have read that even the VRT rules can work with Suricata. I do not use these, because ET Pro officially supports Suricata
- You can supply custom IOCs centrally via Scirius, e.g. network ranges of attackers or payload information like odd User Agents, bad URLs etc.
- You can use LuaJIT for more complex detection rules
The big deal about this is, that you can find out what is normal for your networks.
If you have that, make sure that you do not have 100s of IDS alerts each hour. You can adapt the rules, rule-sets and network ranges. And phase out custom IOCs, once they have become outdated.
Workflow 5 : use a watchdog to auto-restart the software in case of crashes
Suricata might crash. That is very unfortunate and should be monitored. You should do a root-cause analysis and determine why that happened. This doesn't mean that you need to face NIDS downtime.
Supervisord can help to keep the NIDS up and running
Supervisord is an OpenSource process control system, which I use to with Suricata. It will automatically restart it after a crash.
command=/bin/bash -c "/usr/bin/suricata -c /opt/ids_rules/sensor_configs/.../suricata/suricata.yaml --af-packet=eno5 --af-packet=enp22s0f1 --user suri --group suri not ip6"
- This example will attempt to restart Suricata if it crashed (it's for
af_packet and not
pf_ring - but the principle is the same for the watchdog):
- The path from which the
yaml is loaded is related to the Git workflow
- This also assumes that you forward the Syslog output to a log server and generate alerts on segfaults and crashes.
Workflow 6 : configuration management via Git
I manage the suricata configuration via GitHub. You can edit and commit directly from the web.
git fetch --all
git reset --hard origin/master
if [ $? -eq 0 ]; then
logger -i -t ids_rules "Update successful"
logger -i -t ids_rules "Update failed"
This overwrites local changes and sets the config back to Git head. The folder name
ids_rules is confusing. I used to roll out IDS rules like this, before I had Scirius.
- You can manage the configs via the web[tm], like on a commercial IDS system
- You can roll out configs via Git for Suricata sensors
- You can log config updates via Syslog for referance
Workflow 7 : dealing with ClonePools
A ClonePool is an F5 concept. It's similar to a SPAN port, but usually a ClonePool clones the traffic of F5 objects. It has got plenty of options, because F5 BigIP systems are very powerful.
A ClonePool can be used to mirror SSL / TLS offloaded traffic, for the websites which are managed on a BigIP. Using ClonePools for NIDS systems works with Cisco Sourcefire appliances. Why shouldn't it work with our Suricata NIDS sensor?
The good news is that the
ethtool part is not too different. I mentioned it in an earlier workflow. Further good news is, that you actually don't need to modify Suricata. I recommend to disable promiscuous mode on the capture interfaces.
You need to setup an IP on the interface as the receiving endpoint from the F5, like:
ifconfig enp17s0f1 192.168.99.XX netmask 255.255.255.0
I also add the F5 sender IP to a BPF drop statement in Suricata, like:
not net 192.168.99.0/24; given that you don't route that. And given that this is your VLAN for the ClonePool traffic. I do this, because the keep-alive packets from the F5 box aren't part of my IDS scope.
- Suricata works with ClonePools, SPAN ports, TAPs etc. - no problems
- The NIC options are not too different. But promiscuous network interfaces are not necessary for F5 ClonePools
Workflow 8 : split routing, asynchronous routing and PF_RING clusters
Workflow 9 : integrate Suricata with a SIEM once your alerts make sense
I wrote a short howto to integrate Suri with IBM QRadar. It's important to understand that a SIEM isn't a logging system. If you want to log your IDS alerts, use Scirius / ELK. A garbage bin. Once you are over this phase, try lightweight log-correlation systems like Splunk or Sumo Logic.
Among the next iterations comes "security intelligence". That means making sense of logs, and taking action upon nearly all of the results. If you have a SIEM, chances are good that you will have a SoC, 24/7 security etc. Before you can have a SoC you need to know a couple of key things in your infosec department:
- what is normal for your environment?
- In context of the NIDS: for your networks.
- what are the threats and risks? How do you weigh them?
- What are the results of these threat models and risk assessments? How quick do you need to resolve a potential incident?
- what are the key assets? Which state are they in?
- Do have information assets? Do these play into your analysis workflows? As patterns maybe? In configs?
- what is the logging strategy... security strategy... do you have any compliance requirements... ?
At the end of the day your sensitive NIDS alerts, from well-adjusted Suricata sensors, which perfectly fit the given network environments, can log into a SIEM.
tl;dr: the motto I follow is: only SIEM, what makes sense.
There are Machine Learning advocates. But if there is one thing you can take away from OpenSource, it's that an honest step by step approach is much more efficient than a snakeoil machine learning solution, that tries to statistically filter garbage into security intelligence. People who recommend that to you have nicely wrapped amateur solutions. And in many cases a root cause analysis will be impossible.
If you plan your Suricata NIDS well and get along with the OpenSource methods, it's a useful fit into an enterprise environment. Not everyone can use these kinds of systems.
- Suricata has an impressive feature list, and you can manage sensors in multi-Gigabit environments with ease after reading this article
- Central management features are available via commercial OpenSource solutions
- SIEM integration is possible