Thursday, December 2, 2010

Network security monitoring with KVM

This blog post talks about how built-in Linux functionality can be used to implement a network security monitoring solution for a KVM based hypervisor. This gives an "inside the box" view for people who like to know more about the internals of Linux and KVM.

Disclaimer: I am part of the product marketing team for virtualization at Red Hat, but the opinions expressed herein are my own. The functionality as described here may or may not become part of high level management systems like Red Hat Enterprise Virtualization at some point.

With Linux/KVM, the most frequently used network setup is that where a number of virtual machines are attached to a software bridge. At least one physical network adapter is added to the bridge as well, which takes care of the uplink to rest of the network. The bridge acts as a (virtual) switch: traffic is only sent to the switch port that connects the destination Mac address.

Recently, i came across the following need: suppose you want to use a network security system (such as an intrusion detection system - IDS) to detect, or even thwart, security attacks that are in progress in the virtual environment. In the physical world, there's a well established way to do this: you use security appliance "black boxes", and connect those to mirror ports on your switch infrastructure. A mirror port is a specially configured switch port that gets a copy of all frames that pass through the switch. Inside the security appliance, the network card is put in promiscuous mode, which allows the security software to read and analyse all packets that arrive on it. Port mirroring goes under different names. Cisco calls this a Switch Port ANalyser (SPAN) port - other vendors have different names.

What we need to for a network security system in the virtual world is exactly the same: we need a port mirror on all our virtual switches. This port mirror should copy all frames on the virtual switch to a (in this case virtual) security appliance that runs on the same host. So the question is: is it possible to create mirror ports for Linux bridges?

We tried various approaches, including approaches using "iptables" and "ebtables". There's also a very ugly approach that sets the bridge's "ageing time" to 0, making it effectively a hub. The approach that i liked best in the end was to use the Linux traffic shaping capabilities to create a true port mirror. Traffic shaping on Linux is configured using the "tc" command, which is part of the iproute-2 package. A lot of good documentation on traffic shaping can be found in the Linux Advanced Routing and Traffic Control Howto by Bert Hubert. It does a good job of explaining traffic shaping, but unfortunate it doesn't talk about port mirroring. That's why i decided to write this log entry and explain how it works.

First, a very quick overview of how traffic shaping works on Linux. This is a very high level description and it leaves out some important details. For more thorough introduction see Bert Hubert's howto above. In essence, traffic shaping is the process of deciding if, when, and which packets are sent out on a network interface. The key object in traffic shaping is that of a queueing discipline, or qdisc. A qdisc has two main operations: enqueue packets, and dequeue them. A packet can always be enqueued, but it is the decision of the qdisc if, when and how to dequeue one. For example, a qdisc may decide to make packets available immediately, but it may also decide to re-arrange packets (to prioritize certain traffic), or to delay/drop packets (e.g. to enforce bandwidth controls). Two types of qdiscs exist: simple "classless" ones and more powerful "classful" ones. The difference is that the classfull ones can use ,atching logic to classify packets into different categories, in order to provide different policies for them. The picture below illustrates the two types of queueing discplines:

Queueing disciplines are attached to network devices. Any network device will work, be it a physical network device (eth0, etc), a bridge, or a bridge interface that connects to a virtual machine. Normally queueing disciplines are attached to the output, or egress direction of a device. This is how traffic shaping normally works: we do not control what others send to us (like with email), but we can control how we send packets to others. Input, or ingress, qdiscs do exist though, and we will use them below.

Let's get down to it now. Below are the commands that can be used to set up the port mirror. It uses the tc filter action "mirred," which was written specifically for the task of mirroring packets (thanks to Herbert Xu for the pointer).

  1. # tc qdisc add dev vnet1 ingress
  2. # tc filter add dev vnet1 parent ffff: \
          protocol ip u32 match u8 0 0 \
          action mirred egress mirror dev vnet0
  3. # tc qdisc replace dev vnet1 parent root prio
  4. # tc filter add dev vnet1 parent 8002: \
          protocol ip u32 match u8 0 0 \
          action mirred egress mirror dev vnet0

Easy, isn't it? :) Let's go through these in detail. In the code above we have assumed that the security appliance is attached to the bridge interface "vnet0", and the VM to be monitored is attached to bridge interface "vnet1".

  1. # tc qdisc add dev vnet1 ingress

What we do here is to create a new qdisc called "ingress". As mentioned above, qdiscs normally don't work on ingress so this is really a special qdisc that you can consider an "alternate root" for inbound packets.

  2. # tc filter add dev vnet1 parent ffff: \
          protocol ip u32 match u8 0 0 \
          action mirred egress mirror dev vnet0

This is where we copy packets that are generated by the VM. This line says: add a new filter, and attach it to node "ffff:". The ID "ffff:" is the fixed ID of the ingress qdisc. Normally nodes are dynamically allocated, but not for ingress (I assume because you can have just one). The filter only matches for IP packets ("protocol ip"). The part "u32 match u8 0 0" specifies a matching expression. In this case, we use the "u32" matcher, with arguments "u8 0 0". This means match any packet where the first byte, when ANDed with the value 0, returns 0. In other words, all packets are selected. When the filter matches, the action "mirred" is executed with arguments "egress mirror dev vnet0". This tells mirred to copy the packet to the device "vnet0".

  3. # tc qdisc replace dev vnet1 parent root prio

Here we replace the qdisc that is directly attached to the root node with a new qdisc of type "prio". You may select another qdisc if you desire, but the reason why we replace it is to make sure that we attach a classfull qdisc. By default, the classless qdisc "pfifo_fast" is used, and being a classless qdisc, it doesn't evaluate filters.

  4. # tc filter add dev vnet1 parent 8002: \
          protocol ip u32 match u32 0 0 \
          action mirred egress mirror dev vnet0

This line copies packets that are destined towards the virtual machine. The filter is attached to the egress side of the bridge interface, which is where normally all qdiscs operate. The filter is added to the qdisc with node ID 8002:. This may be different on your system. After step 3 you should check the ID that has been allocated with "tc qdisc show dev vnet1". The protocol, match and action parameters are identical to step 2.

That's it! To monitor the traffic for all virtual machines, these steps have to be executed for all bridge interfaces. Inside the virtual machine that connected to vnet0, you can use a tool like "wireshark" to confirm that you're indeed getting a copy of all network traffic.

Improvements / open questions
  1. It would be nice if a filter could match any protocol, rather than just 1 at a time. Of course it is very unlikely that your router would route anything else than IP to your host, so this limitation does not matter much for threats from the outside. It does allow however that virtual machines on the same host or on the same LAN communicate with each other without being detected, if they would use a non-IP protocol.
  2. It would be nice if there were a simpler way to specify a match that is always true, rather than the not very obvious match "m32 match u8 0 0".
  3. The security appliance could be put in-line very easily with the same approach, by using the "redirect" command to the mirred action instead of the "mirror" command. This would not copy the packet but instead forward the original packet. It would be the responsibility of the security appliance to forward the ticket back to the original destination (or not).

The goal of this article was twofold. First, it shows how to use the Linux traffic shaping functionality to implement a port mirror for a virtual switch based on a Linux bridge. Secondly, it shows my personal belief that KVM is a very advanced hypervisor architecture. By being based on Linux, it allows you put all the goodies that have been developed for Linux over the past decade or so to good use, without re-inventing the wheel.


Joy Leima said...

What if I wanted to monitor everything received by the kvm bridge and not another VM guest? I want to take all the pkts received by br0 on the host & forward them to my security appliance. In vmware esx, I just choose the 'Pro 192.10.16.x' Network. I don't really want to monitor a particular VM as much as I want to monitor the entire subnet.

Unknown said...

Thank you so much for this explanation... I was looking for something similar to this, and you are the first who has been able to explain this in a way that makes sense.

I'll give you the scenario:
-Two bonded connections
-802.1q VLANs on the bond
-Needed to mirror traffic from one of those vlans.

Normally, in a non-redundant non-vlan scenario, I'd just pull that off of a span on the switch. That wasn't going to work in this case.

I looked at iptables, I looked at bridges, and nothing I saw seemed to be able to do this out of the box on CentOS/RHEL 6.

Thanks to this guide, I was able to set up mirrored traffic from the bonded vlan to a free physical port, and it works wonderfully.

I'm using something like this to automate the parent discovery:

PARENT=`tc qdisc show dev bond0.1203 | grep prio | cut -d ' ' -f 3`
tc filter add dev bond0.1203 parent $PARENT \
protocol ip u32 match u8 0 0 \
action mirred egress mirror dev em2

I'll eventually have a mirror $src $dest script.

smirnon said...

Thank you so much for the wonderful explanation! I am looking for something similiar with the difference being i also want to send out pkts on the mirrored interface (in your example - vnet0) out to some IP address in the same LAN.
Essentially do this for multiple machines that i want to monitor and capture all their traffic to this one machine that can analyze this traffic. I guess i would have to change the destination IP right?
Any ideas/thoughts on how to do that ? Can it be done ?
Thanks in advance

hellt said...

Thank you for this article.
I wonder if it is possible to mirror non IP packets (layer 2 frames)?
Things like LLDP, LACP and BPDU