Python Script to Parse PFSense DHCP Log#

I have a captive portal setup on my PFSense which allows my laptops and various other devices to connect through wifi. I was looking at the DHCP logs provided by PFsense the other day and realized that I needed a way to verify the macs that were requesting ip addresses. I put together a python script that parses the log and attempts to match the mac addresses that I know with the ones in the log. Enjoy the code and note that the macs have been changed.

Here is a sample of the DHCP log file generated by PFsense:

Apr 16 09:19:22 	dhcpd: DHCPACK on 192.168.0.203 to bc:ae:c5:4c:1a:73 (desktop) via vr0
Apr 16 09:19:22 	dhcpd: DHCPREQUEST for 192.168.0.203 (192.168.0.1) from bc:ae:c5:4c:1a:73 (desktop) via vr0
Apr 16 09:19:22 	dhcpd: DHCPOFFER on 192.168.0.203 to bc:ae:c5:4c:1a:73 (desktop) via vr0
Apr 16 09:19:21 	dhcpd: DHCPDISCOVER from bc:ae:c5:4c:1a:73 (desktop) via vr0
Apr 16 09:18:11 	dhcpd: DHCPACK on 192.168.177.238 to 00:00:1b:4e:00:b7 (Wii) via vr1
Apr 16 09:18:11 	dhcpd: DHCPREQUEST for 192.168.177.238 from 00:00:1b:4e:00:b7 (Wii) via vr1
Apr 16 08:59:20 	dhcpd: DHCPACK on 192.168.177.238 to 00:00:1b:4e:00:b7 (Wii) via vr1
Apr 16 08:59:20 	dhcpd: DHCPREQUEST for 192.168.177.238 from 00:00:1b:4e:00:b7 (Wii) via vr1

Here is the contents of valid_machines.csv:

MAC address, Computer
00:00:1b:4e:00:b7, wii
01:03:30:b4:23:8c, dsi xl
bc:ae:c5:4c:1a:73, desktop
01:0c:04:d1:A3:a5, voip
02:14:4c:23:d2:BC, desktop 2
02:0d:78:40:cA:dE, desktop 3

Here is the python script:

#!/usr/bin/env python
#-*- coding:utf-8 -*-

"""
The purpose of this script is to parse the dhcp log from pfsense looking for mac
addresses that aren't listed in the valid_machines.csv file.

Simple copy and past the DHCP log text in to a text file and process it with
this script.

License:
The MIT License
Copyright (c) 2011 Troy Williams

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
"""

import sys

#Constants
__uuid__ = 'c0a17e00-11bb-4c56-9191-ba4d561feb0a'
__version__ = '0.1'
__author__ = 'Troy Williams'
__email__ = 'troy.williams@bluebill.net'
__copyright__ = 'Copyright (c) 2011, Troy Williams'
__date__ = '2011-04-16'
__maintainer__ = 'Troy Williams'

def load_macs(mac_file):
    """
    The list of know mac addresses are stored as a csv list. Load them
    into a list of dictionaries
    """
    import csv
    data_reader = csv.DictReader(open(mac_file, 'rb'))
    #convert the data_reader into a list of dictionaries so we can properly
    #iterate over it
    return [row for row in data_reader]


def read_log(file_name):
    """
    Takes a file name and reads the contents into a list separated by linefeeds
    """
    text_file = open(file_name, "rb")
    lines = text_file.readlines()
    text_file.close()
    return lines


def find_macs(lines):
    """
    Takes a list of strings and searches them for valid mac address.

    Note: mac regex from here http://txt2re.com
    """

    import re

    re1='.*?'
    re2='((?:[0-9A-F][0-9A-F]:){5}(?:[0-9A-F][0-9A-F]))(?![:0-9A-F])'
    rg = re.compile(re1+re2,re.IGNORECASE|re.DOTALL)

    macs = []
    for line in lines:
        m = rg.search(line)
        if m:
            mac1=m.group(1)
            macs.append(mac1)
    return set(macs)


def main():
    """
    Orchestrates the hole shebang.
    """

    #load the valid_machines.csv for a list of machines into a dictionary
    valid_macs = load_macs('valid_machines.csv')

    #parse the log file
    lines = read_log('dhcp.log.txt')

    #load the list of valid macs
    macs = find_macs(lines)

    #process the list of macs from the log file against the list of valid macs
    for mac in macs:
        found_address = False
        for vmac in valid_macs:
            if mac.lower() == vmac['MAC address'].lower():
                found_address = True
                break
        #Check to see if the mac address was found
        if not found_address:
            print mac, 'not found'
        else:
            print mac, '=', vmac['Computer']


if __name__ == '__main__':
    sys.exit(main())