A Microsoft Exchange timeline for Incident Response
– By Bart Roos – Principal Cyber Security Consultant / Team Lead CERT
Timelines are one of the most useful tools for a forensic investigator. Although there are many off-the-shelve tools available, sometimes you end up writing your own code in order to quickly get the answers you are looking for. In this blog post I will discuss a simple tool I wrote to generate a timeline for Microsoft Exchange access mailbox based on IIS web server logs. It turned out that using this method an investigator can easily spot malicious mailbox access.
The case
For one of our incident response forensic investigations we needed to investigate a Microsoft Exchange mailserver to find traces of malicious mailbox access. In this specific case the Exchange server was only accessible via the IIS web server using HTTPS. We figured out that by creating a timeline based on the available web server logs we should be able to spot any irregularities.
We were specifically interested to find signs of:
- Access originating from foreign countries
- Access from non standard devices or user agents
- Access from different locations at the same time
We found out that the following types of access can be identified within the IIS web server logs:
- Outlook Web Access (OWA). Using this service users get access to their mailboxes using their web browser and the OWA web interface.
- Exchange via web connection HTTP/HTTPS. Used by client software such as Microsoft Outlook, Apple Mail, etc.
- Microsoft Activesync. Used by mobile devices such as smart phones and tablets.
The log files
The IIS log files are located in inetpublogsLogFilesW3SVC1
. The number 1
can vary based on the website ID configured in the IIS configuration. Our Microsoft Internet Information Services 7.0 web server used the following default logging format:
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken
Let’s take a look at the various access types we identified earlier.
An Outlook Web Access request:
2015-12-31 00:19:48 10.11.2.3 GET /owa/ &prfltncy=16&prfrpccnt=14&prfrpcltncy=15&prfldpcnt=0&prfldpltncy=0&prfavlcnt=0&prfavlltncy=0 443 domain.localjohndoe 77.123.22.98 Mozilla/5.0+(Windows+NT+6.1;+WOW64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/39.0.2171.95+Safari/537.36 200 0 0 31
An Exchange web connection request:
2015-12-31 14:40:38 10.11.2.3 POST /EWS/Exchange.asmx SoapAction=GetUserOofSettingsResponse;MailboxRPCRequests=10;MailboxRPCLatency=15;ADRequests=2; ADLatency=0;TimeInGetUserOOFSettings=28; 443 domain.localjohndoe 10.10.5.2 Microsoft+Office/14.0+(Windows+NT+6.1;+Microsoft+Outlook+14.0.7128;+Pro) 200 0 0 46
A Microsoft Activesync request:
2015-01-01 22:16:03 192.168.16.3 POST /Microsoft-Server-ActiveSync/default.eas User=johndoe&DeviceId=Appl2312SDKJ2313&DeviceType=iPhone&Cmd=Ping 443 domain.localjohndoe 77.123.22.98 Apple-iPhone5C2/1104.257 200 0 0 62
Parsing the logs
We considered various generic tools such as an ElasticSearch stack or the excellent Microsoft Log Parser tool. These tools are great to parse web server log data, but require some more tweaking when you want to parse the actual content of the requests or do things like GeoIP lookups. We figured out that writing a simple script ourselves would be the shortest path to what we needed.
I wrote a simple Python script that filters out the requests that contain traces of Exchange access, remove duplicates (one line per date and IP) and do a GeoIP lookup:
#!/usr/bin/env python import collections import csv import sys import GeoIP # Read IIS server from stdin. reader = csv.DictReader( sys.stdin, fieldnames=[ 'date', 'time', 's-ip', 'cs-method', 'cs-uri-stem', 'cs-uri-query', 's-port', 'cs-username', 'c-ip', 'csUser-Agent', 'sc-status', 'sc-substatus', 'sc-win32-status', 'time-taken'], restkey='field', delimiter=' ') # Load GeoIP DB gi = GeoIP.open( "GeoLiteCity.dat", GeoIP.GEOIP_INDEX_CACHE | GeoIP.GEOIP_CHECK_CACHE) # Initialize output fields = ['date', 'user', 'type', 'user_agent', 'ip', 'country', 'city'] writer = csv.writer(sys.stdout) writer.writerow(fields) OutRecord = collections.namedtuple('OutRecord', fields) # Check for Activesync, Exchange Web and OWA access WHITELIST = [ '/Microsoft-Server-ActiveSync/default.eas', '/EWS/Exchange.asmx', '/owa/', ] seen = set() for line in reader: if line["cs-uri-stem"] not in WHITELIST: continue # Do GeoIP lookup. gidata = gi.record_by_name(line['c-ip']) if gidata: city = gidata['city'] country = gidata['country_name'] else: city = country = 'Unknown' item = OutRecord( date=line['date'], user=line['cs-username'], type=line['cs-uri-stem'], user_agent=line['csUser-Agent'], ip=line['c-ip'], country=country, city=city) if item in seen: continue writer.writerow(item) seen.add(item)
The script reads the log files from standard input and writes the resulting CSV content to standard output. To process a set of log files with the .log
extension in the current directory, the python script saved as exchangetimeline.py
and the results written to the file results.txt
use the following command:
cat *.log | exchangetimeline.py > results.csv
Please note that this script uses the python-geoip
Debian package and requires the Maxmind GeoIP database that can be downloaded from the Maxmind website.
The resulting CSV file contains the following columns:
date,user,type,user-agent,ip,country,city
Analyzing the results
Finally we use our basic Microsoft Excel skills to sort our data, select specific users and search for odd locations and user agents:
In our case we could quickly point out some odd requests from a foreign country and investigated requests coming from non-standard user agents. This provided us with the pointers we needed to further focus our investigation.
With our standard Excel toolkit we can also easily generate some nice statistical graphs using a PivotTable:
Conclusion
Even a simple log parsing script and some basic Excel skills can provide you with the insight needed to find malicious behaviour during incident response investigations. In cases where complex analysis on very large datasets is required tools like Kibana and Elasticsearch come into mind, but in many cases a simple Python script does the job just fine.