Adam Crosby home

Overview of Disk Forensics on AWS EC2

30 Oct 2016 - Norfolk, VA

(nb: As the title mentions, this post is for AWS EC2 only, and does not extend to S3, RDS, RedShift or any of the other hundreds of services they offer, except where specifically noted.)

TOC

  1. Analysis Strategies
  2. Overview
  3. EBS Disk Capture
  4. Ephemeral Disk Capture
  5. EFS Disk Capture
  6. Tools

Analysis Strategies

Have a plan.

It’s important to have a strategy for performing your analysis prior to beginning an incident response. The major decision points that must be accounted for in this plan are the location of the analysis work, the desired impact on the systems under scrutiny and the source of expertise.

Cloud-side or offline

Known where you’re going to do your analysis.

The major decision point that should be understood ahead of time is if you are comfortable with cloud-side forensic analysis, or if your workflows and tooling dictate that you do your analysis offline.

The rest of this article (and the guides it links to) will attempt to show approaches for both, although it’s a good tactical idea for most analysis teams to become comfortable with cloud-side analysis (if for no other reason than your velocity and time to respond will be dramatically smaller, and your response capability can be significantly more agile).

Isolate or minimize impact

Don’t take things offline if you don’t have to (unless you want to).

Another strategy here is minimizing impact to the running system vs. isolation/interruption of a compromised service. For the purposes of disk forensics, I will focus on minimal/no-interruption to the running system in all cases.

For this reason, the focus is on obtaining (as close as possible) sound, verified copies of files or disk block stores for analysis apart from the actual running instance. Memory and live-system forensic approaches are out of scope for this discussion.

Internal or external expertise

WHO is going to do the analysis?

Finally, it’s crucial to know ahead of time who is going to be doing your forensic analysis. In-house folks? Are they familiar with EC2 and all of it’s storage options (are you the in house team? Keep reading…)? Comfortable working with the cloud environment, APIs, etc?

Do you outsource your incident response to a QIRA/QSA/PFI? Does your provider have analysts who understand EC2? Do they ask for the hardware to perform their analysis on, instead of images, etc? Does your contract have provisions for the analysis in the cloud? Do you know who pays the data transfer fee if the investigation team wants to pull the data out of the cloud?

This is a huge area of concern in itself, but, like memory forensics, is outside of the scope of this article.

Overview of EC2 Storage Options

With that said, we’re going to focus on minimally invasive, online/offline methods of storage-only analysis.

Before delving into technical how-to, it’s important for an analyst to understand how EC2 storage works. There are two types of block/disk storage, and one type of network storage available to an EC2 instance today:

Elastic Block Store

This is the primary storage mechanism used by most AWS users today (and is the only block store available on a few instance types, such as the T2 family). It functions much like a SAN - attached volumes are mounted by the instance and show up as raw block devices (such as /dev/xvdfa on linux). These volumes can be formatted, given a filesystem, and used like any other block disk. Importantly for forensic analysts, is the knowledge that these EBS volumes can be (and are frequently) used in software RAID arrays or JBOD configurations to provide increased performance/redundancy/etc. to the EC2 instance.

Instance Ephemeral Store

Historically, the instance ephemeral store was the only storage available to an EC2 instance. As it’s name implies, ephemeral store is ephemeral. Content written to the ephemeral store is securely wiped after an instance reboots, shuts down or otherwise changes operating state (on purpose or not). The ephemeral storage is the ‘local’ disk (magnetic or SSD, depending on instance type) directly attached to the droplet hosting your AWS instance. It’s intended for scratch, swap or other storage purposes where speed and latency is important, but persistence is not required. The ec2-run-instance command maps the individual disks on the instance to linux/windows devices on boot. Some instances have no ephemeral storage (e.g. t2.nano), others have dozens of disks available for use (e.g. x1.16xlarge).

Elastic File System (EFS)

A recent entry for AWS, the EFS presents a NFSv4 based network attached file storage capability to EC2 instances to utilize. Multiple EC2 instances can simultaneously mount this storage and use it to share files, state, etc. It shows up as a standard NFS mount point, defined as an IP address and filesystem name combo in a VPC, which you then mount with an NFS client on an EC2 instance. There is no way to snapshot or otherwise read EFS the block level, as only file-based operations are supported. For more information on file vs. block store, refer to this article.

EBS Disk Capture

Overview

The general steps to getting an EBS disk image for use in analysis are:

  1. Determine the EBS volumes attached to the instance, and their mount points
  2. Take snapshot(s) (optionally, quiesce the instance volumes)
  3. Create EBS volume from snapshot
  4. For cloud-side analysis, mount the volumes read-only on your analysis workstation
  5. For offline analysis, create a consumable images and store the images in S3
    • Mount the volumes read-only on your capture instance
    • Use dd or similar tool to generate a file based image of the block device.
    • Generate integrity checking hash/HMAC of disk image
    • Upload image to S3
    • Download image from S3 to offline analysis system
  6. Perform analysis

With the exception of quiescing the volumes, all of the above steps (in either path) can be done without actually interacting with the instance itself (no logging in, no network connections, etc.) - they’re all handled out of band by the AWS EBS infrastructure. Especially if doing cloud-side analysis, this can be highly advantageous. Getting a disk image from a physical server is traditionally a simple, yet very invasive process (either physically, or resource intensive disk i/o), while on EBS it is more complicated, but much less invasive.

Ephemeral Disk Capture

(Insert note here to academic paper on datahiding using ephemeral disks) ## Overview Capturing ephemeral disk usage is a little more intrusive, and complicated. The general steps for ephemeral storage capture are:

  1. Determine the ephemeral instance store -> mount point mapping by querying instance meta-data.
  2. Add tooling to instance (or use existing tooling such as dd) to create a consumable disk image of each mapped ephemeral storage drive.
    • Use dd or similar tool to generate a file based image of the block device.
    • Generate integrity checking hash/HMAC of disk image
    • Upload image to S3
  3. For cloud-side analysis, grab the image from S3 from your analysis workstation
  4. For offline analysis, download the image from S3 to your offline analysis system
  5. Perform analysis

EFS Disk Capture

As of this writing, it is not possible to capture a forensic raw image of an EFS filesystem. For analysis of file content, the following procedure should be used:

  1. Grant access to EFS filesystem via IAM to ec2-based analysis workstation.
  2. Launch analysis workstation
  3. Mount -ro (read only) the EFS filesystem via standard NFSv4 mount command
  4. Copy filesystem content (files, directories, etc.) as needed to analysis storage
    • Generate checksum data during copy to ensure future analysis does not modify files.

While EFS supports many cloudwatch metrics, they are currently limited to performance and capacity based dimensions for each filesystem. EFS events are limited to the HTTP API for managing the service, and do not event on the read/write of individual files, so there is no forensic way to determine chain of events/etc.

Tools

Content forthcoming.