Benchmarking With IOZone
We will get into detail on how to measure I/O Filesystem performance using iozone, a free benchmarking tool for Linux/BSD.
To test the speed of both the existing EMC and Equalogix SANs, as well as the latency of service from the NFS servers 1,2 and 11,12, I have created two servers (VM) with this software installed:
- = Network affinity. I have reserved the IP address in IPPlan.
CentOS: You have to download the .RPM and install it manually using RPM:
cd /tmp wget http://www.iozone.org/src/current/iozone-3-303.i386.rpm rpm -ivh iozone-3-303.i386.rpm
FreeBSD: This is included into ports:
cd /usr/ports/benchmarks/iozone make config make install
...It's just that easy folks...
The iozone.org webiste kinda sucks, but there are many other webistes out there that make good use of communication of how to use IOZone. Tests can take up to 4 hours or longer. Run this command with "time" if possible to yield the time it took to perform the test. Add the "&" to the end of the statement to run in the backgroud.
Here is a breakdown of the man page:
[root@NFSTestCentos-1 tmp]# /opt/iozone/bin/iozone -h iozone: help mode
Usage: iozone [-s filesize_Kb] [-r record_size_Kb] [-f [path]filename] [-h] [-i test] [-E] [-p] [-a] [-A] [-z] [-Z] [-m] [-M] [-t children] [-l min_number_procs] [-u max_number_procs] [-v] [-R] [-x] [-o] [-d microseconds] [-F path1 path2...] [-V pattern] [-j stride] [-T] [-C] [-B] [-D] [-G] [-I] [-H depth] [-k depth] [-U mount_point] [-S cache_size] [-O] [-L cacheline_size] [-K] [-g maxfilesize_Kb] [-n minfilesize_Kb] [-N] [-Q] [-P start_cpu] [-e] [-c] [-b Excel.xls] [-J milliseconds] [-X write_telemetry_filename] [-w] [-W] [-Y read_telemetry_filename] [-y minrecsize_Kb] [-q maxrecsize_Kb] [-+u] [-+m cluster_filename] [-+d] [-+x multiplier] [-+p # ] [-+r] [-+t] [-+X] [-+Z] [-+w percent dedupable] [-+y percent_interior_dedup] [-+C percent_dedup_within] -a Auto mode -A Auto2 mode -b Filename Create Excel worksheet file -B Use mmap() files -c Include close in the timing calculations -C Show bytes transferred by each child in throughput testing -d # Microsecond delay out of barrier -D Use msync(MS_ASYNC) on mmap files -e Include flush (fsync,fflush) in the timing calculations -E Run extension tests -f filename to use -F filenames for each process/thread in throughput test -g # Set maximum file size (in Kbytes) for auto mode (or #m or #g) -G Use msync(MS_SYNC) on mmap files -h help -H # Use POSIX async I/O with # async operations -i # Test to run (0=write/rewrite, 1=read/re-read, 2=random-read/write 3=Read-backwards, 4=Re-write-record, 5=stride-read, 6=fwrite/re-fwrite 7=fread/Re-fread, 8=random_mix, 9=pwrite/Re-pwrite, 10=pread/Re-pread 11=pwritev/Re-pwritev, 12=preadv/Re-preadv) -I Use VxFS VX_DIRECT, O_DIRECT,or O_DIRECTIO for all file operations -j # Set stride of file accesses to (# * record size) -J # milliseconds of compute cycle before each I/O operation -k # Use POSIX async I/O (no bcopy) with # async operations -K Create jitter in the access pattern for readers -l # Lower limit on number of processes to run -L # Set processor cache line size to value (in bytes) -m Use multiple buffers -M Report uname -a output -n # Set minimum file size (in Kbytes) for auto mode (or #m or #g) -N Report results in microseconds per operation -o Writes are synch (O_SYNC) -O Give results in ops/sec. -p Purge on -P # Bind processes/threads to processors, starting with this cpu -q # Set maximum record size (in Kbytes) for auto mode (or #m or #g) -Q Create offset/latency files -r # record size in Kb or -r #k .. size in Kb or -r #m .. size in Mb or -r #g .. size in Gb -R Generate Excel report -s # file size in Kb or -s #k .. size in Kb or -s #m .. size in Mb or -s #g .. size in Gb -S # Set processor cache size to value (in Kbytes) -t # Number of threads or processes to use in throughput test -T Use POSIX pthreads for throughput tests -u # Upper limit on number of processes to run -U Mount point to remount between tests -v version information -V # Verify data pattern write/read -w Do not unlink temporary file -W Lock file when reading or writing -x Turn off stone-walling -X filename Write telemetry file. Contains lines with (offset reclen compute_time) in ascii -y # Set minimum record size (in Kbytes) for auto mode (or #m or #g) -Y filename Read telemetry file. Contains lines with (offset reclen compute_time) in ascii -z Used in conjunction with -a to test all possible record sizes -Z Enable mixing of mmap I/O and file I/O -+K Sony special. Manual control of test 8. -+m Cluster_filename Enable Cluster testing -+d File I/O diagnostic mode. (To troubleshoot a broken file I/O subsystem) -+u Enable CPU utilization output (Experimental) -+x # Multiplier to use for incrementing file and record sizes -+p # Percentage of mix to be reads -+r Enable O_RSYNC|O_SYNC for all testing. -+t Enable network performance test. Requires -+m -+n No retests selected. -+k Use constant aggregate data set size. -+q Delay in seconds between tests. -+l Enable record locking mode. -+L Enable record locking mode, with shared file. -+B Sequential mixed workload. -+D Enable O_DSYNC mode. -+A # Enable madvise. 0 = normal, 1=random, 2=sequential 3=dontneed, 4=willneed -+V Enable shared file. No locking. -+X Enable short circuit mode for filesystem testing ONLY ALL Results are NOT valid in this mode. -+Z Enable old data set compatibility mode. WARNING.. Published hacks may invalidate these results and generate bogus, high values for results. -+w ## Percent of dedup-able data in buffers. -+y ## Percent of dedup-able within & across files in buffers. -+C ## Percent of dedup-able within & not across files in buffers.
To test the filesystem performance of the EMC, old NFS and new NFS, I went ahead and made the following test mount directories:
mkdir /test mkdir /test/emc mkdir /test/nfs11 mkdir /test/nfs1 mkdir /test/nfs12 mkdir /test/nfs2
This should give us at least three specific mount directories:
- 1. EMC - Test the EMC speed as a Benchmark for the other two. Mount that share up on /test/emc
mount <emcstuff> /test/emc
- 2. NFS1 - Test the "old" SAN speed and compare it to the "new" SAN. Mount that share up on /test/nfs1. Optional: Mount a share from NFS2 up in /test/nfs2.
mount 192.168.44.230:/export/es-test /test/test2
- 3. NFS11 - Test the "new" SAN speed and compare it to the other statistics. Mount that share up on /test/nfs11. This is a cluster, so you should only mount the VIP between the two cluster hosts.
mount 192.168.55.56:/zfstest /test/nfs11
mount 192.168.99.83:/zfstest /test/nfs11
On BSD servers, use this command to mount up NFS shares from the Solaris Cluster, or RPC Mounter will fail:
mount_nfs -o rw,tcp,intr,soft,bg 192.168.55.56:/zfstest /test/nfs11
Test #1: General Performance
time /opt/iozone/bin/iozone -R -a -g 1G | tee -a /tmp/iozone-nfs1-1.txt &
The line above will do a test that is run in throughput mode with a single thread, read and write with a 4 Gigabyte file size in the /test/nfs1 directory, dumping results into an Excel Compatible spreadsheet table. Then we "pipe" this to a file onto the guest (server you are running iozone from) to a text file that we will later generate some graphs with.
-R = Excel Spreadsheet
-a = Automatic mode
-g = Extend the maximum file size to be twice your system's main memory size, that size defined as 4G.
Test #2: NFS Specific Performance
/opt/iozone/bin/iozone -R -a -c ------------------From Another Site-------------------------- /opt/iozone/bin/iozone -R -r 4 -s 60m -l 1 -u 50 -i 0 -i 1 -i 8 -+p 60 -+m nodelist -b excel_noflush.wks -C -O | tee -a /tmp/iozone-nfs1-2.txt & -------------------------------------------------------------- The One I am using: iozone -Rac -g 1G -U /test/nfs11 -f /test/nfs11/iozone3 -b /tmp/iozone-nfs11-wload-1.xls
Copy the .XLS spreadsheet onto your workstation and make some heads or tails from the data.
Testing with Bonnie
I ran some tests using another app called Bonnie, which provides simpler stats on the disks IO performance.
ON each server:
bonnie -d <directory> -s <size> -m <machine name>
I used five difference mount options to determine wether or not the performance gain was worth the effort:
Default = “-o -c” EO = rw,async,-d,-U,-3,-s,-i,noatime,-c EO4 = rw,async,-r=4098,-w=4098,-d,-U,-3,-s,-i,noatime,-c EO8 = rw,async,-r=8196,-w=8196,-d,-U,-3,-s,-i,noatime,-c EO16 = rw,async,-r=16392,-w=16392,-d,-U,-3,-s,-i,noatime,-c
Default = "rw" EO = rw,async,noatime,nfsvers=3,udp,soft,intr EO4 = rw,rsize=4098,wsize=4098,async,noatime,nfsvers=3,udp,soft,intr EO8 = rw,rsize=8196,wsize=8196,async,noatime,nfsvers=3,udp,soft,intr EO16 = rw,rsize=16392,wsize=16392,async,noatime,nfsvers=3,udp,soft,intr
I have attached a spreadsheet to this WIKI as to the results of my findings: