In my previous post I set up Oracle 12c on both 8 * X25-E in RAID 0 and SSD 910 disks. In particular other tests I then wanted to see how they would perform. In particular we had previously noted that some Oracle tests had suggested mixed results however maybe both the test and the configuration were not necessarily the best that could be chosen for an Oracle environment. I installed HammerDB on a client system as we know it gives consistent results and created an OLTP database ready to test. I ran a large number of users to drive up the transaction rate on the 2 socket E5 system and captured an AWR report. I ran the test multiple times to ensure consistency and this is what I saw for the redo and transactional throughput.
8 * X25-E | Per Second | Per Transaction |
Redo size (bytes): | 93,789,841.7 | 5,327.5 |
Transactions: | 17,605.0 |
|
SSD 910 | Per Second | Per Transaction |
Redo size (bytes): | 93,240,125.3 | 5,318.7 |
Transactions: | 17,530.7 |
|
So the 8 * X25-E in RAID 0 with all cache enabled did 1056300 TPM (transactions per minute) and the SSD 910 did 1051842 TPM both within 0.5% of each other. I know the system can deliver a higher transaction rate however I was interested in capturing all of the statistics and comparing the disk redo performance with the same workload and for both disk configurations redo was generated at 5.5GB / minute. (Remember that all of the data is on the SSD 910 as well).
Looking at the log writer performance from the AWR the Reqs per sec equivalent to the IOPs are 10559 for the SSD 910 and 13097 for the X25-E RAID configuration so we know there is much more potential for write performance with the SSD 910 rated up to 75,000 write IOPs.
8 * X25-E
Function Name | Writes: Data | Reqs per sec | Data per sec | Waits: Count | Avg Tm(ms) |
LGWR | 28.1G | 13097.48 | 95.512M | 2.5M | 0.00 |
SSD 910
Function Name | Writes: Data | Reqs per sec | Data per sec | Waits: Count | Avg Tm(ms) |
LGWR | 32.9G | 10559.18 | 111.736M | 0 | 0.00 |
The top wait events were also similar however the log file sync time was 2ms which appeared a little high.
8 * X25-E
Event | Waits | Total Wait Time (sec) | Wait Avg(ms) | % DB time | Wait Class |
DB CPU | 9032.9 | 55.3 |
| ||
log file sync | 2,980,901 | 4620.3 | 2 | 28.3 | Commit |
SSD 910
Event | Waits | Total Wait Time (sec) | Wait Avg(ms) | % DB time | Wait Class |
DB CPU | 8978.2 | 52.4 |
| ||
log file sync | 3,621,973 | 5664 | 2 | 33.0 | Commit |
However the AWR report reported busy CPU at over 98% and checking the log file parallel write time can help determine the time actually spent writing to disks. In this case for both configurations the disk was not a sgnificant component of the log file sync, instead with the CPU so busy the more time is utilsed on CPU scheduling for log file sync.
8 * X25-E
Event | Waits | %Time -outs | Total Wait Time (s) | Avg wait (ms) | Waits /txn | % bg time |
log file parallel write | 2,449,867 | 0 | 174 | 0 | 0.46 |
61527766.78 |
SSD 910
Event | Waits | %Time -outs | Total Wait Time (s) | Avg wait (ms) | Waits /txn | % bg time |
log file parallel write | 2,055,279 | 0 | 208 | 0 | 0.39 | 80190288.85 |
Taking a brief snapshot of the iostat data also confirms that for both configurations we could drive redo throughput much higher and we are maximising CPU before disk.
8 * X25-E
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdf 0.00 0.00 0.00 15137.67 0.00 96.30 13.03 0.64 0.04 0.02 37.8
SSD 910
avg-cpu: %user %nice %system %iowait %steal %idle
86.23 0.00 8.44 0.63 0.00 4.70
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util
sdc 0.00 0.00 25.67 2558.67 0.40 27.67 22.24 0.19 0.07 0.05 13.77
sdb 0.00 0.00 17.33 2578.67 0.27 27.79 22.14 0.18 0.07 0.05 13.47
sda 0.00 0.00 18.67 2574.67 0.28 27.67 22.08 0.18 0.07 0.05 13.30
sdd 0.00 0.00 21.00 2551.33 0.33 27.92 22.49 0.19 0.07 0.05 13.87
One other area of note in the AWR was that redo wastage was higher for the SSD 910. Oracle aligns redo with the disk sector chosen in this case 512 byte for X25-E and 4KB for SSD 910, the log writer will never re-read a block that is not completely filled and always begins at a block boundary. With a larger sector and redo block size we see more wastage however this is not of great concern as we have already seen that the transaction throughput is almost the same.
Finally the new performance view V$LGWRIO_OUTLIER in 12c allowed me to check if any of the I/O requests were taking more than 500ms to complete, suprisingly there were a few entires in here for X25-E configuration but none for the SSD 910.
SQL> /
IO_SIZE WAIT_EVENT
---------- ----------------------------------------------------------------
FILE_NAME
--------------------------------------------------------------------------------
IO_LATENCY
----------
1024 log file parallel write
+X25E/SBDB1/ONLINELOG/group_3.256.821098037
620735830
10 log file parallel write
+X25E/SBDB1/ONLINELOG/group_3.256.821098037
622271590
...
Is it worth investigating the X25-E outliers? No, as we have noted before this configuration was brought out of retirement for comparsion and is an EOL product.
So in summary should you continue to put your Oracle redo (and data) on SSDs? Yes absolutely, current products at the time of writing such as SSD 910 raise the bar sufficiently for Oracle redo whatever your demands for throughput and yes if you check the product brief you can also see that this product and current data center products such as the DC S3700 come with the feature Enhanced Power Loss Data Protection.
The post Comparing Performance of Oracle Redo on Solid State Disks (SSDs) appeared first on Blogs@Intel.