Forums » Performance discussions »
Unexpectedly poor performance (20MBps) when writing to spindles and SSDs
Added by Matt S about 1 year ago
Hi, glad to see there's a dedicated performance forum recently added here. In that spirit, I'm hoping to get some advice on what might be wrong with my Nexenta setup. I've been doing a lot of troubleshooting on this system as a side project for months, and I have little to show for it.
First, here are the system's vital stats:
Server: HP ProLiant N36L MicroServer, 8GB ECC RAM
Boot Disk: Seagate 320GB, SATA
Data Disks (RAID-Z2): 4x500GB, SATA, various vendors
Alternate Data Disks: 2xCrucial 64GB SATA II SSD
All data disks attached to Intel SASUC8I flashed with IT firmware
I have found over numerous tests that I'm unable to get good performance out of this system as it currently stands -- frequently 20-25MB/s under both iSCSI and scp copies, on a direct gigabit link between the NAS appliance and another machine. This is in stark contrast with test results produced by bonnie++, showing the 4x500GB RAID-Z2 spindle, as well as a single SSD, achieving much higher "native" speeds within the OS:
4x500GB spindles RAID-Z2:
Version 1.03b ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
RAID-Z2 16G 47092 73 90524 33 51425 22 51952 97 135622 22 573.4 6
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 17913 99 +++++ +++ 30935 99 13371 99 +++++ +++ 29204 99
Single SSD (no zvols assigned to this device):
Version 1.03b ------Sequential Output------ --Sequential Input- --Random-
-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
ssd 16G 58621 89 228767 73 145331 51 52318 91 295407 35 5647 32
------Sequential Create------ --------Random Create--------
-Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 17185 99 +++++ +++ 29919 100 15497 99 +++++ +++ 24514 90
At first I was sure this was due to iSCSI implementation issues, but I'm seeing similar low performance using the scp method as well, even to a single SSD (i.e. no extra overhead from parity calculation). Default compression and sync request values for new dataset creation and for zvol creation were used. Default zvol block size was used. Drive write cache was enabled.
Adding another SSD as ZIL doesn't seem to help much at all. Playing with compression settings helps a bit, but still gets nowhere near the "native" performance that bonnie++ reports is possible. It is unclear to me if both the dataset and the zvol used for iSCSI should have compression enabled or not.
I should also note that when copying to the spindle array (either via iSCSI or scp), I can hear that there are periods where the disks are quiet, where my machine's copy window indicates that the transfer is proceeding. When the disks are written out to, though (as evidenced by hearing them churn), the machine indicates that no copy progress is made during this time -- effectively, writing to spindle seems to impede any data transfer between the machine and the NAS. This pattern occurs periodically and regularly throughout a file copy.
At this point I'm utterly stumped -- any advice is sorely appreciated.
Replies
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Peter Lu about 1 year ago
Can you post the results of zpool status -D poolname and zpool list?
Why do you need a RAID-Z2 with four drives? You would double your write performance by switching to striping across two mirrored pairs, I think. If I were in your situation, I'd start over, and create a new zpool with something like:
zpool create poolname mirror disk1 disk 2 zpool add poolname mirror disk3 disk4
Then re-run your benchmarks. I'd expect that to give you much better write performance.
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Dan Swartzendruber about 1 year ago
I don't think that's correct. For a raid10, there are effectively two spindles for writes, so I think that's a wash? The big win for raid10 is reads. Anyway, an advantage of 4-drive raidz2 is allowing any two drives to fail. With raid10, if a drive fails the chance of a second failure killing you is 33%.
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Peter Lu about 1 year ago
Dan: I agree with you that for four drives in mirrored pairs, you get two spindles of write performance. But I think you only get one for a RAIDZ or RAIDZ2, because it has to distribute all of the parity information over all of the drives. It's my impression, and if I'm wrong correct me, that the reason that RAIDZ and RAIDZ2 aren't used when performance matters, is that, no matter how many drives, the write performance is limited to basically that of a single drive...
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Dan Swartzendruber about 1 year ago
In general yes, but if you are doing, say, a streaming write, do in fact use all N data spindles. Try it sometime. Create both kinds of pools, and then write, say, a 8GB file, from /dev/zero, then copy back from it to /dev/null. What I saw was raidZ faster than raid10 for writes but 1/2 the speed for reads.
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Peter Lu about 1 year ago
Agreed. In support of your point, I assume that if you did a zfs send / receive stream operation, it would also be faster on the RAIDZ than the striped mirrors... but for iSCSI, I'd still go with the mirrors. Also, the original box has only 8 GB of RAM, so that may also be a bottleneck...
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Florian Haake about 1 year ago
Have you tested the configuration with two mirros? If so: How fast is it?
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Matt S about 1 year ago
Peter Lu wrote:
Can you post the results of
zpool status -D poolnameandzpool list?Why do you need a RAID-Z2 with four drives? You would double your write performance by switching to striping across two mirrored pairs, I think. If I were in your situation, I'd start over, and create a new zpool with something like:
[...]
Then re-run your benchmarks. I'd expect that to give you much better write performance.
# zpool status -D tank-spindle
pool: tank-spindle
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank-spindle ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
c0t4d0 ONLINE 0 0 0
c0t5d0 ONLINE 0 0 0
c0t6d0 ONLINE 0 0 0
c0t7d0 ONLINE 0 0 0
errors: No known data errors
# zpool status -D tank-ssd
pool: tank-ssd
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
tank-ssd ONLINE 0 0 0
c0t0d0 ONLINE 0 0 0
errors: No known data errors
--------------
# zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
syspool 148G 7.22G 141G 4% 1.00x ONLINE -
tank-spindle 1.81T 284K 1.81T 0% 1.00x ONLINE -
tank-ssd 59.5G 118K 59.5G 0% 1.00x ONLINE -
I am using the 4 drives in RAID-Z2 configuration as I am concerned with MTBF / MTTDL figures for this dataset down the road. At 500GB per drive this is not currently an issue, but I intend to upgrade-in-place with larger drives in the future, and rebuild time for a failed disk will eventually mean that striped mirrors or RAID-Z1 in a 4-disk set could prove to be insufficient protection.
That said, RAID-Z2 vs. striped mirrors seems to be rather beside the point -- as far as I can tell, there is no good reason for me to be seeing write speeds as low as 20MBps when writing out to a dataset comprised of a single SSD. Both the spindle and SSD sets were created with default compression and sync values under Nexenta using the GUI. What could be going wrong here?
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Florian Haake about 1 year ago
Maybe a mirror is a lot faster because the CPU isn´t that mutch stressed like in a raid-z. A "simple" copy from one to the other disk is easier than the calculation for the parity. The N36L/N40L haven´t such a fast cpu...
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Matt S about 1 year ago
Florian Haake wrote:
Maybe a mirror is a lot faster because the CPU isn´t that mutch stressed like in a raid-z. A "simple" copy from one to the other disk is easier than the calculation for the parity. The N36L/N40L haven´t such a fast cpu...
That's exactly the problem, though -- neither copy is faster than the other. Doing a simple scp to the RAID-Z2 array and the single SSD both clock in at roughly the same speed, 20-25MBps, though the computational load for writing to a RAID-Z2 set should be substantially higher than the load for writing to a single SSD. A mirrored setup should not be appreciably faster than writing to a fast single disk, in this case my MLC SSD. Writing to a striped array, or a striped array of mirrors, should be faster, but that's not an apples-to-apples comparison anymore.
Given this finding, I'm inclined to think that this is not a CPU-bound issue. Again, bonnie++ is able to turn in much higher throughput on disk, with moderate CPU utilization (though as I understand it, the bonnie++ CPU load estimate is a "back of the hand" calculation of sorts).
If there are any commands that someone suggest I run to monitor CPU load and other properties during the copy, I'd be happy to try that.
RE: Unexpectedly poor performance (20MBps) when writing to spindles and SSDs - Added by Marcus S about 1 year ago
With SCP you're likely limited by your CPU performance. I generally peg my Core i7 at between 40-50MB/s, and it's much beefier than your box. It takes a lot to do encryption, and it's not multithreaded. I'd suggest trying netcat for a low overhead network copy (man the nc command).
I'm actually on my way to testing the same thing. It's more than just being able to lose ANY two disks, with 2-4TB disks, the chances of being able to rebuild from a single good copy of data are getting pretty scary. This can be offset somewhat by frequent scrubs, but the idea of still having redundancy when a disk is lost, and having two copies to rebuild from, is very appealing if the performance is acceptable.