Forums » Performance discussions »
Read performance bottleneck
Added by Raj Jethnani about 1 year ago
HI All, I've been struggling with a read performance issue that's been going on for close to a year now. I've tried various troubleshooting methodolgies and still have trouble narrowing down the cause. To give an overview of the environment:
P8 H67-M pro Asus 1155 socket mobo core i3-2100 intel 8GB DDR3 corsair 1333 mhz memory 2 x BR10i LSI cards flashed with IT firmware for JBOD setup -8 x 1TB samsung HD103SJ connected to one of the BR10i cards 9 raid (2 x 4 raidz1 pool) -4 x 2TB SAMSUNG EcoGreen F4 HD204UI on the other BR10i card (Raid 10 mirrored zfs pool)
the OS is sitting on a WD laptop drive 5400 RPM drive connected to one of the br10i breakout sata connections.
performing a dd test from the console to the Samsung 1TB drives: dd if=/dev/zero of=testfile bs=1M count=10240 10737418240 bytes (11 GB) copied, 22.5374 seconds, 476 MB/s
dd if=testfile of=/dev/null 10737418240 bytes (11 GB) copied, 97.7537 seconds, 110 MB/s ---> what the heck?!
Samsung 2TB: dd if=/dev/zero of=testfile bs=1M count=10240 10737418240 bytes (11 GB) copied, 129.264 seconds, 83.1 MB/s dd if=testfile of=/dev/null 10737418240 bytes (11 GB) copied, 95.5551 seconds, 112 MB/s
The 2TB pool behaves as expected (write < read) but both seems bottlenecked at around 110MB/s. I've even taken an SSD drive and saw the same behavior where read speeds were bottlenecked.
The SAN is also attached to 2 esxi whiteboxes and based on the IO analyzer VM running, I'm seeing the same behavior from the hosts IO meter tests.
any suggestions would be really appreciated. I can post any additional data needed. thanks again!!
Replies
RE: Read performance bottleneck - Added by Linda Kateley about 1 year ago
Do you have your ssd added to the pool as l2arc?
RE: Read performance bottleneck - Added by Jeff Gibson about 1 year ago
What does
dd if=testfile of=/dev/null bs=1Mgive? I don't know if dd reading has a way to decide what blocksize to read at and may be defaulting to a low value.
RE: Read performance bottleneck - Added by Linda Kateley about 1 year ago
can we see what your cache hit rate looks like?
nmc@myhost:/$ show performance arc
RE: Read performance bottleneck - Added by Linda Kateley about 1 year ago
Also, do you have jumbo frames enabled? We have seen this places where the switches and clients weren't all config'ed the same
RE: Read performance bottleneck - Added by Jeff Gibson about 1 year ago
Are you still working on this Raj?
I think this is a local issue first before we troubleshoot network. What does
echo ::interrupts -d | mdb -kprovide? Can you try with a single controller in the host sine your mobo might have some type of pcie issue (reaching on this one i know)?
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
Sorry everyone, thanks for chiming in. I've been swallowed up by work and will have some time to post more information on this during the weekend. I'll run the commands and post the results by this weekend and thanks for the assistance.
I know MTU is set at 1500 throughout the environment.
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
@Jeff Gibson, wow that actually brought some interesting results. Now they're in line with what I expected from the array side. Looks like this testing scenario was an ID10T problem (should've known better than to use DD for performance testing).
1TB drives: root@nexentastor:/volumes/morpheus/morph# dd if=/dev/zero of=testfile bs=1M count=10240 10737418240 bytes (11 GB) copied, 23.619 seconds, 455 MB/s
root@nexentastor:/volumes/morpheus/morph# dd if=testfile of=/dev/null 10737418240 bytes (11 GB) copied, 61.3164 seconds, 175 MB/s -->WTF?!
root@nexentastor:/volumes/morpheus/morph# dd if=testfile of=/dev/null bs=1M 10737418240 bytes (11 GB) copied, 19.0273 seconds, 564 MB/s -->looks great!!
2TB drives: root@nexentastor:/volumes/neo/neo# dd if=/dev/zero of=testfile bs=1M count=10240 10737418240 bytes (11 GB) copied, 59.4525 seconds, 181 MB/s
root@nexentastor:/volumes/neo/neo# dd if=testfile of=/dev/null 10737418240 bytes (11 GB) copied, 60.2635 seconds, 178 MB/s ---> WTF
root@nexentastor:/volumes/neo/neo# dd if=testfile of=/dev/null bs=1M 10737418240 bytes (11 GB) copied, 33.3021 seconds, 322 MB/s --->That's more like it!!!
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
There are no SSDs added as l2arc or ZIL at this point.
So here's the next step. Based on the results above, the array is behaving as expected with higher reads than writes.
However, what I'm using this array for is still showing the issue where reads The array was built to connect to 2 VMware esxi 5.0 update 1 servers. I'm running a home lab with a couple of whitebox builds for work.
The storage connection is through the iSCSI software adapter on the esxi host with a dedicated intel nic --> netgear 8 port switch -->dedicated intel nic on the nexentastor box.
So each host has a dedicated nic to the array(2 port intel nic, same card). Based on IOmeter testing from a windows 7 VM with an direct connection to the underlying luns via RDM, I'm seeing the following reads and writes: 16K 100% read
2100 IOs
33 MB/s throughput 16K 100% write
3000 IOs
47MB/s throughput.
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
Linda, I ran the show performance arc command and results are shown below. I ran it during an IO meter with 100% reads. Current ARC Size Min ARC Size(zfsarcmin) Max ARC Size(zfsarcmax) 2826MB 1396MB 11168MB
Cache hits and misses (total): Cache Hits: 66% Cache Misses: 11% Cache hits by type: Demand Data: 51% Prefetch Data: 20% Demand Metadata: 23% Prefetch Metadata: 3% Cache misses by type: Demand Data: 28% Prefetch Data: 70% Demand Metadata: 1% Prefetch Metadata: 0%
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
root@nexentastor:/volumes# echo ::interrupts -d | mdb -k
IRQ Vect IPL Bus Trg Type CPU Share APIC/INT# Driver Name(s)
4 0xb0 12 ISA Edg Fixed 0 1 0x0/0x4 asy#0
5 0x42 5 ISA Edg Fixed 3 1 0x0/0x5 ecpp#0
9 0x81 9 PCI Lvl Fixed 1 1 0x0/0x9 acpiwrapperisr
11 0xd1 14 PCI Lvl Fixed 2 1 0x0/0xb hpet_isr
16 0x60 6 PCI Lvl Fixed 2 1 0x0/0x10 e1000g#1
17 0x61 6 PCI Lvl Fixed 3 1 0x0/0x11 e1000g#2
19 0x62 6 PCI Lvl Fixed 3 1 0x0/0x13 e1000g#0
23 0x82 9 PCI Lvl Fixed 1 2 0x0/0x17 ehci#1, ehci#0
24 0x40 5 PCI Edg MSI 3 1 - mpt#0
25 0x41 5 PCI Edg MSI 0 1 - mpt#1
26 0x83 8 PCI Edg MSI 1 1 - audiohd#0
32 0x20 2 Edg IPI all 1 - cmicmcitrap
160 0xa0 0 Edg IPI all 0 - poke_cpu
208 0xd0 14 Edg IPI all 1 - kcpchwoverflow_intr
209 0xd3 14 Edg IPI all 1 - cbe_fire
210 0xd4 14 Edg IPI all 1 - cbe_fire
240 0xe0 15 Edg IPI all 1 - xc_serv
241 0xe1 15 Edg IPI all 1 - apicerrorintr
RE: Read performance bottleneck - Added by Raj Jethnani about 1 year ago
To summarize, the performance within the array is now working as expected. the troubleshooting chain can now go into networking as the main culprit. I'm going to try to take that 2 port nic out and put it on a linux box for some iperf / jperf testing. Thanks for the help everyone.
RE: Read performance bottleneck - Added by Jeff Gibson about 1 year ago
Raj, here is a link to using iperf on your esxi host. Try to do this test without any other network traffic so you can see what the nexenta <-> ESXi host throughput is. Test this in both directions (server on one machine then the other).
RE: Read performance bottleneck - Added by Jeff Gibson about 1 year ago
Another thought, If you're purely testing for throughput on your esx guests, you should use 1MB or larger block sizes (to match the test size you were using in DD). If you're looking for pure IOPs use 512byte 100% random. If you want to guess what a "real world" performance would be use 8k block 65% read 60% random at a queue depth of at least 8, but 32 if you expect a heavily loaded system.
RE: Read performance bottleneck - Added by Jarret Lavallee about 1 year ago
Raj, How are the tests going? Are you getting better performance?
RE: Read performance bottleneck - Added by Raj Jethnani 12 months ago
Jarret, sorry for the late update buddy. Re-ran the cables going from the ESX hosts and the array by passing the switch completely. Ran iPerf with the following combinations (Array, VMs running on ESX, external windows machine) along with having each of the 3 switch off and on from server to client.
In each test the Array network cards (2 physical nics) when the array acts as a server, I get ~80MBs but when it acts as a client the traffic consistently drops down to ~33 MBs which is the exact behavior I was seeing from IOmeter earlier, limiting the issue to reads from the array. Looks like the problem is definitely related to networking on the array. The odd thing is these are 2 different cards and I've already swapped out one of the cards which didn't change the behaviour.
This now lead me to believe there's something very wrong with the motherboard itself that's affecting PCI cards plugged into it. The raid cards are plugged into PCI-e slots while the nics are plugged into regular PCI slots. I'm thinking of putting in a PCI-e compatible nic and seeing it's affect once tested.
Stay tuned and thanks for all the assistance and recommendations. By the way, I've also upgraded to 3.1.3 and can't wait for 4.0!!
-Raj
RE: Read performance bottleneck - Added by Raj Jethnani 11 months ago
I've just plugged in a PCI-e nic and ran iperf. It ran perfectly....I can't believe it, it's the PCI slots on the damn motherboard that's the issue. over 6 months of pain and troubleshooting and it's the damn PCI slots! Note for anyone building a nexentastor box from of the shelf parts...TEST EVERY SINGLE COMPONENT INDIVIDUALLY!!!!
Time to buy a new mobo...any recommendations for a replacement for a desktop board that could possibly work with nexentastor? Thanks!!!