Possible iSCSI problem with ESXi?

Added by Dan Swartzendruber over 2 years ago

So, I've been trying to migrate VMs off NFS to iSCSI with 3.1.2 as the backend. First time, I was doing copies of 2-3 VMs at a time, and everything locked up solid - the NexentaSTOR was a VM, so I couldn't get useful info. Since then, I have disabled anything (I think) relating to the acceleration settings, including SCSI UNMAP. I've held off moving the rest of the production VMs until I get my new motherboard so I can unvirtualized the SAN. I did try installing a CentOS6.2 to play with openldap, and while it was running, I am seeing a number of these messages in ESXi log:

2012-03-12T23:01:04.744Z cpu6:2141)VSCSI: 2419: handle 8205(vscsi0:0):Completing reset (0 outstanding commands) 2012-03-12T23:20:39.238Z cpu3:128830)VSCSI: 2346: handle 8205(vscsi0:0):Reset request on FSS handle 26477806 (0 outstanding commands) 2012-03-12T23:20:39.238Z cpu6:2141)VSCSI: 2621: handle 8205(vscsi0:0):Reset [Retries: 0/0] 2012-03-12T23:20:39.238Z cpu6:2141)VSCSI: 2419: handle 8205(vscsi0:0):Completing reset (0 outstanding commands) 2012-03-12T23:20:52.144Z cpu4:128830)VSCSI: 2346: handle 8205(vscsi0:0):Reset request on FSS handle 26477806 (0 outstanding commands) 2012-03-12T23:20:52.144Z cpu6:2141)VSCSI: 2621: handle 8205(vscsi0:0):Reset [Retries: 0/0] 2012-03-12T23:20:52.144Z cpu6:2141)VSCSI: 2419: handle 8205(vscsi0:0):Completing reset (0 outstanding commands) 2012-03-12T23:20:52.228Z cpu4:128830)VSCSI: 2346: handle 8205(vscsi0:0):Reset request on FSS handle 26477806 (0 outstanding commands) 2012-03-12T23:20:52.228Z cpu6:2141)VSCSI: 2621: handle 8205(vscsi0:0):Reset [Retries: 0/0] 2012-03-12T23:20:52.228Z cpu6:2141)VSCSI: 2419: handle 8205(vscsi0:0):Completing reset (0 outstanding commands) 2012-03-12T23:58:25.102Z cpu4:128830)NetPort: 1426: disabled port 0x1000016 2012-03-12T23:58:25.102Z cpu4:128834)VSCSI: 6179: handle 8205(vscsi0:0):Destroying Device for world 128830 (pendCom 0) 2012-03-12T23:58:25.628Z cpu7:128830)NetPort: 1239: enabled port 0x1000016 with mac 00:0c:29:e0:68:65 2012-03-12T23:58:25.690Z cpu6:128834)VSCSI: 3620: handle 8206(vscsi0:0):Using sync mode due to sparse disks 2012-03-12T23:58:25.690Z cpu6:128834)VSCSI: 3661: handle 8206(vscsi0:0):Creating Virtual Device for world 128830 (FSS handle 23250164)

I have no idea if these indicate a particular problem or not (google is not real helpful in that respect - LOL). Any thoughts would be appreciated...


Replies

RE: Possible iSCSI problem with ESXi? - Added by Linda Kateley over 2 years ago

I sent your question to the back room, lemme see what they say

RE: Possible iSCSI problem with ESXi? - Added by Dan Swartzendruber over 2 years ago

Thanks. I am thinking it might be a red herring. I moved all of the VMs off the zvol back to the NFS datastore and still occasionally see these, but the pending count is always zero, and nothing bad seems to be happening. I am guessing the guests are doing scsi bus resets on boot? Anyway. the thing that got me scared about this was when I was earlier moving several VMs from NFS to iSCSI zvol and everything froze. I couldn't see anything useful since the nexenta SAN is currently a VM (I am moving it to a separate box later this week), so I could not gather useful info. I am speculating it was the SCSI UNMAP bugaboo, but until I can run the SAN separately, I am not willing to take chances...

RE: Possible iSCSI problem with ESXi? - Added by Jeff Gibson over 2 years ago

I just found the patch today that is supposed to help at least some of the iSCSI issues. ESXi500-201112001.zip was just installed on our cluster to bring us up to build 515841 from 469512. You might try that and see if it helps any.

http://www.vmware.com/patchmgr/download.portal

RE: Possible iSCSI problem with ESXi? - Added by Dan Swartzendruber over 2 years ago

Hmmm, I may give that a try, thanks! Not doing anything with the current build. since I will be doing a fresh install on the physical box sometime the next couple of days...

Content-Type: text/html; charset=utf-8 Set-Cookie: _redmine_session=BAh7BiIKZmxhc2hJQzonQWN0aW9uQ29udHJvbGxlcjo6Rmxhc2g6OkZsYXNoSGFzaHsABjoKQHVzZWR7AA%3D%3D--cebfb08d300a85bd88dafd1422210ebe7c9a5873; path=/; HttpOnly Status: 500 Internal Server Error X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.3 ETag: "d948cc15044d79cc0a6277a36550fc5e" X-Runtime: 622ms Content-Length: 11294 Cache-Control: private, max-age=0, must-revalidate redMine 500 error

Internal error

An error occurred on the page you were trying to access.
If you continue to experience problems please contact your redMine administrator for assistance.

Back