Jump to content
SNS Users' Forums
  • Announcements

    • Eric Newbauer

      Follow SNS on Twitter!

      Get notifications via Twitter! Looking for instant updates? We're now also announcing new versions and beta programs via Twitter. Follow Studio Network Solutions on Twitter. Thanks!
zmark

Problems with globalSAN initiator connecting to target under development

Recommended Posts

Hello,

I am currently developing an open-source iSCSI/SCSI target implementation and I am having some difficulty getting the globalSAN initiator to play well with our target, which was written in-house.

Currently, our target is compatible with Microsoft Software iSCSI Initiator, as well as OpeniSCSI on Linux. We are attempting to round out our offering by making our software compatible with the GlobalSAN initiator.

To be more specific about my troubles, I attempted to perform two operations with the GlobalSAN initiator to our target:

1) Target discovery via SendTargets

2) Manual login to the target via pre-defined IQN.

No security was enabled in either case. Header and data digests were disabled.

Both cases produce the same symptoms. I have packet captures taken with Wireshark that show the following behavior:

1) The initiator sends a login PDU to our target, as expected. The T bit is set to 1, indicating that the initiator would like to enter the full feature phase after this PDU.

2) Our target responds with code SUCCESS with some operational parameters as well, and leaves the T bit (indicating the desire to transition to the full feature phase) to 1.

3) The initiator sends a TCP RST (connection reset) which our target acknowledges, and the connection is reset. A log message appears in syslog ("invalid login response") and a dialog box appears with the message "Could not login to target <iqn> globalSAN error code 120D." I take it that error code 120D is the code for an invalid login response (I was unable to find any documentation of the error codes).

The reason I focus so much on the T bit is that by adjusting our implementation to set the T bit to 0 on the login response, the initiator begins sending SCSI commands (as if the login were successful).

Is our target implementation incorrect in setting the T bit to 1 in the case of this login response? I've looked through RFC3720 extensively on this matter, and I haven't found any specific verbiage. The Microsoft, OpeniSCSI, and Solaris initiators do not exhibit this behavior.

I can send along the packet captures if requested.

Thank you,

Zachary Mark

Software Engineer

Cleversafe, Inc.

http://www.cleversafe.org

Share this post


Link to post
Share on other sites

Any update on this from SNS (or any other protocol experts for that matter)? I have packet captures ready to send.

Share this post


Link to post
Share on other sites

Getting this 120D error now trying to connect to Opensolaris build 100...

Any help would be appreciated.

Jeb

Share this post


Link to post
Share on other sites

I'm seeing the same error. Using globalSAN initiator 3.3.0.43 connecting to a sharedscsi ZVOL on OpenSolaris snv_99 X86. Everything seems normal from the target's point of view:

-bash-3.2$ iscsitadm list target -v

Target: data/timemachine/applecore

iSCSI Name: iqn.1986-03.com.sun:02:076c43a9-b18d-ee5a-d788-ad7c5c55b8e2

Alias: data/timemachine/applecore

Connections: 0

ACL list:

TPGT list:

LUN information:

LUN: 0

GUID: 0

VID: SUN

PID: SOLARIS

Type: disk

Size: 120G

Backing store: /dev/zvol/rdsk/data/timemachine/applecore

Status: online

-bash-3.2$ zfs get shareiscsi data/timemachine/applecore

NAME PROPERTY VALUE SOURCE

data/timemachine/applecore shareiscsi on inherited from data/timemachine

Share this post


Link to post
Share on other sites

As a point of reference (and just for the heck of it) I decided to try using FreeNAS as an intermediary initiator between globalSAN and Solaris. Sure enough, the FreeNAS initiator is able to use the Solaris iSCSI target fine. Then I shared that disk out as a FreeNAS target and was able to connect to it from my laptop using the globalSAN initiator. Sick and twisted, but it works.

Regardless, it's just a fun proof-of-concept. The globalSAN initiator still won't connect with the Solaris target directly.

Share this post


Link to post
Share on other sites

Seeing that the same problem occurs with the Solaris target I'm fairly confident that the globalSAN initiator is protocol-incorrect. The Solaris iSCSI platform seems (to me) to be built on pedantic levels of protocol compliance and it's precisely why I enjoy testing with their initiator, because it brings out the compliance bugs that I never would have caught otherwise. Granted their target may have been developed by a different team, but the general Sun philosophy comes into play here and I'm far more confident that Sun's implementation is correct than that of SNS.

Share this post


Link to post
Share on other sites

Quick update:

I've been able to initiate to OpenSolaris snv_104. I was going to disconnect and recreate the zvol to a larger size for Time Machine testing, but it seems to be hung on disconnect. The globalSAN iSCSI prefs window is just sitting there with the spinning ball.

Update: I also noticed these errors in system.log, just fyi...

Dec 16 11:51:01 Macintosh-4 kernel[0]: SCSIPressurePathManager: Timed out waiting for inactive/error path to become active, loops = 10

Share this post


Link to post
Share on other sites

The previous problem I reported while disconnecting must have been user-inflicted. I've been running through a number of use cases with my MacBook and it has been running great so far. I think we're going to upgrade our OpenSolaris server up to snv_104 and begin using the globalSAN initiator for all of our Mac clients for Time Machine backups.

Share this post


Link to post
Share on other sites
Seeing that the same problem occurs with the Solaris target I'm fairly confident that the globalSAN initiator is protocol-incorrect. The Solaris iSCSI platform seems (to me) to be built on pedantic levels of protocol compliance and it's precisely why I enjoy testing with their initiator, because it brings out the compliance bugs that I never would have caught otherwise. Granted their target may have been developed by a different team, but the general Sun philosophy comes into play here and I'm far more confident that Sun's implementation is correct than that of SNS.

I entirely agree with you in regards to how accurate Sun tries to be in terms of standards compliance.

If you presume that SNS has excellent programmers, and is perfectly capable of inducing a perfected level of function with Solaris (regarding globalSAN initiator for MacOS, and Solaris 10 U6 10/08), and for so long it has not worked, this leaves only one potentially accurate conclusion:

THEY INTEND IT TO BE BROKEN WITH SOLARIS!

Perhaps they are just hiding behind it "being a bug", but in fact they have no desire for Solaris to work with their product.

Either way, buggy code, or intentional dysfunction, I have recommended this software (and SNS generally) to none of my clients.

For if this is their outlook concerning this product, one can only imagine that other products would be supported in an equally off handed manner.

Stuart

Share this post


Link to post
Share on other sites
I entirely agree with you in regards to how accurate Sun tries to be in terms of standards compliance.

If you presume that SNS has excellent programmers, and is perfectly capable of inducing a perfected level of function with Solaris (regarding globalSAN initiator for MacOS, and Solaris 10 U6 10/08), and for so long it has not worked, this leaves only one potentially accurate conclusion:

THEY INTEND IT TO BE BROKEN WITH SOLARIS!

Perhaps they are just hiding behind it "being a bug", but in fact they have no desire for Solaris to work with their product.

Either way, buggy code, or intentional dysfunction, I have recommended this software (and SNS generally) to none of my clients.

For if this is their outlook concerning this product, one can only imagine that other products would be supported in an equally off handed manner.

Stuart

I rather doubt this.

There is no "perfect implementation" of iSCSI. Solaris tries, everyone tries -- but the full specification is insanely complex, and insanely annoying. (Note that there are bugs filed against the iSCSI target in OpenSolaris.) Studio Network is a small company (1-50 employees, so their employees probably don't even qualify for COBRA health benefits), and probably has only a few developers on this project. IBM wrote the spec. They had hundreds, if not thousands, of developers available to devote to implementing it *precisely*.

I know I wouldn't want to try to code something as complex as this. The fact that it works with as many iSCSI targets as it does is frankly amazing.

Also, one last thing: "never attribute to malice that which can adequately be explained by lack of resources."

(and http://www.spoke.com/info/c3Gp1bB/StudioNetworkSolutions is where I'm getting my company data.)

Get off your high horse. As a private concern, they have other issues to worry about.

-KyAnHamilton

Share this post


Link to post
Share on other sites
I entirely agree with you in regards to how accurate Sun tries to be in terms of standards compliance.

If you presume that SNS has excellent programmers, and is perfectly capable of inducing a perfected level of function with Solaris (regarding globalSAN initiator for MacOS, and Solaris 10 U6 10/08), and for so long it has not worked, this leaves only one potentially accurate conclusion:

THEY INTEND IT TO BE BROKEN WITH SOLARIS!

Perhaps they are just hiding behind it "being a bug", but in fact they have no desire for Solaris to work with their product.

Either way, buggy code, or intentional dysfunction, I have recommended this software (and SNS generally) to none of my clients.

For if this is their outlook concerning this product, one can only imagine that other products would be supported in an equally off handed manner.

Stuart

After a while of being on the back burner, I resolved the compatibilty issues. Turns out it was a bug in our target that doesn't properly set the TSIH on the last login response of the login phase (the issue was discovered when testing with a hardware HBA card). It exposed a bug in the GlobalSAN initiator as well as our development target. It explains why setting the T bit on the last login response to 0 allowed the login to continue:

1) Our target was using a hardcoded TSIH of 0 which was correct given that it was the leading login request for a session.

2) The GlobalSAN protocol validator module looked at the login response PDU, saw the 0 TSIH (which matches what it used to login) and a T bit of 0 and processed the PDU as valid. The GlobalSAN initiator did not do a higher-level validation of the PDU because the login proceeded. As far as I can tell, RFC3720 indicates that the negotiation should continue given that protocol sequence.

I'm surprised that the Solaris PDU validators didn't catch this given their general pedanticism about protocol. But RFC3720 is so poorly written that issues like this can easily slip through the cracks.

Share this post


Link to post
Share on other sites
Quick update:

I've been able to initiate to OpenSolaris snv_104. I was going to disconnect and recreate the zvol to a larger size for Time Machine testing, but it seems to be hung on disconnect. The globalSAN iSCSI prefs window is just sitting there with the spinning ball.

Update: I also noticed these errors in system.log, just fyi...

Dec 16 11:51:01 Macintosh-4 kernel[0]: SCSIPressurePathManager: Timed out waiting for inactive/error path to become active, loops = 10

Were you able to resolve this problem? I'm having the same problem and it's annoying as ######. I planed to use an iscsi target for time machine, but it's so annoying when I have to restart my macbook after each time I synchronize time machine :(

Share this post


Link to post
Share on other sites

Hi,

We've just announced a beta version for globalSAN v4.0. This version addresses many incompatibilities with 3rd party targets. We'd love to hear if this v4.0 beta helps with this target. Go here for additional details and download link. I look forward to everyones feedback.

-ryan

Share this post


Link to post
Share on other sites

×