MODE SELECT ignores mode pages in develop branch, v23.11.01 works #1427

uweseimet · 2024-01-31T15:56:59Z

This ticket refers to commit a23d642 (current develop branch).

When you send a sequence of mode pages with MODE SELECT(6), some pages are ignored. If one of the pages is a format device page for a sector size change (configured in the page data, not in the descriptor), and when this page is located somewhere at the end of the list, the change is ignored. The log entry about the sector size change does not appear. When you move this page up in the page list the log entry appears. Whether a page is evaluated or not seems to depend on its position in the list of mode pages.

With release v23.11.01 this is correct, i.e. for the same command parameters the sector size change request is logged (for both MODE SELECT(6) and MODE SELECT(10)), regardless of the size of the page list.

rdmark · 2024-02-17T23:37:49Z

Thanks for reporting, Uwe.

@kkaempf At a glance, this seems to be a specific scenario that we didn't consider when reviewing your ModeSelect PR series.

#1405
#1406

rdmark · 2024-03-20T13:57:45Z

@uweseimet Does the unit test added here adequately represent the test case that you describe in this issue? #1443

uweseimet · 2024-03-20T22:08:25Z

@rdmark No, it doesn't.. The test needs to send multiple pages (a list of pages) with a single MODE SELECT. Then, depending on where in the list the sector size is changed, the change does not have any effect. You can see this in the logfile: piscsi in the develop branch does not log that the sector size has been changed, v23.11.01 does. This shows that the code parsing the page list passed with MODE SELECT is broken in the develop branch.

Please note that I do not intend to spend more time on this and the other MODE SELECT bug. I thought I'd report what I found, so that it does not end up in a release, but I am not interested in defending why I think that a program that crashes has a bug.

rdmark · 2024-03-21T04:52:44Z

@uweseimet Can you please share a MODE SELECT(6) payload that you used to demonstrate the bug? This would help avoid making incorrect assumptions or mistakes based of a misunderstanding of SCSI behavior.

uweseimet · 2024-03-21T08:25:02Z

The payload (list of pages) was similar to the one below. It may even have been the same, but I don't know anymore.

02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
02:01:00
03:16:00:08:00:00:00:00:00:00:00:19:08:00:00:01:00:0b:00:14:00:00:00:00

uweseimet · 2024-03-24T20:07:47Z

@rdmark Out of curiosity I diffed scsi_command_util.cpp against the last release and found this:

-               length -= size;
+               length -= size + 1;
                offset += size;

This looks quite wrong to me. Why is the length now decremented faster than the offset is incremented? This means offset and length are out of sync, aren't they? IMHO this results in the loop being terminated too early, before all data have been parsed.

rdmark · 2024-03-26T23:13:15Z

@uweseimet if you look at the git blame for that change you find this commit: ad5eae9

The commit message lays out the justification for the logic change in detail. This is the relevant passage. Do you have a different interpretation of this data?

* Fix page length computation in ModeSelect

tl;dr

The 'skip to next ModeSelect page' computation was off-by-one, either not
taking the page code itself into account or missing the fact that the
page length is given as `n - 1`.

Fix:

Add 1 to the computed length.

Details:

OpenVMS Alpha sends a ModeSelect6 as follows
~~~
command:

ModeSelect6, CDB $151000001900

payload:

 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
00 00 00 08 00 00 00 00 00 00 02 00 01 0a 24 00 00 00 00 00 00 00 00 00 00
~~~

This translates to (accoring to [1], Section 8.3.3)

~~~
Mode Data Length 0
Medium Type      0
Device-specific  0
Block desc len   8
~~~

with the following offset / length computation _before_ the `while` loop
~~~
offset = 12
length = 13
~~~

The first payload section is
~~~
 4  5  6  7  8  9 10 11
00 00 00 00 00 00 02 00
~~~

translating to

~~~
Density Code     0
Number of blks   0
Reserved         0
Block length   0x200 512
~~~

Then follows a pagecode 1 as
~~~
12 13 14 15 16 17 18 19 20 21 22 23 24
01 0a 24 00 00 00 00 00 00 00 00 00 00
~~~

translating to
~~~~
Page code        1
Page length -1  10
Mode parameters 24 00 00 00 00 00 00 00 00 00 00
~~~

computing (inside the `while` loop, as `// Advance to the next page`)

~~~
size =  10 + 2 = 12
~~~

followed by new `offset` and `length` values

~~~
offset = 25
length = 1
~~~

So it stays in the `while` loop (and has a larger-than-buffer `offset`
value)

uweseimet · 2024-03-27T09:01:37Z

@rdmark IMO the current code is wrong, independent of the data. If you iterate over an array and remember the array index and the number of remaining bytes, is it correct to add/subtract different values? Wouldn't the sum of the index and the remaining byte count always have to be the same? If the number of remaining bytes is decremented faster than the index is incremented you are out of sync and terminate the loop too early. This would perfectly explain the bug I have observed. Please correct me if I am missing something.

https://github.com/uweseimet/scsi2pi/blob/main/cpp/test/scsi_hd_test.cpp has some unit tests with mode page lists, e.g. ModeSelect6_Multiple. (The page list size is not sufficient to completely cover this ticket, though.) In https://github.com/uweseimet/scsi2pi/blob/main/cpp/devices/disk.cpp in the ModeSelect() method both offset and remaining byte count are adjusted by the same value. I recommend to add similar unit tests to PiSCSI. Payload examples for such tests I have already provided, and https://github.com/uweseimet/scsi2pi/blob/main/cpp/test/scsi_hd_test.cpp provides more examples.
Of course, if it turns out that something is wrong with my unit tests I would appreciate feedback.

Maybe I should give an abstract example on what I think is wrong. Assume a page list of 10 pages with a length of 2 bytes each. At the beginning offset is 0, length is 20. After parsing a page the offset is incremented by 2 (size), the length decremented by 3 (size + 1). You get this sequence:

offset = 0, length = 20
offset = 2, length = 17
offset = 4 length = 14
offset = 6, length = 11
offset = 8, length = 8
offset = 10, length = 5
offset = 12, length = 2
offset = 14, length = -1

Parsing stops here because length is < 0. The pages at offsets 16 and 18 have not yet been parsed, i.e. they are ignored.

kkaempf · 2024-03-29T11:18:13Z

The payload (list of pages) was similar to the one below. It may even have been the same, but I don't know anymore.
02:01:00
...
02:01:00
03:16:00:08:00:00:00:00:00:00:00:19:08:00:00:01:00:0b:00:14:00:00:00:00

Where is this data coming from ?

I'd assume this would be "mode pages". Looking at the definition (Table 94 in Small Computer System Interface - 2 rev 10L.pdf - page 148 in document, 184 in pdf) it has

page code (1 byte)
page length n - 1 (!) (1 byte)
page data (n bytes)

So the above should rather be

02:00:00
...
02:00:00
03:15:00:08:00:00:00:00:00:00:00:19:08:00:00:01:00:0b:00:14:00:00:00:00

Commit ad5eae9 was based on actual captured SCSI data with 2 pages.

I'm going to extend the testcase now to cover > 2 pages. It might well be that the offset / length computation still is wrong for > 2 pages.

uweseimet · 2024-03-29T11:39:05Z

The specification says: "The page length field specifies the length in bytes of the mode parameters that follow. If the initiator does not set this value to the value that is returned for the page by the MODE SENSE command, the target shall terminate the command with CHECK CONDITION status."

The length field for MODE SELECT is the same as the length field reported by MODE SENSE for the same page.

When the page data are 02:01:00 there is 1 byte following the page length field, i.e. the page length field must be 1. Please look at the mode page examples in the specification, e.g. table 96. In table 96 there are 6 bytes following the length field and the length field is 6. This is consistent with the mode page data I used in my tests. Correct me if I am wrong, but if you were right the page length in table 96 would have to be 5, which is not the case.

But this is not related to the potential offset/length issue I mentioned, because the affected length is not an individual page length but the total length of the payload data.

kkaempf · 2024-03-29T13:45:25Z

Please look at the mode page examples in the specification, e.g. table 96. In table 96 there are 6 bytes following the length field and the length field is 6. This is consistent with the mode page data I used in my tests. Correct me if I am wrong, but if you were right the page length in table 96 would have to be 5, which is not the case.

Looks like we have a conflict between an example in the spec (length byte == number of byte following) and the actual data captured from the DEC BIOS (length byte == number of bytes following minus 1) 😉

Do we need a DEC BIOS flag ? 🤔

Anyone with more captured data to look at ?

uweseimet · 2024-03-29T14:01:38Z

@kkaempf Before you try to add any flag I recommend to verify that the DEC BIOS really violates the specification. I doubt it, because in this case it would not be compatible with any SCSI drive that correctly implements the specification. I bet the vast majority of drives do it right, because otherwise they would not work with most of the available drivers/tools. It is more likely that you have misinterpreted something.

Do you have any real SCSI drive that is compatible with your DEC BIOS? In this case I recommend that you verify the behavior of this drive regarding MODE SELECT. If you are right, in order to be compatible with the DEC BIOS the drive would have to violate the specification. You can use your PISCSI board and the s2pexec tool of https://www.scsi2pi.net/, for instance, to run any SCSI command against this drive and to evaluate the results.

If you are convinced that your BIOS is a special case I would be interested if https://www.scsi2pi.net/, which has a more elaborate MODE page handling than PiSCSI, works with your board or not. Testing this does not take much time, you can simply install an SCSI2Pi binary package.

In any case, what the specification says should be the default implementation.

@rdmark All in all, it's up to you, of course, but as you can see from my comments I doubt once again that any kind of "quirks" mode is needed. Real SCSI drives also do not have this kind of mode and work with all kinds of platforms/drivers. Usually in such a case there is just a misunderstanding of the specification or what the drivers do.

uweseimet · 2024-03-29T17:45:29Z

@kkaempf I thought I'd run a test with your payload and the SCSI2Pi in_process_test (see https://github.com/uweseimet/scsi2pi/wiki/Advanced-Testing) myself, as a proof of concept. This is the result:

>./bin/in_process_test -s "-i 0 zip.hds"
s2pexec>-i 0 -c 151000001900 -d 000000080000000000000200010a2400000000000000000000
[17:52:57.144] [warning] Device reported CHECK CONDITION (status code $02)
Error: ILLEGAL REQUEST (Sense Key $05), INVALID FIELD IN PARAMETER LIST (ASC $26), ASCQ $00
s2pexec>

The error is expected because your payload appears to contain a vendor-specific mode page 0, which is not supported by the default configuration. As soon as I add a user-defined mode page 0 (see https://www.scsi2pi.net/en/properties.html) your payload is accepted and the error is gone:

>./bin/in_process_test -s "-i 0 zip.hds"
s2pexec>-i 0 -c 151000001900 -d 000000080000000000000200010a2400000000000000000000
s2pexec>

This confirms that nothing is wrong with how PiSCSI v23.11.01 parses the mode pages (offset/length). But PiSCSI does not support mode page 0, and this appears to cause the problem with your payload.

Note that for a valid solution PiSCSI has to support mode page 0 for both MODE SENSE and MODE SELECT.
User-configurable (on device level) mode page data are IMO the only valid solution, because each device that supports mode page 0 can have different mode page data. If you just implement support for mode page 0 as it is needed by your setup, as soon as the there is another user with a setup that needs a different mode page 0, you have a conflict: Without a user-definable mode page, why should PiSCSI support mode page 0 just like you need it, and not just like the other user(s) need it?
There is also a conflict if two drivers for two different devices each expect vendor-specific data. If there is just a single (default) implementation for mode page 0, one of the drivers will fail. This can even happen with the other mode pages, because they use hard-coded data.

Follow-up on PiSCSI#1402, PiSCSI#1405, PiSCSI#1427 This is the first (yet incomplete) test to cover this behavior. Signed-off-by: Klaus Kämpf <[email protected]>

OpenVMS Alpha sends a strange ModeSelect payload, apparently one byte too large. This was 'fixed' by a (wrong) length calculation in PiSCSI#1405, breaking PiSCSI#1427. This PR - fixes the wrong length calculation - improves the loop test in scsi_command_util::ModeSelect to prevent a buffer overflow. (Remaining length was checked for > 0, but buffer access is at offset and offset + 1, effectively requiring 2 bytes.) - the loop test fix makes PiSCSI#1402 pass - adds a testcase for PiSCSI#1402 - adds a testcase for PiSCSI#1427 Fixes issue PiSCSI#1427 Signed-off-by: Klaus Kämpf <[email protected]>

kkaempf · 2024-04-01T15:08:18Z

@uweseimet, I stand corrected on the offset/length calculation in scsi_command_util.cpp. See #1447 for a fix.

However, I don't get your "page 0" arguments about the Alpha VMS payload.
000000080000000000000200010a2400000000000000000000 splits up into
00000008 0000000000000200 010a2400000000000000000000 as explained in #1427 (comment)

Looking at the 010a2400... - it's page 0x01 request with 0x0a bytes following. However, 0x0b bytes actually follow.
The "trailing" byte 0 could be interpreted as a "page 0" but it's missing the length byte. 🤷🏻‍♂️

uweseimet · 2024-04-01T15:13:11Z

@kkaempf Page 0 does not have/require a length byte. Please check the SCSI specification. This page does not have any well-defined format. This is why it must be the last page of a page list, because otherwise you would not be able to determine the start of the next page.

rdmark · 2024-04-04T10:47:51Z

To recap, the reason we wanted to implement the DEC vendor specific ModeSelect page 0 was because the DEC machine cannot boot from a piscsi volume without it:

#1402

The question is if there’s another way to handle the vendor page, I e ignoring or retuning some kind of stub value that makes the DEC boot without introducing SCSI spec breaking behavior.

rdmark · 2024-04-13T10:42:06Z

Whole changeset reverted with #1451

* Allow 'empty' ModeSelect6 tl;dr Treat a computed length of 0 as `has_valid_page_code`. Details: The SRM console (aka 'BIOS') of DEC Alpha sends an empty ModeSelect6 with the following data: ~~~ ModeSelect6, CDB $151000000c00 ~~~ That makes 12 byte(s) as follows ~~~ 0 1 2 3 4 5 6 7 8 9 10 11 00 00 00 08 00 00 00 00 00 00 02 00 ~~~ decoding it (accoring to [1], Section 8.3.3, Table 94) gives us Mode Data Length 0 Medium Type 0 Device-specific 0 Block desc len 8 Density Code 0 Number of blks 0 Reserved 0 Block length 512 `scsi_command_util::ModeSelect` computes ~~~ offset = 4 + buf[3]; ~~~ giving 12 and ~~~ length -= offset; ~~~ giving 0. Thus it never enters the `while` loop and `has_valid_page_code` stays `false`, raising an error. [1] [Small Computer System Interface - 2 rev 10L.pdf](https://dn790004.ca.archive.org/0/items/SCSISpecificationDocumentsSCSIDocuments/Small%20Computer%20System%20Interface%20-%202%20rev%2010L.pdf) Signed-off-by: Klaus Kämpf <[email protected]> * Allow ModeSelect with page code 1 OpenVMS Alpha (the operating system, not the SRM BIOS) uses ModeSelect6 with a page code of 1. The semantics are unknown, just accepting it works for me. Signed-off-by: Klaus Kämpf <[email protected]> * Fix page length computation in ModeSelect tl;dr The 'skip to next ModeSelect page' computation was off-by-one, either not taking the page code itself into account or missing the fact that the page length is given as `n - 1`. Fix: Add 1 to the computed length. Details: OpenVMS Alpha sends a ModeSelect6 as follows ~~~ command: ModeSelect6, CDB $151000001900 payload: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 00 00 00 08 00 00 00 00 00 00 02 00 01 0a 24 00 00 00 00 00 00 00 00 00 00 ~~~ This translates to (accoring to [1], Section 8.3.3) ~~~ Mode Data Length 0 Medium Type 0 Device-specific 0 Block desc len 8 ~~~ with the following offset / length computation _before_ the `while` loop ~~~ offset = 12 length = 13 ~~~ The first payload section is ~~~ 4 5 6 7 8 9 10 11 00 00 00 00 00 00 02 00 ~~~ translating to ~~~ Density Code 0 Number of blks 0 Reserved 0 Block length 0x200 512 ~~~ Then follows a pagecode 1 as ~~~ 12 13 14 15 16 17 18 19 20 21 22 23 24 01 0a 24 00 00 00 00 00 00 00 00 00 00 ~~~ translating to ~~~~ Page code 1 Page length -1 10 Mode parameters 24 00 00 00 00 00 00 00 00 00 00 ~~~ computing (inside the `while` loop, as `// Advance to the next page`) ~~~ size = 10 + 2 = 12 ~~~ followed by new `offset` and `length` values ~~~ offset = 25 length = 1 ~~~ So it stays in the `while` loop (and has a larger-than-buffer `offset` value) Signed-off-by: Klaus Kämpf <[email protected]> Fix length calculation in scsi_command_util::ModeSelect OpenVMS Alpha sends a strange ModeSelect payload, apparently one byte too large. This was 'fixed' by a (wrong) length calculation in PiSCSI#1405, breaking PiSCSI#1427. This PR - fixes the wrong length calculation - improves the loop test in scsi_command_util::ModeSelect to prevent a buffer overflow. (Remaining length was checked for > 0, but buffer access is at offset and offset + 1, effectively requiring 2 bytes.) - the loop test fix makes PiSCSI#1402 pass - adds a testcase for PiSCSI#1402 - adds a testcase for PiSCSI#1427 Fixes issue PiSCSI#1427 Signed-off-by: Klaus Kämpf <[email protected]> Improve buffer overflow checking in scsi_command_util::ModeSelect Signed-off-by: Klaus Kämpf <[email protected]>

OpenVMS Alpha sends a strange ModeSelect payload, apparently one byte too large. This was 'fixed' by a (wrong) length calculation in PiSCSI#1405, breaking PiSCSI#1427. This PR - fixes the wrong length calculation - improves the loop test in scsi_command_util::ModeSelect to prevent a buffer overflow. (Remaining length was checked for > 0, but buffer access is at offset and offset + 1, effectively requiring 2 bytes.) - the loop test fix makes PiSCSI#1402 pass - adds a testcase for PiSCSI#1402 - adds a testcase for PiSCSI#1427 Fixes issue PiSCSI#1427 Signed-off-by: Klaus Kämpf <[email protected]>

uweseimet added the bug Something isn't working label Jan 31, 2024

rdmark mentioned this issue Mar 20, 2024

Add test for sequential repeated MODE SELECT(6) sector size change #1443

Closed

kkaempf added a commit to kkaempf/piscsi that referenced this issue Apr 1, 2024

Add SetSectorSize test in SCSIHD

09ea598

Follow-up on PiSCSI#1402, PiSCSI#1405, PiSCSI#1427 This is the first (yet incomplete) test to cover this behavior. Signed-off-by: Klaus Kämpf <[email protected]>

kkaempf mentioned this issue Apr 1, 2024

Fix length calculation in scsi_command_util::ModeSelect #1447

Closed

rdmark closed this as not planned Won't fix, can't repro, duplicate, stale Apr 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MODE SELECT ignores mode pages in develop branch, v23.11.01 works #1427

MODE SELECT ignores mode pages in develop branch, v23.11.01 works #1427

uweseimet commented Jan 31, 2024

rdmark commented Feb 17, 2024

rdmark commented Mar 20, 2024

uweseimet commented Mar 20, 2024

rdmark commented Mar 21, 2024

uweseimet commented Mar 21, 2024

uweseimet commented Mar 24, 2024 •

edited

Loading

rdmark commented Mar 26, 2024

uweseimet commented Mar 27, 2024 •

edited

Loading

kkaempf commented Mar 29, 2024

uweseimet commented Mar 29, 2024 •

edited

Loading

kkaempf commented Mar 29, 2024

uweseimet commented Mar 29, 2024

uweseimet commented Mar 29, 2024 •

edited

Loading

kkaempf commented Apr 1, 2024

uweseimet commented Apr 1, 2024

rdmark commented Apr 4, 2024

rdmark commented Apr 13, 2024

MODE SELECT ignores mode pages in develop branch, v23.11.01 works #1427

MODE SELECT ignores mode pages in develop branch, v23.11.01 works #1427

Comments

uweseimet commented Jan 31, 2024

rdmark commented Feb 17, 2024

rdmark commented Mar 20, 2024

uweseimet commented Mar 20, 2024

rdmark commented Mar 21, 2024

uweseimet commented Mar 21, 2024

uweseimet commented Mar 24, 2024 • edited Loading

rdmark commented Mar 26, 2024

uweseimet commented Mar 27, 2024 • edited Loading

kkaempf commented Mar 29, 2024

uweseimet commented Mar 29, 2024 • edited Loading

kkaempf commented Mar 29, 2024

uweseimet commented Mar 29, 2024

uweseimet commented Mar 29, 2024 • edited Loading

kkaempf commented Apr 1, 2024

uweseimet commented Apr 1, 2024

rdmark commented Apr 4, 2024

rdmark commented Apr 13, 2024

uweseimet commented Mar 24, 2024 •

edited

Loading

uweseimet commented Mar 27, 2024 •

edited

Loading

uweseimet commented Mar 29, 2024 •

edited

Loading

uweseimet commented Mar 29, 2024 •

edited

Loading