Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test tape device with a VAXstation 3100 running OpenVMS 7.2 #111

Open
uweseimet opened this issue Dec 4, 2024 · 77 comments
Open

Test tape device with a VAXstation 3100 running OpenVMS 7.2 #111

uweseimet opened this issue Dec 4, 2024 · 77 comments
Assignees
Labels
help wanted Extra attention is needed on hold

Comments

@uweseimet
Copy link
Owner

uweseimet commented Dec 4, 2024

This is a follow-up ticket for #100, dealing with issues found when testing the tape device. The OS reports "tape is not valid ANSI format" when trying to list files after a foreign mount operation.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

These are the logs and scripts for this ticket.

  • Content list test script, trying to read a block of 32 bytes from a tape with blocks of 4 bytes:
-i 3:0 -L trace
-c 12:00:00:00:ff:00
-c 00:00:00:00:00:00
-c 0a:00:00:00:04:00 -d 01:02:03:04 -o 90
-c 0a:00:00:00:04:00 -d 01:02:03:04 -o 90
-c 0a:00:00:00:04:00 -d 01:02:03:04 -o 90
-c 11:00:ff:ff:fe:00 -o 90
-n -c 08:00:00:00:20:00 -o 90
-c 03:00:00:00:ff:00
  • Logfile when sending this script to s2p with s2ptool:
    mount_s2p.txt

  • Logfile when sending this script to the Sony tape drive:
    mount.txt

@Pacjunk Please add whatever may be missing. The latest s2p log triggered by your foreign mount operation should be added I think.

@Pacjunk
Copy link

Pacjunk commented Dec 4, 2024

Correction: The mount/foreign works correctly. The error occurs when using the backup/list command to list the contents of the first saveset. If the tape is mounted normally (files11), then the first file can be copied from tape to disk and read successfully - which shows that the tape is formatted correctly, and the files are valid. Simh can read the same .tap file (although not emulating SCSI), and a physical SCSI tape drive also works correctly.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

I sse, I confused the commands. s2p can also read the file, but it has to report a SCSI error because of the block mismatch. s2p does not have an issue with the file, but with the combination of block sizes in the file and the SCSI command to read these blocks. s2psimh does not know anything about this context but just lists the file structure/contents.
The read command sent by your OS tries to read 90 bytes from the current block position, but the first block at position 0 of the tape only has 80.
Can you please attach the s2p log that results in your OS complaining?

@Pacjunk
Copy link

Pacjunk commented Dec 4, 2024

This is the trace log for the mount/foreign command - which completes successfully.
mount.txt

This is the log for the backup/list (what I was calling foreignlist) - which fails and produces the error "tape is not valid ANSI format". This is because it attempts to read the label block and s2p signals an error because of block size mismatch.

list.txt

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

Just to ensure that the test scenario for this ticket is correct: First mount.txt has to be applied, then list.txt. With your drive this sequence of commands works fine, with s2p it results in an error. Please confirm or correct.

@Pacjunk
Copy link

Pacjunk commented Dec 4, 2024

Correct

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

Good, but these were only the logs. Can you please add the matching scripts generated by s2p for these logs?

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

A question regarding the ANSI format: The initial blocks in the .tap files we use for testing have 80 bytes.

Offset 0 ($0): Class 0, good data record, record length 80 ($50)
Offset 88 ($58): Class 0, good data record, record length 80 ($50)
Offset 176 ($b0): Class 0, good data record, record length 80 ($50)

This is the same for the .tap files from your download link as for a blank tap file that you have initialized.
Is this ANSI format? I am asking because your OS reports an ANSI-related error when trying to read 90 byte blocks from such a file. If the 80 bytes are ANSI format, trying to read 90 bytes can never succeed, also not with a real tape drive.

@Pacjunk
Copy link

Pacjunk commented Dec 4, 2024

Yes, that is ANSI.

I found some information on VMS 8.4 (quite a bit later, and not using this architecture, but tape formats havent changed for years). Link is https://community.hpe.com/t5/operating-system-openvms/openvms-8-4-backup-log-disk-test-mga0-test-bck-save-not-working/td-p/4830493. The person is getting the same error messages as me. One of the replies states the following:

backup sets up a 90-byte buffer for reading the VOL1 header and EXPLICITLY checks the IO status block for 80 bytes being returned from the QIO and returns BACKUP$_NOTANSI, if that check fails.
It contains the magic number "90". I know this is not a proper solution, but what would happen if when the request is for 90 bytes, s2p just returns the data in the block (80)?

@uweseimet
Copy link
Owner Author

Where would take s2p the additional data from? Note that your tape drive also does not return more data than available, but signals the mismatch in the response, in the same way s2p does. RQUEST SENSE then provides the details, with custom bytes at the end of the added by your drive.

@Pacjunk
Copy link

Pacjunk commented Dec 4, 2024

What additional data? The above statement says that the device just returns the 80 bytes (and I assume there is a length field somewhere indicating the length of the response?)

The tape drive must at some stage return the required data though, otherwise it wouldn't work.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 4, 2024

The drive in its sense data provides the difference of requested vs. available data, and that's it. When the requested size is bigger than the block size it will not return any data from the block at all, not even the 80 bytes that exist. The spec says:

"If the SILI bit is zero and an incorrect-length logical block is read, CHECK CONDITION status shall be returned. The ILI and VALID bits shall be set to one in the sense data and the additional sense code shall be set to NO ADDITIONAL SENSE INFORMATION."
"If the FIXED bit is zero, the INFORMATION field shall be set to the requested transfer length minus the actual logical block length."

That is what both s2p and your drive do. This is an s2p log:

[warning] Device reported CHECK CONDITION (status code $02)
...
[trace] (ID:LUN 0:0) - GOOD: NO SENSE (Sense Key $00), NO ADDITIONAL SENSE INFORMATION (ASC $00), ASCQ $00, ILI: 1, INFORMATION: 10
...
00000000  f0:00:20:00:00:00:0a:0a:00:00:00:00:00:00:00:00  '.. ...@.........'
00000010  ca:00                                            '..'

This is my HP drive:

00000000  f0:00:20:00:00:00:0a:0b:00:00:00:00:00:00:00:00  '.. .............'
00000010  00:00:00                                         '...'

This is your drive:

[warning] Device reported CHECK CONDITION (status code $02)
...
00000000  f0:00:20:00:00:00:1c:12:00:00:00:00:00:00:00:00  '.. .............'
00000010  ca:00:00:00:00:00:00:1d:1e:3a                    '.........:'

The only difference is the number of mismatched bytes, because this test was run with 4 byte blocks and 32 bytes were requested. And we have the custom bytes at the end.

IMO the conclusion is that when your drive is connected, your OS never tries to read 90 bytes from the freshly initialized tape, which only has blocks of 80 bytes. Otherwise your drive would report the same error as s2p, except for the custom bytes.

If the custom bytes have no relevance, it means that something must have happened before, which caused your OS to do what is does in the s2p case, but do a different thing, i.e. NOT trying to read 90 byte blocks, when your drive is connected.

Would you agree on this conclusion?

I checked all commands in the logs again. There is one READ POSITION, which seems to ensure that the drive is at the beginning of the tape. s2p returns data that say that it is. I don't think anything is wrong with that. My tape drive returns the same data as s2p.
The MODE SENSE/MODE SELECT sequences your OS sends are more interesting, but refer to page 0, which is mostly vendor-specific. Your drive returns this for page 0:

00000000  0b:00:90:08:13:00:00:00:00:00:02:00              '............'

s2p says (page 0 default settings for tape devices):

00000000  0b:00:00:08:00:00:00:00:00:00:02:00              '............'

This is what my HP drive says:

00000000  0b:00:10:08:13:00:00:00:00:00:00:00              '............'

Since the contents of page 0 are vendor-specific, there are no general default values. Each device that provides these data may need its own set of values, but the first bytes are mentioned in the spec, but IMO a bit fuzzy.
Fortunately s2p supports fully customizable mode pages, see https://www.scsi2pi.net/en/properties.html. Please add this line to /etc/s2p.conf:

mode_page.0.SCSI2Pi:SCSI TAPE=0b:00:90:08:13:00:00:00:00:00:02:00

It tells s2p to return the specified data for page 0 for each device that matches the INQUIRY product data in the property key.
Please run your list test again after updating /etc/s2p.conf and provide the logs. Note that this test has to use the current sources in the issue_111 branch. Please run "git pull", "git checkout issue_111" before compiling.
Whatever the outcome is, please provide the logfile. In any case this setting eliminates a difference between s2p and your drive.

Regarding the mode pages more changes may be required, because your OS might change your drive's behavior by modifying vendor-specific data. You can see this in the commands sent by your OS, which are reflected by this script:

-L trace
-i 3:0 -c 00:00:00:00:00:00 # Clear the drive status
-i 3:0 -c 1a:00:00:00:ff:00 # Read mode page 0 data
-i 3:0 -c 1a:00:00:00:ff:00 # Again
-i 3:0 -c 15:00:00:00:0c:00 -d 00:00:10:08:00:00:00:00:00:00:00:00 # MODE SELECT for page 0
-i 3:0 -c 1a:00:00:00:ff:00 # Read data from page 0 again, maybe expecting a change
-i 3:0 -c 1a:00:10:00:ff:00 # Read device configuration page
-i 3:0 -c 1a:00:50:00:ff:00 # Read changeable values of device configuration page
-i 3:0 -c 08:00:00:00:5a:00 -o 90 # The offending READ command with the size mismatch

It reads mode page 0, then writes data for mode page 0, then reads mode page 0 again. Quite likely it assumes that after the MODE SELECT the page data have changed. In case of my HD drive nothing is changed by this command. The OS also checks which data of the device configuration page are changeable, but does not send any MODE SELECT to change them.
Please send the script above to your drive using s2pexec. There must be a tape in the drive, but no data are written.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

With the mode page stuff set, the only difference is the mount now reports that the tape is write locked.

mount.txt
list.txt

I will run the script soon...

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

OK, thanks a lot.

I may just have found something relevant while re-checking all logs. I still don't know why my streamer does not work with Linux and the Atari. When using s2pexec with it I can successfully run the init script and can read the data it writes. When doing this I noticed that in the mount log there is a difference in the error data returned when the first filemark is hit. The OS reads the first 3 blocks, then there is another read command. This one has to fail because after the third block there is a filemark. This raises an error (with your drive, with my drive, and with s2p), but s2p does not return the requested block size in the error message, whereas my drive does. The spec also says this has to be returned. I am in the process of updating s2p accordingly.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

I ran the script twice. The first time it produced a message "Error: UNIT ATTENTION (Sense Key $06), NOT READY TO READY TRANSITION (MEDIUM MAY HAVE CHANGED) (ASC $28), ASCQ $00". The second time, no such message.

output.txt
output2.txt

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

Wrt your first comment: I don't know what you mean with "mode page stuff set".
Edit: Now I know, but the mode page does/should not deal with write protection. Anyway, I will check.

Wrt to the next comment: The UNIT ATTENTION was expected, this is why there was a TEST UNIT READY at the beginning of the script, to clear the unit attention status. The second time you ran the script the status was already cleared, therefore no unit attention anymore.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

Running the script on your drive has revealed what the MODE SELECT does. With the MODE SENSE before the MODE SELECT your drive says that it has a default block size of 512 bytes. The MODE SELECT changes this. After the MODE SELECT your drive return 0 as default block size with the next MODE SENSE. The spec says:
"The block length specifies the length in bytes of each logical block described by the block descriptor. For sequential-access devices, a block length of zero indicates that the logical block size written to the medium is specified by the transfer length field in the command descriptor block (see 10.2.4 and 10.2.14)"
This is not yet what s2p returns in its MODE SENSE data. I will change this.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

The block size change works now. Can you please check if this already improves something after updating and re-compiling issue_111? Important: Remove the mode page line from /etc/s2p.conf.
Please provide the log from the list test after this test.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

The mount crashes s2p with floating point exception

mount.txt

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

I just committed a fix so that when hitting a filemark the requested length is returned. The exception is most likely also fixed in the very latest issue_111 sources. If not, please provide an up to date mount script with the commands that result in the exception. The mount operation also sends the MODE SELECT, this is why it is also affected by the MODE SELECT change.
Just in case. With the latest sources I mean commit Id 05ebbe9.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

Same floating point exception

mount.txt

@uweseimet
Copy link
Owner Author

I see, please provide the script, so that I can reproduce the issue with my setup.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 5, 2024

I may have found what's wrong without a script (but would still like to have it). Please update once more and re-test with commit ID 408538d.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

That fixed the floating point exception. Logs for mount:
mount.txt
mount_script.txt

Backup/list still fails. Logs:
list.txt
list script.txt

@uweseimet
Copy link
Owner Author

Tthe sense data when hitting the filemark in the mount test are now correct.

Please configure a custom mode page 0 again, this time with a block size 0, by updating /etc/s2p.conf:

mode_page.0.SCSI2Pi:SCSI TAPE=0b:00:90:08:13:00:00:00:00:00:00:00

Then please run the list test and provide script and log.

@Pacjunk
Copy link

Pacjunk commented Dec 5, 2024

list.txt
list script.txt

I'm packing up for the night. I will check in tomorrow. Cheers,

@Pacjunk
Copy link

Pacjunk commented Dec 7, 2024

simh does more than just VMS, but they are mostly minicomputer type systems with heaving leaning towards DEC gear. Written by a DEC guy, so I suppose this makes sense. I think the tape drive stuff emulates a TQK50 controller. Certainly the device name (MU) supports that. If it was SCSI it would be MK. There is a fork of simh that supports later workstations (including mine), but it only supports SCSI disks. Tape is still non-SCSI.

What would happen (as a test) if the s2p code didn't generate an error and just returned the 80 bytes? Would be interesting to see.

@uweseimet
Copy link
Owner Author

Your OS requests 90 bytes in the DATA OUT phase. Just returning 80 will cause issues, e.g. a blocked bus. We simply have to accept that also the real drives do not return 80 bytes in this case, and that this is compliant with the specification. The drives and s2p do the right thing in this context.

@Pacjunk
Copy link

Pacjunk commented Dec 7, 2024

Looks like we're never going to find out then!

If this is a dead end, I suggest we move onto writing files to the tape (reading seems to work)

@uweseimet
Copy link
Owner Author

@bog-dan-ro You mentioned that you wanted to implement tape support for PiSCSI. This appears to be stalling, but anyway: I would appreciate your help with testing the SCSI2Pi tape support. The current state is already far beyond what your suggested code for PiSCSI provides. It also supports non-fixed block sizes and simh-compatible image files.
@Pacjunk and I have spent a lot of time with testing s2p against OpenVMS, but we appear to be in a dead end. This is why an additional tester, with an additional platform, would be very helpful. Maybe you would like to help us? In this case, please note that I would need rather prompt feedback on a regular basis.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 7, 2024

@Pacjunk I agree, let's continue with writing. I would not be surprised if we learn something when dealing with other features, i.e. writing, that helps us with the current problem.

I am not as pessimistic as you ;-). For software problems there is always a solution. What's hard in this case is to collect all the data we need in order to make a full analysis of the problem. I bet it is just a minor detail which is causing issues.

@uweseimet
Copy link
Owner Author

@sidick @ppuskari You also expressed interest in tape support. If you are still interested, this is the right time to help with testing.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 7, 2024

@Pacjunk I assume that you are going to start with a new test case. If there is something that does not work, please post the required data - you have a lot of practice with this now ;-) - and I will probably create a new, specific ticket for this case.

@Pacjunk
Copy link

Pacjunk commented Dec 7, 2024

OK. Here are some logs. I initialised a tape in the normal way (don't think we need logs for that now). The .tap file is 64KB in size.

I then mounted the tape (not foreign like we were doing). This worked.
mount.txt
mount script.txt

Then I tried to copy a small file (probably only 40 bytes of real data) to the tape...

copy.txt
copy script.txt

The main error message I got, said that "magnetic tape position lost". I assume it didn't find the right place to append the ANSI header and/or the data for the file.

@uweseimet
Copy link
Owner Author

@Pacjunk Thank you. I will investigate this. I created #112 for dealing with writing data to a tape with your platform. Please from now on let's use the new ticket for anything related to writing data, so that the current ticket remains focused on the 90 bytes etc. issue when reading.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 7, 2024

@Pacjunk Regarding the list operation, let me make some statements. Please comment on what you think is wrong and why. This is not so much about s2p, but about the operations executed by the OS.

  1. Your OS knows about 90 byte blocks, even though they may not be ANSI. Otherwise it would not try at all to read exactly 90 bytes. The 90 is not a random number.
  2. Tapes that do not have 80 byte blocks at the beginning cannot be mounted.
  3. Successfully mounting a tape is a precondition for the list operation.
  4. Because a tape could be mounted, the OS knows that the first block has 80 bytes.
  5. After mounting a tape, it is positioned at the beginning, because the mount operation executes a rewind when it is done.
  6. Despite knowing that the first block has 80 bytes, the OS tries to read 90 bytes from the beginning with the list operation.

We have two different second blocks in our scenarios:

00000000  48:44:52:31:20:20:20:20:20:20:20:20:20:20:20:20  'HDR1            '
00000010  20:20:20:20:20:54:45:53:54:33:33:30:30:30:31:30  '     TEST3300010'
00000020  30:30:30:30:30:30:31:30:30:30:32:34:33:32:39:30  '0000001000243290'
00000030  32:34:33:32:39:20:30:30:30:30:30:30:44:45:43:46  '24329 000000DECF'
00000040  49:4c:45:31:31:41:20:20:20:20:20:20:20:20:20:20  'ILE11A          '

This above is from your latest mount script (not foreign).
In some of your scripts I see this instead:

00000000  48:44:52:31:56:4d:53:30:35:34:2e:41:20:20:20:20  'HDR1VMS054.A    '
00000010  20:20:20:20:20:56:4d:53:30:35:34:30:30:30:31:30  '     VMS05400010'
00000020  30:30:31:30:30:30:31:30:30:20:39:30:32:30:31:20  '001000100 90201 '
00000030  39:30:32:30:31:20:30:30:30:30:30:30:44:45:43:56  '90201 000000DECV'
00000040  4d:53:42:41:43:4b:55:50:20:20:20:20:20:20:20:20  'MSBACKUP        '

This is from the foreign mount, isn't it? I guess there are different ways of labeling, and these data determine what the OS can later do with a tape?

This is a test scenario that will provide more insight into the 80/90 bytes problem:

  1. Connect both your streamer and the Pi. Do not launch s2p, so that the Pi is not active on the SCSI bus.
  2. Boot and mount with a real tape drive.
  3. Run "-L trace -i 3 -c 34:00:00:00:00:00:00:00:00:00" with s2pexec. The result shows the tape position after mounting. This is a piece of information we do not yet have because your OS does not run this command at this point.
  4. Now switch off the streamer, launch s2p (same SCSI ID as your streamer).
  5. Run the list operation.
  6. Does your OS try to read 80 or 90 bytes now?
  7. And vice versa: Switch off your streamer and launch s2p when booting. Then run the mount command with s2p. Now stop s2p, switch on your streamer and run the list operation.

From the sequence of commands sent I am quite sure that your OS will not notice that the device has changed. It always reconfigures the drive for its needs before really doing something.
By having s2p only execute some of the operations we can find out more about what triggers the problem. If the OS reads 90 bytes when the mount was done with the real drive, this means that there is an issue with s2p and the list operation. If the OS reads 90 bytes when the mount was done with s2p, this means that there is an issue with s2p and the mount operation.

@ppuskari
Copy link

ppuskari commented Dec 7, 2024 via email

@Pacjunk
Copy link

Pacjunk commented Dec 8, 2024

Thinking about this 90 byte block, I think this is a quick test to see if the tape is in ANSI format. Here is my theory:

The backup command will require that the tape be in ANSI format, so it requests a 90 byte read.
When this generates an error (most likely):
If the reply says that the block is only 80 bytes (short by 10), then the utility goes on to process the ANSI headers and file system
If the reply says the block is any other length (or succeeds), then the tape is not ANSI and backup cannot continue.

I think this is what must be happening. Mounting a tape foreign doesn't care about the format. If you mount a zero filled file, it reports no label, but it still mounts it. This is for transfering data to/from a "foreign" format. Backup requires a foreign mount, but does care about ANSI format, so must test it as above.

I think the issue must be about how s2p handles the response to the invalid block request.

I will run your tests soon, but I wanted to put forward this theory!

@Pacjunk
Copy link

Pacjunk commented Dec 8, 2024

I ran your check position script. The first time it generated errors, so I ran it again.
check.txt

I then mounted the real tape foreign, switched off the tape drive, started s2p, then did the backup/list. Same error - which is expected as it is the backup command that generates the 90 byte read. As I said in the last comment, the foreign mount doesn't really care what format the tape is in. If it is ANSI, then it will display the label. If not, it just says no label and continues anyway.

@Pacjunk
Copy link

Pacjunk commented Dec 8, 2024

I tried it the other way around (mount the s2p tape, then read the real tape), but this failed as it said the drive contains the wrong tape. There must be a serial number on it, because despite having the same label, it knows it has been changed. In theory your test should work as it is the backup/list command that causes the issue.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 8, 2024

I think the issue must be about how s2p handles the response to the invalid block request.

I'm afraid this collides with what the logs say: They all say that s2p does the same as your drive does when there is a block that has 80 bytes but 90 bytes are requested. It is also the same as my drive does, and matches what the specification says must happen. When receiving this response your OS stops sending commands. I don't think I have missed something when looking at the errors reported by the devices/s2p. The sense data are only two lines, and the command sent before IMO is also identical, requesting 90 bytes.
Until now the logs have exactly reflected what has happened and where s2p and your drive differs, so that I could address the problem. Maybe I misunderstand you: Do you mean that in this particular case the logs are misleading/wrong/incomplete? The sense data with the error are definitely complete.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 8, 2024

The check script shows that the drive is at position 0. This is expected, but we needed to verify this. Too bad that there appears to be a serial number. But as mentioned before there are indeed different numbers in the block contents logged by s2p.
But we are smarter than your OS: We can create simh files with identical numbers. In order to do that, can you please label a tape with your OS and then read its first 3 blocks with s2pexec. We already had a script somewhere, but it's hard to find in all our comments. This should do it:

-i 3 -L trace
-c 12:00:00:00:ff:00
-c 00:00:00:00:00:00
-c 01:00:00:00:00:00 -o 90
-c 03:00:00:00:50:00 -o 90
-c 03:00:00:00:50:00 -o 90
-c 03:00:00:00:50:00 -o 90
-c 03:00:00:00:50:00 -o 90

INQUIRY, TEST UNIT READY, REWIND, 4x READ. The latter should fail and your drive will report errors, which can be ignored. With the output we can create a .tap file with exactly the same contents, i.e. the same serial numbers. Then when you replace the drive by s2p, the OS will think the tape is the same.

@Pacjunk
Copy link

Pacjunk commented Dec 8, 2024

I think you meant 08 there, rather than 03.

Anyway, with a blank tape I got:
blank.txt

With the tape with the backups on it (which is the one we should be working with) I got:
full.txt

I can't see a serial number there, and I initialised a tape twice and the results are identical. Some how the OS knew that I had switched the tapes.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 8, 2024

Yes, 08, you are right.

Regarding the OS, it can only know that something was switched in 3 cases:

  1. The new device reports UNIT ATTENTION, POWER ON on first read access. Your drive does this, but s2p does (intentionally) not.
  2. The OS looks at the data transferred. If the data s2p uses are identical with the data on the tape, there is no difference from the OS perspective.
  3. The drive name has changed

I will create a .tap file with the same content as the full.txt tape, and then let's see what happens. This tape has more than 4 blocks and no tape mark on it, but I hope that the first 4 are sufficient for this kind of test, because the error we are dealing with happened before any attempt to read the fourth block.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 8, 2024

One more question: The data of full.txt are completely different from the data we used previously. We have at least 4 data blocks, but no tape marks at the beginning. Previously, for the list test, we were using these data:

Offset 0 ($0): Class 0, good data record, record length 80 ($50)
Offset 88 ($58): Class 0, good data record, record length 80 ($50)
Offset 176 ($b0): Class 0, good data record, record length 80 ($50)
Offset 264 ($108): Class 0, tape mark
Offset 268 ($10c): Class 0, tape mark
Offset 272 ($110): Class 0, good data record, record length 80 ($50)
Offset 360 ($168): Class 0, good data record, record length 80 ($50)
Offset 448 ($1c0): Class 0, tape mark
Offset 452 ($1c4): Class 0, tape mark
Offset 456 ($1c8): Class 7, private marker (SCSI2Pi end-of-data object)

full.txt starts like this, in simh speech:

Offset 0 ($0): Class 0, good data record, record length 80 ($50)
Offset 88 ($58): Class 0, good data record, record length 80 ($50)
Offset 176 ($b0): Class 0, good data record, record length 80 ($50)
Offset 264 ($108): Class 0, good data record, record length 80 ($50)

I guess I am missing something here. Just to ensure that we are talking about the same: The new test is supposed to look like this:

  1. Boot your OS with the Sony drive connected, and with a tape that is suitable for running the list test.
  2. Keep the Sony drive connected until the list command.
  3. Before the list command disconnect the drive and launch s2p with an image file (the one I am going to create) that has the same contents as the tape present when booting. The s2p drive name has to be the same as that of your drive (/etc/s2p.conf).
  4. Now run the list command with the OS.
  5. The OS will not have noticed that there is s2p now instead your drive, reasoning see above. Even if my reasoning was wrong we would notice that with the logs.
  6. With this setup we will learn if whatever goes wrong with s2p it is triggered by something that happened before the list command, or if it is really the list command itself where something goes wrong.

@Pacjunk
Copy link

Pacjunk commented Dec 8, 2024

The steps that you listed do not detect the tape change.

Mounted the real tape, then swapped to the emulated for the list: (attempts the 90 byte read)
list1.txt

The tape change issue occurs the other direction. ie mount the tape with s2p, then try to list with the real tape drive.

I think the issue is the the backup/list command itself. As I stated before, the mount/foreign doesn't really care about the tape format.

Anyway, packing up for the night.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 8, 2024

Thank you for your help today. The last test indeed says that it is the list command and nothing that happened before.

But let's verify this with the other test, because we have to ensure with all the means we have that we do not miss anything.
The OS can detect that the drive has changed because in contrast to s2p the drive reports UNIT ATTENTION after power on. Again I claim that we are smarter than the OS. We will snatch this information aways from it.
In order to do that, after switching on the real drive and before having the OS execute the list command, launch s2pexec. Manually run a TEST UNIT READY on the s2pexec command line:

s2pexec>-i 3 -c 00:00:00:00:00:00

The drive will report UNIT ATTENTION, but it will only do this once. You can run the command above twice to verify. Now quit s2pexec and run the list operation with your OS. Note that if the tape/image contents are not identical, the OS may still notice the change. I am prepared to provide a faked .tap file if required.
Also ensure that s2p uses the same drive name as your real drive. Use the latest issue_112 sources for this test, because I will tweak s2p again to signal support of synchronous commands, just like your drive. In theory your OS might not just compare names, but the full INQUIRY data.

We expect that the list operation will work with this test scenario, don't we? s2p has done the mount, and the real drive will execute the list operation.

Just in case it should be needed, these are the first 4 blocks of your backup file full.txt as simh file:
backup.tap.zip
The s2psimh dump is:

Offset 0 ($0): Class 0, good data record, record length 80 ($50)
00000000  56:4f:4c:31:56:4d:53:30:35:34:20:20:20:20:20:20  'VOL1VMS054      '
00000010  00:20:20:20:20:20:20:20:20:20:20:20:20:20:20:20  '.               '
00000020  20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:20  '                '
00000030  20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:20  '                '
00000040  20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:33  '               3'
Offset 88 ($58): Class 0, good data record, record length 80 ($50)
00000000  48:44:52:31:56:4d:53:30:35:34:2e:41:20:20:20:20  'HDR1VMS054.A    '
00000010  20:20:20:20:20:56:4d:53:30:35:34:30:30:30:31:30  '     VMS05400010'
00000020  30:30:31:30:30:30:31:30:30:20:39:30:32:30:31:20  '001000100 90201 '
00000030  39:30:32:30:31:20:30:30:30:30:30:30:44:45:43:46  '90201 000000DECF'
00000040  49:4c:45:31:31:41:20:20:20:20:20:20:20:20:20:20  'ILE11A          '
Offset 176 ($b0): Class 0, good data record, record length 80 ($50)
00000000  48:44:52:32:46:30:38:31:39:32:30:38:31:39:32:20  'HDR2F0819208192 '
00000010  20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:20  '                '
00000020  20:20:20:20:4d:20:20:20:20:20:20:20:20:20:20:20  '    M           '
00000030  20:20:30:30:20:20:20:20:20:20:20:20:20:20:20:20  '  00            '
00000040  20:20:20:20:20:20:20:20:20:20:20:20:20:20:20:20  '                '
Offset 264 ($108): Class 0, good data record, record length 80 ($50)
00000000  48:44:52:33:32:30:30:30:30:30:30:31:30:30:30:30  'HDR3200000010000'
00000010  30:30:30:30:30:30:30:31:30:30:30:30:30:30:30:30  '0000000100000000'
00000020  30:30:30:30:30:30:30:30:32:30:30:30:30:30:30:30  '0000000020000000'
00000030  30:30:30:30:30:30:30:30:30:30:30:30:30:30:30:30  '0000000000000000'
00000040  30:30:30:30:20:20:20:20:20:20:20:20:20:20:20:20  '0000            '

A tool that creates simh .tap files from real tapes would be useful, wouldn't it? Then you do not have to do things like that manually, and can convert complete tapes. #113

@Pacjunk
Copy link

Pacjunk commented Dec 9, 2024

That would be fantastic. Then we would know the tapes were identical.

I have been looking at the scsimon trace (I captured doing a list to the real tape drive) in GTKWAVE and found the read command for 90 bytes. Data seems to flow according to the trace...

The command:
1
I assume the zero after the msg line goes high is a success status? Dont know what the 04 is.

2
3
Start of the data flowing in... (at the 'V'):
4
Obvious reading of the label
5

There are a couple of other examples of the 5A read and they return the HDR1 and HDR2 blocks. Same codes inbetween the read command and data flowing.

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 9, 2024

At least for the current tests, where I faked the tape contents, we can be sure that everything is identical. s2psimh and s2pexec show that. If you read 4 blocks from a tape and the contents are the same as what s2psimh displays for a file with 4 blocks, everything is fine.

@Pacjunk
Copy link

Pacjunk commented Dec 10, 2024

Any comment on the scsimon trace above?

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 10, 2024

I am not sure. In case you think these data show that a 90 byte read for a 80 byte block is successful, you would have to resolve the contradiciton with the specification. And why the two streamers we use for testing both report CHECK CONDITION when you try to read 90 bytes from an 80 bytes block.
If you have an explanation for that it would be great, of course. But whatever happens, it must be consistent with the specification, and with what our streamers do.

Or do you think this shows that the 90 byte read command actually is sent by the OS? Yes, this is probably what the data show. This would mean that I am not right with finding it more likely that it is not sent. And then the question remains, why does the OS react differently on the respective CHECK CONDITION situation with s2p than with your streamer, even though the logs show that the same data are returned?

@Pacjunk
Copy link

Pacjunk commented Dec 10, 2024

I don't know what it means, but I have not left out any of the trace. It goes straight from sending the command to retrieving the data. I don't know what the 00,04,00,08,48,80 values are in between the command and the data. You would know better than me!

@uweseimet
Copy link
Owner Author

uweseimet commented Dec 10, 2024

I'm afraid I do not know. There cannot be any data values that are not also visible somewhere in the s2p logs. 00, 04 are rather ubiquitous. 80 may be 80 bytes as part of a read command, but can also be something else. 48 is an alphanumeric "0", also ubiquitous.

Please run the test with the switched drive. It would at least provide a bit more information. I don't think that the positioning fixes done for #112 are also relevant for this ticket, but checking whether something has improved with the latest code also for #111 may be worth trying.

@uweseimet
Copy link
Owner Author

Just for fun I tried scsimon with Linux and my tape drive. The sg_inq Linux tool sends two INQUIRY commands and, depending on the outcome, a REQUEST SENSE. A very simple scenario.
scsimon misses about 2 bytes out of 3. Therefore, the HTML output is different each time, but always incomplete and wrong.
A really useless tool, just like expected. A Pi 4 is much too slow to capture data fast enough. In theory it might work with a really slow initiator and/or target, but this is not a realistic scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed on hold
Projects
Status: In progress
Development

No branches or pull requests

3 participants