Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZoteroObjectUploadError:highlight annotation must be a PDF attachment #45

Open
Amniotic3 opened this issue Nov 15, 2024 · 13 comments
Open

Comments

@Amniotic3
Copy link

I checked the local SQLite database, looking for this PDF document, under the itemAttachments form, the value of contentType is application/pdf, check the relevant code, there seems to be no obvious logic error, special report this bug, affecting PDF batch reading function.

https://github.com/uniuuu/dataserver/blob/172584aa38a17a7e6ecca51e2bbc25b00450fad9/model/Item.inc.php#L1477

// Note, highlight, and underline supported for PDFs, EPUBs, and snapshots
if (in_array($this->annotationType, ["note", "highlight", "underline"])) {
if (!in_array($parentItem->attachmentContentType, ['application/pdf', 'application/epub+zip', 'text/html'])) {
	throw new Exception(
		// TEMP
		//"Parent item $parentItem->libraryKey of $this->annotationType annotation must be a PDF, EPUB, or HTML attachment",
		"Parent item $parentItem->libraryKey of $this->annotationType annotation must be a PDF attachment",
		Z_ERROR_INVALID_INPUT
	);
}
}
@uniuuu
Copy link
Owner

uniuuu commented Nov 15, 2024

Hi @Amniotic3
Good day.
I suggest to post in upstream. However they're prefer all to be posted in forum https://forums.zotero.org/discussions

You can see the original repository has the same code: https://github.com/zotero/dataserver/blob/master/model/Item.inc.php#L1477

https://github.com/uniuuu/dataserver/blob/172584aa38a17a7e6ecca51e2bbc25b00450fad9/model/Item.inc.php#L1477
This above is forked one so code is one in one with upstream.

Custom changes made via files in https://github.com/uniuuu/zotprime/tree/development/stack/dataserver/config

Could you please share how to reproduce this error?

@Amniotic3
Copy link
Author

Amniotic3 commented Nov 18, 2024

@uniuuu Good afternoon, I've been busy getting continuing education credits lately, so I apologize for not getting back in a timely manner.

I'm using a client-side build under the ZotPrime 2.8.2-rc/production branch, and after logging into my admin account, clicking on the sync button, then dragging and dropping a pdf copy of the journal in, reading it in zotero's native reader, adding highlighted notes, and then clicking on sync, and the error comes up.

@Amniotic3
Copy link
Author

Amniotic3 commented Nov 23, 2024

fix this bug.

look the log

[Sat Nov 23 08:00:46.779931 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Executing SQL: SELECT mimeType FROM itemAttachments WHERE itemID=? with itemID=3847
[Sat Nov 23 08:00:46.779959 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Query Result mimeType: application/pdf
[Sat Nov 23 08:00:46.779994 2024] [php:warn] [pid 2516:tid 2516] [client 10.5.5.1:33522] PHP Warning:  iconv(): Wrong encoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in /var/www/zotero/model/Item.inc.php on line 3177
[Sat Nov 23 08:00:46.780139 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [ERROR] Invalid mimeType format: . Setting to empty.
[Sat Nov 23 08:00:46.783755 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Parent item details: Zotero_Item Object ...
[Sat Nov 23 08:00:46.783799 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Parent item attachmentContentType:
[Sat Nov 23 08:00:46.783809 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] [DEBUG]Invalid attachment type:  for annotationType: highlight
[Sat Nov 23 08:00:46.784476 2024] [php:notice] [pid 2516:tid 2516] [client 10.5.5.1:33522] Parent item 1/TZD7L6UZ of highlight annotation must be a PDF attachment new in /var/www/zotero/model/Item.inc.php:1490 (POST /users/1/items) (390762127e)

question is here:

[Sat Nov 23 08:00:46.779994 2024] [php:warn] [pid 2516:tid 2516] [client 10.5.5.1:33522] PHP Warning:  iconv(): Wrong encoding, conversion from "UTF-8" to "ASCII//IGNORE" is not allowed in 

https://github.com/uniuuu/dataserver/blob/6ed5455e6b2c7a23d0c4c547c72e2486a596d469/model/Item.inc.php#L3136

	/**
	 * Get the MIME type of an attachment (e.g. 'text/plain')
	 */
	private function getAttachmentMIMEType() {
		if (!$this->isAttachment()) {
			trigger_error("attachmentMIMEType can only be retrieved for attachment items", E_USER_ERROR);
		}
		
		if ($this->attachmentData['mimeType'] !== null) {
			return $this->attachmentData['mimeType'];
		}
		
		if (!$this->id) {
			return '';
		}
		
		$sql = "SELECT mimeType FROM itemAttachments WHERE itemID=?";
		$stmt = Zotero_DB::getStatement($sql, true, Zotero_Shards::getByLibraryID($this->libraryID));
		$mimeType = Zotero_DB::valueQueryFromStatement($stmt, $this->id);
		if (!$mimeType) {
			$mimeType = '';
		}
		
		// TEMP: Strip some invalid characters
		$mimeType = iconv("UTF-8", "ASCII//IGNORE", $mimeType);
		$mimeType = preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '', $mimeType);
		
		$this->attachmentData['mimeType'] = $mimeType;
		return $mimeType;
	}

rewrite to this:

private function getAttachmentMIMEType() {
    if (!$this->isAttachment()) {
        trigger_error("attachmentMIMEType can only be retrieved for attachment items", E_USER_ERROR);
    }

    if ($this->attachmentData['mimeType'] !== null) {
        return $this->attachmentData['mimeType'];
    }

    if (!$this->id) {
        return '';
    }

    $sql = "SELECT mimeType FROM itemAttachments WHERE itemID=?";
    $stmt = Zotero_DB::getStatement($sql, true, Zotero_Shards::getByLibraryID($this->libraryID));
    $mimeType = Zotero_DB::valueQueryFromStatement($stmt, $this->id);

    if (!$mimeType) {
        $mimeType = '';
    } else {
        try {
            $mimeType = mb_convert_encoding($mimeType, "ASCII", "UTF-8");//fix bug
        } catch (Exception $e) {
            error_log("[ERROR] iconv conversion failed for mimeType: $mimeType. Error: " . $e->getMessage());
        }

        $mimeType = preg_replace('/[^\x{0009}\x{000a}\x{000d}\x{0020}-\x{D7FF}\x{E000}-\x{FFFD}]+/u', '', $mimeType);

        if (!preg_match('/^[a-zA-Z0-9\-\.]+\/[a-zA-Z0-9\-\.]+$/', $mimeType)) {
            error_log("[ERROR] Invalid mimeType format: $mimeType. Setting to empty.");
            $mimeType = '';
        }
    }

    $this->attachmentData['mimeType'] = $mimeType;
    return $mimeType;
}

@Amniotic3
Copy link
Author

Checking the server-side database, I can find that mimeType has a specific data type,

but after the // TEMP: Strip some invalid characters part of the processing, the data is replaced with null,

when we add an annotation for uploading and synchronizing // Annotation will prompt an error report,

because the // Here. in_array($parentItem->attachmentContentType, ['application/pdf', 'application/epub+zip', 'text/html']) , attachmentContentType is already null.

So whether you are annotating a PDF document or not, it will report an error that the parent item is not a PDF document.

https://github.com/uniuuu/dataserver/blob/6ed5455e6b2c7a23d0c4c547c72e2486a596d469/model/Item.inc.php#L1476

@uniuuu uniuuu reopened this Dec 12, 2024
@uniuuu
Copy link
Owner

uniuuu commented Dec 12, 2024

Hi @Amniotic3

The custom files made within zotprime are located in stack/dataserver/config/
image

But stack/dataserver/dataserver is pointing to fork of the original dataserver repository https://github.com/zotero/dataserver

Therefore to fix it best is to submit bug to zotero.
Or you may make changes via custom zotprime files but then it will not reflect to parent zotero dataserver repository.

@Amniotic3 Can see that there is new commit from zotero dataserver maintaners zotero/dataserver@6ed5455 regarding "Fix "0" for annotation field being removed"

@uniuuu
Copy link
Owner

uniuuu commented Dec 12, 2024

@Amniotic3 Can see that there is new commit from zotero dataserver maintaners zotero/dataserver@6ed5455 regarding "Fix "0" for annotation field being removed"

Added this commit under https://github.com/uniuuu/zotprime/tree/v2.8.4
@Amniotic3 Please try it.

@uniuuu uniuuu reopened this Dec 13, 2024
@uniuuu
Copy link
Owner

uniuuu commented Dec 13, 2024

Hi @Amniotic3
Please let me know, how you solved it?

@Amniotic3
Copy link
Author

@uniuuu

Oh dear friend, sorry, I thought I had made this clear.

It's like this, with the help of GPT, I added multiple debug hints and found out that the reason for this is that the loss of the highlighted parent item here comes from an error in the getAttachmentMIMEType function when it does the replacement check.

The problem is in this code:

https://github.com/zotero/dataserver/blob/31c5db7d5ab853399062af273e7042b6ef7111e8/model/Item.inc.php#L3159

$mimeType = iconv(“UTF-8”, “ASCII//IGNORE”, $mimeType);

At GPT's suggestion, modifying the iconv function to mb_convert_encoding solved the problem.

$mimeType = mb_convert_encoding($mimeType, “ASCII”, “UTF-8”);

Meta-information replacement then no longer gives errors.

@Amniotic3
Copy link
Author

Aha, been pouring over how to do HTTPS deployments for the past few days, super buggy and headache to change 😭

@Amniotic3
Copy link
Author

I'm even wondering if it's possible to replicate the functionality of zotero's official website, such as account registration, password changes, maintenance of group permissions, and so on. The command line approach is still pretty geeky, and other members of the team have been giving me feedback on whether or not it could be UI-ized.

@Amniotic3
Copy link
Author

@uniuuu

Therefore to fix it best is to submit bug to zotero.

For GitHub use is really not a big conference, in addition to this account is a blank no trace of the account, I guess it is not very easy to gain trust, so I will not try haha.

Amniotic3 added a commit to Amniotic3/zotprime that referenced this issue Dec 17, 2024
@uniuuu
Copy link
Owner

uniuuu commented Dec 17, 2024

@Amniotic3 It's done trough next forum https://forums.zotero.org/discussions and not github, please check.

For GitHub use is really not a big conference, in addition to this account is a blank no trace of the account, I guess it is not very easy to gain trust, so I will not try haha.

@uniuuu
Copy link
Owner

uniuuu commented Dec 17, 2024

Let's keep open this issue. I'll be reviewing it too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants