Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Possible Archive.org check implementation #176

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 66 additions & 1 deletion src/js/core/background.js
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,13 @@ const bookmarksorganizer = {
*/
disableConfirmations : false,

/**
* Disables automatic replacement with archive.org URL. It defaults to false, there is no user setting (yet).
*
* @type {boolean}
*/
disableAutomaticArchiveOrgReplacement : false,

/**
* Internal variable. It's only true while a check is running.
*
Expand Down Expand Up @@ -540,6 +547,60 @@ const bookmarksorganizer = {
}
},

/**
* This method sends a GET request to the archive.org API to check if there is a snapshot available for a given URL.
* It replaces the current broken URL with the archived URL.
Copy link
Author

@simonsan simonsan Jan 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is currently not true, as I imagined it to be more clean to use it as a NewUrl and take the redirect approach.

*
* @param {bookmarks.BookmarkTreeNode} bookmark
* @returns {bookmarks.BookmarkTreeNode} - the bookmark object
*/
async checkArchiveOrgSnapshotAvailability (bookmark) {
const archive_api = 'https://archive.org/wayback/available?url=';
const snapshot_base_url = 'https://web.archive.org/web/';

try {
const controller = new AbortController();
const { signal } = controller;

setTimeout(() => controller.abort(), bookmarksorganizer.TIMEOUT_IN_MS);

const response = await fetch(`${archive_api}${bookmark.url}`, {
cache : 'no-store',
method : 'GET',
mode : 'no-cors',
signal : signal
});

const archived_snapshots = await response.json();

console.log(archived_snapshots);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug


// Check for available snapshots
if (archived_snapshots.keys('archived_snapshots').length === 0) {
bookmark.status = STATUS.NOT_FOUND;
}
else {
bookmark.newUrl = `${snapshot_base_url}${bookmark.url}`;
bookmark.status = STATUS.REDIRECT;
}
Comment on lines +582 to +585
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as an addition here, we could rename the bookmark. Though I'm not sure this is working out well with the approach to threat it as a possible redirect.


return bookmark;
}
catch (error) {
let cause = 'snapshot-availability-fetch-error';

if (error.name === 'AbortError') {
// eslint-disable-next-line require-atomic-updates
bookmark.status = STATUS.TIMEOUT;
cause = 'timeout';
}
else {
// eslint-disable-next-line require-atomic-updates
bookmark.status = STATUS.FETCH_ERROR;
}
}
},

/**
* This method sends a fetch request to check if a bookmark is broken or not, called by checkForBrokenBookmark().
*
Expand Down Expand Up @@ -612,7 +673,11 @@ const bookmarksorganizer = {
});
}

if (bookmark.status > STATUS.REDIRECT) {
if (bookmark.status == STATUS.NOT_FOUND) {
Copy link
Owner

@cadeyrn cadeyrn Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always use strict equality operators (=== instead of ==).

By the way: If you run npm run lint:js you can run the linter. It complains about things like that. :)

console.log('Checking for archive.org availability...');
await bookmarksorganizer.checkArchiveOrgSnapshotAvailability(bookmark);
}
else if (bookmark.status > STATUS.NOT_FOUND) {
if (bookmark.attempts < bookmarksorganizer.MAX_ATTEMPTS) {
await bookmarksorganizer.checkHttpResponse(bookmark, 'GET');
}
Expand Down