-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need Help: Getting this Error: `open_http': 400 BAD REQUEST #307
Comments
I also found a similar error. $ wayback_machine_downloader http://www.ogurayui.jp/ Getting snapshot pages....../usr/share/ruby/open-uri.rb:364:in |
same issue ! |
Hi Guys, I found a solution for this. Use this updated version, and it works. credits to ShiftaDeband =========================
You can also uninstall the original nonfunctional gem if you installed it previously with the gem uninstall wayback_machine_downloader Note: I tried this and its working now (Tested on 6th October 2024) Don't forget to give star to ShiftaDeband |
Thanks hupictz,But it doesn't work for me |
use these commands for Windows PowerShell, after installing ruby and downloading wayback-machine-downloader: |
Thank you very much, it has taken effect ! |
Nice! Your site opens correctly as shown on web.archive.org? or you have the same problem after moving the downloading files to your remote site, like on my previous screenshot? |
It only contains HTML files,Do not download CSS and image files,I can't find the reason why this situation is happening |
Yes, the same with different sites P.S. everything is okey now, depends on site |
@intercoop I am able to download all types of files. .png, .css, .js, .gif etc check this screenshot |
Here is my error message from kali linux:
I'll look at the fork. Thanks! |
@Jacek216 No, for me, it works fine. I am not facing any issues. |
Also worked for me thank you @hupictz
|
Facing same issue. |
@muhammadbaqirjafari try this one > #307 (comment) this is working for everyone. |
Hello hupictz : Getting snapshot pages. found 0 snaphots to consider. No files to download. why this situation is happening when i download sites? |
@intercoop I am also facing the same issue. There is no issue with the code. The problem is on the Internet Archive side. It's temporarily offline. My guess is. When it comes back online, it should work. One of these has to come back online for it to start working. https://web.archive.org/cdx/search/xd |
Now, the download is working. (confirmed on 30th Oct 2024) If you guys have any issues. Please report it here. I'm happy to help. |
Hello, hupictz : |
Hello, I am trying to download this specific website which is now taken down
The method you outlined works wonders and manages to download the files, however when i click index.html and open it in a browser it just loads forever, any ideas why? This is the command I used:
Thanks |
The script doesn't load URLs with suffixes like this: It obviously considers them erroneous and does not load, and there are half of the sites from many. As an example, these folders from the domain wellmetmk.ru |
Maybe an issue with timestamps? If you're providing timestamps (with |
@intercoop I can download all the images without any issues. If you share more details, I will try to help. |
Manually open everything, the files exist. But the script with the --all key should load the files through all 301 redirects. And redirects just lead to such files in folders with labels. And because of these labels, the script thinks that the folder with a time stamp does not fit, but it does, it just has a suffix _im _js _cs and so on. |
And even if you do not specify any restrictions, the redirect to such a folder will not be loaded either, because the script does not process such folders and considers them an error. The script starts with loading a normal folder without any labels, and in them there is a redirect to such folders with labels and it does not load them. |
Got latest version from gem, ran plainly with no options:
|
@milescrawford, have you tried the solution from @hupictz above that uses the fork from @ShiftaDeband? I got the same error as you when running the latest version of the gem (2.3.1), but when I downloaded the fork and ran it using the instructions from @hupictz, I was able to use it without issue. Note: If you're on Mac, Linux, or BSD, you can use whatever terminal program you like to run the fork in Ruby instead of PowerShell, which is Windows exclusive. |
Thank you ! |
Holee crud this worked! thanks in perpetuity...this made my life. I had given up on resurrecting my old content... this changes my life... |
I installed this version of wayback-machine-downloader: This web site I am trying to get is all asp based, so it may be that asp breaks this tool. The site I am interested in is this: https://web.archive.org/web/20190220201426/http://www.twinsaabs.com/index.asp
I have never created a website with active server pages and am not sure how it is supposed to be, but this does not seem right. Could it be that the downloader is interpreting the asp and not downloading the file? |
including extra config settings, a proper rate limit, and a logger. Fixes: hartator#307 hartator#291 hartator#281 hartator#269 and probably others too
Hello, I have the 400 BAD REQUEST issue, and this method didn't work for me. When I use ruby wayback_machine_downloader, PowserShell doesn't recpgnize "ruby" as a command. But it is installed along with the gem. And, if I use ruby instead of PowerShell, from the bin folder of wayback_machine_downloader of ShiftaDeband , I get the 400 bad request error. I also tried this method:
But it doesn't work. When I use cd downloads, PowerShell doesn't recognize the path. From where am I supposed to use this command? |
It works!!! thanks!!! I am using it in Windows trying to restore wordpress site |
@vmackey First of all, it sounds like Ruby isn't on your system PATH. If you run the Ruby installer, does it give you an option like "add Ruby executables to your PATH"? If so, make sure that's selected. Or if you can't do that, add Ruby to your PATH manually:
I think you can ignore the You want to be inside |
Actually it is |
Getting snapshot pages.../System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:359:in
'open_http': 400 BAD REQUEST (OpenURI::HTTPError) from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:737:in
buffer_open'from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:212:in
block in open_loop' from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:210:in
catch'from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:210:in
open_loop' from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:151:in
open_uri'from /System/Library/Frameworks/Ruby.framework/Versions/2.3/usr/lib/ruby/2.3.0/open-uri.rb:717:in
open' from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader/archive_api.rb:13:in
get_raw_list_from_api'from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:92:in
block in get_all_snapshots_to_consider' from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in
times'from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:91:in
get_all_snapshots_to_consider' from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:105:in
get_file_list_curated'from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:164:in
get_file_list_by_timestamp' from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:309:in
file_list_by_timestamp'from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/lib/wayback_machine_downloader.rb:192:in
download_files' from /Library/Ruby/Gems/2.3.0/gems/wayback_machine_downloader-2.3.1/bin/wayback_machine_downloader:72:in
<top (required)>'from /usr/local/bin/wayback_machine_downloader:22:in
load' from /usr/local/bin/wayback_machine_downloader:22:in
The text was updated successfully, but these errors were encountered: