Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 2 - Download the Data #178

Open
mzeman1 opened this issue May 25, 2020 · 11 comments
Open

Chapter 2 - Download the Data #178

mzeman1 opened this issue May 25, 2020 · 11 comments

Comments

@mzeman1
Copy link

mzeman1 commented May 25, 2020

This code doesn't work for me:

import os
import tarfile
import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    os.makedirs(housing_path, exist_ok=True)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close() 
@Cpauls35
Copy link

Same issue here, running on my Jetson nano. When I run this code i get a urllib request error. Importing urlib.request fixed that; however, even after calling the function I don't get a directory made and am currently investigating the path as that doesn't work either.

@dgmorrow19
Copy link

did you call the function? (which is in the next cell)
fetch_housing_data()

@Cpauls35
Copy link

fetch_housing_data() called... Error output
HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

@NVivek
Copy link

NVivek commented May 26, 2020

from __future__ import division, print_function, unicode_literals

import numpy as np
import os
import pandas as pd
import tarfile
from six.moves import urllib

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master/"
HOUSING_PATH = os.path.join("datasets", "housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
        os.makedirs(housing_path)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

@mzeman1
Copy link
Author

mzeman1 commented May 26, 2020

I didn't. But now, it gave me this error:

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1108)>

@2807754
Copy link

2807754 commented Jun 12, 2020

Hello community, just started with this interesting book, but a problem came over with this following code:

%matplotlib inline
import matplotlib.pyplot as plt
housing.hist(bins=50, figsize=(20,15))
save_fig("attribute_histogram_plots")
plt.show()

Once I deployed, it shows the following error:


AttributeError Traceback (most recent call last)
in
----> 1 get_ipython().run_line_magic('matplotlib', 'inline')
2 import matplotlib.pyplot as plt
3 housing.hist(bins=50, figsize=(20,15))
4 save_fig("attribute_histogram_plots")
5 plt.show()

/opt/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2305 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
2306 with self.builtin_trap:
-> 2307 result = fn(*args, **kwargs)
2308 return result
2309

</opt/anaconda3/lib/python3.7/site-packages/decorator.py:decorator-gen-108> in matplotlib(self, line)

/opt/anaconda3/lib/python3.7/site-packages/IPython/core/magic.py in (f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):

/opt/anaconda3/lib/python3.7/site-packages/IPython/core/magics/pylab.py in matplotlib(self, line)
97 print("Available matplotlib backends: %s" % backends_list)
98 else:
---> 99 gui, backend = self.shell.enable_matplotlib(args.gui.lower() if isinstance(args.gui, str) else args.gui)
100 self._show_matplotlib_backend(args.gui, backend)
101

/opt/anaconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py in enable_matplotlib(self, gui)
3405 gui, backend = pt.find_gui_and_backend(self.pylab_gui_select)
3406
-> 3407 pt.activate_matplotlib(backend)
3408 pt.configure_inline_support(self, backend)
3409

/opt/anaconda3/lib/python3.7/site-packages/IPython/core/pylabtools.py in activate_matplotlib(backend)
304
305 import matplotlib
--> 306 matplotlib.interactive(True)
307
308 # Matplotlib had a bug where even switch_backend could not force

AttributeError: module 'matplotlib' has no attribute 'interactive'

Need help to solve this exercise.
Many thanks

@zkDreamer
Copy link

I had the same error

@ageron
Copy link
Owner

ageron commented Mar 26, 2021

Hi there,

@mzeman1 , you're running into a very common problem which is linked to the installation of Python on MacOSX. You need to install the SSL certificates. I explain how in the FAQ.

@Cpauls35 , getting an HTTP 404 error is weird. This means that the URL is invalid. The only explanation I can see is there's a typo in your code. Please make sure you're running exactly the same code as in the notebook. If it still doesn't work, please check your network settings, perhaps a firewall or proxy is messing things up. In any case, if you run the notebook in Colab, you will see that everything works fine.

@2807754 and @zkDreamer , this StackOverflow question seems to have an accepted answer that may fix your problem: in short, uninstall matplotlib and reinstall it.

Hope this helps.

@AlejandorLazaro
Copy link

AlejandorLazaro commented Jul 19, 2022

@mzeman1, I just had this same error (On macOS Monterey 12.2.1 (21D62) on an M1 MacBook Air), and the following Github answer solved the problem for me.

Cadene/pretrained-models.pytorch#193 (comment)

I reworked the data fetching logic for Chapter 2 into the following, which worked on my machine:

def fetch_data(url, path, archive_name):
    # Workaround for https://github.com/Cadene/pretrained-models.pytorch/issues/193#issuecomment-635730515
    import ssl
    ssl._create_default_https_context = ssl._create_unverified_context

    os.makedirs(path, exist_ok=True)
    archive_path = os.path.join(path, archive_name)
    urllib.request.urlretrieve(url, archive_path)
    archive = tarfile.open(archive_path)
    archive.extractall(path)
    archive.close()

@ageron
Copy link
Owner

ageron commented Sep 26, 2022

@AlejandorLazaro , please don't do this ! It deactivates all SSL verification, basically destroying all SSL security. It's not the right solution. Instead, please install the root certificates by opening a terminal and running the following command (change 3.10 to whatever Python version you are using):

/Applications/Python\ 3.10/Install\ Certificates.command

This will install the certifi bundle of root certificates and solve the problem without destroying all security.

If you installed Python using MacPorts, then run sudo port install curl-ca-bundle instead.

@AlejandorLazaro
Copy link

Whoops! Thanks for the response and correction there!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants