You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@drbenvincent The below function currently has a doc string and I presume that the DATASETS have the names of the datasets. Do you want the doc string to be more detailed as in describing the directory change and then reading the csv?
As for the dictionary DATASETS, do you want the small descriptions of to be added in?
def load_data(dataset: str = None) -> pd.DataFrame:
"""Loads the requested dataset and returns a pandas DataFrame.
:param dataset: The desired dataset to load
"""
if dataset in DATASETS:
data_dir = _get_data_home()
datafile = DATASETS[dataset]
file_path = data_dir / datafile["filename"]
return pd.read_csv(file_path)
else:
raise ValueError(f"Dataset {dataset} not found!")
So the idea is simply to provide a more informative docstring that lists out all the datasets (valid values of the dataset kwarg).
Actually, that might involve quite a lot of duplication and potentially be error prone as new datsets are added. Just an idea, but how about a helper function called something like list_datasets which prints out the info in DATASETS in a nice way. And for the docstring of load_data you could refer to the list_datasets function to get a full list of available datasets.
That idea isn't so good actually, because in order for the API docs on the website to be useful, the info will have to be in the docstring. I wonder if there's a sphinx command to automatically generate docstring text based on the DATASETS dict? If not, then I guess it's a case of manually entering the info into the docstring.
The text was updated successfully, but these errors were encountered: