Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding dataset #38

Open
MinaBasem opened this issue Apr 21, 2024 · 0 comments
Open

Question regarding dataset #38

MinaBasem opened this issue Apr 21, 2024 · 0 comments

Comments

@MinaBasem
Copy link

Hello Philip,

I am working on a GUI mock data generation project that (as the name states) generates fake data such as first name, last name, countries, etc.

I was looking for a more realistic way to generate names from their corresponding countries and I came across your repository, I've tried tinkering around with the API but the execution time is too long for mass data generation.

Question is whether there is a way to call out numerous names in a single API call? If not, I am considering using the original dataset to create my own algorithm without needing API calls. However, I wanted to check whether the 3.3GB file has duplicate rows or not, examples regarding what duplicate data there is and such (since I currently cannot download the dataset on my machine).

Point is if there is a significant number of numerous data then I might attempt to manually shrink the rows down by removing as much duplicates as I can in order to run the algorithm locally, making it much faster than waiting for API call returns.

Regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant