-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] issue-44: Add support for file_encoding in sources #46
base: main
Are you sure you want to change the base?
Conversation
277d1d6
to
04e024f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution Anand. I tried running the code in my local but ran into an error. It'd be great if you could fix those.
Thanks, @shpiyu I have updated the code to resolve the issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @shpiyu Pls review and test.
@@ -31,6 +31,8 @@ def __init__(self, source, params_map): | |||
self._src = source | |||
else: | |||
self._src = self.format_file_path(source, params_map) | |||
if not source.get('file_encoding'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ChillarAnand I was thinking to remove this default setting here and add defaults in pd.read_csv calls in file_reader and xml_reader. Like encoding=src.get('file_encoding', 'utf-8')
. The benefit is, if we set default closer to where it is being used it'd be easier for the reader to understand that utf-8 is the default encoding. Let me know your thoughts?
Description
Allow users to define custom encoding for files.
Changes Made
Add file_encoding support in sources.
Definition of Done
Before submitting this pull request, please ensure that the following criteria have been met:
Additional Notes
closes #44