Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring code for better performance and code quality #280

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

maldil
Copy link

@maldil maldil commented Jun 29, 2022

Hi
Thank you very much for your excellent work in tensorflow/transform

I am a graduate student at the University of Colorado, studying the best practices of evolving ML codes. From our research, one of the most common evolution best practice in ML code is the migration of loop-based computations, since it improves performance and code quality. We made the following changes in tensorflow/transform, which remove the FOR loop and use NumPy APIs and List Comp. I carefully checked the modification to ensure that it does not break the code. I will gladly contribute. Please help me to merge this.

Copy link
Member

@zoyahav zoyahav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

@@ -14,7 +14,7 @@
"""Utilities for information-theoretic preprocessing algorithms."""

import math

import numpy as np
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a library based on tensorflow, so all of the math operations done below are deferred by constructing the tensorflow graph, and should not be using numpy.
There are tensorflow alternatives for these operations though.

Were you able to run the tests in this repo to make sure that they don't break with the change?

Copy link
Author

@maldil maldil Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your response. I'm new to this repository. Before adding the new import, I checked to see if this repo makes use of numpy APIs (https://github.com/tensorflow/transform/search?q=numpy). After confirming that , I automatically analyzed the code to identify any missed chances to use the APIs. Please close the PR if this script is not intended to use numpy APIs. Regarding test. No I did not. Now I'm trying to run the test. However, I am having difficulty locating the test scripts. Any assistance is much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants