Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove Redundancy of Fields among curves #41

Merged
merged 12 commits into from
Mar 5, 2019

Conversation

Bhargavasomu
Copy link
Contributor

What was wrong?

As part of the refactoring, the first step is to remove the redundant field_elements.py and optimized_field_elements.py files and make it possible to develop an API which any curve can use, instead of mere copy pasting with few properties changes.

How was it fixed?

The field_properties has been generated. Also the field_elements and the optimized_field_elements files generate the fields FQ or FQP by making use of the curve name. Further the tests corresponding to these have been created in another file named test_field_elements.py which has been taken out from test_bn128_and_bls12_381.py (It shouldn't be hard to corelate the tests, just changed the syntax which is caused due to the breaking API change).

Note

This API hasn't still been integrated into the curves logic. I have just made the API change and mimiced the corresponding changes to the test cases. If this API is ok, then I will integrate them with the other curves. Please let me know the exhaustive list of references or explanations if you cannot co-relate anything to the already existing codebase. I will give you an explanation for all the questions posed.

Cute Animal Picture

put a cute animal picture link inside the parentheses

@Bhargavasomu Bhargavasomu force-pushed the refactor_v2 branch 2 times, most recently from 1f9a433 to c86163e Compare January 13, 2019 15:11
@Bhargavasomu
Copy link
Contributor Author

@pipermerriam @carver review please

@carver
Copy link
Collaborator

carver commented Jan 18, 2019

First thought (I'll come back and review more, soon):

I think you might be able to fit the new curves into the old tests by modifying this area:

@pytest.fixture(params=[bn128, optimized_bn128, bls12_381, optimized_bls12_381])
def lib(request):
return request.param
@pytest.fixture
def FQ(lib):
return lib.FQ

To do something like:

from functools import partial

@pytest.fixture(params=[
  bn128, optimized_bn128, bls12_381, optimized_bls12_381,
  "bn128", "optimized_bn128"])
def curve(request):
    return request.param

@pytest.fixture
def FQ(curve):
  if isinstance(curve, str):
    if curve.startswith('optimized_'):
      return partial(optimized_field_elements.FQ, curve_name=curve[10:])
    else:
      return partial(field_elements.FQ, curve_name=curve)
  else:
    return lib.FQ

That would allow testing the new models and the old models side by side, using the same tests, which is ideal during a refactor.

BTW, note that the partial only currently works with a keyword argument. It would work positionally if the argument order of the init were switched. That would be a very reasonable thing to do.

Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that's as far as I got for the day, I will come back again to continue another time.

py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
IntOrFQ = Union[int, "FQ"]


class FQ(object):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can just delete this FQ and import it from the regular (unoptimized) implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carver I have observed this prior, but have just copy/pasted because to maintain symmetry between the optimized and the non-optimized version in terms of type hinting, imports etc at this point of time.

How about I change this after we complete this whole refactor thing?

py_ecc/fields/field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think that wraps it up for this round of reviews. Thanks for your work!

py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Outdated Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Show resolved Hide resolved
py_ecc/fields/optimized_field_elements.py Show resolved Hide resolved
@Bhargavasomu
Copy link
Contributor Author

Bhargavasomu commented Jan 25, 2019

@carver instead of using partial, I have manually created the classes to all the Fields of the curves, (thank guido that there is a type method :P). The code is now visually better to read and now visually a feast to those referring to the codebase. @pipermerriam I think you would be happy with this change, as I remember you asking to change this earlier as part of the other refactoring PR.

Further, I have removed the mypy checks, as they are messy right now. I will later integrate mypy, after this whole refactor thing is done with.

Also please note that in the curve files such as py_ecc/bn128/bn128_curve.py, I have done
from py_ecc.fields import bn128_FQ as FQ. I have done that so as to make the PR small and easily relatable for you guys (Recently experienced, how tough reviewing is). I know that this could cause debugging issues. Hence after this gets merged (preferabbly tonight), I would like to open another PR which would directly import bn128_FQ and use it with that name only, wherever apt (This also would be happening tonight).

Please review and let me know if any other changes are needed to done.

py_ecc/fields/field_elements.py Show resolved Hide resolved
if len(set(
[len(lm), len(hm), len(low), len(high), len(nm), len(new), self.degree + 1]
)) != 1:
raise Exception("Mismatch between the lengths of lm, hm, low, high, nm, new")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it is more work but it would be ideal to actually include which one was the mismatch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please note that, we could display an error if only one of them all, don't match with the rest. Assuming that would be the scenario, I could cook something up.

o[i] += int(temp[degb + i] / b[degb])
for c in range(degb + 1):
temp[c + i] -= o[c]
return cast(Tuple[IntOrFQ], tuple(o[:deg(o) + 1]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this type is supposed to be Tuple[IntOrFQ, ...] or is this really a length-1 tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam this actually should be Tuple[IntOrFQ, ...], but as I mentioned, is it ok to skip the type hinting as part of this PR?

@Bhargavasomu
Copy link
Contributor Author

@pipermerriam made the changes, please review again. Thankyou.

@Bhargavasomu
Copy link
Contributor Author

@carver could you please review with the new changes made? Thankyou.

@Bhargavasomu Bhargavasomu mentioned this pull request Feb 19, 2019
@Bhargavasomu
Copy link
Contributor Author

When it comes to the proficiency, this refactored code is taking 70 seconds longer than the code on master to run all the tests using pytest. I did a quick profiling and observed that, this increase is because of the following reasons.

  • We are trying to normalize the given input at many places in the FQ operations. For example, to add an FQ point with another number (or) another FQ point, we are first checking the type with an isinstance check. This is causing a lot of overhead as the whole library boils down to usage of FQ operations (IIUC).
  • Introduction of Error Handling in various functions, which is further done by introducing a lot of isinstance checks.

@ChihChengLiang @carver could you please review this?
/cc @pipermerriam

@Bhargavasomu
Copy link
Contributor Author

@pipermerriam @carver @ChihChengLiang the main devil causing more time here was, making the function call **normalize_FQ_point**. This is a costly operation because those functions are called a lot of times which makes it time consuming. On removing that function and pasting the
actual function code wherever needed, made the code way faster. The time taken to run the tests by pytest is now nearly the same or faster than master now. Please verify this and review if possible. Thankyou.

@ChihChengLiang
Copy link
Contributor

I know it might be too late to say and you might have already tried, but not sure if you like this idea:

If we want field elements to be a common API, in the py_ecc/fields they shouldn't know anything about the curve. So the proposed elements API should look like:

# py_ecc/fields/field_elements.py


# Can use type variable in base class methods
TFQ = TypeVar("TFQ", bound="BaseFQ")


class BaseFQ(ABC):

    @property
    @abstractmethod
    def field_modulus(self):
        pass


class BaseFQP(ABC):
    @property
    @abstractmethod
    def degree(self):
        pass


class FQ2(FQP):
    degree = 2
    

class FQ12(FQP):
    degree = 12

And in every curve, we keep some_curve_field_elements.py and let it inherit base field elements.

# py_ecc/bls12_381/bls12_381_field_elements.py
from .constants import field_modulus
from py_ecc.fields.field_elements import BaseFQ

class FQ(BaseFQ):
    field_modulus = field_modulus

@Bhargavasomu
Copy link
Contributor Author

@ChihChengLiang thankyou for reviewing this. I see what you are saying about Fields should not know anything about the curve. I could just remove the need to pass the curve_name in the fields, which is pretty useless.

Another major point, that I wanted to raise is that, I am minorly 👎 on keeping the <some_curve>_field_elements.py. I want all of those to be as part of the Fields API. That is why you can find the fields of all the curves in __init__.py of the Fields Module in this PR.

One more thing about the Base* convention is that, I don't think that we need a Base* class here. This is because, even if we were to include a Base class, it would have a lot of functionality and not just the abstract functions, which is kind of adverse to the definition of Base Classes IIUC (Please correct me if wrong).

Copy link
Member

@pipermerriam pipermerriam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Let me know when you're ready for me to click the merge button

py_ecc/fields/__init__.py Show resolved Hide resolved
"""
n = None # type: int
field_modulus = None
# curve_name can be either 'bn128' or 'bls12_381'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about using an enum.Enum for the curve names. This would give us much stronger type safety and prevent certain types of errors like misspelling the dictionary key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The curve_name attribute has been removed from the API. This is because the fields should not be knowing about the curve which is going to use the fields. Hence it has been removed which makes this passing around of curve_name obsolete.

def __add__(self, other: IntOrFQ) -> "FQ":
if isinstance(other, FQ):
on = other.n
elif isinstance(other, int):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this should be experimented with before merging this PR or something we try after but... what if we just made FQ a subclass of int. My intuition says it could solve a whole swath of these various type checks that occur since we could treat FQ instances as integers.

What isn't clear to me is how difficult or necessary it would be to ensure that an FQ object is what is return returned from operations like a * b and b * a where one of a, b is an FQ and one is an int. Stated differently, it seems potentially problematic if it is difficult to reason about whether an object we are interacting with is an integer or an FQ as well as whether the return value of various mathmatical operations will be an int or FQ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam I see what you are saying. It should work as per theory. But I would like this experimentation to be done after the refactoring. Opened #54 for this.


@classmethod
def one(cls) -> "FQ":
return cls(1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR but a possible performance gain might be available if we interned (cached) these return values so that we don't need to re-instantiate the 1 and 0 values each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #55 for this

.format(type(other))
)

for c1, c2 in zip(self.coeffs, other.coeffs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that since we have the isinstance check up top we're safe here, but worth pointing out that zip discards extra values if one of the values is longer so unless we know that these are the same length there's an error edge cases here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pipermerriam #45 should take care of this. (Probably a separate PR for this.)

@Bhargavasomu
Copy link
Contributor Author

@pipermerriam this should be ready to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants