-
-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More on patterns overhaul #1110
Conversation
If the changes have settled down, let me know and I will pull down the code, try it, and try to understand it in detail. Thanks. |
@rocky, look at this PR when you have the time. There are much more things to do, but it is already quite large, despite most of the changes are trivial. |
Could you outline those much more things? We eventually want to use is a trie structure or some other pattern-matching data structure , analogous to regular-expression patterns, to do the bulk of the work. So we should not do fancy restructuring or pattern detection as code that makes this harder. |
I am just talking about making trivial changes that reduce the linter complaints. The first thing that comes to my mind is to decompose Mathis.builtin.patterns into smaller submodules. |
mathics/builtin/patterns.py
Outdated
@@ -99,9 +94,17 @@ class Rule_(BinaryOperator): | |||
operator = "->" | |||
attributes = A_SEQUENCE_HOLD | A_PROTECTED | |||
grouping = "Right" | |||
messages = {"argrx": "Rule called with `1` arguments; 2 arguments are expected."} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I discovered recently - this is not needed since it appears in General messages. However, the message() call needs to be adjusted.
I will put in a PR to merge in here since there are a couple of changes that need to go on and my linter says that importing Callable
is not needed.
It would be nice to go over the code base to reduce the unnecessary duplication of message keys.
mathics/builtin/patterns.py
Outdated
|
||
if max.get_name() == "System`Infinity": | ||
if maxidx.get_name() == "System`Infinity": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases, this is no longer correct because Infinity can now be DirectedInfinity[1]
. This happens when a 3rd Infinity
argument is passed explicitly.
Since this is less trivial than the other changes it is not in #1117.
Misc lint-like stuff
@@ -588,9 +621,14 @@ def yield_match(vars_2, rest): | |||
else: | |||
yield_func(vars_2, None) | |||
|
|||
self.pattern.match(yield_match, expression, vars, evaluation) | |||
pattern_context = pattern_context.copy() | |||
pattern_context["yield_func"] = yield_match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the copy
and pattern_context
now is needed when it wasn't before. Was the previous code wrong? Please enlighten me - thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why the
copy
andpattern_context
now is needed when it wasn't before. Was the previous code wrong? Please enlighten me - thanks.
pattern_context
just collects all the arguments that match
receives in the master's version. I could use **kwargs instead, or call it **bunch_of_other_parameters
but I thought that using a fixed parameter would be less prone to errors, and that this name represents well what these parameters are. Also, this is the way in which pattern matching in ExpReduce was implemented.
On the other hand, we gain something by not rebuilding the dict on each call: most of the time, all the parameters are passed untouched. However, in some places, some parameters change: then the dict is copied and modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that using a fixed parameter is less prone to error, and allows for automated error checking.
As for the efficiency part, unless this very much slows things down, let's revisit this. It may be that some future way of handling patterns will make all of this moot.
It would be good to note somewhere in the code that a copy is needed because parameters can sometimes get altered, maybe with an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the efficiency part, unless this very much slows things down, let's revisit this. It may be that some future way of handling patterns will make all of this moot.
Also, using copy() avoids explicitly rebuilding each entry in the dict, which makes the code shorter.
It would be good to note somewhere in the code that a copy is needed because parameters can sometimes get altered, maybe with an example.
I thought I had put a comment in BasePattern.match
, but it seems it was lost in one of my failed attempts that this works. Also, I have to adjust its docstring. Maybe in the next round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like match_numericq()
, match_real_numberq()
and other match_xxx()
could be DRY'd by having a common function that takes a yield_match
function in as a parameter.
The other good thing about doing something like this is that it makes it clearer what the difference is between these routiens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rocky, as I mentioned before, there are a lot of things to do just to make the code more readable and easy to understand and debug, but a larger PR would make it harder to review. If you agree, in this round we can leave a TODO comment and face it in the following round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. But since I am looking at this, I want to make sure I note what I see in this pass.
I am now done making a pass over the PR. The final changes suggested are in the more_patterns_more
branch.
mathics/builtin/patterns.py
Outdated
self.pattern.match(expression, pattern_context) | ||
|
||
def match_real_numberq(self, expression: Expression, pattern_context: dict): | ||
"""Match function for RealValudNumberQ""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RealValudNumberQ -> RealValuedNumberQ and will be in next commit of more-patterns-more
mathics/builtin/patterns.py
Outdated
if builtin is not None and isinstance(builtin, Test): | ||
return builtin.test(candidate) | ||
|
||
from mathics.core.builtin import Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be moved to the top of the file and will make our linter overlords happy. Will be in the next commit of more_patterns_more
mathics/builtin/patterns.py
Outdated
|
||
def get_match_count( | ||
self, vars_dict: OptionalType[dict] = None | ||
) -> Union[int, tuple]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type annotation is not correct. In particular, if self.alternatives
is empty, then tuple[None]
is returned.
mathics/builtin/patterns.py
Outdated
range = list(sub) | ||
sub = alternative.get_match_count(vars_dict) | ||
if range_lst is None: | ||
range_lst = list(sub) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be tuple(sub)
. I will add to more_patters_more. PR.
vars_dict: Optional[dict] = None, | ||
): | ||
self, elements: Tuple[BaseElement], pattern_context: dict | ||
) -> Union[int, tuple]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual return: return (1,1)
does match this return annotation. It should be Tuple[int, int]
. The change will be in more_patterns_more PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding this, I wonder if we should not make the return value of this method a little bit more consistent, but again, it would imply making more changes about how the code works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - the Union[int, tuple]
should probably become `tuple``. And yes, this might be done in a future PR. But let's correct what is there now. This is done in the more_pattern_more branch.
"""Get the possible wrappings""" | ||
def get_wrappings(self, yield_func: Callable, items: Tuple, pattern_context: dict): | ||
""" | ||
Get the possible wrappings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand and possibly don't like the term "wrappings" which is vague.
What exactly is a wrapping? Changing the name can be left for the future though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I neither fully understand/like this term. What is true is that this function is called a lot, and seems to be related to the different orders/ grouping that the same expression must be (tried to) match against a pattern. The docstring has the best understand I could get from looking at the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
See conversation tab in #1110
More lint stuff
Ok. With this set of changes, I think we're ready to merge. #1006 getting looked at next. |
This PR includes many tiny changes over mathics.core.pattern and mathics.builtin.patterns.
pattern_context
parameter that collects all the parameters. The linter seems to be happier with this change.Rule
andRuleDelayed
now check if they receive two parameters. If they receive more or less parameters, show a warning message.The work is not complete but is closer to being: it remains to remove ** from the match calls, as well as some other adjusts.