-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IDEA] Generation of problem files with mixed-representation index sets #602
Conversation
These tests verify that implicitly ordered sets work as expected, including when the set members are not comparable.
Only sort data when the sorted order is set.
Also updating a baseline test where ordering changed with the default set order
Instead of randomizing, this simple inserts components in order
1. Updating files for small16 tests 2. Disabling GAMS/Baron tests by changing their testname 3. Updating small16 GAMS/Baron baselines
@jsiirola @ghackebeil I'm marking this as WIP because there are known test failures in PySP and GDP. The tests pass for pyomo.repn and pyomo.solvers, so I'm cautiously optimistic that we can safely disable the sorting logic in the writers. However, I suspect that PySP and GDP are generating models nondeterministically (e.g. through reformulations), and hence we have baseline test failures. NOTE: I've currently disabled the sorting logic. However, an alternative is to leave that logic in place, but make the default writer behavior to not sort. This would allow testing of components where models are generated nondeterministically. However, I think that we do ultimately want determinism throughout Pyomo. |
@jsiirola @ghackebeil Hold off on reviewing this and resolving issues. I did some more careful testing, and there's an issue that I haven't accounted for yet. While we can easily make sets ordered, it's harder to make sparse indexing of components ordered the same way without using OrderedDict (which has performance impliactions, I think). |
FWIW, I would rather work this issue on the set-rewrite branch instead of continuing to backpatch the current implementation, because in my experience working there, we need some rather significant changes to the design of sets to cleanly support this. |
The iterator now depends on the set data iterator, which reflects the fixed ordering of the underlying set.
Updated the small16 problem to ensure that it could generate differences when nondeterminism exists in the model.
Has an indexed constaint that is sparsely populated.
@jsiirola @ghackebeil OK. I think I resolved the issue. |
@jsiirola I don't quite see that this is a backpatch. I think these changes here are self-sufficient. I understand that we're working on the same code, but I'm expecting that the changes in that branch will simply replace the logic in sets.py. The changes to other files in this branch are orthogonal to that change. Right? |
BTW, my last change didn't require the use of OrderedDict. It was merely a change to the iterator for Set objects. |
@jsiirola This last change may resolved the GDP tests. I'll try to confirm tomorrow. |
@jsiirola It looks like the GDP test failures are gone in Python 3.6, but there are some test failures in Python 3.5. This is likely due to nondeterminism in the use of dict objects. I'll look into this. |
@ghackebeil On Python 3.6, there remain some test failures for PySP under tests/convert. I'm having trouble replicating these on old-sisu. Let me know if you have any thoughts about what might be going on here. |
They failed locally for me, and the baseline updates I applied made sense to me. If they were not failing on another 3.6 build, that might indicate that there is another source of non-determinism. Would you mind testing on that machine again with the new baselines and let me know if it fails? |
Cannot tests small16 and small17 with file_determinism=3
@ghackebeil I just reran the Travis tests, and there are 4 failures for PySP in Python 3.7/3.6, and 13 failures for Python 3.5/2.7. I can't replicate these tests on old-sisu. Do you know if I need to have a specific package installed to do that? I could imagine that there are additional sources of nondeterminism in PySP, though pyomo.core/pyomo.repn tests seem stable. That's where the components are defined and the problem writers are defined. |
This should obviate testing baseline changes outside of pyomo.repn. The small16 and small17 tests are tested with determinism=0, since that is the only mode suitable for models with mixed representations.
@ghackebeil I changed the logic to default to determinism=1 (sorting of indices). Hence, I reverted your baseline changes. |
So are the changes in this PR now solely related to Set pprint baselines? |
For python version < 3.6, don't run these tests.
@ghackebeil I do have tests that confirm that writers work in Python 3.6, 3.7 with determinism=0. |
@jsiirola @ghackebeil I think my latest commits will resolve the outstanding test failures in this PR. In recent changes, I've made determinism=1 again the default for writers. Thus, the current code will fail if a user uses an incomparable index set in Python3.x and tries to write a problem file. As John has noted, we could use the sorted_robust() function in the code (in the Block class?) to resolve this issue. However, I did some simple testing that shows that sorted_robust() is 5-10x slower for sets that contain string and integer values:
On my Mac, I get statistics like:
So, this shouldn't impact our test performance, and I would expect it wouldn't impact most users. But using sorted_robust() would create a situation where a user with an incomparable set sees a nontrivial slowdown when moving from Python 2.7 to 3.x. And that is despite the fact that sorting isn't necessary to generate a reproducible problem representation (which I think is really the main point of the determinism option). So, I'm reluctant to make this change. Thoughts? |
Codecov Report
@@ Coverage Diff @@
## master #602 +/- ##
=========================================
- Coverage 67.21% 67.2% -0.01%
=========================================
Files 392 392
Lines 62208 62215 +7
=========================================
+ Hits 41812 41814 +2
- Misses 20396 20401 +5
Continue to review full report at Codecov.
|
@jsiirola @ghackebeil Can we finalize this PR? |
I assumed you were still working on it (it is marked I also do not understand the point of this PR. You are adding a bunch of tests, but they seem to only apply to one version of python (IMHO, not good). You are also changing the behavior of Set's |
Using sorted_robust() to sort all sets. This method is slow for mixed representations, but that doesn't matter when writing diagnostic output.
This should never have been committed.
@jsiirola @carldlaird I reworked the PR description to clarify the scope/intent of this change. Note that this PR does not change the default behavior in Pyomo, as was originally intended. That proved problematic (see #677). |
I am not in favor of this PR, as my understanding is that it will potentially lead to Pyomo generating different solver input files on different versions of Python (before/after Python 3.6). Please correct me if I am mistaken. FWIW, the design I am experimenting with in the set-rewrite branch could address sorting through a different approach: Sets would support two methods: |
I updated #567 to document that we are closing this PR because of inactivity. |
Fixes #567 (partial).
Summary/Motivation:
We recently made a small change that makes the default representation of Sets in Pyomo rely on insertion order. This PR completes that activity by resolving the simple test failures outlined in #567. Specifically, these changes allow mixed-representation sets to be used as index sets. The result is that problem files can be generated with determinism=0, which does no sorting of index values.
However, these changes only work with CPython (3.6, 3.7) and PyPy. Starting with Python3.6, the CPython and PyPy Python implementations have deterministic key orderings in their dictionary representations. And as of Python 3.7 this property is a part of the Python language specification. This feature is exploited to provide deterministic file generation without sorting index values.
NOTE: This is a partial fix of #567, since it only applies to more recent versions of Python. However, there is not a clear motivation for extending this fix for older versions of Python. However, this PR is motivated by the fact that sorting is not done during file generation, and hence files are generated more quickly (especially for models with constraints that have large index sets).
NOTE: Sorted ordering of mixed-representation sets often works for Python 2.7 (using determinism=1), since that version allows for comparison of more data types.
NOTE: This PR does not simply use the sorted_robust() function to sort when generating problem files. The sorted_robust() function is significantly slower than sorted() with mixed-representation data. Thus using sorted_robust() with Python 3.x would make it appear that Python3 is slower than Python2, when in fact there are faster alternatives.
Changes proposed in this PR:
Legal Acknowledgement
By contributing to this software project, I agree to the following terms and conditions for my contribution: