Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removes use of statsmodels.tsa.arima_model.ARMA #1896

Merged
merged 9 commits into from
Aug 2, 2022

Conversation

j-bryan
Copy link
Collaborator

@j-bryan j-bryan commented Jul 18, 2022


Pull Request Description

What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)

#1872

What are the significant changes in functionality due to this change request?

Removes use of statsmodels.tsa.arima_model.ARMA and replaces it with statsmodels.tsa.arima.model.ARIMA, making additional changes as needed. Some attributes in the armaResultsProxy class (in ravenframework/SupervisedLearning/ARMA.py) were changed to better match the ARIMAResults object returned by the new ARIMA model class being used; these changes will break existing pickled ARMA ROMs!


For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in XML block, the node <internalParallel> to True.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added is the the analytic documentation updated/added?
  • 9. If any test used as a basis for documentation examples (currently found in raven/tests/framework/user_guide and raven/docs/workshop) have been changed, the associated documentation must be reviewed and assured the text matches the example.

Copy link
Collaborator

@dylanjm dylanjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small changes. Other than that, it looks good. maybe @wangcj05 or @PaulTalbot-INL would like to take a second look.

ravenframework/SupervisedLearning/ARMA.py Outdated Show resolved Hide resolved
@PaulTalbot-INL
Copy link
Collaborator

PaulTalbot-INL commented Jul 18, 2022

I don't know if these will pass without updating the pinned version of statsmodels in RAVEN, in raven/dependencies.xml.

@dylanjm
Copy link
Collaborator

dylanjm commented Jul 18, 2022

@j-bryan You can update that dependency in this PR if you'd like. Or is there already a PR updating statsmodels?

@j-bryan
Copy link
Collaborator Author

j-bryan commented Jul 18, 2022

@dylanjm @PaulTalbot-INL Would statsmodels needs to be updated now? Moving to the new class allows us to use newer versions of statsmodels, but the ARIMA class we're using has been there since 0.11.0 I think, so we shouldn't need to update right now.

Is there a way for me to see why the checks are failing now? They were passing on here earlier, and all of the relevant tests still seem to be passing on my (Windows) machine. I was able to run some tests on a Linux machine, and I had a test fail due to a value falling just outside of a tolerance, so maybe some check tolerances need to be loosened a little?

@dylanjm
Copy link
Collaborator

dylanjm commented Jul 18, 2022

@j-bryan It looks like the ARMAParallel test is failing. Does it pass on your machine?

@j-bryan
Copy link
Collaborator Author

j-bryan commented Jul 19, 2022

@dylanjm Trying around on different operating systems I have access to here, ARMAparallel passes on Windows but fails on Mac and Linux (Ubuntu-based). It's failing with a pretty hairy-looking error being kicked out from ray (ValueError: buffer source array is read-only). I can get you the stack trace if you'd like.

I'm also getting that (negligible) diff I mentioned for the ARMA/Basic test on Linux only. It passes on Windows and Mac.

@dylanjm
Copy link
Collaborator

dylanjm commented Jul 19, 2022

@j-bryan ooof 🙃

I can get you the stack trace if you'd like.

I think I can see the RAY error stack trace coming from the testing machines. I'm not quite sure how we should fix this.

@joshua-cogliati-inl does this RAY error look familiar to you?

@joshua-cogliati-inl
Copy link
Contributor

joshua-cogliati-inl commented Jul 19, 2022

ray.exceptions.RayTaskError: ray::evaluateSample() (pid=2997, ip=10.53.30.122)
  At least one of the input arguments for this task could not be computed:
ray.exceptions.RaySystemError: System error: buffer source array is read-only
traceback: Traceback (most recent call last):
  File "/home/civet/pip_envs/raven_libraries/lib/python3.6/site-packages/ray/serialization.py", line 281, in deserialize_objects
    obj = self._deserialize_object(data, metadata, object_ref)
  File "/home/civet/pip_envs/raven_libraries/lib/python3.6/site-packages/ray/serialization.py", line 194, in _deserialize_object
    return self._deserialize_msgpack_data(data, metadata_fields)
  File "/home/civet/pip_envs/raven_libraries/lib/python3.6/site-packages/ray/serialization.py", line 172, in _deserialize_msgpack_data
    python_objects = self._deserialize_pickle5_data(pickle5_data)
  File "/home/civet/pip_envs/raven_libraries/lib/python3.6/site-packages/ray/serialization.py", line 160, in _deserialize_pickle5_data
    obj = pickle.loads(in_band, buffers=buffers)
  File "statsmodels/tsa/statespace/_initialization.pyx", line 227, in statsmodels.tsa.statespace._initialization.dInitialization.__init__
  File "stringsource", line 658, in View.MemoryView.memoryview_cwrapper
  File "stringsource", line 349, in View.MemoryView.memoryview.__cinit__
ValueError: buffer source array is read-only

Yes, this error means that something is being serialized (pickled) and deserialized that was not designed to be serialized. (How to fix on the other hand might be a challenge.)

(A first guess would be modifying __getstate__ and __setstate__ in ARMA to not serialize something, but rebuild it some other way.)

@j-bryan
Copy link
Collaborator Author

j-bryan commented Jul 19, 2022

Hmmm that's a tricky one. I'll start looking at the ARMA ROM pickling, but this could take a sec to figure out. Some other tests seem to successfully pickle and unpickle the ROM, and it's only failing in the parallel test. In the meantime, let me know if you have any other ideas of what might be causing the error.

@joshua-cogliati-inl
Copy link
Contributor

Yes, it can take some time to figure out. I think the biggest hint is statsmodels/tsa/statespace/_initialization.pyx is roughly what is failing.

@joshua-cogliati-inl
Copy link
Contributor

FYI: the last time I saw this error message: scikit-learn/scikit-learn#21685

@j-bryan j-bryan requested a review from dylanjm July 25, 2022 22:23
Copy link
Collaborator

@dylanjm dylanjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good to me. No major functionality is changed.

@dylanjm
Copy link
Collaborator

dylanjm commented Jul 27, 2022

Looks like we are getting a failure in pickleTests:

Here's the test name

 Failed tests/framework/ROM/pickleTests/eRl_setAdditionalParams

Here's the stack trace:

Output of'tests/framework/ROM/pickleTests/eRl_setAdditionalParams':
Loading plugin "TEAL" at /Users/civet/civet/build_0/raven/plugins/TEAL
 ... successfully imported "TEAL" ...
Loading plugin "HERON" at /Users/civet/civet/build_0/raven/plugins/HERON
 ... successfully imported "HERON" ...
Loading plugin "SR2ML" at /Users/civet/civet/build_0/raven/plugins/SR2ML
 ... successfully imported "SR2ML" ...
Loading plugin "LOGOS" at /Users/civet/civet/build_0/raven/plugins/LOGOS
 ... successfully imported "LOGOS" ...
Loading plugin "SRAW" at /Users/civet/civet/build_0/raven/plugins/SRAW
 ... successfully imported "SRAW" ...
Loading plugin "FARM" at /Users/civet/civet/build_0/raven/plugins/FARM
 ... successfully imported "FARM" ...
Loading plugin "ExamplePlugin" at /Users/civet/civet/build_0/raven/plugins/ExamplePlugin
 ... successfully imported "ExamplePlugin" ...
InputData: Using param spec "pickledROM" to read XML node "loadedROM.
(  884.45 sec) Interp. Cluster ROM      : DEBUG           -> Evaluating interpolated ROM ...
(  884.45 sec) Interp. Cluster ROM      : DEBUG           ->  ... evaluating macro step "0" (1 / 10)
(  884.45 sec) Clustered ROM            : DEBUG           -> Sampling from 3 segments ...
(  884.45 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 0
(  884.92 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 1
(  884.93 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 2
(  884.94 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 3
(  884.95 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 4
(  884.96 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 5
(  884.96 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 6
(  884.97 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 7
(  884.97 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 8
(  884.98 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 9
(  884.99 sec) Interp. Cluster ROM      : DEBUG           ->  ... evaluating macro step "1" (2 / 10)
(  884.99 sec) Clustered ROM            : DEBUG           -> Sampling from 3 segments ...
(  884.99 sec) Clustered ROM            : DEBUG           -> Evaluating ROM segment 0
Traceback (most recent call last):
  File "eRl_setAddtlParams.py", line 111, in <module>
    main()
  File "eRl_setAddtlParams.py", line 106, in main
    results = check(runner, before, after, signal)
  File "eRl_setAddtlParams.py", line 49, in check
    res = runner.evaluate(inp)[0]
  File "/Users/civet/civet/build_0/raven/scripts/externalROMloader.py", line 147, in evaluate
    output.append(self.rom.evaluate({k:np.asarray(v[index]) for k,v in request.items()}))
  File "/Users/civet/civet/build_0/raven/ravenframework/Models/ROM.py", line 406, in evaluate
    resultsDict = self.supervisedContainer[0].run(request)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/SupervisedLearning.py", line 323, in run
    return self.evaluate(edict)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ROMCollection.py", line 1818, in evaluate
    subResult = model.evaluate(edict) # TODO same input for all macro steps? True for ARMA at least...
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ROMCollection.py", line 896, in evaluate
    result = Segments.evaluate(self, edict)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ROMCollection.py", line 368, in evaluate
    result = self._evaluateBySegments(edict)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ROMCollection.py", line 425, in _evaluateBySegments
    subResults = rom.evaluate(evaluationDict)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/SupervisedLearning.py", line 362, in evaluate
    return self.__evaluateLocal__(featureValues)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ARMA.py", line 710, in __evaluateLocal__
    return self._evaluateCycle(featureVals)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ARMA.py", line 819, in _evaluateCycle
    randEngine = self.randomEng)
  File "/Users/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ARMA.py", line 926, in _generateARMASignal
    burnin=2*max(self.P,self.Q)) # @alfoa, 2020
  File "/Users/civet/.conda/envs/raven_libraries/lib/python3.7/site-packages/statsmodels/tsa/arima_process.py", line 107, in arma_generate_sample
    return signal.lfilter(ma, ar, eta, axis=axis)[fslice]
  File "/Users/civet/.conda/envs/raven_libraries/lib/python3.7/site-packages/scipy/signal/signaltools.py", line 1907, in lfilter
    return sigtools._linear_filter(b, a, x, axis)
MemoryError: Could not create zfzfilled

@moosebuild
Copy link

Job Mingw Test on 9b2450d : invalidated by @joshua-cogliati-inl

civet settings changed.

@moosebuild
Copy link

Job Mingw Test on 9b2450d : canceled by @wangcj05

waiting for #1913

@moosebuild
Copy link

Job Mingw Test on 9b2450d : invalidated by @wangcj05

@moosebuild
Copy link

Job Mingw Test on 9b2450d : canceled by @wangcj05

@j-bryan
Copy link
Collaborator Author

j-bryan commented Jul 28, 2022

@dylanjm Which tests are failing now? I've rerun the tests on my Mac and Windows machines, and everything is passing for me.

@dylanjm
Copy link
Collaborator

dylanjm commented Jul 28, 2022

@j-bryan It looks like the test is failing due to something being broken upstream. It looks like #1913 fixes this issue. If you're curious about the failing test it is:

Diff doc/workshop/optimizer/GeneticAlgorithms/Inputs/GA_MaxwoRepConstrained

Once that fix is merged, you'll have to update RAVEN, rebase, and then force push to get this PR to pass tests.

dylanjm
dylanjm previously approved these changes Jul 28, 2022
@j-bryan
Copy link
Collaborator Author

j-bryan commented Jul 28, 2022

Okay, I'll keep an eye out for that. Thanks!

@moosebuild
Copy link

All jobs on 9b2450d : invalidated by @wangcj05

1 similar comment
@moosebuild
Copy link

All jobs on 9b2450d : invalidated by @wangcj05

@PaulTalbot-INL
Copy link
Collaborator

It looks like some small numerical error is preventing this merge in the ARMA "Basic" test:

Mismatch between tests/framework/ROM/TimeSeries/ARMA/Basic/romMeta.xml and tests/framework/ROM/TimeSeries/ARMA/gold/Basic/romMeta.xml
    No match for gold node DataObjectMetadata/arma/Speed/ARMA_params/std
               tag: std
              attr: {}
              text: 0.258080826596
  Nearest unused match:     DataObjectMetadata/arma/Speed/ARMA_params/std
    <std> text does not match: "0.258080826596" vs "0.258081109144"

@joshua-cogliati-inl
Copy link
Contributor

joshua-cogliati-inl commented Aug 2, 2022

Hm, this causes HERON and FARM to fail: https://civet.inl.gov/job/1125079/:

File "/home/civet/civet/build_0/raven/ravenframework/SupervisedLearning/ARMA.py", line 1391, in writeXML
    armaNode.append(xmlUtils.newNode('std', text=arma.sigma))
  File "/home/civet/.conda/envs/raven_libraries/lib/python3.7/site-packages/statsmodels/base/wrapper.py", line 34, in __getattribute__
    obj = getattr(results, attr)
AttributeError: 'ARMAResults' object has no attribute 'sigma'

So SupervisedLearning/ARMA.py", line 1391 probably needs to be updated.

@@ -1382,7 +1388,7 @@ def writeXML(self, writeTo, targets=None, skip=None):
root.append(targetNode)
armaNode = xmlUtils.newNode('ARMA_params')
targetNode.append(armaNode)
armaNode.append(xmlUtils.newNode('std', text=np.sqrt(arma.sigma2)))
armaNode.append(xmlUtils.newNode('std', text=arma.sigma))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is failing in HERON tests: https://civet.inl.gov/job/1125079/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is because we don't have a way to update the trained ARMAs that we use for testing as prerequisites to running the HERON cases, so if ever the ARMA ROM gets updated, we have to manually update those serialized testing ARMAs. This may not be avoidable, but we should make sure that @j-bryan can update those trained ARMAs in HERON immediately after merging this. Will that work?

Copy link
Contributor

@joshua-cogliati-inl joshua-cogliati-inl Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is fine with me. (Is there written documentation on how to do this? I checked Troubleshooting and some other pages in the HERON wiki and didn't see how to update those ARMAs.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It probably should be documented better. There's RAVEN XML files in the ARMA test folder along with the associated training data, so it should be a simple matter of running those training XML files with the new changes in RAVEN present.

@PaulTalbot-INL
Copy link
Collaborator

@j-bryan this is ready to merge, but I want to make sure first that you'll be able to follow up with the HERON fixes that @joshua-cogliati-inl has noted. Let me know if there's questions on that.

@j-bryan
Copy link
Collaborator Author

j-bryan commented Aug 2, 2022

@PaulTalbot-INL I should be able to handle that. Just to check my understanding of the issue, some of the HERON (and FARM?) tests rely on ARMA models which need to be retrained. I’ll need to go and retrain those models manually using the new version of the ARMA ROM. Is that correct?

@PaulTalbot-INL
Copy link
Collaborator

That's right, the training workflows should all be there along with the data in tests/integration_tests/ARMA (iirc), so train them, re-add them, and push on a branch. I'll merge this PR and you can start on that, then we'll update the HERON submod ASAP.

@PaulTalbot-INL PaulTalbot-INL merged commit 0dc33a1 into idaholab:devel Aug 2, 2022
dgarrett622 added a commit to idaholab/HERON that referenced this pull request Aug 3, 2022
Updates pickled ARMA ROMs due to changes from idaholab/raven#1896
@dgarrett622 dgarrett622 linked an issue Aug 5, 2022 that may be closed by this pull request
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK] Remove use of statsmodels.tsa.arima_model.ARMA
5 participants