Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the initial guess for scipy.optimize.minimize #18

Open
luxin-tian opened this issue Feb 6, 2020 · 7 comments
Open

Questions about the initial guess for scipy.optimize.minimize #18

luxin-tian opened this issue Feb 6, 2020 · 7 comments

Comments

@luxin-tian
Copy link

luxin-tian commented Feb 6, 2020

Dear Dr. @rickecon ,

In HW5 Q1.c, we are asked to perform a two-step GMM estimation using an optimized weighting matrix based on the result of the previous GMM estimation that uses the identity weighting matrix. For the previous GMM estimation, I set the initial guess as mu=11, sigma=0.5 and got a satisfying result that fits the data well. However, when I use the same initial guess for the second GMM that uses the two-step optimized weighting matrix, the result becomes obviously worse than the initial GMM. Then, if I turn to use the previous GMM result as the initial guess for the second estimation, the result would be very close to the initial guess (the previous GMM result).

It seems that the optimizer highly depends on the initial guess, and even though I change the tolerant threshold, the problem still exists. I wonder is it true that the SciPy optimizer actually does need a somewhat "accurate" initial guess, or there can be other methods to avoid this problem? (PS: all the results indicate the optimizer converges with a "success: True". )

Thank you!

@linghui-wu
Copy link

I encountered the same problem that the parameters we are supposed to estimate are highly dependent on the "True" initial guess. I was wondering if such inconsistency in the resulting estimations due to different initial values is acceptable.

Thank you in advance, Dr. Evans!

@rickecon
Copy link
Contributor

@luxin-tian @linghui-wu . This is a really good question, and it is one of the main points of this problem. In part (f), I ask you to compare them. However, the answer is that you cannot directly compare them using the criterion function because in each case, either the vector of moments or the weighting matrix is different. The criterion functions are fundamentally different.

So you will get some answers for parts (a) through (e). My question at the end in part (f) is a hard one. Which of your estimations do you like the best? Is there a scientific way to compare them? How would you justify your best fit to an academic audience?

@rickecon
Copy link
Contributor

@luxin-tian @linghui-wu . Also a hint: All your estimated distributions should fit pretty closely to each other, with only small changes. Plot your estimated distributions to make sure this is true.

@hesongrun
Copy link

@luxin-tian @linghui-wu I think the main reason is, when you take a look at the weighting matrix, it is putting too small weight on the second moment (1e-9 or something). As a result, second moment does not matter too much in the optimization and the optimization is subject to numeric approximation error.

@hesongrun
Copy link

Theoretically speaking, the optimization should have a global optima since it is a convex optimization problem, i.e. no matter what initial value you choose, you will always get the same answer. The reason you get different answers to different initial value is the numeric error.

@hesongrun
Copy link

hesongrun commented Feb 10, 2020

@luxin-tian @linghui-wu Another potential issue with your code (as I have made the mistake myself), you are using the standard deviation as the model moment while using variance as the data moment to calculate the weighting matrix. This explains why the scale of the variance covariance matrix is so different for the mean and the standard deviation.

@luxin-tian
Copy link
Author

Theoretically speaking, the optimization should have a global optima since it is a convex optimization problem, i.e. no matter what initial value you choose, you will always get the same answer. The reason you get different answers to different initial value is the numeric error.

Thank you @hesongrun ! I agree with you this is due to the approximation error. Even though the difference between the results given different initial guesses seems to be intuitively unacceptable, they are the approximations to the same optimum of this convex maximization problem. This seems to be somewhat unavoidable when using a numerical optimizer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants