Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to do a polynomial? #20

Closed
jseabold opened this issue Jun 25, 2013 · 13 comments
Closed

How to do a polynomial? #20

jseabold opened this issue Jun 25, 2013 · 13 comments

Comments

@jseabold
Copy link
Member

I've seen https://groups.google.com/forum/#!topic/pystatsmodels/96cMRgFXBaA, but why doesn't something like this work.

from patsy import dmatrices
data = dict(y=range(1,11), x1=range(21,31), x2=range(11,21))

dmatrices("y ~ x1 + x2**2", data)

or

dmatrices("y ~ x1 + I(x2**2)", data)

This works

dmatrices("y ~ x1 + np.power(x2, 2)", data)
@jseabold jseabold mentioned this issue Jun 25, 2013
@njsmith
Copy link
Member

njsmith commented Jun 25, 2013

The reason x22 doesn't work is that
x2
2
is shorthand for
x2 * x2
which is shorthand for
x2 + x2 + x2:x2
which collapses down to just plain
x2
because self-interactions don't make sense. See
https://patsy.readthedocs.org/en/latest/formulas.html#the-formula-language

There's an argument that for numerical predictors, x2:x2 should be the same
as np.multiply(x2, x2), just like x1:x2 is the same as np.multiply(x1, x2).
But even then x2**2 would mean x2+np.multiply(x2, x2).

In your example, I(x2**2) doesn't work because x2 is a python list, not a
numpy array. patsy has no crystal ball that tells it that lists should be
coerced to arrays while interpreting arbitrary python code. Use DataFrame()
instead of dict() and it will just work...

On Tue, Jun 25, 2013 at 1:24 AM, Skipper Seabold
[email protected]:

I've seen
https://groups.google.com/forum/#!topic/pystatsmodels/96cMRgFXBaA, but
why doesn't something like this work.

from patsy import dmatrices
data = dict(y=range(1,11), x1=range(21,31), x2=range(11,21))

dmatrices("y ~ x1 + x2**2", data)

or

dmatrices("y ~ x1 + I(x2**2)", data)

This works

dmatrices("y ~ x1 + np.power(x2, 2)", data)


Reply to this email directly or view it on GitHubhttps://github.com//issues/20
.

@jseabold
Copy link
Member Author

Is it possible/desirable that if the RHS of ** is an integer (or a numeric term) to not do the interaction expansion, because it's surely not what someone wants?

@njsmith
Copy link
Member

njsmith commented Jun 25, 2013

I don't understand. The only legal RHS for ** is an integer, anything else
raises an error in the formula evaluator...?

The intended usage is to say things like "give me all 1, 2, or 3-way
interactions, but not anything higher than that":
(a + b + c + d + e + f)**3

-n

On Tue, Jun 25, 2013 at 1:10 PM, Skipper Seabold
[email protected]:

Is it possible/desirable that if the RHS of ** is an integer (or a numeric
term) to not do the interaction expansion, because it's surely not what
someone wants?


Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-19971654
.

@jseabold
Copy link
Member Author

Yeah I was thinking of *. Hmm. Oh well.

@njsmith
Copy link
Member

njsmith commented Jun 25, 2013

It's better to use something like poly(x, 3) anyway, right? B/c that can
produce orthogonalized columns? The people I know are always excited about
testing for whether the higher order terms are significant, and that's most
easily done if you use orthogonalized columns as your polynomial basis. (Of
course this requires we implement poly() but that's not hard, just, no-one
has gotten around to it.)

On Tue, Jun 25, 2013 at 1:22 PM, Skipper Seabold
[email protected]:

Yeah I was thinking of *. Hmm. Oh well.


Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-19972181
.

@jseabold
Copy link
Member Author

I suppose, if for ease of inference, though I rarely see it in practice. Might just be where I hang out. Different strokes.

@jseabold
Copy link
Member Author

Translation of Stata's orthpoly.ado, so likely not fit for inclusion anywhere, but I'll leave it here. // cc @josef-pkt

https://gist.github.com/jseabold/5859121

@njsmith
Copy link
Member

njsmith commented Jun 25, 2013

I think those are orthogonal polynomials, which are different. Orthogonal
polynomials are ones where if you evaluate them at every point in their
domain, the resulting vectors are orthogonal. For regression, we want
polynomials such that if you evaluate them at every point where there is
data, then the resulting vectors are orthogonal.

There's already code for doing this using qr in the patsy Poly class for
categorical polynomial coding, but it hasn't been adapted to be a numerical
stateful transform.
On 25 Jun 2013 16:01, "Skipper Seabold" [email protected] wrote:

Translation of Stata's orthpoly.ado, so likely not fit for inclusion
anywhere, but I'll leave it here. // cc @josef-pkthttps://github.com/josef-pkt

https://gist.github.com/jseabold/5859121


Reply to this email directly or view it on GitHubhttps://github.com//issues/20#issuecomment-19982092
.

@josef-pkt
Copy link

I find it confusing that polynomials should be QR orthogonalized.
I think the main polynomials should be numpy.polynomial.xxxvander

A question of choosing names so that most users (and not just R users) understand what the different versions mean.

@jseabold
Copy link
Member Author

AFAICT, the Poly class is only intended for equally spaced categorical variables. At least, I couldn't figure out how to replicate what orthpoly gives with Poly for a continuous regressor, but I just wanted to work on something else for an hour.

I'm not sure I understand your distinction between what's orthogonal to what. I think orthpoly is like a scaled version of legvander that Josef was after.

from statsmodels.tools.tools import webuse
dta = webuse("auto")
weight = dta["weight"].values.astype(float)
orth = orthpoly(weight, 4)
np.allclose(np.dot(orth[:,0], orth[:,1]), 0)

http://www.stata.com/help11.cgi?orthpoly

@njsmith
Copy link
Member

njsmith commented Jun 25, 2013

I only looked at your code for a few seconds, so I could be confused :-)

This looks like a good skimmable paper on why orthogonal is better than
vander in many cases (and I'm not aware of any situation where it's worse):
http://www.jstor.org/stable/1403204

The confusion is that the term "orthogonal polynomial" almost always refers
to something else, that afaict is totally irrelevant to linear regression.
In ordinary mathematical usage, two polynomials f1 and f2 are orthogonal
iff the integral of f1(x)f2(x)dx = 0, evaluated over the whole domain.

What we want are polynomials so that sum f1(x)f2(x) = 0 when summing over
our data points.

Poly() implements this latter (and only uses it for categorical variables
treated as integers, but the code would work for non-integer data just as
well).

What stata does I don't know. The manual page you pointed me to strongly
implies that they produce the second kind of orthogonal polynomials (when
it says "orthog and orthpoly create a set of variables such that the
"effects" of all the preceding variable have been removed from each
variable."). But afaict the Christoffel-Darboux recurrence is for producing
the first kind of orthogonal polynomials.

Are the predictor vectors that you get out of orthpoly actually orthogonal
as vectors?

AFAICT, the Poly class is only intended for equally spaced categorical
variables. At least, I couldn't figure out how to replicate what
orthpolygives with
Poly for a continuous regressor, but I just wanted to work on something
else for an hour.

I'm not sure I understand your distinction between what's orthogonal to
what. I think orthpoly is like a scaled version of legvander that Josef was
after.

from statsmodels.tools.tools import webuse
dta = webuse("auto")
weight = dta["weight"].values.astype(float)
orth = orthpoly(weight, 4)
np.allclose(np.dot(orth[:,0], orth[:,1]), 0)

http://www.stata.com/help11.cgi?orthpoly


Reply to this email directly or view it on
GitHubhttps://github.com//issues/20#issuecomment-19985792
.

@jseabold
Copy link
Member Author

Just parking this one-liner here, since I came back around to this. An equivalent to R's poly for continuous variables AFAIK, though not really numerically sound.

poly = lambda x, degree : np.linalg.qr(np.vander(x, degree + 1)[:, ::-1])[0][:, 1:]

@matthewwardrop
Copy link
Collaborator

I'm going to close this one out since a PR exists that adds a poly implementation (#92), and there's no further action to be taken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants