discrete choice modeling blogpost #11

drbenvincent · 2024-10-31T14:46:33Z

This PR adds a notebooks which will form the second Colgate client write-up blog posts.

The first post was Causal sales analytics: Are my sales incremental or cannibalistic?

NOTE: I'll be pretty aggressive about hiding most of the code cells in the final blogpost in order to maximise readability.

Current state: At this point (2024/10/31) I've basically written the first half of the blog post. It outlines the basic discrete choice model and sets up the core limitation of producing uninteresting cannibalization effects.

TODO

We might want to play with the random seed to get the synthetic data nice
We might also want to tweak the synthetic price data to allow for better parameter identiability
Potentially add a manufacturer (or benefit) effect to really show the lack of interesting cannibalization effects.
I'm hoping that either @ricardoV94 or @lucianopaz or @cluhmann will take over the reigns and continue the blog post to talk about the core innovations of what we did. We are allowed to talk about the maths of the nested logit, but we're not allowed to present code to implement it.
Hoping someone can write a nice overview of the cool new stuff that was done. I'll then come back in and wrap it up with the executive summary at the start and a conclusion summary at the end.

review-notebook-app · 2024-10-31T14:46:38Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ricardoV94 · 2024-11-18T08:41:04Z

@drbenvincent I'm leaving some comments from your part of the blogpost, before I start on the second part:

A single intercept doesn't make sense. I assume you meant B_0_^i (one intercept per item), as you have in the model description below?
In the model description the line u_{i,t} = a = b, dosen't make math sense. You introduce a log of the price, which is clearly not equivalent to the original expression on price. Also the brackets are strange. You are not multiplying the intercept by the price, I assume?
Multinomial model is very confident because total_sales is high. We obviously had more noise in our data, and that's why we had to switch the the DirichletMultinomial model instead, as the Multinomial cannot expalin so large errors with these high total sales. May be worth mentioning?
"So this is all great, but it's the kind of output that data scientists would enjoy." Is there irony in this sentence or missing a "not"?
What-if scenario. Needs a bit more text explaining what results we can see from the 5 plots?
I don't like the plot showing the market share before and after as the distance from the x=y line. This will never show anything interesting unless you remove an item that has a sizeable portion of the market-share (which you would never do anyway). It's also mostly wasted white space on the plot. I would rather show the ratio of market share before or after as a plot-bar, which will clearly show everything going up by the same %. Conversely: imagine you remove an item with a 1% market share and another item takes all this market share (very interesting perfect cannibalization), it would go from x to x+1%, which would still look super boring on the plot you defined. The plot is not good to show what you want.
Prior for intercept should be zerosum, otherwise there's one too many parameters.

Obligatory message: I think overall the blog is in a pretty nice shape!

ricardoV94 · 2024-11-18T10:41:00Z

I'm going to push a second NB that uses pre-generated data according to the NLM. I think this will streamline the blogpost, showing where it fails and why the NLM can address it. Not changing the original NB so we can compare, because git changes suck for NBs

initial commit of discrete choice modeling blogpost

40cce0e

Use data generated in script and bridge to NLM model

8083847

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

discrete choice modeling blogpost #11

discrete choice modeling blogpost #11

drbenvincent commented Oct 31, 2024 •

edited

Loading

review-notebook-app bot commented Oct 31, 2024

ricardoV94 commented Nov 18, 2024 •

edited

Loading

ricardoV94 commented Nov 18, 2024

discrete choice modeling blogpost #11

Are you sure you want to change the base?

discrete choice modeling blogpost #11

Conversation

drbenvincent commented Oct 31, 2024 • edited Loading

review-notebook-app bot commented Oct 31, 2024

ricardoV94 commented Nov 18, 2024 • edited Loading

ricardoV94 commented Nov 18, 2024

drbenvincent commented Oct 31, 2024 •

edited

Loading

ricardoV94 commented Nov 18, 2024 •

edited

Loading