Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Add metric for price estimates deviation #1921

Closed
sunce86 opened this issue Oct 5, 2023 · 7 comments
Closed

chore: Add metric for price estimates deviation #1921

sunce86 opened this issue Oct 5, 2023 · 7 comments
Labels
question Further information is requested team:solvers Issues / PRs scoped to solvers

Comments

@sunce86
Copy link
Contributor

sunce86 commented Oct 5, 2023

Background

For a group of price estimates, add metric that shows the standard score of the winning solution.
Apparently, CompetitionEstimator would be a good place to add this metric.

Later on, we can add grafana alert and inspect if the standard score is too big which would indicate that the winning solution is far away from what the general truth is (what most other estimators return).

@sunce86 sunce86 added the oncall Issue/PR for consideration during oncall rotation label Oct 5, 2023
@acanidio-econ
Copy link

acanidio-econ commented Oct 5, 2023

standard score (or z-score) are a common way to detect outliers whenever there are several observations from a population. For our purposes, suppose that "executable prices" for a token pair follow a normal distribution. If we have enough observation, we can compute a mean and standard deviation. We can then consider a specific price and ask: what is the probability that this specific price comes from the distribution of executable prices? If the answer is too low (10% or 5%), we may flag this price as suspicious, that is, not really being executable. For example, a common z score is 1.28 because, in a standard normal distribution, there is a 10% probability that an observation is more than 1.28 standard deviations above the mean.

Things get a bit more complicated when we have few observations. First, there are two ways to compute standard deviation (one derived from a biased estimator of the variance, one from the unbiased). The two are equivalent when we have many observations but differ when there are few observations. The "biased" formula will report lower values than the "unbiased" formula and hence it is more likely to trigger the alert.

Most important, there is no grand theory that can guide us in the interpretation of the test. For example, if there are only two observations, the z-test either always fails or always passes (so it has no information content). We may consider this a reasonable outcome in the sense that, with only two observations, it is unclear whether we can use a statistical test. For three observations, we can run some simulations to see when the test will trigger the alert. The result is below (for the unbiased formula for standard deviation and where the lowest observation x1 is equalized to 1). For example, the graph shows that setting a z score to 1.1 would trigger an alert whenever the second observation is 2 (or lower) and the highest observation is 4 (or higher) 3-obs

@acanidio-econ
Copy link

We can repeat the same simulation with 4 observations. Again, the lowest observation is 1. If the second lowest observation is 2, then the simulation results are 4-obs-2
Hence, if z=1.2, then observations [1,2,2,3] or [1,2,3,4.5] would trigger an alert.

if the second lowest observation is 3, then the simulation results are
4-obs-3
Hence, if z=1.2, then observations [1,3,3,5] and [1,3,4,6] would trigger an alert.

if the second lowest observation is 4, then the simulation results are
4-obs-4
Hence, with z=1.2, observations [1,4,4,7] and [1,4,5,8] would trigger an alert

@acanidio-econ
Copy link

Anyway, the formula for the standard deviation (using the unbiased estimator of the variance) is here:
s:=sqrt( ( (x1-xm)^2 + (x2-xm)^2 +...+ (xT-xm)^2 )/(T-1) )
where T is the number of observations and xm is the sample average. The z-score of a given observation is then
(xi-xm)/s
We are interested in the max z-score. We could start from the rule of thumb of considering a max z-score of 1.2 as problematic.

@sunce86 sunce86 self-assigned this Oct 13, 2023
Copy link

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

@github-actions github-actions bot added the stale label Feb 14, 2024
@harisang harisang reopened this Apr 10, 2024
@harisang harisang added team:solvers Issues / PRs scoped to solvers and removed stale oncall Issue/PR for consideration during oncall rotation labels Apr 10, 2024
@fleupold
Copy link
Contributor

Do we still want to implement this suggestion in a world where we have verified quotes (ie we simulate the calldata a solver reports in their quote and only use a quote if the simulation matches the amount they claim)?

This would make it very unlikely for solvers to accidentally report unrealistic quotes (only if the e.g. underwrote private liquidity orders for it).

My main worry with this approach is that it will likely cost some significant engineering time to monitor and fine tune this system so that we are not accidentally ignoring good quotes (we have seen quotes where only two or three solvers respond with significant difference in the results, but all of them being valid).

It is true that verified quotes do not work until the user has actually connected their wallet (and the wallet has sufficient funds to make the trades they are quoting). Until then bogus estimates could still be returned. However, I'd argue that in those cases the user is unlikely to place an order anyway, so what's the value in showing them a realistic quote (I'd rather show them a message in the UI for unverified quotes that those are only indicative and might change once they connect a wallet with sufficient funds).

If at all, I'd like us to not filter out any quotes that have been verified no matter how good they are (to avoid false negatives in our detection). But in that case I wonder if this task is worth the cost of building it.

@fleupold fleupold added the question Further information is requested label Apr 11, 2024
Copy link

github-actions bot commented Jul 8, 2024

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

@github-actions github-actions bot added the stale label Jul 8, 2024
@mfw78 mfw78 removed the stale label Jul 10, 2024
Copy link

This issue has been marked as stale because it has been inactive a while. Please update this issue or it will be automatically closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested team:solvers Issues / PRs scoped to solvers
Projects
None yet
Development

No branches or pull requests

5 participants