You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems unclear what is the reason for creating 20 random values ns = np.linspace(0, n_max, 20).astype(int) to determine the number of labelled data used to calculate the PPI, but in the end only the PPI value calculated from n_max is retained as seen from avg_ci = ci.mean(axis=0)[-1]
I don't understand what is the purpose of conducting multiple trials? Since only the only the PPI value calculated from n_max is retained, it should be constant throughout the trials isn't it?
ci_classical[j, i, :] =binomial_iid(n, alpha, y.mean())
except:
avg_ci_classical=None
avg_ci=ci.mean(axis=0)[-1]
Also, I saw there there were other functions from the original PPI repository (PPI bootstrap, cross PPI etc), what were you consideration when choosing which PPI function to use?
The text was updated successfully, but these errors were encountered:
Correct me if I am wrong but I think this code resulted from a misinterpretation of the code examples from the PPI paper. There they have experiments to show how PPI compares to classic inference by running multiple trials with labelled samples from different parts of the dataset and different sizes. In practice, one would simply use the complete labelled dataset.
Hi
It seems unclear what is the reason for creating 20 random values
ns = np.linspace(0, n_max, 20).astype(int)
to determine the number of labelled data used to calculate the PPI, but in the end only the PPI value calculated fromn_max
is retained as seen fromavg_ci = ci.mean(axis=0)[-1]
I don't understand what is the purpose of conducting multiple trials? Since only the only the PPI value calculated from
n_max
is retained, it should be constant throughout the trials isn't it?ARES/ares/RAG_Automatic_Evaluation/LLMJudge_RAG_Compared_Scoring.py
Lines 238 to 281 in 2684d47
Also, I saw there there were other functions from the original PPI repository (PPI bootstrap, cross PPI etc), what were you consideration when choosing which PPI function to use?
The text was updated successfully, but these errors were encountered: