Unclear about the code of calculate_ppi function #64

kitkhai · 2024-07-23T02:03:54Z

Hi

It seems unclear what is the reason for creating 20 random values ns = np.linspace(0, n_max, 20).astype(int) to determine the number of labelled data used to calculate the PPI, but in the end only the PPI value calculated from n_max is retained as seen from avg_ci = ci.mean(axis=0)[-1]

I don't understand what is the purpose of conducting multiple trials? Since only the only the PPI value calculated from n_max is retained, it should be constant throughout the trials isn't it?

ARES/ares/RAG_Automatic_Evaluation/LLMJudge_RAG_Compared_Scoring.py

Lines 238 to 281 in 2684d47

    
           def calculate_ppi(Y_labeled: np.ndarray, Yhat_labeled: np.ndarray,  
        
                             Yhat_unlabeled: np.ndarray, alpha: float, num_trials: int) -> tuple: 
        
               """ 
        
               Calculate prediction-powered inference (PPI) and classical inference intervals. 
        
               Parameters: 
        
               Y_labeled (np.ndarray): Labeled ground truth values. 
        
               Yhat_labeled (np.ndarray): Predictions for the labeled data. 
        
               Yhat_unlabeled (np.ndarray): Predictions for the unlabeled data. 
        
               alpha (float): Significance level for the confidence intervals. 
        
               num_trials (int): Number of trials to run for the inference. 
        
               Returns: 
        
               tuple: A tuple containing the average PPI confidence interval, the average classical confidence interval, and the imputed-only confidence interval. 
        
               """ 
        
               n_max = Y_labeled.shape[0] 
        
               ns = np.linspace(0, n_max, 20).astype(int) 
        
               # Imputed-only estimate 
        
               imputed_estimate = (Yhat_labeled.sum() + Yhat_unlabeled.sum()) / (Yhat_labeled.shape[0] + Yhat_unlabeled.shape[0]) 
        
               # Initialize arrays to store confidence intervals 
        
               ci = np.zeros((num_trials, ns.shape[0], 2)) 
        
               ci_classical = np.zeros((num_trials, ns.shape[0], 2)) 
        
               # Run prediction-powered inference and classical inference for many values of n 
        
               for j in tqdm(range(num_trials), desc="Trials"):  # Wrap the outer loop with tqdm for the progress bar 
        
                   for i, n in enumerate(ns):  # Iterate over ns with an index 
        
                       rand_idx = np.random.permutation(Y_labeled.shape[0]) 
        
                       f = Yhat_labeled.astype(float)[rand_idx[:n]] 
        
                       y = Y_labeled.astype(float)[rand_idx[:n]] 
        
                       output = pp_mean_iid_asymptotic(y, f, Yhat_unlabeled, alpha) 
        
                       ci[j, i, :] = output 
        
                       # Classical interval 
        
                       try: 
        
                           if n == 0: 
        
                               ci_classical[j, i, :] = [0, 0] 
        
                           else: 
        
                               ci_classical[j, i, :] = binomial_iid(n, alpha, y.mean()) 
        
                       except: 
        
                           avg_ci_classical = None 
        
               avg_ci = ci.mean(axis=0)[-1]

Also, I saw there there were other functions from the original PPI repository (PPI bootstrap, cross PPI etc), what were you consideration when choosing which PPI function to use?

The text was updated successfully, but these errors were encountered:

WJ44 · 2024-09-13T10:01:20Z

Correct me if I am wrong but I think this code resulted from a misinterpretation of the code examples from the PPI paper. There they have experiments to show how PPI compares to classic inference by running multiple trials with labelled samples from different parts of the dataset and different sizes. In practice, one would simply use the complete labelled dataset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unclear about the code of calculate_ppi function #64

Unclear about the code of calculate_ppi function #64

kitkhai commented Jul 23, 2024 •

edited

Loading

WJ44 commented Sep 13, 2024

Unclear about the code of calculate_ppi function #64

Unclear about the code of calculate_ppi function #64

Comments

kitkhai commented Jul 23, 2024 • edited Loading

WJ44 commented Sep 13, 2024

kitkhai commented Jul 23, 2024 •

edited

Loading