Refactor code expanding/exploding regions into a single utility function #206

siddharth-krishna · 2024-03-01T12:48:10Z

There's code in generate_uc_properties that expands allregions and comma-separated region lists:

Lines 749 to 767 in b12287d

    
           # TODO: Can this (until user_constraints.explode) become a utility function? 
        
           # Handle allregions by substituting it with a list of internal regions 
        
           index = user_constraints["region"].str.lower() == "allregions" 
        
           if any(index): 
        
               user_constraints["region"][index] = [model.internal_regions] 
        
           # Handle comma-separated regions 
        
           index = user_constraints["region"].str.contains(",").fillna(value=False) 
        
           if any(index): 
        
               user_constraints["region"][index] = user_constraints.apply( 
        
                   lambda row: [ 
        
                       region 
        
                       for region in str(row["region"]).split(",") 
        
                       if region in model.internal_regions 
        
                   ], 
        
                   axis=1, 
        
               ) 
        
           # Explode regions 
        
           user_constraints = user_constraints.explode("region", ignore_index=True)

which is very similar to code in process_transform_tables:

xl2times/xl2times/transforms.py

Lines 1708 to 1745 in 3720c7e

    
           # Handle Regions: 
        
           if set(df.columns).isdisjoint( 
        
               {x.lower() for x in regions} | {"allregions"} 
        
           ): 
        
               if "region" not in df.columns: 
        
                   # If there's no region information at all, this table is for all regions: 
        
                   df["region"] = ["allregions"] * len(df) 
        
               # Else, we only have a "region" column so handle it below 
        
           else: 
        
               if "region" in df.columns: 
        
                   raise ValueError( 
        
                       "ERROR: table has a column called region as well as columns with" 
        
                       f" region names:\n{table}\n{df.columns}" 
        
                   ) 
        
               # We have columns whose names are regions, so gather them into a "region" column: 
        
               region_cols = [ 
        
                   col_name 
        
                   for col_name in df.columns 
        
                   if col_name in set([x.lower() for x in regions]) | {"allregions"} 
        
               ] 
        
               other_columns = [ 
        
                   col_name for col_name in df.columns if col_name not in region_cols 
        
               ] 
        
               df = pd.melt( 
        
                   df, 
        
                   id_vars=other_columns, 
        
                   var_name="region", 
        
                   value_name="value", 
        
                   ignore_index=False, 
        
               ) 
        
               df = df.sort_index().reset_index(drop=True)  # retain original row order 
        
           # This expands "allregions" into one row for each region: 
        
           df["region"] = df["region"].map( 
        
               lambda x: regions if x == "allregions" else x 
        
           ) 
        
           df = df.explode(["region"]) 
        
           df["region"] = df["region"].str.upper()

and there's also an explode function in utils.py.

It would be good to have all the code exploding regions in one place, both for code reuse and conciseness but also so that optimizations are applied everywhere.

(Link to original discussion: https://github.com/etsap-TIMES/xl2times/pull/179/files/4ea76267c9558b3a08d09ec282b7a5fcaa458f8c#r1487242195)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor code expanding/exploding regions into a single utility function #206

Refactor code expanding/exploding regions into a single utility function #206

siddharth-krishna commented Mar 1, 2024

Refactor code expanding/exploding regions into a single utility function #206

Refactor code expanding/exploding regions into a single utility function #206

Comments

siddharth-krishna commented Mar 1, 2024