summary_plot: also applied typehints and snake_case (refer issue #50 ) #58

Tanvi-Jain01 · 2023-07-10T09:18:54Z

@nipunbatra , @patel-zeel
This PR solves the issue #50 and also implements the suggestion made by @patel-zeel of snake_case and typehints

BEFORE:

CODE:

import xarray as xr
import numpy as np
import pandas as pd
import geopandas as gpd
import plotly.express as px
import matplotlib.pyplot as plt

np.random.seed(42)  

start_date = pd.to_datetime('2022-01-01')
end_date = pd.to_datetime('2022-12-31')

dates = pd.date_range(start_date, end_date)

pm25_values = np.random.rand(365)  # Generate 365 random values
o3_values = np.random.rand(365) 
nox_values = np.random.rand(365)
co_values = np.random.rand(365)
pm10_values = np.random.rand(365)

"pm10", "pm25", "sox", "co", "o3", "nox", "pb", "nh3"
df = pd.DataFrame({
    'date': dates,
    'pm25': pm25_values,
    'o3':o3_values,
    'nox': nox_values,
    'co': co_values,
     'pm10': pm10_values
})

df['date'] = df['date'].dt.strftime('%Y-%m-%d')  # Convert date format to 'YYYY-MM-DD'

print(df)

from vayu.summaryPlot import summaryPlot

print(df.columns)
summaryPlot(df)

ERROR:

KeyError                                  Traceback (most recent call last)
File ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py:3802, in Index.get_loc(self, key, method, tolerance)
   3801 try:
-> 3802     return self._engine.get_loc(casted_key)
   3803 except KeyError as err:

File ~\anaconda3\lib\site-packages\pandas\_libs\index.pyx:138, in pandas._libs.index.IndexEngine.get_loc()

File ~\anaconda3\lib\site-packages\pandas\_libs\index.pyx:165, in pandas._libs.index.IndexEngine.get_loc()

File pandas\_libs\hashtable_class_helper.pxi:5745, in pandas._libs.hashtable.PyObjectHashTable.get_item()

File pandas\_libs\hashtable_class_helper.pxi:5753, in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'so2'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
Cell In[4], line 15
     12 #benzene.dropna(inplace=True)
     13 benzene.fillna(0, inplace=True)
---> 15 summaryPlot(benzene)

File ~\anaconda3\lib\site-packages\vayu\summaryPlot.py:126, in summaryPlot(df)
    124 plt.subplot(9, 3, sub)
    125 sub = sub + 1
--> 126 a = df_all[dataPoints[i]].plot.line(color="gold")
    127 a.axes.get_xaxis().set_visible(False)
    128 a.yaxis.set_label_position("left")

File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3807, in DataFrame.__getitem__(self, key)
   3805 if self.columns.nlevels > 1:
   3806     return self._getitem_multilevel(key)
-> 3807 indexer = self.columns.get_loc(key)
   3808 if is_integer(indexer):
   3809     indexer = [indexer]

File ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py:3804, in Index.get_loc(self, key, method, tolerance)
   3802     return self._engine.get_loc(casted_key)
   3803 except KeyError as err:
-> 3804     raise KeyError(key) from err
   3805 except TypeError:
   3806     # If we have a listlike key, _check_indexing_error will raise
   3807     #  InvalidIndexError. Otherwise we fall through and re-raise
   3808     #  the TypeError.
   3809     self._check_indexing_error(key)

KeyError: 'so2'

OUTPUT:

AFTER:

CODE:

 import pandas as pd
 import matplotlib.pyplot as plt
def summary_plot(df:pd.DataFrame):
   

    # Initialize variables
    pollutants = ["pm10", "pm25", "sox", "co", "o3", "nox", "pb", "nh3"]
    categories = ["s", "m", "h"]

    counts = {pollutant: {category: 0 for category in categories} for pollutant in pollutants}

    
    df.index = pd.to_datetime(df.date)
    df = df.drop("date", axis=1)
    df_all = df.resample("1D")
    df_all = df.copy()
    df_all = df_all.fillna(method="ffill")
    #print(df_all.columns)

    # Calculate counts for each pollutant category
    for pollutant in pollutants:
        if pollutant in df_all.columns:
            column_data = df_all[pollutant]
            #print(df_all)
            for _, data in column_data.iteritems():
                if pollutant in ["pm10", "pm25"]:
                    if data < 100:
                        counts[pollutant]["s"] += 1
                    elif data < 250:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "co":
                    if data < 2:
                        counts[pollutant]["s"] += 1
                    elif data < 10:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "sox":
                    if data <= 80:
                        counts[pollutant]["s"] += 1
                    elif data <= 380:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "o3":
                    if data < 100:
                        counts[pollutant]["s"] += 1
                    elif data < 168:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "nox":
                    if data < 80:
                        counts[pollutant]["s"] += 1
                    elif data < 180:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "pb":
                    if data <= 1:
                        counts[pollutant]["s"] += 1
                    elif data <= 2:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
                elif pollutant == "nh3":
                    if data <= 400:
                        counts[pollutant]["s"] += 1
                    elif data <= 800:
                        counts[pollutant]["m"] += 1
                    else:
                        counts[pollutant]["h"] += 1
         
                

    # Plot line, histogram, and pie charts for each pollutant
    fig, axes = plt.subplots(len(df_all.columns), 3, figsize=(25,25))

    for i, pollutant in enumerate(df_all.columns):
        ax_line = axes[i, 0]
        ax_hist = axes[i, 1]
        ax_pie = axes[i, 2]

        df_all[pollutant].plot.line(ax=ax_line, color="gold")
        ax_line.axes.get_xaxis().set_visible(False)
        ax_line.yaxis.set_label_position("left")
        ax_line.set_ylabel(pollutant, fontsize=30, bbox=dict(facecolor="whitesmoke"))

        ax_hist.hist(df_all[pollutant], bins=50, color="green")

        labels = ["Safe", "Moderate", "High"]
        sizes = [counts[pollutant][category] for category in categories]
        explode = [0, 0, 1]

        ax_pie.pie(sizes, explode=explode, labels=labels, autopct="%1.1f%%", shadow=False, startangle=90)
        ax_pie.axis("equal")

        ax_pie.set_xlabel("Statistics")
      
        print(f"{pollutant}\nmin = {df_all[pollutant].min():.2f}\nmax = {df_all[pollutant].max():.2f}\nmissing = {df_all[pollutant].isna().sum()}\nmean = {df_all[pollutant].mean():.2f}\nmedian = {df_all[pollutant].median():.2f}\n95th percentile = {df_all[pollutant].quantile(0.95):.2f}\n")

    plt.savefig("summaryPlot.png", dpi=300, format="png")
    plt.show()
    print("your plots has also been saved")
    plt.close()

USAGE:

summary_plot(df)

OUTPUT:

Tanvi-Jain01 added 9 commits June 30, 2023 08:59

enhanced code of scatterPlot(refer issue sustainability-lab#43)

03f403a

timplot: modifying plots using plotly and

a2ea7c7

Adding visualization using plotly

d7d2e6b

modifying the code of googleMaps

fd7c45c

modifying googlemaps sustainability-lab#38

5b48d31

Commit message: timeplot using plotly and subplot error solved

043b730

googleMaps code enhanced and errors solved

273342d

code extended with group and time_period

4087590

applied typehints and camelcase

b53edd1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

summary_plot: also applied typehints and snake_case (refer issue #50 ) #58

summary_plot: also applied typehints and snake_case (refer issue #50 ) #58

Tanvi-Jain01 commented Jul 10, 2023

summary_plot: also applied typehints and snake_case (refer issue #50 ) #58

Are you sure you want to change the base?

summary_plot: also applied typehints and snake_case (refer issue #50 ) #58

Conversation

Tanvi-Jain01 commented Jul 10, 2023

BEFORE:

AFTER: