Skip to content

Commit

Permalink
API Mirgrations script changes
Browse files Browse the repository at this point in the history
- Condensing 12 python scripts to 4 with more distinctive purposes 
- One bash that calls them and deletes unnecessary CSVs
- Pending: Update README, setup script
  • Loading branch information
niccolopaganini authored Sep 22, 2023
1 parent 03f4b71 commit 58fdf6a
Show file tree
Hide file tree
Showing 7 changed files with 405 additions and 0 deletions.
45 changes: 45 additions & 0 deletions bin/API_migration_scripts/API_cf.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
read -p "Enter the URL which contains the API changes: " url

python c1_m2_u3_s4.py "$url"
sleep 4
python o5_up6.py
sleep 10
python l7_sl8_cs9_cswc10.py "$url"
sleep 4
python me11_mo12.py

if [ -e classes.csv ]; then
rm classes.csv
fi

if [ -e modified_csv.csv ]; then
rm modified_csv.csv
fi

if [ -e unique_classes.csv ]; then
rm unique_classes.csv
fi

if [ -e simplify_classes.csv ]; then
rm simplify_classes.csv
fi

if [ -e unique_packages.csv ]; then
rm unique_packages.csv
fi

if [ -e links.csv ]; then
rm links.csv
fi

if [ -e simplified_links.csv ]; then
rm simplified_links.csv
fi

if [ -e changed_scraped.csv ]; then
rm changed_script.csv
fi

if [ -e changed_scraped_without_column.csv ]; then
rm changed_scraped_without_column.csv
fi
Binary file added bin/API_migration_scripts/CSVs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
83 changes: 83 additions & 0 deletions bin/API_migration_scripts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@

# ReadMe - API Migration Automation script

This README allows one to successfully run to the API Migration Automation scripts. This script searches all directories for plujgins and then stores the changed methods & fields in a CSV. This script DOES NOT change code but automates the process of finding packages required for the API migration.


## File(s) Description

### Python Scripts
| .py | Description |
| ------------- | ------------- |
| classes | scrapes packages/ classes |
| modified_csv | removes unnecessary info |
| unique_classes | compiles a unique list from the above |
| simplify_classes | removes unnecessary from the above |
| output | compares the above csv with all the plugins form the given directory |
| unique_packages | removes the path column and compiles a unique list |
| links | scrapes similar to classes but stores them as URLs |
| simplified_links | removes unnecessary info |
| changed_scraped | gets the methods with the URLs and desc. |
| changed_scraped_without_column | filters to the methods|
| matching_entries | gets the matching packages |
| merged_output | gets the name of the packages, file location, and method name (as a whole) |

The above python scripts creates the following CSVs

### CSVs

| .CSVs | Description |
| ------------- | ------------- |
| classes | scrapes packages/ classes |
| modified_csv | removes unnecessary info |
| unique_classes | compiles a unique list from the above |
| simplify_classes | removes unnecessary from the above |
| output | compares the above csv with all the plugins form the given directory |
| unique_packages | removes the path column and compiles a unique list |
| links | scrapes similar to classes but stores them as URLs |
| simplified_links | removes unnecessary info |
| changed_scraped | gets the methods with the URLs and desc. |
| changed_scraped_without_column | filters to the methods|
| matching_entries | gets the matching packages |
| merged_output | gets the name of the packages, file location, and method name (as a whole) |

#### Image 1 Below

![CSV_Files_Importance](https://github.com/niccolopaganini/e-mission-phone-nvsr-iter1/blob/16b6ba09b3e6dc37e0927b6ff338400e3236e28b/bin/API_migration_scripts/screenshots/CSVs.png)

## Setup/ How to run

### Setup
__1. Navigate to the directory__
```
.../e-mission-phone/bin/API_Migration_scripts
```

**2. Grab API changes' link**

Copy the link which has the API changes mentioned
_Link should look something like this:_
```
https://developer.android.com/sdk/api_diff/33/changes/alldiffs_index_changes
```

__3. This command runs the shell script which in turn executes the python scripts__
```
bash bash API_cf.sh.sh
```
## Expected Output
Excepted Output will be a CSV file (```merged_output.csv```). The file opened in an excel will be in the following format:
```
<package/ class> | <File Location> | <Method name/ link>
```

For example:
```
content | /Users/nseptank/Downloads/e-mission-phone/plugins/cordova-plugin-file/src/android/FileUtils.java | ../../../../reference/android/companion/CompanionDeviceManager.html#hasNotificationAccess(android.content.ComponentName)
```
I formatted it in this way because the method has the "full extension" so having the directory location in the middle can allow one to look at the classes on the left for a high level view and can scroll to the right for more information (basically quality of life from my perspective).

#### _Reasons why I didn't print changes_
1. The whole process on average took 4 minutes to run.
2. Quality of life - Looks more presentable/ easy to understand
3. Work span can take more time and this will have to be run just once. After that it's just you referring the CSV.
82 changes: 82 additions & 0 deletions bin/API_migration_scripts/c1_m2_u3_s4.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
import csv
import requests
import bs4
import pandas as pd
import sys

def main():
if len(sys.argv) != 2:
print("Usage: python combined_script.py <URL>")
sys.exit(1)

url = sys.argv[1]
response = requests.get(url)
soup = bs4.BeautifulSoup(response.content, "html.parser")

links = []
for a in soup.find_all("a"):
if a.has_attr("href") and a["href"].startswith("/sdk/api_diff/33/changes/"):
links.append(a["href"])

csv_file = open("classes.csv", "w", newline="")
csv_writer = csv.DictWriter(csv_file, fieldnames=["Package Class"])
csv_writer.writeheader()

for link in links:
css_class = link.split("/")[-1]
csv_writer.writerow({"Package Class": css_class})

# Modify the CSV
csv_path = "classes.csv"
with open(csv_path, "r") as f:
reader = csv.DictReader(f)
with open("modified_csv.csv", "w") as fw:
writer = csv.DictWriter(fw, fieldnames=["Package Class"])
writer.writeheader()
for row in reader:
class_name = row["Package Class"]
class_name = class_name.split("#")[0]
class_name = class_name.lstrip(".")
class_name = class_name.strip("[]")
class_name = class_name.rstrip(".")
row["Package Class"] = class_name
writer.writerow(row)

# Create unique classes
with open("modified_csv.csv", "r") as f:
classes = set()
for line in f:
class_name = line.strip().split(",")[0]
classes.add(class_name)

with open("unique_classes.csv", "w") as fw:
writer = csv.writer(fw)
for class_name in classes:
writer.writerow([class_name])

# Simplify classes
def simplify(csv_file, output_file):
with open(csv_file, "r") as f:
reader = csv.reader(f)
lines = []
for row in reader:
new_row = []
for item in row:
end_index = item.rfind(".")
if end_index == -1:
new_row.append(item)
else:
simplified_item = item[end_index + 1:]
new_row.append(simplified_item)
lines.append(new_row)

with open(output_file, "w") as f:
writer = csv.writer(f)
writer.writerows(lines)

csv_file = "unique_classes.csv"
output_file = "simplify_classes.csv"
simplify(csv_file, output_file)

if __name__ == "__main__":
main()
91 changes: 91 additions & 0 deletions bin/API_migration_scripts/l7_sl8_cs9_cswc10.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
import os
import csv
import requests
import bs4
import sys

# Define functions for each part of the original scripts

def get_links(url):
response = requests.get(url)
soup = bs4.BeautifulSoup(response.content, "html.parser")

links = []
for a in soup.find_all("a"):
if "href" in a.attrs:
links.append(a["href"])

return links

def filter_links(links, output_file):
filtered_links = [link for link in links if link.startswith("/sdk")]

with open(output_file, "w", newline="") as f:
csvwriter = csv.writer(f)
csvwriter.writerows([[link] for link in filtered_links])

def get_changed_content(url):
response = requests.get(f"https://developer.android.com{url}")

if response.status_code == 200:
soup = bs4.BeautifulSoup(response.content, "html.parser")

tables = soup.find_all("table", summary=lambda s: s and ("Changed Methods" in s or "Changed Fields" in s))

contents = []

for table in tables:
for row in table.find_all("tr"):
cells = row.find_all("td")
if len(cells) == 3:
method_name = cells[0].find("a", href=True)
if method_name:
method_url = method_name['href']
method_name_text = method_name.text
description = cells[1].text.strip()

contents.append([method_name_text, method_url, description])

return contents

def main():
if len(sys.argv) != 2:
print("Usage: python combined_script.py <URL>")
sys.exit(1)

url = sys.argv[1]

# Get links from the URL
links = get_links(url)

# Filter and save relevant links
filtered_links_file = "simplified_links.csv"
filter_links(links, filtered_links_file)

# Scrape changed content and save it
scraped_content_file = "changed_scraped.csv"
with open(scraped_content_file, "w", newline='') as f:
csvwriter = csv.writer(f)
csvwriter.writerow(['Method Name', 'Method URL', 'Description'])

for link in links:
changed_content = get_changed_content(link)
if changed_content:
with open(scraped_content_file, "a", newline='') as f:
csvwriter = csv.writer(f)
csvwriter.writerows(changed_content)

# Remove the first column from the scraped content
with open(scraped_content_file, "r") as input_file:
reader = csv.reader(input_file)
data = [row for row in reader]

new_data = [[row[1]] for row in data]

scraped_content_without_column_file = "changed_scraped_without_column.csv"
with open(scraped_content_without_column_file, "w", newline='') as output_file:
writer = csv.writer(output_file)
writer.writerows(new_data)

if __name__ == "__main__":
main()
46 changes: 46 additions & 0 deletions bin/API_migration_scripts/me11_mo12.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import csv
import re
import pandas as pd

try:
# Load unique packages into a set
unique_packages = set()
with open("unique_packages.csv", "r") as unique_file:
reader = csv.reader(unique_file)
for row in reader:
if row:
unique_packages.add(row[0].strip()) # Remove leading/trailing whitespace

# Load scraped data and find matching entries
matching_entries = []
with open("changed_scraped_without_column.csv", "r") as scraped_file:
reader = csv.reader(scraped_file)
for row in reader:
if row:
url = row[0]
# Extract words from the URL using regular expressions
words = re.findall(r'\w+', url)
# Check if any word from unique_packages is in the list of words
matching_packages = [package for package in unique_packages if any(word == package for word in words)]
if matching_packages:
matching_entries.append([", ".join(matching_packages), url])

# Write the matching entries to a new CSV file
with open("matching_entries.csv", "w", newline='') as matching_file:
writer = csv.writer(matching_file)
writer.writerow(["Matching Packages", "Matching Content"])
writer.writerows(matching_entries)

# Merge the output CSV and matching entries CSV
output_df = pd.read_csv('output.csv', header=None)
matching_entries_df = pd.read_csv('matching_entries.csv', header=None)

merged_df = output_df.merge(matching_entries_df, left_on=0, right_on=0, how='inner')

merged_df.columns = ['Package', 'Location', 'Description']
merged_df.drop_duplicates(inplace=True)

merged_df.to_csv('merged_output.csv', index=False, header=None)

except Exception as e:
print(f"An error occurred: {str(e)}")
Loading

0 comments on commit 58fdf6a

Please sign in to comment.