Skip to content

Commit

Permalink
log reset added
Browse files Browse the repository at this point in the history
  • Loading branch information
robjharrison committed May 31, 2024
1 parent cc16bca commit 8317ef2
Show file tree
Hide file tree
Showing 5 changed files with 4,099 additions and 10 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/static.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Simple workflow for deploying static content to GitHub Pages
name: Deploy static content to Pages
# name: Deploy static content to Pages

# Run events
on:
# # Run events
# on:
# # On push
# push:
# branches: ["main"]
Expand Down
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
<h1>Ofsted ILACS Summary</h1>
<p>Summarised outcomes of published short and standard ILACS inspection reports by Ofsted, refreshed daily.<br/>An expanded version of the shown summary sheet, refreshed concurrently, is available to <a href='ofsted_childrens_services_overview.xlsx'>download here</a> as an .xlsx file. <br/>Data summary is based on the original <i>ILACS Outcomes Summary</i> published periodically by the ADCS:<a href='https://adcs.org.uk/inspection/article/ilacs-outcomes-summary'>https://adcs.org.uk/inspection/article/ilacs-outcomes-summary</a>.</p>
<p>Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ and the published PDF inspection report files. As a result of the nuances|variance within the inspection report content or pdf encoding, we're noting some problematic data extraction for a small number of LA's*.<br/> <a href="mailto:[email protected]?subject=Ofsted-Scrape-Tool">Feedback</a> on specific problems|inaccuracies|suggestions welcomed.<br/>**LA reports with issues: southend-on-sea, [overall, help_and_protection_grade,care_leavers_grade], nottingham,[inspection_framework, inspection_date], redcar and cleveland,[inspection_framework, inspection_date], knowsley,[inspector_name], stoke-on-trent,[inspector_name]</p>
<p><b>Summary data last updated: 29 05 2024 13:50</b></p>
<p><b>Summary data last updated: 31 05 2024 07:19</b></p>
<p><b>LA inspections last updated: []</b></p>
<div class="container">
<table border="1" class="dataframe">
Expand Down
19 changes: 13 additions & 6 deletions ofsted_childrens_services_inspection_scrape.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@
from datetime import datetime
import nltk
import json
import logging
import git

# pdf search/data extraction
Expand Down Expand Up @@ -105,17 +104,25 @@
except ModuleNotFoundError:
print("Please install 'scikit-learn' using pip")

# Configure logging/logging module
import warnings
import logging

# wipe / reset the logging file
with open('output.log', 'w'):
# comment out if maintaining ongoing/historic log
pass

# Keep warnings quiet unless priority
import warnings
logging.getLogger('org.apache.pdfbox').setLevel(logging.ERROR)
warnings.filterwarnings('ignore')

# Configure the logging module

logging.basicConfig(filename='output.log', level=logging.INFO, format='%(asctime)s - %(message)s')

nltk.download('punkt')
nltk.download('stopwords')

# text analysis libs
nltk.download('punkt') # tokeniser models/sentence segmentation
nltk.download('stopwords') # stop words ready for text analysis|NLP preprocessing



Expand Down
Binary file modified ofsted_childrens_services_overview.xlsx
Binary file not shown.
Loading

0 comments on commit 8317ef2

Please sign in to comment.