Skip to content

Commit

Permalink
minor html reformatting
Browse files Browse the repository at this point in the history
  • Loading branch information
robjharrison committed Aug 14, 2024
1 parent 37c2d87 commit c4e0e38
Show file tree
Hide file tree
Showing 4 changed files with 16 additions and 8 deletions.
4 changes: 2 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@
<body>
<h1>Ofsted CS JTAI Inpections Overview</h1>
<p>Summarised outcomes of published JTAI inspection reports by Ofsted, refreshed weekly.<br/>An expanded version of the shown summary sheet, refreshed concurrently, is available to <a href="ofsted_childrens_services_jtai_overview.xlsx">download here</a> as an .xlsx file. <br/>Data summary is based on the original <i>JTAI Outcomes Summary</i> published periodically by the ADCS: <a href="https://www.adcs.org.uk/inspection-of-childrens-services/">https://www.adcs.org.uk/inspection-of-childrens-services/</a>. <a href="https://github.com/data-to-insight/ofsted-ilacs-scrape-tool/blob/main/README.md">Read the source ILACS tool/project background details and future work.</a>.</p>
<p>Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ published PDF inspection report files. As a result of the nuances|variance within the inspection report content or pdf encoding, we're noting some problematic data extraction for a small number of LAs*.<br/> *Known extraction issues: JTAI report structure varies pre|post 2023. ADCS published inspection Themes unavailable via current scrape process. Publication date is based on CSS tag data and may not always reflect actual report publication. Where 1+ case studies are reported on, only 1 is pulled through.<br/><a href="mailto:[email protected]?subject=Ofsted-Scrape-Tool">Feedback</a> on specific problems|inaccuracies|suggestions welcomed.*</p>
<p><b>Summary data last updated: 13 08 2024 17:53</b></p>
<p>Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ published PDF inspection report files.<br/>As a result of the nuances|variance within the inspection report content or pdf encoding, we're noting problematic data extraction for a small number of LAs*.<br/> *Known extraction issues: <ul><li>JTAI report structure varies pre|post 2023(?), hence sparse|mixed summary columns until improved|agreed approach finalised.</li><li>ADCS published inspection Themes unavailable via current scrape process. This being worked on currently.</li><li>Publication date, isn't available within inspection reports and is therefore based on CSS tag data and may not always reflect actual report publication.</li><li>Where 1+ case studies are reported on (e.g. Peterborough City), only 1 summary is pulled through.</li></ul><a href="mailto:[email protected]?subject=Ofsted-Scrape-Tool">Feedback</a> highlighting problems|inaccuracies|suggestions welcomed.</p>
<p><b>Summary data last updated: 14 08 2024 09:36</b></p>
<p><b>LA inspections last updated: []</b></p>
<div class="container">
<table border="1" class="dataframe">
Expand Down
18 changes: 12 additions & 6 deletions ofsted_childrens_services_inspection_scrape.py
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -1035,10 +1035,15 @@ def save_to_html(data, column_order, local_link_column=None, web_link_column=Non
)

disclaimer_text = (
'Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ published PDF inspection report files. '
'As a result of the nuances|variance within the inspection report content or pdf encoding, we\'re noting some problematic data extraction for a small number of LAs*.<br/> '
'*Known extraction issues: JTAI report structure varies pre|post 2023. ADCS published inspection Themes unavailable via current scrape process. Publication date is based on CSS tag data and may not always reflect actual report publication. Where 1+ case studies are reported on, only 1 is pulled through.<br/>'
'<a href="mailto:[email protected]?subject=Ofsted-Scrape-Tool">Feedback</a> on specific problems|inaccuracies|suggestions welcomed.*'
'Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ published PDF inspection report files.<br/>'
'As a result of the nuances|variance within the inspection report content or pdf encoding, we\'re noting problematic data extraction for a small number of LAs*.<br/> '
'*Known extraction issues: <ul>'
'<li>JTAI report structure varies pre|post 2023(?), hence sparse|mixed summary columns until improved|agreed approach finalised.</li>'
'<li>ADCS published inspection Themes unavailable via current scrape process. This being worked on currently.</li>'
'<li>Publication date, isn\'t available within inspection reports and is therefore based on CSS tag data and may not always reflect actual report publication.</li>'
'<li>Where 1+ case studies are reported on (e.g. Peterborough City), only 1 summary is pulled through.</li>'
'</ul>'
'<a href="mailto:[email protected]?subject=Ofsted-Scrape-Tool">Feedback</a> highlighting problems|inaccuracies|suggestions welcomed.'
)

# # testing
Expand Down Expand Up @@ -1069,8 +1074,9 @@ def save_to_html(data, column_order, local_link_column=None, web_link_column=Non
# # If a web link column is specified, convert that column's values to HTML hyperlinks
# # Shortening the hyperlink text by taking the part after the last '/'
if web_link_column:
data[web_link_column] = data[web_link_column].apply(lambda x: f'<a href="{x}">ofsted.gov.uk/{x.rsplit("/", 1)[-1]}</a>') # publ_date
# if web_link_column:
data[web_link_column] = data[web_link_column].apply(lambda x: f'<a href="{x}">ofsted.gov.uk/{x.rsplit("/", 1)[-1]}</a>')

# if web_link_column: # if the link is a bytes obj, this might be problematic
# data[web_link_column] = data[web_link_column].apply(lambda x: f'<a href="{x}">ofsted.gov.uk/{x.rsplit("/", 1)[-1]}</a>' if isinstance(x, str) else x) # publ_date

# Convert column names to title/upper case
Expand Down
Binary file modified ofsted_childrens_services_jtai_overview.xlsx
Binary file not shown.
2 changes: 2 additions & 0 deletions setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@ sudo apt-get update
# Install Python dependencies
pip install -r requirements.txt

# Ensure run permissions on scrape script
chmod +x ofsted_childrens_services_inspection_scrape.py


# Install the Python extension for Visual Studio Code
Expand Down

0 comments on commit c4e0e38

Please sign in to comment.