Skip to content

Commit

Permalink
Data refresh 231224
Browse files Browse the repository at this point in the history
  • Loading branch information
robjharrison committed Dec 23, 2024
1 parent bbb3699 commit 9cd3fb9
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 4 deletions.
4 changes: 2 additions & 2 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@
<h1>Ofsted CS JTAI Inspections Overview</h1>
<p>Summarised outcomes of published JTAI inspection reports by Ofsted, refreshed weekly.<br/>An expanded version of the shown summary sheet, refreshed concurrently, is available to <a href="ofsted_childrens_services_jtai_overview.xlsx">download here</a> as an .xlsx file. <br/>Data summary is based on the original <i>JTAI Outcomes Summary</i> published periodically by the ADCS: <a href="https://www.adcs.org.uk/inspection-of-childrens-services/">https://www.adcs.org.uk/inspection-of-childrens-services/</a>. </p>
<p>Disclaimer: This summary is built from scraped data direct from https://reports.ofsted.gov.uk/ published PDF inspection report files.<br/><br/>Nuanced|variable inspection report content, structure and pdf encoding occasionally results in problematic data extraction for a small number of LAs.<br/> Known extraction issues: <ul><li>JTAI report structure varies pre|post 2023(?), resulting in sparse|mixed summary columns. [In development].</li><li>ADCS published inspection Themes unavailable via current scrape process. [In development].</li><li>Publication date, isn't available within inspection reports and is therefore based on CSS tag data and may not always reflect actual report publication.</li><li>Where 1+ case studies are reported on (e.g. Peterborough City), only 1 summary is pulled through.</li></ul><a href="mailto:[email protected]?subject=Ofsted-JTAI-Scrape-Tool">Feedback</a> highlighting problems|inaccuracies|suggestions welcomed.<a href="https://github.com/data-to-insight/ofsted-ilacs-scrape-tool/blob/main/README.md">Read the source ILACS tool/project for background details and future work.</a>.</p>
<p><b>Summary data last updated: 16 12 2024 13:45</b></p>
<p><b>LA inspections last updated: ['80475_hertfordshire/joint area child protection inspection - 12 december 2024']</b></p>
<p><b>Summary data last updated: 23 12 2024 11:29</b></p>
<p><b>LA inspections last updated: []</b></p>
<div class="container">
<table border="1" class="dataframe">
<thead>
Expand Down
3 changes: 1 addition & 2 deletions ofsted_childrens_services_inspection_scrape.py
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,6 @@ def extract_dates_from_text(text):
# ..
"""
# print("Debug: Starting date extraction")

if not text:
print("Debug: Input text is empty or None.")
Expand Down Expand Up @@ -851,7 +850,7 @@ def process_provider_links(provider_links):
# get/format date(s) (as dt objects)
## revised to capture 'actual' published date from css tag data
report_published_date = format_date(report_published_date_str, '%d %B %Y', '%d/%m/%y')
print(f"Debug: Report Published Date: {report_published_date} was {publication_date}")
# print(f"Debug: Report Published Date: {report_published_date} was {publication_date}")


# Now get the in-document data
Expand Down
Binary file modified ofsted_childrens_services_jtai_overview.xlsx
Binary file not shown.

0 comments on commit 9cd3fb9

Please sign in to comment.