Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5 summarise clusters #6

Merged
merged 29 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
896906c
Save topic info and produce names and descriptions for clusters
RFOxbury Oct 8, 2024
866bcb2
Lint files
RFOxbury Oct 8, 2024
3f104f9
Add notebook on semantic chunking
RFOxbury Oct 9, 2024
98b08c2
Check buffer sizes
RFOxbury Oct 9, 2024
b490c46
Merge branch '7-semantic-chunking' into 5-summarise-clusters
RFOxbury Oct 10, 2024
ab0be63
Expand the pipeline:
RFOxbury Oct 11, 2024
1b03982
Lint files
RFOxbury Oct 11, 2024
756eac0
Link interview prompt questions in app
RFOxbury Oct 11, 2024
ae5bec7
Reorganise pipeline:
RFOxbury Oct 22, 2024
8030d22
Small updates
RFOxbury Oct 30, 2024
94e28e4
Rename app
RFOxbury Oct 31, 2024
05b9b4b
Rename again because I was on the wrong branch
RFOxbury Oct 31, 2024
52221ba
Merge branch 'dev' into 5-summarise-clusters
RFOxbury Nov 4, 2024
8a6315a
Add csvs to gitignore
RFOxbury Nov 5, 2024
b4a7353
Add poetry.lock to gitignore
RFOxbury Nov 6, 2024
6e469ff
Add types and docstrings
RFOxbury Nov 6, 2024
fc44cfd
Update readme
RFOxbury Nov 6, 2024
37bd674
Untrack poetry lock
RFOxbury Nov 6, 2024
6042f3d
Add dependencies and remove unused scripts
RFOxbury Nov 7, 2024
af28698
Add readme and add small fixes
RFOxbury Nov 7, 2024
55a5434
Remove error saving
RFOxbury Nov 11, 2024
d5b6fd3
Add ollama instructions to readme
RFOxbury Nov 11, 2024
512f2e8
adding tests
beingkk Nov 12, 2024
9d2c07c
Merge branch '5-summarise-clusters' of https://github.com/nestauk/dsp…
beingkk Nov 12, 2024
71b77ce
Merge branch '5-summarise-clusters' of github.com:nestauk/dsp_intervi…
RFOxbury Nov 12, 2024
5230b9d
Remove deprecated script
RFOxbury Nov 12, 2024
4e5cbf3
Add chaining for readability
RFOxbury Nov 12, 2024
39c1eb9
Update dsp_interview_transcripts/pipeline/name_clusters.py
RFOxbury Nov 12, 2024
61e9751
Update dsp_interview_transcripts/pipeline/README.md
RFOxbury Nov 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,8 @@ ipython_config.py
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
Expand Down Expand Up @@ -171,3 +172,8 @@ outputs/*
!/outputs/*.md
# DS store in any folder
**/.DS_Store

# data files
*.csv
poetry.lock
poetry.lock
48 changes: 24 additions & 24 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,12 +44,12 @@ repos:
types: [file, python]
stages: [commit]

- id: flake8
name: Run Flake8
entry: poetry run pflake8
language: system
types: [file, python]
stages: [commit]
# - id: flake8
# name: Run Flake8
# entry: poetry run pflake8
# language: system
# types: [file, python]
# stages: [commit]

- id: yamllint
name: Run Yamllint
Expand All @@ -58,21 +58,21 @@ repos:
types: [file, yaml]
stages: [commit]

- id: bandit
name: Run Bandit
entry: poetry run bandit
language: system
types: [file, python]
args:
[
--configfile,
pyproject.toml,
--severity-level,
all,
--confidence-level,
all,
--quiet,
--format,
custom,
]
stages: [commit]
# - id: bandit
# name: Run Bandit
# entry: poetry run bandit
# language: system
# types: [file, python]
# args:
# [
# --configfile,
# pyproject.toml,
# --severity-level,
# all,
# --confidence-level,
# all,
# --quiet,
# --format,
# custom,
# ]
# stages: [commit]
12 changes: 7 additions & 5 deletions dsp_interview_transcripts/__init__.py
Original file line number Diff line number Diff line change
@@ -1,28 +1,30 @@
import logging
import os
from pathlib import Path
from typing import Optional
import yaml

from pathlib import Path
from typing import Optional

import dotenv
import yaml


dotenv.load_dotenv()


def get_yaml_config(file_path: Path) -> Optional[dict]:
"""Fetch yaml config and return as dict if it exists."""
if file_path.exists():
with open(file_path, "rt") as f:
return yaml.load(f.read(), Loader=yaml.FullLoader)


# Define project base directory
PROJECT_DIR = Path(__file__).resolve().parents[1]

# Define logger
# Define logger
logging.basicConfig(level=logging.INFO, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s")
logger = logging.getLogger(__name__)

# base/global config
_base_config_path = Path(__file__).parent.resolve() / "config/base.yaml"
config = get_yaml_config(_base_config_path)
config = get_yaml_config(_base_config_path)
22 changes: 11 additions & 11 deletions dsp_interview_transcripts/config/base.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@

---
questions: ["What, if anything, do you know about the Boiler Upgrade Scheme? If you don't know anything about the scheme, just give it your best guess.",
"What, if anything, do you know about the process of applying for Boiler Upgrade Scheme funding?",
"What do you think are the eligibility requirements for someone to use this scheme?",
"How would you go about finding out more about the Boiler Upgrade Scheme",
"What do you think about there being eligibility requirements for a scheme like this?",
"As a homeowner, where do you see yourself in relation to the eligibility requirements?",
"What, if any, type of work do you think needs to be done to a house to replace fossil fuel heating systems?",
"What types of home upgrades would you consider getting done to your house to improve the efficiency of your heating system?",
"What types of work to your house wouldn't you consider?",
"What are some energy-efficient heating systems that you could consider, apart from the one currently in use at your home?",
"Is there anything we've talked about you'd like to discuss further?"]
"What, if anything, do you know about the process of applying for Boiler Upgrade Scheme funding?",
"What do you think are the eligibility requirements for someone to use this scheme?",
"How would you go about finding out more about the Boiler Upgrade Scheme",
"What do you think about there being eligibility requirements for a scheme like this?",
"As a homeowner, where do you see yourself in relation to the eligibility requirements?",
"What, if any, type of work do you think needs to be done to a house to replace fossil fuel heating systems?",
"What types of home upgrades would you consider getting done to your house to improve the efficiency of your heating system?",
"What types of work to your house wouldn't you consider?",
"What are some energy-efficient heating systems that you could consider, apart from the one currently in use at your home?",
"Is there anything we've talked about you'd like to discuss further?"]
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.10"
"version": "3.11.4"
}
},
"nbformat": 4,
Expand Down
Loading