🗞️ Extra extra! This is Finite News: the mindful, personalized newspaper.
I happily pay for subscriptions to quality news sources and support essential journalism! But increasingly news websites and newsletters are filled with clickbait, pop-ups, and attention vampires.
I made Finite News to deliver a lean, personalized daily news email. Its goal is to reduce distractions and focus on what's happening in the world.
Finite News can...
- Give you the day's headlines from your trusted APIs, feeds, and websites.
- Enforce strict limits on the volume of news
- Leave out ads and links.
- Applies rules and large language models (LLMs) to remove opinions and clickbait, consolidate related headlines, and only show news you haven't seen before.
- Forecast your local weather.
- Get you the latest XKCD comic and James Webb photo.
- Deliver custom alerts, like when your favorite team plays tonight.
- List upcoming events of interest to you.
Finite News is Python code that's set up as a Google Cloud Run job. It could be run locally as a cron job or deployed on other platforms, too.
- Publication: The general processes that are shared by every issue and subscription.
- Subscription: The customizations that personalize Finite News for a single person (subscriber).
- Issue: One email delivered to one subscriber.
- Set up the local code environment on your computer or server. This is where you'll work on the newspaper and deploy it to the cloud.
- Clone this repo as a directory locally.
- Install uv on your computer.
- Run
uv sync
to create the virtual environment. - Run
uv run pre-commit install
- Configure your newspaper (see "Designing your newspaper" section).
- Set up a Google Cloud account.
- You can follow the general setup steps for a Google Cloud job, like the early parts of this quickstart.
- Create a new Google Cloud project for Finite News.
- Install the gcloud command line utility on your computer.
- Make a new Google Cloud bucket. Add the files you created in "Designing your newspaper"
- Create an account on
sendgrid.com
. This lets you send the emails (via an API). - Store your secrets.
- You'll need the following secrets:
SENDGRID_API_KEY
FN_BUCKET_NAME
- (Optional)
OPENAI_API_KEY
if you optional add GPT to filter headlines (see below). - Any API keys you signed up for custom news sources
- Store each secret in two places:
- In Google Cloud Secrets Manager. Once deployed as a Cloud Run Job, we'll expose these secrets as environment variables.
- As local environment variables on your computer (e.g. in .zshrc,
export SECRET_NAME="secret_value"
)
- You'll need the following secrets:
- (Optional) Download a free language model to enable the Smart Deduper.
- The Smart Deduper removes headlines that are similar to others in the same issue. It uses a language model to measure the similarity (in meaning) of headlines.
- I like to use the model
paraphrase-multilingual-MiniLM-L12-v2
. It works in multiple languages.- But you can use another model supported by https://huggingface.co/sentence-transformers
- Download the language model to the project folder. Here's one way:
- Run the following block of code in this project's virtual environment. You can plop it in a cell in the Jupyter notebook
dev.ipynb
.from sentence_transformers import SentenceTransformer model = SentenceTransformer("sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
- That will download all the model files to a central location on your computer:
~/.cache/hub/sentence-transformers/{MODEL}
. Example:~/.cache/hub/sentence-transformers/models--sentence-transformers--paraphrase-multilingual-MiniLM-L12-v2
- Move that
{MODEL}
subdirectory into the root Finite News project folder, undermodels/smart-deduper/{MODEL}
- Run the following block of code in this project's virtual environment. You can plop it in a cell in the Jupyter notebook
- In
publication_config.yml
specify the path to the model in the Finite News project folder, specifically thesnapshots/{HASH}
folder inside it.- Example:
path_to_model: "models/smart-deduper/models--sentence-transformers--paraphrase-multilingual-MiniLM-L12-v2/snapshots/8d6b950845285729817bf8e1af1861502c2fed0c"
- Example:
- (Optional) Create an API account on openai.com, to use GPT to remove low-quality headlines (clickbait etc).
- 💁♂️ Tip: I find that using manual rules, configuring
substance_rules.yml
in your newspaper files, does more to improve the quality of headlines than the GPT feature. It takes trial and error to find the keywords to exclude the junk. - Note: Using the OpenAI API will incur charges to your OpenAI account.
- If you do this, add the secret
OPENAI_API_KEY
to the secrets you created above. - And update the code calls to
run_finite_news()
(inrun.py
anddev.ipynb
) by adding the argumentdisable_gpt=False
.
- 💁♂️ Tip: I find that using manual rules, configuring
- Test locally using the notebook
dev.ipynb
.- Select the virtual environment
.venv
in the project folder (created when you diduv sync
). - To run Python scripts directly from your local environment, you can simply use
uv run run.py
.
- Select the virtual environment
- Deploy.
- If you use a local cron job, you can schedule the command in this project directory and run as
uv run run.py
. - To run it in the cloud as a Google Cloud Run job:
- Follow the general steps of this quickstart.
- Enable Google Cloud Run and configure your computer to operate it with the
gcloud
command line utility. - Ensure the Cloud Run job has permissions to access the secrets and Cloud Storage bucket in your project.
- Deploy the code from your computer to a new Cloud Run job.
- You can use the bash script
./deploy-finite-news.sh
. Update as necessary for your region etc. - This script will build a container out of your code, upload it to the Google Cloud's Artifact Registry, and create a Cloud Run job.
- The script will tell you when it's done.
- You can use the bash script
- You can run the job! Either
- Execute the new job as a one-off using the Google Cloud Console or gcloud command line.
- Or create a Scheduler Trigger to run the job on a schedule, such as once a day.
- If you use a local cron job, you can schedule the command in this project directory and run as
- To updating the configuration (such as adding a new subscriber config file or changing the
publication_config.yml
): Upload the changed/new files to your existing Google Cloud Storage bucket. - To update the code:
- To add, remove, or update dependencies, use
uv
commands.- If you haven't used
uv
before, it's awesome. Use its commands like you would usepip
orconda
.
- If you haven't used
- To deply new code to the Google Cloud job:
- Commit code changes to your local git repo.
- When you commit, a
uv
pre-commit will update therequirements.txt
file if any dependencies have changed.
- When you commit, a
- Run
./deploy-finite-news.sh
to deploy the new code.- This will build a new version of the container, upload it to the Google Cloud's Artifact Registry, and point the existing Cloud Run job to the new version of the container.
- The deployment will use the
.python-version
andrequirements.txt
files to install the right version of Python and the dependencies in the container.
- If you set up a Scheduler Trigger to run the job on a schedule, no changes should be needed!
- The job should automatically point to the new version of the container.
- You may want to delete the old version of the container in the Artifact Registry, if you don't need it.
- Cuz they charge you keeping containers up there, like a storage unit.
- Commit code changes to your local git repo.
- To add, remove, or update dependencies, use
🚨🚨 Comply with the Terms of Service of your sources and APIs.
Create the following files. See the samples_files
folder for examples. Later, your files will go in your Google Cloud Storage Bucket.
publication_config.yml
: General choices for how to run Finite News- This includes setting up individual news sources. See the sample
publication_config.yml
for instructions. - If you need an API key to access a particular news source, add
api_key_name: {NAME OF YOUR API KEY}
under that source inpublication_config.yml
. Then add a new secret/environment variable for{NAME OF YOUR API KEY}
as described above in Set Up for the other secrets. - To disable GPT, delete the
gpt
section inpublication_config.yml
.
- This includes setting up individual news sources. See the sample
config_*.yml
: Configuration for each subscription (each daily email).- To add a new subscriber, create a new
config_their_name.yml
file and upload to the bucket. - Finite News creates the list of subscribers by looking for all YML files in the bucket that begin with
config_
.
- To add a new subscriber, create a new
template.htm
: The layout for the email. The parts in[[ ]]
are populated by the code at runtime.substance_rules.yml
: Policies for identifying "low substance" headlines to always drop. You can add rules to remove headlines on topics you don't want to hear about or recurring noise.thoughts_of_the_day.yml
: (optional) Shared list of jokes and quotes sampled for Thought of the Day. To enable, inconfig_*.yml
file(s) setadd_shared_thoughts=True
.
You're awesome, thank you! The best way is to create a new Issue.