Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exploratory script for parsing astro.ph feed #2

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

james-trayford
Copy link

Currently this is just a script to demonstrate parsing the astro.ph feed, constructing posts, and printing them out. Some features:

  • Pull today's papers
  • Tex => Unicode conversion in titles
  • Author name parsing & arrangement - use et al. for > 3 authors. Option to abbreviate names - but this seems hard to make reliable thanks to many edge-cases - currently use names as entered on arXiv.
  • Truncate titles if above 300 char limit (TODO: check grapheme counting)
  • indicative emoji for sub-feeds , i.e.
subcats = ['CO','EP','GA','HE','IM','SR']                                                                                                                   
subcat_emoji = ['🔮','🪐','🌀','🎆','🛠️','✨']                                                                                                               
emojidict = dict(zip(subcats, subcat_emoji)) 

@emilyhunt
Copy link
Contributor

emilyhunt commented Dec 12, 2024

Looks awesome! This handles tons of the arXiv side of things now 😄

One thought I have before going further is about Python tooling - probably easier to do now than later.

pip freeze struggles with cross-platform support once more packages are added (as it dumps everything in the environment). Could you try specifying dependencies in e.g. a pyproject.toml instead? (Link goes to an example one in the astrofeed-server repo)

For consistency with the other repos & making it easier, you could try using uv. It generates platform-independent lockfiles & updates the pyproject.toml dependencies automatically, too.

Would be as simple as:

  1. Make a new pyproject.toml without dependencies in it - just package name etc (you may need to delete the old requirements.txt and start fresh)
  2. Install uv
  3. Pick a Python version - probably latest 3.12, so do uv python pin 3.12
  4. Add dependencies with uv add (docs link). I guess you'd just need to add arxiv and pylatexenc - which will be the only dependencies it lists (it's smart enough to not e.g. add requests to the requirements.txt, unlike pip)

You can then do uv run python <script_name> while in the project folder to run scripts using a virtual environment created automatically by uv from the pyproject.toml dependencies list. It's a really easy way to work :D

@james-trayford
Copy link
Author

Great! Yeah should get this working with uv early - pip freeze was just the quickest way i knew to dump dependencies to start with. Will follow your steps.

@james-trayford
Copy link
Author

Current output and formatting - here using a reduced character limit of 150 to show truncation happening (much rarer at 300 chars)

output
Probing a diffuse flux of axion-like particles from galactic supernovae with...
David Alonso-González et al.
http://arxiv.org/pdf/2412.09595v1 🔮
[posted 2024-12-12 18:58:24+00:00 - was 176 of 150 characters, reduced by 32]

JWST PRIMER: strong evidence for the environmental quenching of low-mass galaxies...
M. L. Hamadouche et al.
http://arxiv.org/pdf/2412.09592v1 🌀
[posted 2024-12-12 18:58:01+00:00 - was 153 of 150 characters, reduced by 9]

Limits on dark matter, ultralight...
Sara Rufrano Aliberti, Gaetano Lambiase & Tanmay Kumar Poddar 
http://arxiv.org/pdf/2412.09575v1 🔮
[posted 2024-12-12 18:53:22+00:00 - was 203 of 150 characters, reduced by 68]

Asymmetric Temperature Variations In Protoplanetary disks: I....
Zhaohuan Zhu, Shangjia Zhang & Ted Johnson 
http://arxiv.org/pdf/2412.09571v1 🪐
[posted 2024-12-12 18:52:05+00:00 - was 178 of 150 characters, reduced by 34]

A glitch in gravity: cosmic Lorentz-violation from fiery Big Bang to glacial heat death
Robin Y. Wen et al.
http://arxiv.org/pdf/2412.09568v1 🔮
[posted 2024-12-12 18:50:56+00:00 - was 143 of 150 characters, reduced by 0]

How Many Bursts Does it Take to Form a Core at the Center of a Galaxy?
Olivia Mostow et al.
http://arxiv.org/pdf/2412.09566v1 🌀
[posted 2024-12-12 18:50:04+00:00 - was 127 of 150 characters, reduced by 0]

Nonlinear evolution of fluting oscillations in coronal flux tubes
Roberto Soler & Andrew Hillier 
http://arxiv.org/pdf/2412.09547v1 ✨
[posted 2024-12-12 18:37:25+00:00 - was 133 of 150 characters, reduced by 0]

Kinetic simulations of the Kruskal-Schwarzchild...
William Groger, Hayk Hakobyan & Lorenzo Sironi 
http://arxiv.org/pdf/2412.09541v1 🎆
[posted 2024-12-12 18:32:11+00:00 - was 211 of 150 characters, reduced by 77]

A universal and physically motivated threshold for...
Edward Olex, Wojciech A. Hellwing & Alexander Knebe 
http://arxiv.org/pdf/2412.09531v1 🔮
[posted 2024-12-12 18:21:21+00:00 - was 191 of 150 characters, reduced by 49]

A new and flexible design method for Symmetric Quadrature Hybrid...
Arjun Ghosh & Ritoban Basu Thakur 
http://arxiv.org/pdf/2412.09528v1 🛠️
[posted 2024-12-12 18:17:18+00:00 - was 176 of 150 characters, reduced by 37]

Primary Beam Chromaticity in HIRAX: I. Characterization from Simulations and Power...
Ajith Sampath et al.
http://arxiv.org/pdf/2412.09527v1 🔮🛠️
[posted 2024-12-12 18:14:19+00:00 - was 163 of 150 characters, reduced by 19]

FLAMINGO: Galaxy formation and feedback effects on the gas density and velocity...
Lurdes Ondaro-Mallea et al.
http://arxiv.org/pdf/2412.09526v1 🔮🌀
[posted 2024-12-12 18:14:09+00:00 - was 151 of 150 characters, reduced by 4]

A Comparative Test of the LCDM and R_h=ct Cosmologies Based on Upcoming Redshift Drift Measurements
Fulvio Melia 
http://arxiv.org/pdf/2412.09489v1 🔮
[posted 2024-12-12 17:39:12+00:00 - was 149 of 150 characters, reduced by 0]

SIRI-2 Detection of the Gamma-ray Burst 221009A
Lee J. Mitchell et al.
http://arxiv.org/pdf/2412.09476v1 🎆
[posted 2024-12-12 17:21:30+00:00 - was 106 of 150 characters, reduced by 0]

Constraints on Pre-Big-Bang Cosmology from Advanced LIGO and Advanced Virgo's First Three...
Qin Tan et al.
http://arxiv.org/pdf/2412.09461v1 🔮
[posted 2024-12-12 17:12:56+00:00 - was 155 of 150 characters, reduced by 12]

On the evolutionary nature of massive B-type supergiants: a modern empirical...
Abel de Burgos 
http://arxiv.org/pdf/2412.09454v1 ✨🌀
[posted 2024-12-12 17:06:50+00:00 - was 178 of 150 characters, reduced by 46]

Revisiting the Galactic Winds in M82 I: the recent starburt and launch of outflow in...
Tian-Rui Wang et al.
http://arxiv.org/pdf/2412.09452v1 🌀✨
[posted 2024-12-12 17:05:31+00:00 - was 154 of 150 characters, reduced by 9]

On the Relation between the Inclination Angle of the Accretion Disk and the Broad-line...
Rong Du et al.
http://arxiv.org/pdf/2412.09451v1 🎆
[posted 2024-12-12 17:05:16+00:00 - was 170 of 150 characters, reduced by 30]

FBQ 0951+2635: time delay and...
Vyacheslav N. Shalyapin, Luis J. Goicoechea & Eleana Ruiz-Hinojosa 
http://arxiv.org/pdf/2412.09435v1 🌀
[posted 2024-12-12 16:39:47+00:00 - was 170 of 150 characters, reduced by 34]

Meteoroid Ablation within the Jovian Atmosphere: Implications on the Oxygen Delivery to...
C. A. Mehta et al.
http://arxiv.org/pdf/2412.09426v1 🪐
[posted 2024-12-12 16:32:54+00:00 - was 169 of 150 characters, reduced by 24]

Radial evolution of a density structure within a solar wind magnetic sector boundary
Etienne Berriot et al.
http://arxiv.org/pdf/2412.09395v1 ✨
[posted 2024-12-12 16:00:12+00:00 - was 143 of 150 characters, reduced by 0]

Extended Skyrme effective interactions with higher-order momentum-dependence for...
Si-Pei Wang et al.
http://arxiv.org/pdf/2412.09393v1 🎆
[posted 2024-12-12 15:59:56+00:00 - was 170 of 150 characters, reduced by 32]

Searching for New Physics in Ultradense...
Francesco Grippa, Gaetano Lambiase & Tanmay Kumar Poddar 
http://arxiv.org/pdf/2412.09381v1 🎆🔮
[posted 2024-12-12 15:48:40+00:00 - was 193 of 150 characters, reduced by 56]

A BCool survey of stellar magnetic cycles
S. Bellotti et al.
http://arxiv.org/pdf/2412.09365v1 ✨
[posted 2024-12-12 15:31:15+00:00 - was 96 of 150 characters, reduced by 0]

Behind the dust veil: A panchromatic view of an optically dark galaxy at z=4.82
Nikolaj B. Sillassen et al.
http://arxiv.org/pdf/2412.09363v1 🌀
[posted 2024-12-12 15:29:59+00:00 - was 143 of 150 characters, reduced by 0]

Galaxy Morphological Classification with Manifold Learning
Vasyl Semenov et al.
http://arxiv.org/pdf/2412.09358v1 🌀
[posted 2024-12-12 15:26:29+00:00 - was 115 of 150 characters, reduced by 0]

Insights from Modeling Magnetar-driven Light Curves of Stripped-envelope Supernovae
Amit Kumar 
http://arxiv.org/pdf/2412.09357v1 🎆
[posted 2024-12-12 15:26:23+00:00 - was 131 of 150 characters, reduced by 0]

Using nebular near-IR spectroscopy to measure asymmetric chemical distributions in...
J. O'Hora et al.
http://arxiv.org/pdf/2412.09352v1 ✨🎆
[posted 2024-12-12 15:19:46+00:00 - was 173 of 150 characters, reduced by 34]

Is cosmological data suggesting a nonminimal coupling between...
Miguel Barroso Varela & Orfeu Bertolami 
http://arxiv.org/pdf/2412.09348v1 🔮
[posted 2024-12-12 15:15:52+00:00 - was 158 of 150 characters, reduced by 17]

Long term optical variations in Swift J1858.6-0814: evidence for ablation and comparisons...
L. Rhodes et al.
http://arxiv.org/pdf/2412.09347v1 🎆
[posted 2024-12-12 15:15:21+00:00 - was 162 of 150 characters, reduced by 17]

Electromagnetic radiation from a relativistic jet...
Vladimir Epp, Konstantin Osetrin & Elena Osetrina 
http://arxiv.org/pdf/2412.09326v1 🎆
[posted 2024-12-12 14:50:58+00:00 - was 174 of 150 characters, reduced by 35]

A first glimpse at the MeerKAT DEEP2 field at S-band
S. Ranchod et al.
http://arxiv.org/pdf/2412.09314v1 🌀
[posted 2024-12-12 14:30:26+00:00 - was 106 of 150 characters, reduced by 0]

Constraining primordial black hole abundance with Insight-HXMT
Chen Yang & Xin Zhang 
http://arxiv.org/pdf/2412.09297v1 🔮🎆
[posted 2024-12-12 14:12:19+00:00 - was 122 of 150 characters, reduced by 0]

A theoretical framework for BL Her stars III. A case study: Robust light curve...
Susmita Das et al.
http://arxiv.org/pdf/2412.09287v1 ✨🌀
[posted 2024-12-12 13:56:37+00:00 - was 158 of 150 characters, reduced by 21]

Journey of complex organic molecules: Formation and transport in protoplanetary disks
T. Benest Couzinou et al.
http://arxiv.org/pdf/2412.09271v1 🪐
[posted 2024-12-12 13:32:37+00:00 - was 147 of 150 characters, reduced by 0]

Exploring nuclear force with pulsar glitch observation
Zhong-Hao Tu & Ang Li 
http://arxiv.org/pdf/2412.09219v1 🎆✨
[posted 2024-12-12 12:20:06+00:00 - was 114 of 150 characters, reduced by 0]

Fast-rotating A- and F-type stars with Hα emissions in NGC 3532,...
Chenyu He, Chengyuan Li & Gang Li 
http://arxiv.org/pdf/2412.09217v1 ✨🌀
[posted 2024-12-12 12:13:35+00:00 - was 160 of 150 characters, reduced by 21]

Evolution, speed, and precession of the parsec-scale jet in the 3C 84 radio galaxy
M. Foschi et al.
http://arxiv.org/pdf/2412.09215v1 🎆
[posted 2024-12-12 12:10:23+00:00 - was 135 of 150 characters, reduced by 0]

Magnetic Reconnection between a Solar Jet and a Filament Channel
Garima Karki et al.
http://arxiv.org/pdf/2412.09206v1 ✨
[posted 2024-12-12 12:00:41+00:00 - was 120 of 150 characters, reduced by 0]

Quantifying Roche Lobe Overflow in the Formation of Merging Black Hole Binaries
David Dickson 
http://arxiv.org/pdf/2412.09172v1 🎆✨
[posted 2024-12-12 11:02:24+00:00 - was 131 of 150 characters, reduced by 0]

Constraining Schwarzschild Models with Orbit Classifications
Richard J. Long 
http://arxiv.org/pdf/2412.09167v1 🌀
[posted 2024-12-12 10:54:58+00:00 - was 113 of 150 characters, reduced by 0]

herakoi: a sonification experiment for astronomical data
Michele Ginolfi, Luca Di Mascolo & Anita Zanella 
http://arxiv.org/pdf/2412.09152v1 🛠️
[posted 2024-12-12 10:37:33+00:00 - was 143 of 150 characters, reduced by 0]

Chemical Evolution of R-process Elements in Stars (CERES). III. Chemical abundances...
L. Lombardo et al.
http://arxiv.org/pdf/2412.09141v1 🌀
[posted 2024-12-12 10:24:46+00:00 - was 180 of 150 characters, reduced by 39]

Modified gravity in galaxy clusters: Joint analysis of Hydrostatics and Caustics
Minahil Adil Butt et al.
http://arxiv.org/pdf/2412.09134v1 🔮
[posted 2024-12-12 10:14:31+00:00 - was 141 of 150 characters, reduced by 0]

Chemodynamic evolution of Sun-like stars in nearby moving groups
Christian Lehmann et al.
http://arxiv.org/pdf/2412.09128v1 ✨🌀
[posted 2024-12-12 10:04:16+00:00 - was 126 of 150 characters, reduced by 0]

Betwixt Annihilation and Decay: The Hidden Structure of...
Jonah Barber, Keith R. Dienes & Brooks Thomas 
http://arxiv.org/pdf/2412.09123v1 🔮
[posted 2024-12-12 09:59:59+00:00 - was 158 of 150 characters, reduced by 17]

Spectroscopic observations of flares and superflares on AU Mic
P. Odert et al.
http://arxiv.org/pdf/2412.09113v1 ✨🪐
[posted 2024-12-12 09:44:28+00:00 - was 115 of 150 characters, reduced by 0]

Molecular chemistry induced by J-shock toward supernova remnant W51C
Tian-Yu Tu et al.
http://arxiv.org/pdf/2412.09092v1 🌀🎆
[posted 2024-12-12 09:21:11+00:00 - was 123 of 150 characters, reduced by 0]

Substellar candidates at the earliest stages: the SUCANES database
A. M. Pérez-García et al.
http://arxiv.org/pdf/2412.09091v1 ✨🪐🌀🛠️
[posted 2024-12-12 09:19:54+00:00 - was 132 of 150 characters, reduced by 0]

Probing the low-energy particle content of blazar jets through MeV observations
F. Tavecchio et al.
http://arxiv.org/pdf/2412.09089v1 🎆
[posted 2024-12-12 09:15:58+00:00 - was 135 of 150 characters, reduced by 0]

Insights into the Properties of Type Ibn/Icn Supernovae and Their...
Yusuke Inoue & Keiichi Maeda 
http://arxiv.org/pdf/2412.09066v1 🎆🌀✨
[posted 2024-12-12 08:53:00+00:00 - was 176 of 150 characters, reduced by 40]

A comprehensive numerical study on four categories of holographic dark energy models
Jun-Xian Li & Shuang Wang 
http://arxiv.org/pdf/2412.09064v1 🔮
[posted 2024-12-12 08:51:54+00:00 - was 147 of 150 characters, reduced by 0]

@james-trayford
Copy link
Author

do we want abstract or pdf links?

@emilyhunt
Copy link
Contributor

I can think of three benefits of arXiv links over the pdf:

  • arXiv is starting to roll out HTML versions of papers, which some people prefer reading (+ generally much more accessible for people with sight difficulties etc)
  • the arXiv page has links to other services (like ADS)
  • you can always go from arXiv page to the pdf, but it's harder to go the other way

@james-trayford
Copy link
Author

pdf url => abs url, and strip unneeded https://

Probing a diffuse flux of axion-like particles from galactic supernovae with neutrino...
David Alonso-González et al.
arxiv.org/abs/2412.09595v1 🔮
[posted 2024-12-12 18:58:24+00:00 - was 169 of 150 characters, reduced by 23]

JWST PRIMER: strong evidence for the environmental quenching of low-mass galaxies out to 𝐳≃ 2
M. L. Hamadouche et al.
arxiv.org/abs/2412.09592v1 🌀
[posted 2024-12-12 18:58:01+00:00 - was 146 of 150 characters, reduced by 0]

...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants