Authors: Juan R. Loaiza (URosario / HU Berlin) and Miguel González Duque (ITU Copenhagen)
In this repository we track progress on a research project in which we apply text mining to philosophy journals in Latin America. Our aim is to provide insights into the history of philosophy in Latin America using a data-driven approach.
We are starting with Ideas y Valores (Colombia) and articles from 2009 to 2017. We plan on expanding later to include more years and other journals such as Crítica (Mexico) and Análisis Filosófico (Argentina).
.
├── data # Data files (omitted from Git repository for the moment)
| ├── raw_html # Raw HTML files directly as scraped with metadata
| └── clean_json # Parsed HTML files and metadata in JSON format
├── utils # Helper utilities
├── notebooks # Notebooks with preprocessing and analyses
| └── wordlists # Stopwords and protected words lists
└── README.md
- Extract view information from main HTML page.
- Calibrate the number of topics for the LDA model.
- Implement LDA in gensim and use topic coherence measures to calibrate the number of topics.
The following plots are only proofs of concept. We are using a temporary LDA model with 10 topics to find which visualizations would work best. There is still work to fully optmize the LDA model though. We use a model with the following top 10 most salient words.
Topic 0 | Topic 1 | Topic 2 | Topic 3 | Topic 4 | Topic 5 | Topic 6 | Topic 7 | Topic 8 | Topic 9 |
---|---|---|---|---|---|---|---|---|---|
lenguaje | kant | religioso | ser | creencia | ser | político | acción | alma | político |
interpretación | concienciar | religión | cuerpo | mundo | mundo | formar | moral | ser | derecho |
teoría | ser | ciudad | formar | ser | hegel | vida | ser | platón | moral |
experiencia | concepto | filosofía | heidegger | teoría | filosofía | ser | accionar | filosofía | ser |
wittgenstein | objetar | historia | modo | propiedad | dios | filosofía | agente | conocimiento | justicia |
filosofía | experiencia | siglo | aristóteles | término | bien | nietzsche | personar | sócrates | bien |
ser | arte | cultura | ente | contener | vida | foucault | desear | hombre | social |
problema | husserl | tradición | naturaleza | concepto | razón | social | intención | virtud | sociedad |
autor | trascendental | ciencia | bien | físico | hombre | crítico | bien | bien | teoría |
filosófico | modo | obrar | existencia | objeto | pensar | pensamiento | libertar | obrar | razón |