Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRAN task view proposal: ArchaeologicalScience #64

Open
benmarwick opened this issue Sep 3, 2024 · 6 comments
Open

CRAN task view proposal: ArchaeologicalScience #64

benmarwick opened this issue Sep 3, 2024 · 6 comments

Comments

@benmarwick
Copy link

Here is our proposal for a CTV on Archaeological Science. We have organised the proposal into the sections requested in the instructions (scope, packages, overlap, maintainers):

Scope:

Archaeological science is the study of past human behaviours and relationships using techniques, concepts, and methods from the natural and life sciences. This task view is a list of packages useful for archaeological science. It is intended to help archaeologists in their search for packages relevant to their research and teaching. It includes packages written by archaeologists for working with distinctive types of archaeological data, such as radiocarbon ages, data from various types of artefacts (lithics, pottery, etc.), faunal remains, geoarchaeological and landscape data. It also includes packages containing archaeological datasets that are useful for teaching, and packages from closely related sciences, such as environmental science, that are widely used by archaeologists.

If you have any questions feel free to reach out to the task view maintainers or the maintainers of specific packages. If there is an archaeological science package on CRAN or elsewhere that we have missed, please let us know. Contributions are always welcome, and encouraged -- please see the linked GitHub repository for details.

Packages:

Within each thematic section, you will find first the packages distributed on CRAN, followed by a selection of code projects in other repositories. Please keep in mind that the order in which packages are listed should not be taken as an indicator of quality or endorsement.

Analysis of Dates and Chronological Patterns

Radiometric Dating

Radiocarbon ages can be calibrated using many of the packages in this section:

  • r pkg("rcarbon", priority = "core") is useful for calibration, and also contains extensively documented functions for hypothesis testing and modelling radiocarbon ages. See Crema and Bevan (2021) for an introduction. Basic calibration is also possible with r pkg("rintcal").

  • r pkg("Bchron") adds various calibration curves (including user generated ones); also does age-depth modelling, relative sea level rate estimation incorporating time uncertainty in polynomial regression models, and non-parametric phase modelling via Gaussian mixtures as a means to determine the activity of a site (and as an alternative to the Oxcal function SUM). r pkg("clam") similarly does 'classical' age-depth modelling of deposits.

  • Bayesian age-depth modelling of radiocarbon dates is available in r pkg("nimbleCarbon") and r pkg("rbacon").

  • r pkg("coffee") uses Bayesian methods to enforce the chronological ordering of radiocarbon and other dates, for example for trees with multiple radiocarbon dates spaced at exactly known intervals.

  • r pkg("oxcAAR") allows you to use R to connect to a local installation of the OxCal software to calibrate radiocarbon dates and a variety of other OxCal operations.

  • r pkg("ArchaeoPhases") allows you to post-process Markov Chain Monte Carlo (MCMC) simulations from ChronoModel, Oxcal or BCal. It provides statistical tools to analyze and to estimate archaeological phases from the posterior distribution of a sequence of dates and includes testing procedures to check the presence of a gap between two successive phases.

  • r pkg("spDates") allows analysis of spatial gradients in radiocarbon dates.

  • r pkg("IsoplotR") offers a statistical toolbox for radiometric geochronology.

  • r github("ropensci/c14bazAAR") facilitates retrieval and preparation of large radiocarbon datasets.

  • r github("joeroe/c14") provides basic classes and functions for radiocarbon data in R. It makes it easier to combine methods from several existing packages (e.g. rcarbon, Bchron, oxcAAR, c14bazAAR, ArchaeoPhases, stratigraphr) together and work with them in a tidy data workflow.

  • r github("tonydoss/UThwigl") computes closed- and open-system uranium-thorium (U-Th) ages of geological and archaeological samples.

  • The r github("UCL/ADMUR") package provides tools to directly model underlying population dynamics using chronological datasets (radiocarbon and other) with a variety of models, including Continuous Piecewise Linear (CPL) model framework, and model comparison framework using BIC.

  • r github("ArchaeoStat/ArchaeoChron") allows basic radiocarbon calibration and Bayesian combination of radiocarbon dates.

Dendrochronology

  • r pkg("dendroNetwork") allows to create dendrochronological networks based on the similarity between tree-ring series or chronologies.
  • r github("ropensci/fellingdater") offers a suite of functions designed to assist dendrochronologists in inferring estimates for felling dates, derived from dated tree-ring series.

Luminescence Dating

  • Various R functions for Luminescence Dating data analysis are in the r pkg("Luminescence") and r pkg("numOSL") packages, including equivalent dose calculation, annual dose rate determination, growth curve fitting, decay curve decomposition, statistical age model optimization, and statistical plot visualization.
  • r pkg("BayLum") provides chronological bayesian models integrating Optically Stimulated Luminescence and radiocarbon age dating.

Paleoenvironmental Proxies

  • r pkg("tidypaleo") provides a set of functions for age-depth model management, stratigraphic visualization, and common statistical transformations.
  • r pkg("shoredate") offers methods to shoreline date coastal sites based on their present-day elevation and the trajectory of past relative sea-level change.

Archaeological Time Series

  • r pkg("era") provides a consistent representation of year-based time scales as a numeric vector with an associated era. r pkg("aion") contains a toolkit for handling archaeological time series.

  • r pkg("aoristic"), r pkg("kairos") and r github("ISAAKiel/aoristAAR") provide functions for the aoristic analysis of archaeological data (takes into account the uncertainty of the exact moment that an event occurred when examining the overall incidence of events over time).

  • r pkg("kairos") provides functions for mean ceramic date estimation.

  • r pkg("SPARTAAS") and r pkg("kairos") provide methods for statistical pattern recognition, time range plotting and seriation plots of archaeological artefacts.

  • r pkg("datplot") converts date ranges into dating 'steps' to ease the visualization of changes in e.g. pottery consumption, style and other variables over time.

  • The r github("davidcorton/archSeries") package makes chronologies from information from multiple entities with varying chronological resolution and overlapping date ranges.

Artefact Analysis

  • r pkg("tabula", priority = "core") provides several tests and measures of diversity: heterogeneity and evenness (Brillouin, Shannon, Simpson, etc.), richness and rarefaction (Chao1, Chao2, ACE, ICE, etc.), turnover and similarity (Brainerd-Robinson, etc.). The package make it easy to visualize count data and statistical thresholds: rank vs. abundance plots, heatmaps, Ford (1962) and Bertin (1977) diagrams.

  • r github("zoometh/iconr") for modeling prehistoric iconographic compositions and preparing for further analysis (clustering, typology tree, Harris diagram, etc.).

  • r github("ISAAKiel/quantAAR") contains tidy wrappers and useful utility function for common applications of exploratory statistics in archaeology.

  • r github("ISAAKiel/shapAAR") for the extraction, analysis and classification of (not only) archaeological objects derived from scanned images. Aims especially at the analysis of the shapes/profiles of e.g. ceramic vessels or arrow heads.

  • r github("yesdavid/outlineR") for the fast and easy extraction of single outline shapes of, for example, stone tools from images containing multiple thereof, such as the ones present in archaeological publications.

  • r github("cornelmpop/Lithics3D") for working with 3D scans of archaeological lithics (clean triangular meshes and existing landmarks).

  • r github("maciejkasinski/quantatools") for analysis of quantum (common measurement units) in archaeological data with cosine quantogram and related statistical methods.

  • r github("Johanna-Mestorf-Academy/sdsanalysis") for exploration and visualization of lithic datasets recorded using the 'Systematic and digital documentation of stone artefacts' recording system.

Cultural Evolutionary Analysis

  • r github("ercrema/cTransmission") for an Approximate Bayesian Computation Framework for inferring patterns of cultural transmission from frequency data.
  • r github("ercrema/HERAChp.KandlerCrema") enables the reproduction of the analysis and associated figures for the book chapter Analysing cultural frequency data: neutral theory by Anne Kandler and Enrico Crema for the volume Handbook of Evolutionary Research in Archaeology, edited by Anna Prentiss. The package contains two main functions for simulating cultural transmission.
  • r github("benmarwick/signatselect") provides two functions useful for investigating change over time in artefact assemblages (and genetic time-series data).
  • r github("benmarwick/roev") provide functions for analysing and visualizing rates of evolution following the methods in Philip D. Gingerich’s 2019 book Rates of Evolution: A Quantitative Synthesis.

Zooarchaeological Analysis

  • r pkg("zoolog", priority = "core") to generate and manipulate log-ratios (also known as log size index (LSI) values) from measurements obtained on zooarchaeological material.

Mortuary Analysis

  • r pkg("mortAAR", priority = "core") calculates a life table based on archaeological demographic data.
  • r github("nevrome/varnastats") for bi- and multivariate analysis of matrices of archaeological data. Developed and used for the analysis of Varna Necropolis (Bulgaria).

Geoarchaeological Analysis

  • r pkg("aqp") simplifies the quantitative analysis of soil profile data. It allows soil profile visualization, aggregation, and classification. r pkg("G2Sd"), r github("mauricio-camargo/rysgran") and r pkg("EMMAgeo") can be used for working with sedimentary grain-size data in logarithmic (phi) and geometric (micrometers) scales, based on various methods.

  • r pkg("SIBER"), r pkg("MixSIAR"), r pkg("simmr") and r pkg("IsotopeR") provide methods for working with isotope data.

  • r pkg("munsell") provides methods for working with sediment colour.

  • r pkg("nexus") provides tools for compositional data analysis, chemical fingerprinting and source tracking of ancient materials.

  • r pkg("isopleuros") allows creation of ternary plots. It also includes common ternary diagrams useful for the archaeologist (e.g. soil texture charts, ceramic phase diagram).

  • r github("ISAAKiel/magAAR") analyzes geomagnetic data from archaeological contexts.

  • r github("Andros-Spica/cerUB") for multivariate statistic protocols for integrating archaeometric data (geochemical, mineralogical, petrographic).

Landscape Analysis

  • r pkg("leastcostpath", priority = "core") calculates Least Cost Paths (LCPs) using numerous time- and energy-based cost functions that approximate the difficulty of moving across a landscape.

  • r pkg("skyscapeR") for data reduction, visualization and analysis in skyscape archaeology, archaeoastronomy and cultural astronomy.

  • r github("SCSchmidt/percopackage") implements percolation Analysis as a 2D point pattern analysis technique for identifying clusters of any size and form (e.g. of archaeological sites).

  • r github("ISAAKiel/lecAAR") for calculating the largest empty circles and estimation of archaeological sites theoretically to be expected in region of interest.

  • r github("ISAAKiel/pathAAR") to reconstruct paths using archaeological monuments, model parameters of infrastructure, and evaluate those parameters.

  • r github("mrecos/klrfome") for archaeological site location modeling; maps a single scalar outcome (e.g. presence/absence; 0/1) to a distribution of features.

  • r github("eScienceCenter/SiteExploitationTerritories") calculates a non-isotropic spatial relationship by integrating human energy expenditure in terrain based estimations.

  • r github("nevrome/bleiglas") calculates of three-dimensional Voronoi diagrams from input point clouds for spatiotemporal applications in archaeology.

  • r github("wccarleton/lamap") calculates the Locally Adaptive Model of Archaeological Potential (LAMAP).

Survey, Excavation, and Stratigraphic Analysis

  • r pkg("archeoViz") is a packaged R Shiny application for the 2D and 3D visualization, exploration, and web communication of spatial data from archaeological excavations.

  • r pkg("archeofrag") for refitting and stratigraphic analysis in archaeology.

  • r pkg("recexcavAAR") for 3D reconstruction and analysis of excavations, provides methods to reconstruct natural and artificial surfaces based on field measurements. This allows to spatially contextualize documented subunits and features.

  • r github("joeroe/stratigraphr") provides a tidy framework for working with archaeological stratigraphy and chronology in R. It includes tools for reading, analysing, and visualising stratigraphies (Harris matrices) and sequences as directed graphs; helper functions for using radiocarbon dates in a tidy data analysis; and an R interface to OxCal's Chronological Query Language (CQL).

  • r github("joeroe/fieldwalkr") for designing and evaluating sampling strategies in spatial survey (fieldwalking in archaeological jargon). It contains functions for simulating the effect of different survey units, sampling methods and detection functions on the estimation of randomly generated or observed point processes.

  • r github("mrecos/signboardr") Utilize Google Vision API to extract text from archaeological photos containing a sign board.

Data Management and Cleaning

  • r pkg("unstruwwel") can detect and transform strings containing historic dates (e.g. "3rd century CE") to numeric values.

Datasets

  • r pkg("archdata") contains eleven archaeological datasets from around the world reported in published studies. These represent typical forms of archaeological data (and so are useful for teaching).

  • r pkg("binford") contains more than 200 variables coding aspects of hunter-gatherer subsistence, mobility, and social organization for 339 ethnographically documented groups of hunter-gatherers, as used in Binford (2001) Constructing Frames of Reference: An Analytical Method for Archaeological Theory Building Using Ethnographic and Environmental Data Sets.

  • r pkg("folio") provides several types of data related to broad topics (cultural evolution, radiocarbon dating, paleoenvironments, etc.), which can be used to illustrate statistical methods in the classroom (multivariate data analysis, compositional data analysis, diversity measurement, etc.).

  • r pkg("gsloid") Contains published data sets for global benthic d18O data for 0-5.3 Myr and global sea levels based on marine sediment core data for 0-800 ka.

  • r pkg("chemometrics") contains a dataset of elemental concentrations for 180 archaeological glass vessels excavated from 15th - 17th century contexts in Antwerp.

  • r pkg("datplot") contains a data set on Inscriptions from Bithynia as used in B. Weissova (2019) Regional Economy, Settlement Patterns and the Road System in Bithynia (4th Century BC-6th Century AD). Spatial and Quantitative Analysis suitable for geographical and chronological analysis.

  • r github("geanes/bioanth") contains three osteometric datasets useful for biological and forensic anthropology.

  • r github("sfsheath/cawd") contains 15 datasets of ancient Greek, Roman and Persian maps and digital atlas data.

  • r github("benmarwick/evoarchdata") contains four published datasets widely used in archaeological studies of cultural evolution.

  • r github("joeroe/islay") includes various datasets relating to the prehistoric archaeology of the Scottish island of Islay and recorded by the ‘Southern Hebrides Mesolithic Project’ (Mithen et al. 2000).

  • r github("DCPollard94/knossoscemeteries") includes artefacts data from two Early Iron Age Cemeteries at Knossos, Crete.

  • r github("joeroe/swapdata") a collection of archaeological datasets and tools related to the prehistory of Southwest Asia.

  • r github("benmarwick/mjbnaturepaper") contains stone artefact and total station data for excavations at Madjedbebe, a rock shelter in northern Australia

  • r github("lsteinmann/clayringsmiletus") data on clay rings from the Sanctuary of Dionysos in the ancient city of Miletus on the western coast of Anatolia

  • r github("bischrob/Rosegate-Projectile-Points-in-the-Fremont-Region") contains frequency data for different types of projectile points across 23 sites in the Fremont Region of the American Southwest

Radiocarbon Datasets and APIs

  • r pkg("BSDA") contains a dataset of 60 radiocarbon ages of observations taken from an archaeological site with four phases of occupation.
  • r github("ropensci/c14bazAAR") contains over 20 datasets of radiocarbon ages from around the world.
  • r github("people3k/p3k14c") contains a global dataset of radiocarbon dates.
  • r github("xronos-ch/xronos.R") accesses XRONOS, a global dataset of radiocarbon and other chronometric dates.
  • r github("joeroe/rintchron") accesses IntChron, an indexing service for chronological information.
  • r github("ArchaeoStat/ArchaeoData") contains two datasets for chronological modelling with r pkg("ArchaeoPhases").

Overlap:

This list does not include packages for general purpose tasks such as importing/exporting data, data manipulation, common forms of data analysis and visualization, and doing reproducible research. These may be found in other CRAN Task Views, especially r view("Environmetrics"), r view("Spatial"), r view("Multivariate"), r view("Phylogenetics"), r view("Cluster"), r view("ReproducibleResearch"), r view("WebTechnologies"), r view("MachineLearning"), and r view("SpatioTemporal"). To minimise overlap we do not include those packages in this list. Given that this CTV is currently fairly short we see little potential for splitting at this stage, but potentially radiometric dating could be a good candidate for having its own CTV in the future.

Maintainers:

Here are the co-maintainers, who have all agreed to the duty and contributed to the composition of the list: Ben Marwick, Bjørn Peare Bartholdy, Nicolas Frerebeau, Sam Leggett, Sarah Pederzani, Joe Roe, Sophie C. Schmidt, Laure Spake, Lisa Steinmann, Liying Wang, Jesse Wolfhagen. We have assembled a diverse group in terms of gender, origin, scientific field, and geographic location.

Thank you for considering our proposal, we look forward to your feedback.

@tuxette
Copy link
Contributor

tuxette commented Sep 5, 2024

Thanks @benmarwick for this proposal!

  • I am far from being a specialist of the field but, to me, the topic is close to the Paleontology proposal that has been proposed in issue CRAN task view proposal: Paleontology #57 . Can you explain precisely if this is completely a different topic or if there are some overlaps? In the latter case, could the two task views be somehow merged? Your input, @willgearty , would also be helpful.
  • An additional important point is that the proportion of GitHub packages not published on CRAN seems quite important. CRAN task views typically include a few non-CRAN packages, but the proportion should remain low—firstly because these are CRAN task views, and secondly to keep the task view more manageable in terms of maintenance.

@jpiaskowski
Copy link

Thank you, @tuxette, these echo my concerns.

@willgearty
Copy link
Contributor

While I agree that one might expect that the scope of this proposed CTV and the scope of the proposed Paleontology CTV could overlap quite a bit, there is actually very little package overlap due to the very different timescales and subject matters of the two fields. In fact, from a very cursory check it looks like we only have 2 shared packages, folio and tidypaleo. I'd argue the two fields are quite distinct in most cases, and that the two proposed CTVs are quite distinct and complementary and therefore shouldn't be merged.

I did also notice that a large proportion of the packages were non-CRAN packages. Perhaps as part of the Archaeology CTV effort, the maintainers could push for more of these packages to be published to CRAN?

@benmarwick
Copy link
Author

Thank you for your quick reviews and feedback.

Yes, archaeology and paleontology are distinctly different topics. Archaeology is the study of past human behaviors and relationships (ie all about people), while paleontology is the study of fossil animals and plants (ie excludes people). At US universities, archaeology is almost always part of Anthropology departments, and paleontology is typically in Earth sciences or Biology. Someone with a PhD in palaeontology will not be successful in applying for a job as an archaeologist and vice versa.

As @willgearty noted, looking at the two CTV proposals we can see that there are only two packages in both proposals. If these two areas overlapped then we might expect more than one package to appear in both CTV proposals. In my opinion, merging the two would make researchers in both fields less likely to trust the CTV as a reliable source of information about useful R packages.

Regarding the high proportion of non-CRAN packages, we can request the developers to submit to CRAN to increase the proportion, and we can remove some non-CRAN packages also (ie those that have not been updated in a while). I know that some developers on our list are keen on alternatives like r-universe and drat and might never put their package on CRAN. @tuxette and @jpiaskowski if you could please give us some guidance about what is an acceptable proportion (eg under 25% or under 5%, etc) that would be very much appreciated and help us strategize for getting more packages onto CRAN.

Thank you again so much for considering our proposal and your feedback.

@tuxette
Copy link
Contributor

tuxette commented Sep 11, 2024

About your second question, I don't think that there is a formally acceptable proportion of CRAN packages but as a rule of thumb, I would say than non-CRAN packages should probably represent no more than 20 or 25% of packages in the task view.

@benmarwick
Copy link
Author

@tuxette Thank you so much for that feedback. I'll discuss with my co-maintainers how we can get our percentage of non-CRAN packages under 20% and follow up here when I have some news.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants