Skip to content

CTS URNs and Work Identifiers: Overview and Perseus Catalog Usage

Alison Babeu edited this page Aug 13, 2015 · 3 revisions

CITE, CTS URNs and the Perseus Catalog

CITE, shorthand for the CITE architecture, stands for “Collections, Indexes, Texts, and Extensions.” It was originally developed to support the Homer Multitext Project and “defines a framework for scholarly reference to the unique cultural phenomena that humanists study”. This architecture includes two different URN formats for citing digital resources, CTS URNs for texts and CITE Collection URNs for “discrete objects”. URNs, are defined by RFC 2141, as, “persistent, location-independent, resource identifiers.” The CITE Architecture has also defined two corresponding network services made use of by the Perseus Catalog, the “canonical text service” or CTS, which uses CTS URNs to uniquely identify and support the retrieval of digital versions of texts, and the CITE Collection service that uses “Collection URNs to identify and retrieve digital representations of discrete objects.” CTS URNs are thus essentially a collection of CTS compliant URNs.

The Perseus Catalog uses CTS URNs as the basis for unique identifiers to represent all of the textgroups in the catalog. Textgroup, is a term used in the CTS architecture, and represent “traditional, convenient groupings of texts such as ‘authors’ for literary works, or corpus collections for epigraphic or papyrological texts.” While CTS URNS include unique identifiers, they also support multiple titles (in order to support multi-lingual collections). As an example, the textgroup for Homer has the URN urn:cts:greekLit:tlg0012; the work Iliad has the URN urn:cts:greekLit:tlg0012.tlg001; and lastly, an edition of the Iliad, published in 1920, edited by Thomas W. Allen and available in the Perseus Digital Library, has the URN urn:cts:greekLit:tlg0012.tlg001.perseus-grc1. The CTS URN makes use of the preexisting domain identifier tlg0012, which is the identifier for the author Homer in the TLG.

These CTS URN identifiers used in the Perseus Catalog reference identifiers for authors and works from the TLG and PHI canons as well as in some cases the Stoa Latin Text Inventory. Such identifiers were used because they have domain-specific meaning for members of the classical community and provide a semantic cue as to the author or work being referenced. They do not indicate that a specific edition from the TLG or PHI canon is being referenced.

The Need To Create New Unique Work Identifiers

Over the course of cataloging in the last nine years, many works have been encountered that lack any canonical identifiers, particularly anonymous works, later classical Latin works (due to the relatively early end date of the PHI), fragmentary works, and works by authors later determined to be fictitious. During the first few years of cataloging, the basic procedure was to simply catalog the work with no identifier. Starting in 2011 when the catalog first became available online through the eXtensible catalog implementation, the importance of unique identifiers to ideally ultimately support the aggregation of all cataloged works became increasingly important.

Because of the expandable nature of the STOA Registry of Latin literature, identifiers have been created for Latin works (both cataloged and non-cataloged) that had either previously been cataloged and had no PHI or existing STOA and for new Latin works identified as part of the creation and ongoing expansion of the Authors-Abbreviations–Editions spreadsheet. For Greek works that were not found within the TLG canon (a much smaller number), a basic pattern of tlg-author name (e.g. for the fragmentary lyric poet Sacadas the MODS record included <identifier type="tlg">tlg-sacadas</identifier>) had been used as a placeholder to help keep track of such authors until a more formal system of identifier creation for fragmentary and fictitious authors could be decided upon.

A Provisional Solution

A solution for fragmentary historians without identifiers was developed that makes use of a spreadsheet of data created by the the Digital Fragmenta Historicorum Graecorum (DFHG) project. As CTS URNs require a textgroup it was decided to use FHG as the top level textgroup for those fragmentary historians within this collection that lacked TLGs and then the spreadsheet referenced above was used to create numeric identifiers for all of the historians within the collection. The numeric identifiers were then used as the top level FHG identifier for each historian, with their works numbered sequentially as with other textgroups. For example, for the historian Evanoridas Eleus, the FHG identifier of 0412 was used, and the one existing work within the collection was encoded as <identifier type="fhg">fhg0412.fhg001</identifier>.

It has provisionally been decided as there have been numerous instances of fragmentary and other collections with authors lacking any formal identifiers to adapt the current system to accept certain types of patterns for work identifiers, rather than requiring certain semantic identifiers such as TLGs, PHIs or STOAs. Essentially as long as there is a consistent **textgroup.work **pattern the system will generate a CTS URN that will allow a work to be ingested into the catalog (e.g. there are a number of authors from other collections such as the Epicorum Graecorum Fragmenta and Poetae Lyrici Graeci that will make use of textgroups such as EGF and PLG).

The need for opaque identifiers created automatically to support the use of CTS URNS has been discussed at various times and will likely grow in necessity if the Perseus Catalog is to scale up in either supporting broader participation as well as the addition of large numbers of authors and or works without standard identifiers.

A Suggested Satirical Reading Order:

Home for an overview of the repositories

Basic Steps--overview of what to do

Searching the Catalog Is this edition already cataloged?

Finding MODS Records Let's go get some MODS Records!

Saving and Naming MODS Records--Where does my MODS record go?

Enhancing MODS Records What do I put in my MODS record?

Analytical Cataloging So what exactly is this FRBR you speak of?

Sample MODS Records

Finding and Downloading Authority Records What do you mean my author isn't in LCNAF?

Creating and Enhancing Authority Records Templates, schemplates...

CTS URNs and Work Identifiers My kingdom for a preexisting canonical work identifier!

Sample MAD Records So that's what an authority record looks like!

Clone this wiki locally