Skip to content

Harvard LibaryCloud (Json API)

vogelsgesang edited this page Apr 9, 2014 · 6 revisions

We decided to use the Harvard LibaryCloud API which is offered under the Creative Commons Zero license (CC0) .

This repository contains information from the Harvard Library Bibliographic Dataset, which is provided by the Harvard Library under its Bibliographic Dataset Use Terms and includes also data made available by, among others, OCLC Online Computer Library Center, Inc. OCLC Online Computer Library Center, Inc. and the Library of Congress.. This dataset contains over 12 million bibliographic records. The data is also available for bulk download. A more in depth documentation of the original data is available here.

Request to this API are sent as GET request and data is returned in JSON format. The returned data is extracted from Harvard's bibliography data which is stored in the old Marc21 format. The API offers a search functionality on top of the Marc21 records and renames Marc21 fields into human readable names. In addition, the original Marc21 record is returned.

There are two possibilities to access data:

  • Using LibraryCloud Query Builder: It helps quickly build requests and view responses, without leaving the comfort of your browser.
  • Querying from browser: use following Base URI
http://librarycloud.harvard.edu/v1/api/item/

Querying techniques

Basic Query

We can execute 3 queries per second from a single IP address.We specify parameter name <?filter> and then write field we want to search against as a bellow.

http://librarycloud.harvard.edu/v1/api/item/?filter=keyword:internet

The outcome type of this query is a JSON format. This query follows some rules .The rules for creating well-formed query terms differ according to whether the search type performs exact or keyword matching.

Keywords Exact Searches
Case insensitive Case sensitive
Truncation only at word boundaries allowed Only full values accepted

Base Fields

The following example shows some of the returned fields. Complete documentation is available on the web.

Field name Field description
keyword Almost all of a record's fields get copied to this field. This is the place to start if you don't know where to start. Keyword matching.
id The identifier given to the item here in LibraryCloud. Exact matching.
title The title and/or subtitle of the item. Exact matching.
title_keyword The title and/or subtitle of the item. Keyword matching.

Here is an example of search for subjects.

http://librarycloud.harvard.edu/v1/api/item/?filter=lcsh_keyword:computer%20networks

This query is containing the term computer networks.Following results number of found : 16935 that is start as a deafualt from 0 , this can be change and also limited to 25 record.

{
num_found: 16935,
start: "0",
limit: "25",
sort: "shelfrank desc",
filter: "lcsh_keyword:computer\ networks",
   docs: [
    {
       lcsh: [
         "Transplantation immunology Periodicals.",
         "Transplantation of organs, tissues, etc. Periodicals.",
         "Transplantation.",
         "Transplantation Immunology.",
         "Periodicals.",
         "Computer network resources.",
         "Electronic journals."
      ],
      <snip>
    },
  ]
}

Other Parameter

The API allows to filter the results on multiple criteria, and supportspagination and sorting the results.

Parameter name Parameter description
filter You can narrow queries by using filters. Syntax: fieldname:filter (example: language:English). Multiple filter parameters can be provided.
limit Number of records to return. Default is 25. Max is 250.
start The starting point in the result set. Default is 0.
sort Specifies the sort order. Default: “shelfrank desc”. This parameter is undocumented and not officially supported!

Here comes some examples:

http://librarycloud.harvard.edu/v1/api/item/?filter=keyword:internet&start=0&filter=language:German&sort=score_downloads%20desc

Result:

num_found: 1830,
start: "0",
limit: "25",
sort: "score_downloads desc",
     filter: [
    "keyword:internet",
    "language:German"
 ],
http://librarycloud.harvard.edu/v1/api/item/?filter=keyword:internet&filter=note_keyword:michigan&facet=holding_libs
http://librarycloud.harvard.edu/v1/api/item/?filter=keyword:internet&limit=10&start=30&sort=score_downloads%20desc

In the last line a search for items containing the term internet is done, sorted in descending order by score_downloads while limiting the number of returned items to 10 starting at record 30.