Skip to content

craig-petch/analytics-udd

 
 

Repository files navigation

Jisc Learning Analytics Unified Data Definitions v1.3.2

Introduction

The Unified Data Definitions (UDD) of the Jisc learning analytics project is a vocabulary of the chief data entities of interest to learning analytics: students, courses, modules, and so on, as well as their characteristics. The data coded with this vocabulary is typically extracted from the student record system of a particular college or university.

Along with xAPI recipes, the UDD makes up the core data specification of the Jisc learning analytics architecture.

The main folder (jiscdev/analytics-udd) contains:

  • this ReadMe file that gives an overview of the UDD
  • the details of the UDD licencing arrangements
  • a UDD entity-relationship diagram
  • a link to spreadsheets listing the differences between the previous version of the UDD and version 1.3
  • a consolidated list of the descriptions of each UDD entity.

In addition to the main folder, there are 4 sub-folders. The udd sub-folder is the heart of the specification, with a file for each entity, describing its properties in detail. Refer to these files to design data for import into the Learning Data Hub. The media sub-folder contains various supporting files, including the E-R diagram source, the changes spreadsheet, Guides to the relative importance of UDD properties in respect of applications, products and services that use the UDD, and a copy of the JACS3 subject classification system. The utilities sub-folder has code fragments and snippets to support the development and use of the UDD. The implementation sub-folder describes matters that are not part of the formal UDD specification, but are closely related to it, for example a description of the mechanism for handling unofficial extensions to properties in the UDD, and filename conventions for adding data into the Learning Data Hub.

Differences between versions

The development of v1.3 has involved a number of additions and changes. This overview page provides a mapped listing of each change between version 1.2.7 and version 1.3.0. For differences between v1.3.0 and v1.3.1, see the Release Notes.

Data format

UDD data must be UTF-8 encoded. JSON is the preferred data format, but XML and TSV data are also supported. Other formats are not supported.

When providing UDD data, supply the data for different entities in separate files, 1 file per entity, using the UDD filename conventions.

Diagram

2 entity-relationship diagrams provide an overview of the specification. There is a brief E-R diagram containing just the primary keys, constraints and foreign key properties, and a full E_R diagram with all the properties.

Core sections

Primary keys

Some entities have uniqueness constraints across multiple properties; for example student_on_course_instance has STUDENT_COURSE_MEMBERSHIP_ID plus COURSE_INSTANCE_ID. The MD files for these entities contain a note to this effect. These entities have a single primary key for ease of processing and to enable field-level extensibility. The data supplier may choose to provide the single primary key, or may choose to leave it blank, in which case it will be generated by the Learning Data Hub loading mechanism.

Additional sections

There are also files of code lists extracted from the MD files for machine processing.

Mandatory and optional properties

The properties of the UDD are required in compliant datasets to different degrees. The Mandatory properties in the UDD guide outlines the different categories of UDD property. It is available as both Excel and ODF spreadsheets.

Code lists

Some UDD properties consist of code lists. Some have values derived from HESA tables (for HE) or ILR tables (for FE). In general these code lists are mapped to generic UDD code lists, so that they are standardised across data from multiple institutions. To extract code lists from the UDD MD files, you may wish to use the Python utility provided here.

Some code lists will be specific to one or a limited group of institutions. These lists are not included in the UDD and can be generated by the vendor. They can be loaded to the Learning Data Hub via a standard JSON format or can be handled via extensions (see below). An example of the JSON format is:

"MOD_LEVEL": {"A": null, "C": null, "B": null, "E": null, "D": null, "1": null, "0": null, "3": null, "2": null, "5": null, "7": null, "6": null, "9": null}}

Extensions

There is provision for data extensions at the level of property (in other words, "field-level extensions"). Although not strictly part of the UDD, a separate entity is provided and described at extension.md.

Specification development workflow

The simplest way of contributing to the UDD is as follows:

  1. add an issue to the issue tracker to alert everyone to what you are working on and why.
  2. tag the issue with the version milestone of which you'd like the patch to be a part.
  3. make an edit or add a file in this repository, and save it to your own branch. If you prefer, you can fork the whole repository and work in your own repository.
  4. send a pull request once you're done.
  5. the pull request will be discussed at our regular meetings and either merged, or kept in the queue, depending on whether more work is required.

You can do all this through the Github GUI, but you're welcome to use any other git tool you prefer.

Particular release versions will get their own branches, but the master branch will always contain the latest agreed release. Releases will be made after the review group has come to an agreement.

Versioning is done broadly as follows: (majorVersion.minorVersion.patch) major versions indicate major data model changes. Minor versions denote changes that can break applications, such as the deletion of properties that were valid in earlier versions. Patches can include the addition of new properties.

Note that some properties will be marked as 'deprecated'. This means that the property is still valid, but will be removed by the next minor version update.

Acknowledgements

Many thanks to all contributors who have raised issues, sent pull requests, commented and made suggestions. The UDD specification is the achievement of all of you.

  • @alanepaull
  • @andrewhickey
  • @arc12
  • @christoffballard
  • @ds10
  • @gryglbrt
  • @ht2
  • @huwrobertsjisc
  • @jfmullaney
  • @michaelwebjisc
  • @MiroslavKratchounov
  • @robwynj
  • @ryansmith94
  • @sandeepmjay
  • @wilmTap

License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%