Having a common vocabulary for identifying places in Indonesia is essential for synergising development efforts across multiple stakeholders. However, at present, different organizations refer to the same places by different names. Additionally, existing efforts to identify places in Indonesia, such as those identified by GeoNames, are generally incomplete and may not reflect the actual structure of administrative divisions in Indonesia. Thankfully, through the use of Linked Data, it is possible to align these disparate representations using predicates like owl:sameAs
.
This repository aims to create a reference for identifying administrative divisions in Indonesia for use in Linked Data applications, such as BenangMerah. BenangMerah uses this data to link places in Indonesia with statistics about the places as well as social projects and organizations active in those places.
The contents of this repository are as follows:
- A script to generate RDF triples from reference documents, using node.js.
- Reference documents to generate the triples from.
- The resulting RDF triples, in Turtle format.
Additionally, a set of URI conventions are used to identify the Indonesian administrative divisions referenced in the triples. They are described in this readme.
A custom (i.e., not directly based on any other ontology) OWL ontology (Tbox) is used to describe the concepts needed to describe administrative divisions in Indonesia. OWL classes are used represent the classes of administrative divisions: Provinsi, Kabupaten, Kota, Kecamatan, Distrik, Desa, Kelurahan, etc. OWL object properties are used to denote the parent-child relationships in the hierarchy of administrative divisions.
At the moment, the ontology is available in Turtle format from this repository. However, in the future, the BenangMerah ontology will be split off into a different repo. The URIs used will stay the same.
The instances RDF graph (Abox) is generated using a custom node.js script from a CSV extracted using Tabula from the PDF of Buku Induk Kode dan Wilayah Administrasi Pemerintahan Per Provinsi, Kabupaten/Kota dan Kecamatan Seluruh Indonesia, which was legalised as Lampiran I Permendagri No. 18/2013, with several mistruncated words corrected based on information on the document itself, as well as abbreviations expanded.
Note that the Permendagri does not include recent establishments of new divisions, such as the province of Kalimantan Utara and many kabupatens around Indonesia. Nonetheless, this knowledgebase uses the Permendagri as its basis. Other possible sources, such as http://kodepos.nomor.net/, will be incorporated in the future.
URIs are used to identify Linked Data resources, in this case the Indonesian administrative divisions. Each division is referred by 2 URIs, with the equivalence of the URIs asserted using owl:sameAs
.
The Indonesian government maintains numeric codes for administrative divisions. These numeric codes are reused by other governmental bodies, including their datasets. As such, using these URIs are more preferred for linking government-sourced datasets. The URI pattern is as follows:
http://benangmerah.net/place/idn/bps/[bps-code]
bps-code
refers to the BPS code, which is a two-digit (for provinces), four-digit (for kabupaten/kota), or six-digit (for kecamatan) number.
Since administrative divisions follow a hierarchy, much like files and directories in a filesystem, a similar way of addressing is used. The base URI pattern is as follows:
http://benangmerah.net/place/idn/[provinsi]/[kabupaten-kota]/[kecamatan]
Where:
provinsi
is theslugified-name
of the province, according to the Permendagri. Note that:- Daerah Istimewa Yogyakarta, referred as Daista Yogyakarta in the Permendagri and DI Yogyakarta by BPS, is written as
di-yogyakarta
, notdaerah-istimewa-yogyakarta
,daista-yogyakarta
,yogyakarta
, nordiy
. - DKI Jakarta, on the other hand, is written as
dki-jakarta
, notdaerah-khusus-ibukota-jakarta
,jakarta
, nordki
. - Aceh is written as
aceh
, as it is its official name according to UU No. 11/2006.
- Daerah Istimewa Yogyakarta, referred as Daista Yogyakarta in the Permendagri and DI Yogyakarta by BPS, is written as
kabupaten-kota
is theslugified-name
of the kabupaten/kota, including the wordkabupaten
orkota
. The abbreviation Kab. in the Permendagri is expanded. Note that the subdivisions of DKI Jakarta are officially termed "Kota Administratif" and "Kabupaten Administratif".kecamatan
is theslugified-name
of the kecamatan/distrik, not including the wordkecamatan
nordistrik
.
The slugified-name
form of place names are generated using the slugify
function of underscore.string.
These URI conventions can be compared to other ontologies/resources:
- GeoNames which uses codes for places, appended to the base GeoNames URI.
- DBPedia uses Wikipedia titles.
Each resource is rdfs:label
-ed by its name according to the Permendagri.
The instances RDF graph is available in Turtle format from this repository.
As a CLI script:
node main.js [-o turtle_output_file_name]
As a module:
var wilayah = require('benangmerah-wilayah');
wilayah.getTripleStore(function(err, tripleStore) {
// tripleStore is an instance of N3Store containing the triples
});
wilayah.writeTriples('turtle_output_file_name', function(err) {
if (!err) {
// Turtle succesfully written
}
});
BenangMerah is an effort to collect data on social development in Indonesia into a knowledge base based on Semantic Web/Linked Data technologies.
BenangMerah is developed by Andhika Nugraha, a student at Institut Teknologi Bandung.