-
Notifications
You must be signed in to change notification settings - Fork 4
/
dsquery.1.html
157 lines (155 loc) · 5.62 KB
/
dsquery.1.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
<!DOCTYPE html>
<html>
<head>
<title>Dataset Project</title>
<link href='https://fonts.googleapis.com/css?family=Open+Sans' rel='stylesheet' type='text/css'>
<link rel="stylesheet" href="https://caltechlibrary.github.io/css/site.css">
</head>
<body>
<header>
<a href="http://library.caltech.edu" title="link to Caltech Library Homepage"><img src="https://caltechlibrary.github.io/assets/liblogo.gif" alt="Caltech Library logo"></a>
</header>
<nav>
<ul>
<li><a href="/">Home</a></li>
<li><a href="index.html">README</a></li>
<li><a href="LICENSE">LICENSE</a></li>
<li><a href="install.html">INSTALL</a></li>
<li><a href="user_manual.html">User Manual</a></li>
<li><a href="about.html">About</a></li>
<li><a href="search.html">Search</a></li>
<li><a href="https://github.com/caltechlibrary/dataset">GitHub</a></li>
</ul>
</nav>
<section>
<h1 id="name">NAME</h1>
<p>dsquery</p>
<h1 id="synopsis">SYNOPSIS</h1>
<p>dsquery <a href="#options">OPTIONS</a> C_NAME SQL_STATEMENT
[PARAMS]</p>
<h1 id="description">DESCRIPTION</h1>
<p><strong>dsquery</strong> is a tool to support SQL queries of dataset
collections. Pairtree based collections should be index before trying to
query them (see ‘-index’ option below). Pairtree collections use the
SQLite 3 dialect of SQL for querying. For collections using a SQL
storage engine (e.g. SQLite3, Postgres and MySQL), the SQL dialect
reflects the SQL of the storage engine.</p>
<p>The schema is the same for all storage engines. The scheme for the
JSON stored documents have a four column scheme. The columns are “_key”,
“created”, “updated” and “src”. “_key” is a string (aka VARCHAR),
“created” and “updated” are timestamps while “src” is a JSON column
holding the JSON document. The table name reflects the collection name
without the “.ds” extension (e.g. data.ds is stored in a database called
data having a table also called data).</p>
<p>The output of <strong>dsquery</strong> is a JSON array of objects.
The order of the objects is determined by the your SQL statement and SQL
engine. There is an option to generate a 2D grid of values in JSON, CSV
or YAML formats. See OPTIONS for details.</p>
<h1 id="parameters">PARAMETERS</h1>
<dl>
<dt>C_NAME</dt>
<dd>
If harvesting the dataset collection name to harvest the records to.
</dd>
<dt>SQL_STATEMENT</dt>
<dd>
The SQL statement should conform to the SQL dialect used for the JSON
store for the JSON store (e.g. Postgres, MySQL and SQLite 3). The SELECT
clause should return a single JSON object type per row.
<strong>dsquery</strong> returns an JSON array of JSON objects returned
by the SQL query.
</dd>
<dt>PARAMS</dt>
<dd>
Is optional, it is any values you want to pass to the SQL_STATEMENT.
</dd>
</dl>
<h1 id="sql-store-scheme">SQL Store Scheme</h1>
<dl>
<dt>_key</dt>
<dd>
The key or id used to identify the JSON documented stored.
</dd>
<dt>src</dt>
<dd>
This is a JSON column holding the JSON document
</dd>
<dt>created</dt>
<dd>
The date the JSON document was created in the table
</dd>
<dt>updated</dt>
<dd>
The date the JSON document was updated
</dd>
</dl>
<h1 id="options">OPTIONS</h1>
<dl>
<dt>-help</dt>
<dd>
display help
</dd>
<dt>-license</dt>
<dd>
display license
</dd>
<dt>-version</dt>
<dd>
display version
</dd>
<dt>-pretty</dt>
<dd>
pretty print the resulting JSON array
</dd>
<dt>-sql SQL_FILENAME</dt>
<dd>
read SQL from a file. If filename is “-” then read SQL from standard
input.
</dd>
<dt>-grid STRING_OF_ATTRIBUTE_NAMES</dt>
<dd>
Returns list as a 2D grid of values. This options requires a comma
delimited string of attribute names for the outer object to include in
grid output. It can be combined with -pretty options.
</dd>
<dt>-csv STRING_OF_ATTRIBUTE_NAMES</dt>
<dd>
Like -grid this takes our list of dataset objects and a list of
attribute names but rather than create a 2D JSON array of values it
creates CSV representation with the first row as the attribute names.
</dd>
<dt>-yaml STRING_OF_ATTRIBUTE_NAMES</dt>
<dd>
Like -grid this takes our list of dataset objects and a list of
attribute names but rather than create a 2D JSON of values it creates
YAML representation.
</dd>
<dt>-index</dt>
<dd>
This will create a SQLite3 index for a collection. This enables dsquery
to query pairtree collections using SQLite3 SQL dialect just as it would
for SQL storage collections (i.e. don’t use with postgres, mysql or
sqlite based dataset collections. It is not needed for them). Note the
index is always built before executing the SQL statement.
</dd>
</dl>
<h1 id="examples">EXAMPLES</h1>
<p>Generate a list of JSON objects with the <code>_key</code> value
merged with the object stored as the <code>._Key</code> attribute. The
colllection name “data.ds” which is implemented using Postgres as the
JSON store. (note: in Postgres the <code>||</code> is very helpful).</p>
<pre><code>dsquery data.ds "SELECT jsonb_build_object('_Key', _key)::jsonb || src::jsonb FROM data"</code></pre>
<p>In this example we’re returning the “src” in our collection by
querying for a “id” attribute in the “src” column. The id is passed in
as an attribute using the Postgres positional notatation in the
statement.</p>
<pre><code>dsquery data.ds "SELECT src FROM data WHERE src->>'id' = $1 LIMIT 1" "xx103-3stt9"</code></pre>
</section>
<footer>
<span>© 2022 <a href="https://www.library.caltech.edu/copyright">Caltech Library</a></span>
<address>1200 E California Blvd, Mail Code 1-32, Pasadena, CA 91125-3200</address>
<span><a href="mailto:[email protected]">Email Us</a></span>
<span>Phone: <a href="tel:+1-626-395-3405">(626)395-3405</a></span>
</footer>
</body>
</html>