Skip to content

Commit

Permalink
Merge branch 'main' of github.com:caltechlibrary/dataset into gh-pages
Browse files Browse the repository at this point in the history
  • Loading branch information
R. S. Doiel committed Sep 19, 2024
2 parents c6725d5 + cb5806f commit a34a7a2
Show file tree
Hide file tree
Showing 42 changed files with 103 additions and 108 deletions.
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,6 @@ maintainers:
orcid: "https://orcid.org/0000-0001-9266-5146"

repository-code: "https://github.com/caltechlibrary/dataset"
version: 2.1.18
version: 2.1.19
license-url: "https://data.caltech.edu/license"
keywords: [ "GitHub", "metadata", "data", "software", "json" ]
63 changes: 23 additions & 40 deletions TODO.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,18 +25,19 @@
<section>
<h1 id="action-items">Action Items</h1>
<h2 id="bugs">Bugs</h2>
<h2 id="next-prep-for-v2.1.20">Next (prep for v2.1.20)</h2>
<ul class="task-list">
<li><label><input type="checkbox"
checked="" /><code>dataset help init</code> should include examples of
forming a dsn for SQL store dataset collections using SQLite3, MySQL and
PostgreSQL from docs/init.md</label></li>
<li><label><input type="checkbox" />Update datasetd to support
urlencoded data submissions in additional to application/json</label>
<ul>
<li>this would allow a simple data entry system to be build directly
from HTML without the need for JavaScript in the browser</li>
<li>the urlencoded data should support embedded YAML in text areas for
extrapolating more complex data structures</li>
</ul></li>
</ul>
<h2 id="next-prep-for-v2.1.14">Next (prep for v2.1.14)</h2>
<h2 id="someday-maybe">Someday, Maybe</h2>
<ul class="task-list">
<li><label><input type="checkbox" />Need to add getting updated Man
pages using the <code>dataset help ...</code> command</label></li>
<li><label><input type="checkbox" />Need to map path parts to parameter
sequence for calling sql functions in datasetd</label></li>
<li><label><input type="checkbox" />My current approach to versioning is
too confusing, causing issues in implementing py_dataset, versioning
needs to be automatic with a minimum set of methods explicitly
Expand Down Expand Up @@ -223,56 +224,42 @@ <h2 id="next-prep-for-v2.1.14">Next (prep for v2.1.14)</h2>
<li><label><input type="checkbox" />prune, remove attachments (including
all versions) from an JSON object in the collection</label></li>
</ul></li>
<li><label><input type="checkbox" />Document example Shell access to
datasetd via cURL</label></li>
<li><label><input type="checkbox" />take KeyMap out of collection.json
so collection.json is smaller</label>
<ul>
<li>support for segmented key maps (to limit memory consumption for very
large collections)</li>
</ul></li>
<li><label><input type="checkbox" />Add support for segmented key maps
(to limit memory consumption for very large collections) settings in
collection.json using keywords of patch, minor, major</label></li>
<li><label><input type="checkbox" />Auto-version attachments by patch,
minor or major release per settings in collection.json using keywords of
patch, minor, major</label></li>
</ul>
<h2 id="someday-maybe">Someday, Maybe</h2>
<ul class="task-list">
<li><label><input type="checkbox" />Look at <a
href="https://github.com/metacall/golang-typescript-example">Metacall</a>
and consider TypeScrit integration into dataset/datasetd</label></li>
minor or major release per</label></li>
<li><label><input type="checkbox" />Need to add getting updated Man
pages using the <code>dataset help ...</code> command</label></li>
<li><label><input type="checkbox" />Allow a WASM module to be used to
validate objects in the collection. It needs to me integrate such that
it “travels” will the dataset collection</label>
<ul>
<li>this would let our JSON collections support explicit JSON structures
as well as ad-hoc JSON objects</li>
<li>could use the YAML model approach in Newt to define the
structures</li>
</ul></li>
<li><label><input type="checkbox" />Review <a
href="https://go-app.dev/">Go-app</a> and see if this would be a way to
create a local client UI for working with datasets and enabling LunrJS
for search</label></li>
<li><label><input type="checkbox" />Document an example Python 3 http
client support for web API implementing a drop in replacement for
py_dataset using the web service or cli</label></li>
<li><label><input type="checkbox" checked="" />Missing tests for
AttachStream()</label></li>
<li><label><input type="checkbox" />Implement a wrapping logger that
takes a verboseness level for output (e.g. 0 - quiet, 1 progress
messages, 2 warnings, errors should always show)</label></li>
<li><label><input type="checkbox" checked="" />Memory consumption is
high for attaching, figure out how to improve memory usage, switched to
using streams where possible</label></li>
<li><label><input type="checkbox" />Add support for https:// based
datasets (in addition to local disc and s3://)</label></li>
<li><label><input type="checkbox" />dsbagit would generate a “BagIt” bag
for preservation of collection objects</label></li>
<li><label><input type="checkbox" />dsgen would take a model described
in YAML and generate HTML and browser side ES6 for quick prototyping
with datasetd</label></li>
<li><label><input type="checkbox" />OAI-PMH importer to prototype iiif
service based on Islandora content driven by a dataset
collection</label></li>
<li><label><input type="checkbox" />Implement version support in the web
service</label></li>
<li><label><input type="checkbox" />Implement an integrated UI for
datasetd</label>
<li><label><input type="checkbox" />Implement an integrated a web UI for
managing dataset collections and their data structures</label>
<ul class="task-list">
<li><label><input type="checkbox" />Form pages could be expressed in
Markdown+YAML for forms and embedded in the datasetd settings YAML
Expand All @@ -292,10 +279,6 @@ <h2 id="someday-maybe">Someday, Maybe</h2>
web approach to embedding forms in Markdown combined with some JS glue
code to knit the two together</label></li>
</ul></li>
<li><label><input type="checkbox" />Consider updating datasetd to
support urlencoded data submissions in additional to application/json,
this might make it easier to quicklt develop browser side UI for
datasetd web services</label></li>
</ul>
</section>

Expand Down
35 changes: 14 additions & 21 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,16 @@ Action Items
Bugs
----

- [x] `dataset help init` should include examples of forming a dsn for SQL store dataset collections using SQLite3, MySQL and PostgreSQL from docs/init.md

Next (prep for v2.1.14)
Next (prep for v2.1.20)
-----------------------

- [ ] Need to add getting updated Man pages using the `dataset help ...` command
- [ ] Need to map path parts to parameter sequence for calling sql functions in datasetd
- [ ] Update datasetd to support urlencoded data submissions in additional to application/json
- this would allow a simple data entry system to be build directly from HTML without the need for JavaScript in the browser
- the urlencoded data should support embedded YAML in text areas for extrapolating more complex data structures

Someday, Maybe
--------------

- [ ] My current approach to versioning is too confusing, causing issues in implementing py_dataset, versioning needs to be automatic with a minimum set of methods explicitly supporting it otherwise versioning should just happen in the back ground and only be supported at the package and libdataset levels.
- [ ] create, read, update, list operations should always reflect the "current" version (objects or attachments), delete should delete all versions of objects as should prune for attachments, this is because versioning suggests things never really get deleted, just replaced.
- [ ] Common dataset verbs (dataset/datasetd)
Expand Down Expand Up @@ -103,39 +106,29 @@ Next (prep for v2.1.14)
- [ ] attach, add an attachment to a JSON object in the collection, respect versioning if enabled
- [ ] detach, retrieve an attachment from the JSON object in the collection
- [ ] prune, remove attachments (including all versions) from an JSON object in the collection
- [ ] Document example Shell access to datasetd via cURL
- [ ] take KeyMap out of collection.json so collection.json is smaller
- support for segmented key maps (to limit memory consumption for very
large collections)
- [ ] Auto-version attachments by patch, minor or major release per
- [ ] Add support for segmented key maps (to limit memory consumption for very large collections)
settings in collection.json using keywords of patch, minor, major

Someday, Maybe
--------------

- [ ] Look at [Metacall](https://github.com/metacall/golang-typescript-example) and consider TypeScrit integration into dataset/datasetd
- [ ] Auto-version attachments by patch, minor or major release per
- [ ] Need to add getting updated Man pages using the `dataset help ...` command
- [ ] Allow a WASM module to be used to validate objects in the collection. It needs to me integrate such that it "travels" will the dataset collection
- this would let our JSON collections support explicit JSON structures as well as ad-hoc JSON objects
- [ ] Review [Go-app](https://go-app.dev/) and see if this would be a way to create a local client UI for working with datasets and enabling LunrJS for search
- could use the YAML model approach in Newt to define the structures
- [ ] Document an example Python 3 http client support for web API implementing a drop in replacement for py_dataset using the web service or cli
- [X] Missing tests for AttachStream()
- [ ] Implement a wrapping logger that takes a verboseness level for
output (e.g. 0 - quiet, 1 progress messages, 2 warnings, errors
should always show)
- [X] Memory consumption is high for attaching, figure out how to improve
memory usage, switched to using streams where possible
- [ ] Add support for https:// based datasets (in addition to local disc
and s3://)
- [ ] dsbagit would generate a "BagIt" bag for preservation of collection
objects
- [ ] dsgen would take a model described in YAML and generate HTML and browser side ES6 for quick prototyping with datasetd
- [ ] OAI-PMH importer to prototype iiif service based on Islandora
content driven by a dataset collection
- [ ] Implement version support in the web service
- [ ] Implement an integrated UI for datasetd
- [ ] Implement an integrated a web UI for managing dataset collections and their data structures
- [ ] Form pages could be expressed in Markdown+YAML for forms and embedded in the datasetd settings YAML file
- See my notes on my text oreinted web experiment, yaml2webform.go
- Forms could be render into the htdocs auto-magically saving development effort
- The same forms could then be used server side for validation based on descriptors and JavaScript converted to WASM code
- [ ] A standard JavaScript library could be used to knit the forms to the datasetd web service (sort of a mini-newt)
It would be nice if citesearch was defined by the citesearch.yaml file and some markdown documents taking a text oriented web approach to embedding forms in Markdown combined with some JS glue code to knit the two together
- [ ] Consider updating datasetd to support urlencoded data submissions in additional to application/json, this might make it easier to quicklt develop browser side UI for datasetd web services
2 changes: 1 addition & 1 deletion about.html
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@

<section>
<h1 id="about-this-software">About this software</h1>
<h2 id="dataset-2.1.18">dataset 2.1.18</h2>
<h2 id="dataset-2.1.19">dataset 2.1.19</h2>
<h3 id="authors">Authors</h3>
<ul>
<li>R. S. Doiel</li>
Expand Down
4 changes: 2 additions & 2 deletions about.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ authors:
orcid: "https://orcid.org/0000-0001-9266-5146"

repository-code: "https://github.com/caltechlibrary/dataset"
version: 2.1.18
version: 2.1.19
license-url: "https://data.caltech.edu/license"
keywords: [ "GitHub", "metadata", "data", "software", "json" ]

Expand All @@ -23,7 +23,7 @@ keywords: [ "GitHub", "metadata", "data", "software", "json" ]
About this software
===================

## dataset 2.1.18
## dataset 2.1.19

### Authors

Expand Down
7 changes: 7 additions & 0 deletions api.go
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,13 @@ func staticRouter(next http.Handler) http.Handler {
responseLogger(r, 403, fmt.Errorf("Forbidden, requested a dot path"))
return
}
// See if we need to set a header of JavaScript or TypeScript files.
if strings.HasSuffix(r.URL.Path, ".js") || strings.HasSuffix(r.URL.Path, ".mjs") {
w.Header().Add("Content-Type", "application/javascript; charset=utf-8")
}
if strings.HasSuffix(r.URL.Path, ".ts") {
w.Header().Add("Content-Type", "application/typescript; charset=utf-8")
}
// If we make it this far, fall back to the default handler
next.ServeHTTP(w, r)
})
Expand Down
2 changes: 1 addition & 1 deletion clean.bat
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
REM This is a Windows 10 Batch file for building dataset command
REM from the command prompt.
REM
REM It requires: go version 1.12.4 or better and the cli for git installed
REM It requires: go version 1.23.1 or better and the cli for git installed
REM
DEL /S bin
RMDIR /S bin
Expand Down
4 changes: 2 additions & 2 deletions codemeta.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
"releaseNotes": "Updated Go version to 1.22 and related packages, query path added to datasetd",
"name": "dataset",
"dateRelease": "2023-07-18",
"dateModified": "2024-09-10",
"dateModified": "2024-09-18",
"codeRepository": "https://github.com/caltechlibrary/dataset",
"issueTracker": "https://github.com/caltechlibrary/dataset/issues",
"license": "https://data.caltech.edu/license",
"version": "2.1.18",
"version": "2.1.19",
"author": [
{
"@type": "Person",
Expand Down
2 changes: 1 addition & 1 deletion dataset.1.html
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ <h1 id="examples">EXAMPLES</h1>
dataset delete my_objects.ds &quot;345&quot;

dataset keys my_objects.ds</code></pre>
<p>dataset 2.1.18</p>
<p>dataset 2.1.19</p>
</section>

<footer>
Expand Down
6 changes: 3 additions & 3 deletions dataset.1.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%dataset(1) user manual | version 2.1.18 72a96f4
%dataset(1) user manual | version 2.1.19 6f4ea86
% R. S. Doiel and Tom Morrell
% 2024-09-10
% 2024-09-19

# NAME

Expand Down Expand Up @@ -114,6 +114,6 @@ implements.
dataset keys my_objects.ds
~~~

dataset 2.1.18
dataset 2.1.19


2 changes: 1 addition & 1 deletion datasetd.1.html
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ <h1 id="examples">EXAMPLES</h1>
<pre><code> curl http://localhost:8485/api/t1.ds/keys</code></pre>
<p>In the shell session where datasetd is running press “ctr-C” to
terminate the service.</p>
<p>datasetd 2.1.18</p>
<p>datasetd 2.1.19</p>
</section>

<footer>
Expand Down
6 changes: 3 additions & 3 deletions datasetd.1.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%datasetd(1) user manual | version 2.1.18 72a96f4
%datasetd(1) user manual | version 2.1.19 6f4ea86
% R. S. Doiel
% 2024-09-10
% 2024-09-19

# NAME

Expand Down Expand Up @@ -222,6 +222,6 @@ In the shell session where datasetd is running press "ctr-C"
to terminate the service.


datasetd 2.1.18
datasetd 2.1.19


4 changes: 2 additions & 2 deletions dsimporter.1.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%dsimporter(1) dataset user manual | version 2.1.18 72a96f4
%dsimporter(1) dataset user manual | version 2.1.19 6f4ea86
% R. S. Doiel and Tom Morrell
% 2024-09-10
% 2024-09-19

# NAME

Expand Down
4 changes: 2 additions & 2 deletions dsquery.1.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%dsquery(1) dataset user manual | version 2.1.18 72a96f4
%dsquery(1) dataset user manual | version 2.1.19 6f4ea86
% R. S. Doiel and Tom Morrell
% 2024-09-10
% 2024-09-19

# NAME

Expand Down
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
module github.com/caltechlibrary/dataset/v2

go 1.22.5
go 1.23.1

require (
github.com/caltechlibrary/dotpath v0.0.4
github.com/caltechlibrary/dsv1 v0.0.0-20220817192039-7c2741c5699d
github.com/caltechlibrary/pairtree v1.0.3
github.com/caltechlibrary/pairtree v1.0.4
github.com/caltechlibrary/semver v0.0.0-20220817184719-a504da2d5c6a
github.com/glebarez/go-sqlite v1.22.0
github.com/go-sql-driver/mysql v1.8.1
Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ github.com/caltechlibrary/dotpath v0.0.4 h1:ghc3XZefLPDhQnUo9oXqvz8vfh52QOE0BjK5
github.com/caltechlibrary/dotpath v0.0.4/go.mod h1:rAu0NPuhTaEa9szxXq92x/JmMmedjsEEdHl5uTxzzs0=
github.com/caltechlibrary/dsv1 v0.0.0-20220817192039-7c2741c5699d h1:SGz3rTkjsp/8tEkFXTRHaJxqyeeDiL210QD69Bn7bd0=
github.com/caltechlibrary/dsv1 v0.0.0-20220817192039-7c2741c5699d/go.mod h1:ajUo9ZOowgXjLfDXeUdMmJit9CZyHUzgaPIHUh9TBcg=
github.com/caltechlibrary/pairtree v1.0.3 h1:ykaydbmdyI1Doszaw0rvScKSXcU7HbotCQpNTlotX7s=
github.com/caltechlibrary/pairtree v1.0.3/go.mod h1:7jeP5TyT9ilM+TTRklwrIbUWI/uGuQFm06vrhmgcS5U=
github.com/caltechlibrary/pairtree v1.0.4 h1:eMr4Ku6BFmrpv5vvnxQ1SDMcNveH8TZn8MWRVPaP7dg=
github.com/caltechlibrary/pairtree v1.0.4/go.mod h1:7jeP5TyT9ilM+TTRklwrIbUWI/uGuQFm06vrhmgcS5U=
github.com/caltechlibrary/semver v0.0.0-20220817184719-a504da2d5c6a h1:3q6ct6FfFDF2dEiW06Ran7iEOZ4d9HBiMbkqJUOO2oU=
github.com/caltechlibrary/semver v0.0.0-20220817184719-a504da2d5c6a/go.mod h1:LxzDpCilL3QqjL8qdirgmlo8riDkSFNsTALfIYYyQtE=
github.com/dustin/go-humanize v1.0.1 h1:GzkhY7T5VNhEkwH0PVJgjz+fX1rhBrR7pRT3mDkpeCY=
Expand Down
2 changes: 1 addition & 1 deletion installer.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#
param(
[Parameter()]
[String]$VERSION = "2.1.18"
[String]$VERSION = "2.1.19"
)
[String]$PKG_VERSION = [Environment]::GetEnvironmentVariable("PKG_VERSION")
if ($PKG_VERSION) {
Expand Down
2 changes: 1 addition & 1 deletion installer.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# Set the package name and version to install
#
PACKAGE="dataset"
VERSION="2.1.18"
VERSION="2.1.19"
GIT_GROUP="caltechlibrary"
RELEASE="https://github.com/$GIT_GROUP/$PACKAGE/releases/tag/v$VERSION"
if [ "$PKG_VERSION" != "" ]; then
Expand Down
7 changes: 3 additions & 4 deletions libdataset/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ clean:
@if [ -f "$(LIB_NAME)-amd64.so" ]; then rm "$(LIB_NAME)-amd64.so"; fi
@if [ -f "$(LIB_NAME)-arm64.so" ]; then rm "$(LIB_NAME)-arm64.so"; fi
@if [ -f "$(LIB_NAME).h" ]; then rm "$(LIB_NAME).h"; fi
@if [ -f "$(LIB_NAME)-js.wasm" ]; then rm "$(LIB_NAME)-js.wasm"; fi
@if [ -d "dist" ]; then rm -fR dist; fi
@if [ -d "testout" ]; then rm -fR testout; fi

Expand All @@ -65,8 +64,8 @@ save:

# WASM code build is experimental, Python maybe able to load WASM code via wasmer-python, https://github.com/wasmerio/wasmer-python
# This would let me avoid having at have seperate machines to build a libdataset C-shared library.
wasm: $(LIB_NAME).go
env CGO_ENABLED=1 GOOS=js GOARCH=wasm go build -o $(LIB_NAME)-js.wasm $(LIB_NAME).go
#wasm: $(LIB_NAME).go
# env GOOS=js GOARCH=wasm go build -o $(LIB_NAME).wasm $(LIB_NAME).go

release: $(LIB_NAME)$(EXT)
mkdir -p dist/man/man3
Expand All @@ -79,7 +78,7 @@ release: $(LIB_NAME)$(EXT)
go build -buildmode=c-shared -o "$(LIB_NAME)$(EXT)" "$(LIB_NAME).go"
cp -v $(LIB_NAME)$(EXT) dist/
cp -v $(LIB_NAME).h dist/
cd dist && zip $(LIB_NAME)-v$(VERSION)-$(OS)-$(ARCH).zip $(LIB_NAME)$(EXT) $(LIB_NAME).h $(LIB_NAME)-js.wasm codemeta.json CITATION.cff README.md LICENSE INSTALL.md
cd dist && zip $(LIB_NAME)-v$(VERSION)-$(OS)-$(ARCH).zip $(LIB_NAME)$(EXT) $(LIB_NAME).h codemeta.json CITATION.cff README.md LICENSE INSTALL.md


.FORCE:
Expand Down
2 changes: 1 addition & 1 deletion libdataset/make.bat
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ REM
REM A simple batch file to build the c-shared library and
REM package the Python3 module from the Windows 10 command prompt.
REM
REM Requires: Go v1.23.4 or better
REM Requires: Go v1.23.1 or better
REM Miniconda Python 3.7 or better.
REM Using conda: `conda install git` `conda install m2w64-gcc`
REM
Expand Down
Loading

0 comments on commit a34a7a2

Please sign in to comment.