-
Notifications
You must be signed in to change notification settings - Fork 43
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
31 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Data Concept | ||
|
||
## Motivation | ||
We like to see ORCESTRA as **our common** field campaign. | ||
All should be able to use the gathered data. | ||
Together and for mutual benefit. | ||
The purpose of these goals is to learn from what worked and what didn't work during the EUREC4A field campaign and other projects. | ||
|
||
### Goals | ||
|
||
The goals are sorted in decreasing priority (i.e. 1 is the most important). We **aim for all of them**, but if we have to cut, we should cut at the end. | ||
|
||
1. **a *single* list of existing datasets**<br/> | ||
We want a common data collection of our field campaign. | ||
Everyone interested in ORCESTRA should be able to find available datasets. | ||
For clarity and consistency, there must be exactly one list. | ||
2. **the datasets in list are *accessible***<br/> | ||
Given someone found a dataset in the list, the dataset should be usable. | ||
That is, the information in the list must be sufficient for everyone to be able to open the dataset with common tools and little effort. | ||
3. **datasets are *well-formed* and *analysis-ready***<br/> | ||
Useful datasets are typically written once and read often. | ||
The overall effort can be reduced if we spend a bit more time on creating the dataset if that facilitates the later use. | ||
4. **incremental backups are possible**<br/> | ||
We expect that the ORCESTRA data collection is a valuable contribution to our scientific field. | ||
We should be able to have a backup of this collection. | ||
Realistically, the list will evolve over time, thus we will have to update any backups incrementally. | ||
5. **datasets are on a shared, distributed system**<br/> | ||
We want the data system to be use in actual scientific work (not only for "data publication"). | ||
Traditional systems are often too complicated or slow for day-to-day usage. | ||
A distributed system increases the availability and performance (e.g. due to local caches, redundant servers...), which renders the actual use of own published data convenient, fast and fun. |