Compendium metadata¶
Basics¶
Metadata in a compendium is stored in a directory .erc
. This directory contains the normative metadata documents using a file naming scheme <PREFIX>_<MODEL>_<VERSION>.<FORMAT>
filled via each metadata mapping file found in the broker tool of the o2r metadata tool suite, the default prefix is metadata
, e.g. metadata_o2r_1.json
, metadata_zenodo_1.json
, or metadata_datacite_41.xml
. The filename of the extracted raw metadata has no versioning and is constantly found as metadata_raw.json
.
A copy of the files in this directory is kept in database for easier access, so every compendium returned by the API can contain different sub-properties in the metadata property. This API always returns the database copy of the metadata elements. You can download the respective files to access the normative metadata documents.
Metadata formats¶
The files are available on demand, but metadata variants are created after each metadata update.
The sub-properties of the metadata
and their content are
raw
contains raw metadata extracted automaticallyo2r
holds the main information for display and is modelled according the the o2r metadata model. This metadata is reviewed by the user and the basis for translating to other metadata formats and also for search.zenodo
holds Zenodo metadata for shipments made to Zenodo and is brokered fromo2r
metadatazenodo_sandbox
holds Zenodo metadata for shipments made to Zenodo Sandbox, i.e. a clone ofzenodo
metadata
Note
The information in each sub-property are subject to independent workflows and may differ from one another. The term brokering is used for translation from one metadata format into another.
Metadata validation¶
Only valid metadata can be saved to a compendium.
The o2r
metadata element is validated against a JSON Schema using the validate
tool of o2r-meta
.
The schema file is included in the o2r-meta
repository: https://raw.githubusercontent.com/o2r-project/o2r-meta/master/schema/json/o2r-meta-schema.json.
Get all compendium metadata¶
curl https://…/api/v1/$ID
GET /api/v1/compendium/:id
Abbreviated example response:
200 OK
{
"id":"12345",
"metadata": {
"raw": {
"title": "Programming with Data. Springer, New York, 1998. ISBN 978-0-387-98503-9.",
"author": "John M. Chambers",
…
},
"o2r": {
"title": "Programming with Data",
"creators": [
{
"name": "John M. Chambers"
}
],
"publication_date": 1998,
…
},
"zenodo": {
…
}
},
"created": …,
"files": …
}
Get o2r metadata¶
The following endpoint allows to access only the normative o2r-metadata element:
curl https://…/api/v1/$ID/metadata
GET /api/v1/compendium/:id/metadata
200 OK
{
"id":"compendium_id",
"metadata": {
"o2r": {
…
}
}
}
URL parameters¶
:id
- compendium id
Spatial metadata¶
For discovery purposes, the metadata includes extracted GeoJSON bounding boxes based on data files in a workspace.
Currently supported spatial data sources:
The following structure is made available per file:
"spatial": {
"files": [
{
"geojson": {
"bbox": [
-2.362060546875,
52.0862573323384,
-1.285400390625,
52.649729197309426
],
"geometry": {
"coordinates": [
[
[
-2.362060546875,
52.0862573323384
],
[
-1.285400390625,
52.649729197309426
]
]
],
"type": "Polygon"
},
"type": "Feature"
},
"source_file": "path/to/file1.geojson"
},
{
"geojson": {
"bbox": [
7.595369517803192,
51.96245837645124,
7.62162297964096,
51.96966694957956
],
"geometry": {
"coordinates": [
[
[
7.595369517803192,
51.96245837645124
],
[
7.62162297964096,
51.96966694957956
]
]
],
"type": "Polygon"
},
"type": "Feature"
},
"source_file": "path/to/file2.shp"
}
],
"union": {
"geojson": {
"bbox": [
-2.362060546875,
51.96245837645124,
7.62162297964096,
51.96245837645124
],
"geometry": {
"coordinates": [
[
-2.362060546875,
51.96245837645124
],
[
7.62162297964096,
51.96245837645124
],
[
7.62162297964096,
52.649729197309426
],
[
-2.362060546875,
52.649729197309426
]
],
"type": "Polygon"
},
"type": "Feature"
}
}
}
The spatial
key has a union
bounding box, that wraps all extracted bounding boxes.
Update metadata¶
The following endpoint can be used to update the o2r
metadata elements.
All other metadata sub-properties are only updated by the service itself, i.e. brokered metadata.
After creation the metadata is persisted to both files and database, so updating the metadata via this endpoint allows to trigger a brokering process and to retrieve different metadata formats either via this metadata API or via downloading the respective file using the download endpoint.
Metadata update rights
Only authors of a compendium or users with the required user level can update a compendium's metadata.
Metadata update request¶
curl -H 'Content-Type: application/json' \
-X PUT \
--cookie "connect.sid=<code string here>" \
-d '{ "o2r": { "title": "Blue Book" } }' \
/api/v1/compendium/:id/metadata
The request overwrites the existing metadata properties, so the full o2r metadata must be put with a JSON object called o2r
at the root, even if only specific fields are changed.
Note
This endpoint allows only to update the metadata.o2r
elements. All other properties of
URL parameters¶
:id
- compendium id
Metadata update response¶
The response contains an excerpt of a compendium with only the o2r metadata property.
200 OK
{
"id":"compendium_id",
"metadata": {
"o2r": {
"title": "Blue Book"
}
}
}
Metadata update error responses¶
401 Unauthorized
{"error":"not authorized"}
400 Incomplete metadata (description property missing)
{
"error":"Error updating metadata file, see log for details",
"log": "[o2rmeta] 20180302.085940 received arguments: {'debug': True, 'tool': 'validate', 'schema': 'schema/json/o2r-meta-schema.json', 'candidate': '/tmp/o2r/compendium/1cAIr/data/.erc/metadata_o2r_1.json'}
[o2rmeta] 20180302.085940 launching validator
[o2rmeta] 20180302.085940 checking metadata_o2r_1.json against o2r-meta-schema.json
[o2rmeta] 20180302.085940 !invalid: None is not of type 'string'
Failed validating 'type' in schema['properties']['description']:
{'type': 'string'}
On instance['description']:
None"
}
400 Bad Request
"SyntaxError [...]"
422 Unprocessable Entity
{"error":"JSON with root element 'o2r' required"}
Other metadata properties¶
Besides the metadata
element, a compendium persists some additional properties to reduce computation on the server, and to allows client applications to improve the user experience.
bag
- a boolean showing if the uploaded artefact was detected as a BagIt bag (detection file:bagit.txt
)compendium
- a boolean showing if the uploaded artefact was detected as a compendium (detection file:erc.yml
)
Example:
(Properties metadata
and files
not shown for brevity.)
{
"id": "U9IZ7",
"metadata": {},
"created": "2017-01-01T00:00:42.000Z",
"user": "0000-0002-1825-0097",
"bag": false,
"compendium": false,
"files": {}
}