Ingest by File System is a way to ingest hundreds of data collections or whole data repositories. It can be used for automatic ingests for which the manual Ingest by WebDAV would be too time consuming.
In order to ingest data collections, the file system MUST conform to the following pre-OCFL structure:
[root] └── {repository} └── data ├── 0=ocfl_1.0 ├── {collection} 1 │ ├── {bundle} 2 │ │ ├── 0=ocfl_object_1.0 3 │ │ ├── 4 │ │ ├── 5 │ │ └── v1 6 │ │ ├── 7 │ │ ├── 8 │ │ └── content 9 │ │ ├── dir1 10 │ │ │ ├── file1 10 │ │ │ ├── file2 10 │ │ │ ├── ... │ │ ├── dir2 10 │ │ │ ├── file1 10 │ │ │ ├── ... │ │ ├── file1 10 │ │ ├── file2 10 │ │ ├── ... │ │ │ ├── {bundle} │ │ ├── 0=ocfl_object_1.0 │ │ └── v1 │ │ └── content │ │ ├── ... │ ├── ... │ ├── {collection} │ ├── {bundle} │ ├── 0=ocfl_object_1.0 │ ├── ...
./{collection}/{bundle}
is a placeholder for arbitrary folder names.v1
.v1
MUST NOT contain an OCFL inventory file.v1
MUST NOT contain an OCFL inventory digest file.v1
MUST contain exactly
one folder named content
.Use the Ingest API in order to transform the pre-OCFL structure into a full OCFL structure.
It is also possible to mix pre-OCFL objects with already full OCFL objects under one OCFL storage root. In this case, the already full OCFL objects are ignored by the Ingest API.
If pre-OCFL content files are added, deleted, or changed at a later time, you MUST start the file system ingest from the beginning by first deleting all OCFL inventory files and OCFL inventory digest files. E.g. on Linux with:
linux:[root]/{repository}/data$ find -name 'inventory.json*' -delete
Or in order to reindex just some OCFL objects or one OCFL object:
linux:[root]/{repository}/data/{collection}$ find -name 'inventory.json*' -delete
linux:[root]/{repository}/data/{collection}/{bundle}$ find -name 'inventory.json*' -delete
After the reindex, each OCFL object contains OCFL object inventory files with information about identity, fixity, paths, as well as creator and creation time of all OCFL content files. For each OCFL inventory file there is a OCFL inventory digest file. Use the WebDAV API to browse and check the resulting OCFL structure.
It is not possible to ingest versioned data collections with this file system ingest. For ingest of versioned data collections via file system, use one of the many available OCFL clients and OCFL validators and make sure that your OCFL structure conforms to the OCFL specification.