Ingesting

There are four mechanisms for ingesting metadata and data into MyTARDIS:

  1. User Interface The User Interface is appropriate for ingesting a single experiment by the end user with a relatively small amount of data.
  2. REST API The RESTful API allows creation of Experiments, Datasets and Datafiles including meta data. Files can be added via POSTing, through a staging area or through a shared data storage area. See RESTful API for MyTardis for more information.
  3. Staging Area The Staging Area is appropriate for ingesting a single experiment by the end user with larger amounts of data.
  4. Batch Ingestion Batch ingestion is typically used by facilities automatically ingesting all metadata from one or more instruments into MyTARDIS.

MyTARDIS supports 2 different XML schemas for importing metadata. One method is METS and the other is using a MyTARDIS specific XML format. METS is the preferred format because it is supported by systems other that MyTARDIS so will provide more versatility in the long run.

METS

The Meta-data Encoding and Transmission Standard was recommended by Monash Librarians and ANU in 2008 as the XML description format for MyTARDIS datasets.

For details about the METS format and how it is used by MyTARDIS, please see METS File Format.

Ingestion Script

Metadata may be easily ingested using a simple script and POST request:

#!/bin/bash

file="$1"
username="localdb_admin"
password="secret"
host="http://localhost:8000"
owner="$username"

curl -F username=$username -F password=$password -F xmldata=@${file} -F experiment_owner=$owner "$host/experiment/register/"

To use this script paste it into a new file called, e.g. register.sh, chmod +x register.sh then can call it using ./register.sh file.xml. There are several example XML and METS files within the tardis test suite.

Post Processing

MyTARDIS takes advantage of the Django signal framework to provide post processing of files. The only default post processing step that is enabled by default operates on newly created Dataset Files.

Staging Hook

The staging hook is responsible for moving files from the staging area to the data store. It operates as a django.db.models.signals.post_save signal and only triggers in a newly created file.

The staging hook is only triggerd on files that have a protocol of staging which signifies that the file is in the in the TARDIS staging area.

EXIF Metadata extraction

See also

http://www.loc.gov/standards/mets/
Metadata Encoding and Tranmission Standard

Table Of Contents

Previous topic

Schema and Parameter Sets

Next topic

auth – Auth Framework

This Page