REMI Database: A Data Management System for Nuclear ...REMI Facts about REMI Developed for the...

Post on 27-Nov-2020

3 views 0 download

Transcript of REMI Database: A Data Management System for Nuclear ...REMI Facts about REMI Developed for the...

REMI Database: A Data Management System forNuclear Medicine Studies

Antall Johann Fernandes

Florida Institute of Technology

afernandes2010@my.fit.edu

30th November, 2011

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 1 / 43

REMI

Facts about REMI

Developed for the Nuclear Imaging Group of the Life SciencesDivision at Lawrence Berkeley National Laboratory, California.

Supported by a subcontract on NIH grant 5-R01-EB007219-04,”Molecular Imaging of Cardiac Hypertrophy Using microPET andPinhole SPECT”.

Currently is in production and houses 180 experiments, and 8105 datafiles.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 2 / 43

Motivation

Typical Work Process by the Nuclear Imaging Group

Perform a SPECT, DT-MRI or Micro-PET experiment.

Copy experimental data to individual researcher’s workstation.

Perform data analysis, image reconstruction and various otherprocessing tasks on experimental data.

Extract results from processed data.

Publish results and findings based on results.

What happens to the experiment raw data or processed data?

Data resides on individual researcher’s workstation under a researchergenerated directory structure.

Data is moved or copied into a central file server into a researchergenerated location and directory structure.

Data about the experimental data may or may not be captured.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 3 / 43

Motivation

Knowledge about Data

What experimental data has been collected?

Where does all of the experimental data reside?

Data Sharing

Can we share the experimental data with others?

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 4 / 43

Motivation

The Need for a System

Aware of all the experimental data captured over the years.

Knowledge of the location of the experimental data.

Provide the capabilities to share the experimental data.

Sharing Experimental Data

Data should be stored in a standardized manner.

Data should be query-able.

Meta-data needs to be captured in a timely manner. (Meta-data isexplained later).

Users should not need to deal with file systems.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 5 / 43

Design Considerations

Typical Scientific Data Management Systems should have...

Creation of logical collections - physical data to logical collections.

Physical data handling - storage of the physical data files.

Security support - data access authorization and change verification.

Data ownership - who is responsible for data quality and meaning.

Persistence - data lifetime.

Knowledge and information discovery - identify useful informationinside the data collection.

Meta-data collection, management and access.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 6 / 43

Design Considerations

REMI’s Design Considerations

Open access to downloading of experimental data.

Users should be able to access the database from anywhere.

Authorized researchers should be able to upload experimental data.

REMI should handle the physical storage of data files (explainedlater).

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 7 / 43

Understanding Experiments and its Data

SPECT Experiment

Data and Meta-Data

Patient Coordinate System

Detector Coordinate System

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 8 / 43

Meta-data

What is Meta-data?

Information about data.

Describes how data is measured, acquired, and computed.

Enables data browsing, data transfer, and data documentation

Makes it possible to build data independent automated tools.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 9 / 43

Entities and Relationships

Entity

A ”thing” or ”object” in the real world that is distinguishable from allother objects. Eg. Person or Vehicle.

An entity has a set of properties and values that identify an entity.

Relationship

An association among several entities. Eg. A Person owns a Vehicle.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 10 / 43

Entities and Relationships

Entity-Relationship Data Model

Based on the perception of the real world that consists of a set of basicobjects called entities, and of relationships among these objects.

Mapping Cardinalities

Represents constraints on the relationship.

Expresses the number of entities to which another entity can beassociated via a relationship.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 11 / 43

REMI Meta-data Database Development

Logical Entity Relationship Diagram

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 12 / 43

REMI Meta-data Database Development

Physical Table Design Diagram

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 13 / 43

REMI Meta-data Database Development

Physical Table Design Diagram: Machine

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 14 / 43

REMI Meta-data Database Development

Physical Table Design Diagram: Tracer

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 15 / 43

REMI Meta-data Database Development

Physical Table Design Diagram: File Tag

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 16 / 43

REMI Meta-data Database Development

Physical Table Design Diagram: File Association

Owner ID maps to the entity which owns the File.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 17 / 43

REMI Meta-data Database Development

Physical Table Design Diagram: User

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 18 / 43

REMI File Storage Development

REMI’s Design Consideration (mentioned earlier)

REMI should handle the physical storage of data files.

File Storage Schemes

(c) Single Directory (d) Directory Hierarchy

All Files under a single directory.

Individual directory for each experiment.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 19 / 43

REMI File Storage Development

REMI File Storage

Concatenate experiment ID and filename and generate a 64 lengthSHA256 string.

Create directory structure based on SHA256 string, directory levels,and directory name size

Parameter directory name size determines how many characters touse of the hash value

Parameter directory levels determines the depth of the directorystructure

Example 1

SHA256 string: fb80f1735b3153e7a41d38390de5d9773c35259965..directory levels : 2directory name size : 2=>ROOT DIR/fb/80/<experiment id> <filename>.<extension>

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 20 / 43

REMI File Storage Development

REMI File Storage: Example 2

SHA256 string: fb80f1735b3153e7a41d38390de5d9773c35259965..directory levels : 2directory name size : 3=>ROOT DIR/fb8/0f1/<experiment id> <filename>.<extension>

REMI File Storage: Example 3

SHA256 string: fb80f1735b3153e7a41d38390de5d9773c35259965..directory levels : 3directory name size : 2=>ROOT DIR/fb/80/f1/<experiment id> <filename>.<extension>

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 21 / 43

REMI File Storage Development

File Storage

directory levels : 2directory name size : 2

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 22 / 43

REMI Application Development

Rails Model-View-Controller (MVC) Architecture

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 23 / 43

REMI Application Development

System Design

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 24 / 43

REMI Application Development

System Component Design

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 25 / 43

REMI Application Development

Reasons to upload Data Files in Parts

Web Browsers can upload a maximum of 4 GB of data per request.

Certain data files are in excess of 50 GB in size.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 26 / 43

REMI Application Development

File Upload Process

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 27 / 43

REMI Application Development

File Upload Process

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 28 / 43

REMI Application Development

File Upload Process: Save the chunk files

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 29 / 43

REMI Application Development

File Upload Process: Concatenate the chunk files

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 30 / 43

REMI User Interface Development

JSON (JavaScript Object Notation)

A lightweight data-interchange format.

Easy for humans to read and write.

Built on two structures.

A collection of name/value pairs.Realized as an object, record, struct, dictionary, hash table, keyed list,or associative array.An ordered list of values.Realized as an array, list, or sequence.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 31 / 43

REMI User Interface Development

JSON Syntax

(e) Object Syntax

(f) Array Syntax

(g) Value Syntax

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 32 / 43

REMI User Interface Development

REMI Menu

id: Translates to the element ID on the HTML page.

value: Display name shown on the user interface.

link: URL to be called.

submenu: Array to contain sub-menus.Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 33 / 43

REMI User Interface Development

Download Menu Options

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 34 / 43

REMI User Interface Development

Searching for Experiment based on Modality

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 35 / 43

REMI User Interface Development

Creating a New SPECT Experiment

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 36 / 43

REMI User Interface Development

Various Sections within the SPECT Experiment

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 37 / 43

Weaknesses

Weaknesses within REMI

Lack of full support for Semi-Structured Data.

Semi-Structured Data does not conform with the formal structure oftables and data models associated with relational databases.Meta-data is all semi-structured data.

Resume functionality for uploading and downloading data files.

Server side file compression on large files.

Meta-data Templates and User Personalization.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 38 / 43

Future Enhancements

Provide support for Semi-Structured Data.

Relational databases with XML support.

Move from a relational model to Entity-Attribute-Value model.

Document based databases.

User Personalization.

Provide user specific views of owned experiments.

Support saving of personalized search queries.

Meta-data

Best approach to modeling meta-data.

Make the meta-data capture process less cumbersome.

Reading meta-data from header files.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 39 / 43

Future Enhancements

Resume functionality for Data File Uploads.

Files uploaded in chunks.

Query the server for existing file size on the server.

Transfer file chunks greater than the size on the server.

Resume functionality for Data File Downloads.

Look at HTTP Byte Range Retrieval Extensions.

See how the resume functionality affects the compression process onthe server.

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 40 / 43

Questions

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 41 / 43

Demonstration

REMI Website

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 42 / 43

Thank You

Antall Johann Fernandes (Florida Tech.) REMI Database 30th November, 2011 43 / 43