Introducing Photon-HDF5

In this post I introduce Photon-HDF5, a format I co-authored during the past year. For a more complete overview you can read the recently published paper (get the biorRxiv preprint or the version just published by Biophysical Journal).

Briefly, Photon-HDF5 is a file format for storing single-molecule fluorescence data based on photon timestamps and other per-photon data. It is, in essence, a conventional structure to save this class of data in HDF5 files, therefore facilitating data sharing and long-term archival.

The format was initially designed to store freely-diffusing single-molecule FRET data, but it has evolved to store any measurement which consists of streams of “photon-data” (e.g. timestamps, detectors, TCSPC nanotimes, etc.).

xkcd comic: standards

from https://xkcd.com/927/

Design and Features

Since Photon-HDF5 is based on HDF5 files, it inherits all its advantages. In particular it is open standard, multi-platform and multi-language. It is also an efficient binary format supporting compression transparently.

In designing Photon-HDF5, we followed a set of generic principles that may be useful also for other scientific formats:

  • self-describing: each data field embeds a description explaining the purpose of the field;
  • self-contained: it contains all the information necessary to analyze the data;
  • suitable for long-term archival: rich metadata records experimental details, provenance, author and software version;
  • supports arbitrary user data;
  • all support software is open source (under MIT license).

Finally, the following features make Photon-HDF5 suitable for a wide range of single-molecule fluorescence data:

  • supports any number of spectral, polarization or beam-split channels.
  • supports single- and multi-spot data.
  • extensible: the bulk “photon-data” (present in all types of measurements) is logically separated from data specific of a single measurement type.

Thanks to this extensible structure, new measurement-types can be defined in backward-compatible manner. In fact, we encourage users to propose new measurement types (use the Photon-HDF5 mailing list).

Open Development

All Photon-HDF5 development (both specification documents and software) takes place publicly on GitHub. We encourage users to join the effort, providing feedback and/or submitting Issues or Pull Requests. By providing feedback and ideas, you can shape the future development of Photon-HDF5. Plus we acknowledge all contributions.

Supporting Software

Photon-HDF5 files can be opened in HDFView and read exactly in the same way you read other HDF5 files. To help newcomers, we posted reading examples in Python, MATLAB and LabVIEW.

To write valid Photon-HDF5 files from scratch, we provide a small open source python library called phconvert. Phconvert includes, additionally, Jupyter notebooks to convert common file formats to Photon-HDF5. For all the other languages, the easiest and most robust way of writing Photon-HD5 files is using an ad-hoc script called phforge.

Try it Online!

To get a the taste of it, you can use the online conversion service to convert an existing data file (Becker Hickl SPC or PicoQuant HT3) to Photon-HDF5. This service uses MyBinder.org to provide access to the Jupyter Notebooks implementing the file conversion.

Enjoy!

Comments !

blogroll

social