qcodes.data.hdf5_format

class qcodes.data.hdf5_format.HDF5Format[source]

Bases: qcodes.data.format.Formatter

HDF5 formatter for saving qcodes datasets.

Capable of storing (write) and recovering (read) qcodes datasets.

close_file(data_set: DataSet)[source]

Closes the hdf5 file open in the dataset.

Parameters

data_set – DataSet object

read(data_set: DataSet, location=None)[source]

Reads an hdf5 file specified by location into a data_set object. If no data_set is provided will create an empty data_set to read into.

Parameters
  • data_set – the data to read into. Should already have attributes io (an io manager), location (string), and arrays (dict of {array_id: array}, can be empty or can already have some or all of the arrays present, they expect to be overwritten)

  • location (None or str) – Location to write the data. If no location is provided will use the location specified in the dataset.

write(data_set, io_manager=None, location=None, force_write=False, flush=True, write_metadata=True, only_complete=False)[source]

Writes a data_set to an hdf5 file.

Parameters
  • data_set – qcodes data_set to write to hdf5 file

  • io_manager – io_manger used for providing path

  • location – location can be used to specify custom location

  • force_write (bool) – if True creates a new file to write to

  • flush (bool) – whether to flush after writing, can be disabled for testing or performance reasons

  • write_metadata (bool) – If True write the dataset metadata to disk

  • only_complete (bool) – Not used by this formatter, but must be included in the call signature to avoid an “unexpected keyword argument” TypeError.

N.B. It is recommended to close the file after writing, this can be done by calling HDF5Format.close_file(data_set) or data_set.finalize() if the data_set formatter is set to an hdf5 formatter. Note that this is not required if the dataset is created from a Loop as this includes a data_set.finalize() statement.

The write function consists of two parts, writing DataArrays and writing metadata.

  • The main part of write consists of writing and resizing arrays, the resizing providing support for incremental writes.

  • write_metadata is called at the end of write and dumps a dictionary to an hdf5 file. If there already is metadata it will delete this and overwrite it with current metadata.

write_metadata(data_set, io_manager=None, location=None, read_first=True, **kwargs)[source]

Writes metadata of dataset to file using write_dict_to_hdf5 method

Note that io and location are arguments that are only here because of backwards compatibility with the loop. This formatter uses io and location as specified for the main dataset. The read_first argument is ignored.

write_dict_to_hdf5(data_dict, entry_point)[source]

Write a (nested) dictionary to HDF5

Parameters
  • data_dict (dict) – Dicionary to be written

  • entry_point (object) – Object to write to

read_metadata(data_set: DataSet)[source]

Reads in the metadata, this is also called at the end of a read statement so there should be no need to call this explicitly.

Parameters

data_set – Dataset object to read the metadata into

read_dict_from_hdf5(data_dict, h5_group)[source]

Read a dictionary from HDF5

Parameters
  • data_dict (dict) – Dataset to read from

  • h5_group (object) – HDF5 object to read from

class ArrayGroup(shape, set_arrays, data, name)

Bases: tuple

Create new instance of ArrayGroup(shape, set_arrays, data, name)

count(value, /)

Return number of occurrences of value.

property data

Alias for field number 2

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property name

Alias for field number 3

property set_arrays

Alias for field number 1

property shape

Alias for field number 0

group_arrays(arrays)

Find the sets of arrays which share all the same setpoint arrays.

Some Formatters use this grouping to determine which arrays to save together in one file.

Parameters

arrays (Dict[DataArray]) – all the arrays in a DataSet

Returns

namedtuples giving:

  • shape (Tuple[int]): dimensions as in numpy

  • set_arrays (Tuple[DataArray]): the setpoints of this group

  • data (Tuple[DataArray]): measured arrays in this group

  • name (str): a unique name of this group, obtained by joining the setpoint array ids.

Return type

List[Formatter.ArrayGroup]

match_save_range(group, file_exists, only_complete=True)

Find the save range that will joins all changes in an array group.

Matches all full-sized arrays: the data arrays plus the inner loop setpoint array.

Note: if an outer loop has changed values (without the inner loop or measured data changing) we won’t notice it here. We assume that before an iteration of the inner loop starts, the outer loop setpoint gets set and then does not change later.

Parameters
  • group (Formatter.ArrayGroup) – a namedtuple containing the arrays that go together in one file, as tuple group.data.

  • file_exists (bool) – Does this file already exist? If True, and all arrays in the group agree on last_saved_index, we assume the file has been written up to this index and we can append to it. Otherwise we will set the returned range to start from zero (so if the file does exist, it gets completely overwritten).

  • only_complete (bool) – Should we write all available new data, or only complete rows? If True, we write only the range of array indices which all arrays in the group list as modified, so that future writes will be able to do a clean append to the data file as more data arrives. Default True.

Returns

the first and last raveled indices that should
be saved. Returns None if:
  • no data is present

  • no new data can be found

Return type

Tuple(int, int)

read_one_file(data_set: DataSet, f, ids_read)

Read data from a single file into a DataSet.

Formatter subclasses that break a DataSet into multiple data files may choose to override either this method, which handles one file at a time, or read which finds matching files on its own.

Parameters
  • data_set – the data we are reading into.

  • f – a file-like object to read from, as provided by io_manager.open.

  • ids_read (set) – array_ids that we have already read. When you read an array, check that it’s not in this set (except setpoints, which can be in several files with different inner loops) then add it to the set so other files know it should not be read again.

Raises

ValueError – if a duplicate array_id of measured data is found

class qcodes.data.hdf5_format.HDF5FormatMetadata[source]

Bases: qcodes.data.hdf5_format.HDF5Format

metadata_file = 'snapshot.json'
write_metadata(data_set: DataSet, io_manager=None, location=None, read_first=False, **kwargs)[source]

Write all metadata in this DataSet to storage.

Parameters
  • data_set – the data we’re storing

  • io_manager (io_manager) – the base location to write to

  • location (str) – the file location within io_manager

  • read_first (Optional[bool]) – read previously saved metadata before writing? The current metadata will still be the used if there are changes, but if the saved metadata has information not present in the current metadata, it will be retained. Default True.

  • kwargs (dict) – From the dicionary the key sort_keys is extracted (default value: False). If True, then the keys of the metadata will be stored sorted in the json file. Note: sorting is only possible if the keys of the metadata dictionary can be compared.

read_metadata(data_set)[source]

Reads in the metadata, this is also called at the end of a read statement so there should be no need to call this explicitly.

Parameters

data_set – Dataset object to read the metadata into

class ArrayGroup(shape, set_arrays, data, name)

Bases: tuple

Create new instance of ArrayGroup(shape, set_arrays, data, name)

count(value, /)

Return number of occurrences of value.

property data

Alias for field number 2

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

property name

Alias for field number 3

property set_arrays

Alias for field number 1

property shape

Alias for field number 0

close_file(data_set: DataSet)

Closes the hdf5 file open in the dataset.

Parameters

data_set – DataSet object

group_arrays(arrays)

Find the sets of arrays which share all the same setpoint arrays.

Some Formatters use this grouping to determine which arrays to save together in one file.

Parameters

arrays (Dict[DataArray]) – all the arrays in a DataSet

Returns

namedtuples giving:

  • shape (Tuple[int]): dimensions as in numpy

  • set_arrays (Tuple[DataArray]): the setpoints of this group

  • data (Tuple[DataArray]): measured arrays in this group

  • name (str): a unique name of this group, obtained by joining the setpoint array ids.

Return type

List[Formatter.ArrayGroup]

match_save_range(group, file_exists, only_complete=True)

Find the save range that will joins all changes in an array group.

Matches all full-sized arrays: the data arrays plus the inner loop setpoint array.

Note: if an outer loop has changed values (without the inner loop or measured data changing) we won’t notice it here. We assume that before an iteration of the inner loop starts, the outer loop setpoint gets set and then does not change later.

Parameters
  • group (Formatter.ArrayGroup) – a namedtuple containing the arrays that go together in one file, as tuple group.data.

  • file_exists (bool) – Does this file already exist? If True, and all arrays in the group agree on last_saved_index, we assume the file has been written up to this index and we can append to it. Otherwise we will set the returned range to start from zero (so if the file does exist, it gets completely overwritten).

  • only_complete (bool) – Should we write all available new data, or only complete rows? If True, we write only the range of array indices which all arrays in the group list as modified, so that future writes will be able to do a clean append to the data file as more data arrives. Default True.

Returns

the first and last raveled indices that should
be saved. Returns None if:
  • no data is present

  • no new data can be found

Return type

Tuple(int, int)

read(data_set: DataSet, location=None)

Reads an hdf5 file specified by location into a data_set object. If no data_set is provided will create an empty data_set to read into.

Parameters
  • data_set – the data to read into. Should already have attributes io (an io manager), location (string), and arrays (dict of {array_id: array}, can be empty or can already have some or all of the arrays present, they expect to be overwritten)

  • location (None or str) – Location to write the data. If no location is provided will use the location specified in the dataset.

read_dict_from_hdf5(data_dict, h5_group)

Read a dictionary from HDF5

Parameters
  • data_dict (dict) – Dataset to read from

  • h5_group (object) – HDF5 object to read from

read_one_file(data_set: DataSet, f, ids_read)

Read data from a single file into a DataSet.

Formatter subclasses that break a DataSet into multiple data files may choose to override either this method, which handles one file at a time, or read which finds matching files on its own.

Parameters
  • data_set – the data we are reading into.

  • f – a file-like object to read from, as provided by io_manager.open.

  • ids_read (set) – array_ids that we have already read. When you read an array, check that it’s not in this set (except setpoints, which can be in several files with different inner loops) then add it to the set so other files know it should not be read again.

Raises

ValueError – if a duplicate array_id of measured data is found

write(data_set, io_manager=None, location=None, force_write=False, flush=True, write_metadata=True, only_complete=False)

Writes a data_set to an hdf5 file.

Parameters
  • data_set – qcodes data_set to write to hdf5 file

  • io_manager – io_manger used for providing path

  • location – location can be used to specify custom location

  • force_write (bool) – if True creates a new file to write to

  • flush (bool) – whether to flush after writing, can be disabled for testing or performance reasons

  • write_metadata (bool) – If True write the dataset metadata to disk

  • only_complete (bool) – Not used by this formatter, but must be included in the call signature to avoid an “unexpected keyword argument” TypeError.

N.B. It is recommended to close the file after writing, this can be done by calling HDF5Format.close_file(data_set) or data_set.finalize() if the data_set formatter is set to an hdf5 formatter. Note that this is not required if the dataset is created from a Loop as this includes a data_set.finalize() statement.

The write function consists of two parts, writing DataArrays and writing metadata.

  • The main part of write consists of writing and resizing arrays, the resizing providing support for incremental writes.

  • write_metadata is called at the end of write and dumps a dictionary to an hdf5 file. If there already is metadata it will delete this and overwrite it with current metadata.

write_dict_to_hdf5(data_dict, entry_point)

Write a (nested) dictionary to HDF5

Parameters
  • data_dict (dict) – Dicionary to be written

  • entry_point (object) – Object to write to

qcodes.data.hdf5_format.str_to_bool(s)[source]