Dear All,
I’m using a Pupil Labs Pupil Core eyetracker, and having solved one of the problems with the data not being recorded (due to a threshold being applied incorrectly), I now find that the .hdf5 file that is output has lots of duplicate timestamps for consecutive samples stored. I.e. there might be 4 different measurements with the same timestamp. I used Pupil Labs Service 3.5.1 to configure the “glasses” to record at 60Hz, so the time increment between samples looks plausible (see below for example).
timestamp | pupil_size | pupil_x | pupil_y |
---|---|---|---|
90.0416797 | 24.65824 | 0.46113947 | 0.6341769 |
90.0416797 | 24.659914 | 0.46109676 | 0.6342506 |
90.0416797 | 26.743896 | 0.27378133 | 0.6250556 |
90.0416797 | 26.742664 | 0.2737806 | 0.62503195 |
90.0566602 | 24.782402 | 0.46105444 | 0.6332933 |
90.0566602 | 24.783678 | 0.46104982 | 0.6333272 |
90.0566602 | 27.196869 | 0.27445492 | 0.6252303 |
90.0566602 | 27.198088 | 0.27445367 | 0.6252611 |
90.0769303 | 24.869741 | 0.4606103 | 0.63383394 |
90.0769303 | 27.065561 | 0.2749664 | 0.62616295 |
90.0769303 | 24.869802 | 0.46061468 | 0.63383234 |
90.0769303 | 27.066133 | 0.27496833 | 0.6261733 |
90.0990981 | 27.110886 | 0.27536234 | 0.6255919 |
90.0990981 | 24.889042 | 0.4603511 | 0.6338006 |
90.0990981 | 27.112803 | 0.27537686 | 0.62561965 |
Here’s the code I’ve used to convert the .hdf5 to .csv (with inspiration from Becca - thank you!):
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Sun Aug 11 16:22:57 2024
@author: JCWB
"""
import h5py
import pandas as pd
import os
from glob import glob
infolder='/Users/gnx20mmu/PROJECTS/PUPILLOMETRY/Data/'
for hdf5_file in glob(infolder + '*.hdf5'):
participant_ID = os.path.basename(hdf5_file).split('_')[0]
print(participant_ID)
with h5py.File(hdf5_file, "r") as f:
# get the list of eyetracker measures available in the hdf5
eyetracker_measures = list(f['data_collection']['events']['eyetracker'])
for measure in ['MonocularEyeSampleEvent']:
print('Extracting events of type: ', measure)
data_collection = list(f['data_collection']['events']['eyetracker'][measure])
if len(data_collection)>0:
column_headers = data_collection[0].dtype.descr
cols = []
data_dict = {}
for ch in column_headers:
cols.append(ch[0])
data_dict[ch[0]] = []
for row in data_collection:
for i, col in enumerate(cols):
data_dict[col].append(row[i])
pd_data = pd.DataFrame.from_dict(data_dict)
pd_data.to_csv(infolder+participant_ID+'_'+measure+'.csv', index = False)
else:
print('No data for type', measure, ' moving on')
# get the list of eyetracker measures available in the hdf5
eyetracker_measures = list(f['data_collection']['events']['experiment'])
for measure in ['MessageEvent']:
print('Extracting events of type: ', measure)
data_collection = list(f['data_collection']['events']['experiment'][measure])
if len(data_collection)>0:
column_headers = data_collection[0].dtype.descr
cols = []
data_dict = {}
for ch in column_headers:
cols.append(ch[0])
data_dict[ch[0]] = []
for row in data_collection:
for i, col in enumerate(cols):
data_dict[col].append(row[i])
pd_data = pd.DataFrame.from_dict(data_dict)
pd_data.to_csv(infolder+participant_ID+'_'+measure+'.csv', index = False)
else:
print('No data for type', measure, ' moving on')
Any ideas?
Cheers, Jon