Bytes, Reloaded

I might have been a overzealous in my previous post about using Julia as the best way for parsing binary files in the most efficient, pleasant-to-use manner. I think I was also too eager to forgo using Python in my analysis due to slow looping. I'll save most of my reasons in switching from Julia to Python in another post, but I wanted to revisit how I parsed the WaveDump output files and do the same thing in Python. Mostly as a way for me to get back into writing posts but also to provide the stepping stone for the next post on switching languages.

To recap WaveDump outputs binary files in a structure where the first six 32-bit values constitute the header, giving information on the number of waveform samples coming next and waveform number count. Then based on the header info we can read the next n 16-bit values as the waveform ADC values. I looked at several different ways to accomplish this in Python, one method using NumPy, another using the built-in module struct, but the one I found easiest to do this in was with the built-in module array.

The code looks like

import array
import numpy as np


def read_binary_file_array(file_path):
    with open(file_path, "rb") as binary_file:
        header = array.array("I")
        data = array.array("H")
        while binary_file.peek() != b'':
            header.fromfile(binary_file, 6)
            data.fromfile(binary_file, 500)

    return np.array(header).reshape(-1, 6), np.array(data).reshape(-1, 500)

The nice thing about the array module is continuing to read into an already-created array will just append the values, whereas the other two methods I tried you need to do the appending yourself. I return as numpy arrays and can reshape them into however many waveforms were taken during the data run. There should be some more work done on determining how long the waveforms are and making this function more general, but for the current purpose I think it shows how easy it is to accompish this same task of reading custom binary files in Python as it is in other languages.

Find me here