- do we know the format of the data? (even better is it expressed in xml?)
- how much data is produced per run?
- how do we get data off the instrument?
- is any post processing/data sampling required
We are or course assuming that in this case the data is structured numerical data, and in some sort of known format such as csv. We make no assumptions about the data content, it could be a record of villeins wages during the black death, word frequencies, or daily temperature readings – the key points are that he data is both structured and numerical.
That of course does not have any implications for the interpretation of the data – for example the FITS image format is structured and numerical. However, it’s quite possible that the raw data format is not the format we want to ingest, meaning that we will have a transformation stage …