Extracting data from messy text file

Illustration
Alison - 2022-04-09T12:19:35+00:00
Question: Extracting data from messy text file

There is a header followed by row names. I want to extract the numeric data for Time, and Area and Volume then group them together into a convenient format for analysis. I've tried textscan, sscanf. I haven't regexp because I've never used it before!   Many thanks in advance!

Expert Answer

Profile picture of Prashant Kumar Prashant Kumar answered . 2025-11-20

It's just a repetitive application of textscan...

 

 

fmt1='Time       [T] %f';
fmt2='Area    [V] %f %f %f Volume  [V] %f %f %f';
fid=fopen('Data.txt');
% read first set as has unique number header lines
time=cell2mat(textscan(fid, fmt1,'headerlines',10));  % 1st time value
data=cell2mat(textscan(fid, fmt2, ...
              'headerlines',3,'collectoutput',true,'delimiter','\n'))
% and second also has unique number to skip...
time=[time; cell2mat(textscan(fid, fmt1,'headerlines',5))];
data=[data; cell2mat(textscan(fid, fmt2, 'headerlines',3, ...
                     'collectoutput',true,'delimiter','\n'))];
while ~feof(fid)
  time=[time; cell2mat(textscan(fid, fmt1,'headerlines',7))];
  data=[data; cell2mat(textscan(fid, fmt2, 'headerlines',3, ...
                        'collectoutput',true,'delimiter','\n'))];
end
fid=fclose(fid);

At the end you'll have a Nx1 vector of time and Nx6 of volumes and areas. You could either concatenate time and data into one array or separate out A and V based on the columns in data; your choice.

At the command line the above gives me

>> [time data]
ans =
 1.0e+04 *
       0    1.7221    1.6475    0.0746    0.0995    0.0987    0.0009
  0.1054    1.7221    1.6475    0.0746    0.1089    0.1081    0.0008
  0.2108    1.7221    1.6475    0.0746    0.1102    0.1093    0.0008
  0.3162    1.7221    1.6475    0.0746    0.1111    0.1103    0.0008
  0.4216    1.7221    1.6475    0.0746    0.1118    0.1110    0.0008
  0.5270    1.7221    1.6475    0.0746    0.1124    0.1116    0.0008
  0.6324    1.7221    1.6475    0.0746    0.1129    0.1120    0.0008
  0.7379    1.7221    1.6475    0.0746    0.1134    0.1126    0.0008
  0.8433    1.7221    1.6475    0.0746    0.1139    0.1130    0.0008
  ...

 


Not satisfied with the answer ?? ASK NOW

Get a Free Consultation or a Sample Assignment Review!