7 Wide vs. long representation

When extracting acoustic data from programs such as Praat (Boersma and Weenink 2018), these data is often collected in so-called ‘wide-format’ spreadsheets and tables such as Table 7.1.

Table 7.1: A sample dataframe in wide-format representation.
subjID label duration f0 F1 F2
subj1 con1 0.07 149.777 515.864 1584.73
subj1 voc1 0.07616 144.938 591.371 1574.50
subj1 con2 0.04845 143.508 260.772 1507.64
subj1 voc2 0.04937 126.02 469.582 1794.32
subj2 con1 0.06765 137.835 327.424 1456.00
subj2 voc1 0.09672 140.127 617.94 1551.74
subj2 con2 0.03057 133.868 365.43 1563.14
subj2 voc2 0.1099 110.316 441.188 1824.32
subj3 con1 0.07215 127.312 628.841 1862.74
subj3 voc1 0.10034 126.121 526.501 1495.03
subj3 con2 0.03043 123.797 201.437 1658.92
subj3 voc2 0.10579 98.4136 384.419 1784.46
  • In wide-format, there is a column for each variable.

Although wide-format representation is convenient for computing statistics with some R functions and MS Excel (or other spreadsheet-based software), it presents disadvantages for many functions in R. For example, plotting and modeling functions in R require the data in long-format representation.

In long-format representation, variable types are included in a column and their values in other. So, the same information presented in Table 7.1 can be represented in a long-format representation as shown in Table 7.2.

Table 7.2: The same data in long-format representation.
subjID label type value
subj1 con1 dur 0.07000
subj1 voc1 dur 0.07616
subj3 con2 dur 0.03043
subj3 voc2 dur 0.10579
subj1 con1 f0 149.7770
subj1 voc1 f0 144.9380
subj3 con2 f0 123.7970
subj3 voc2 f0 98.41360
subj1 con1 F1 515.864
subj1 voc1 F1 591.371
subj3 con2 F1 201.437
subj3 voc2 F1 384.419
subj1 con1 F2 1584.73
subj1 voc1 F2 1574.50
subj3 con2 F2 1658.92
subj3 voc2 F2 1784.46
  • The function melt(data) from the data.table library can be used to convert between wide- and long-format.
  • Alternatively, there are some programs such as VoiceSauce that export data in long-format.
  • The degree of ‘wideness’ depends on your needs.

References

Boersma, Paul, and David Weenink. 2018. “Praat: doing phonetics by computer.”