7 Wide vs. long representation
When extracting acoustic data from programs such as Praat (Boersma and Weenink 2018), these data is often collected in so-called ‘wide-format’ spreadsheets and tables such as Table 7.1.
| subjID | label | duration | f0 | F1 | F2 |
|---|---|---|---|---|---|
| subj1 | con1 | 0.07 | 149.777 | 515.864 | 1584.73 |
| subj1 | voc1 | 0.07616 | 144.938 | 591.371 | 1574.50 |
| subj1 | con2 | 0.04845 | 143.508 | 260.772 | 1507.64 |
| subj1 | voc2 | 0.04937 | 126.02 | 469.582 | 1794.32 |
| subj2 | con1 | 0.06765 | 137.835 | 327.424 | 1456.00 |
| subj2 | voc1 | 0.09672 | 140.127 | 617.94 | 1551.74 |
| subj2 | con2 | 0.03057 | 133.868 | 365.43 | 1563.14 |
| subj2 | voc2 | 0.1099 | 110.316 | 441.188 | 1824.32 |
| subj3 | con1 | 0.07215 | 127.312 | 628.841 | 1862.74 |
| subj3 | voc1 | 0.10034 | 126.121 | 526.501 | 1495.03 |
| subj3 | con2 | 0.03043 | 123.797 | 201.437 | 1658.92 |
| subj3 | voc2 | 0.10579 | 98.4136 | 384.419 | 1784.46 |
- In wide-format, there is a column for each variable.
Although wide-format representation is convenient for computing statistics with some R functions and MS Excel (or other spreadsheet-based software), it presents disadvantages for many functions in R. For example, plotting and modeling functions in R require the data in long-format representation.
In long-format representation, variable types are included in a column and their values in other. So, the same information presented in Table 7.1 can be represented in a long-format representation as shown in Table 7.2.
| subjID | label | type | value |
|---|---|---|---|
| subj1 | con1 | dur | 0.07000 |
| subj1 | voc1 | dur | 0.07616 |
| … | |||
| subj3 | con2 | dur | 0.03043 |
| subj3 | voc2 | dur | 0.10579 |
| subj1 | con1 | f0 | 149.7770 |
| subj1 | voc1 | f0 | 144.9380 |
| … | |||
| subj3 | con2 | f0 | 123.7970 |
| subj3 | voc2 | f0 | 98.41360 |
| subj1 | con1 | F1 | 515.864 |
| subj1 | voc1 | F1 | 591.371 |
| … | |||
| subj3 | con2 | F1 | 201.437 |
| subj3 | voc2 | F1 | 384.419 |
| subj1 | con1 | F2 | 1584.73 |
| subj1 | voc1 | F2 | 1574.50 |
| … | |||
| subj3 | con2 | F2 | 1658.92 |
| subj3 | voc2 | F2 | 1784.46 |
- The function
melt(data)from thedata.tablelibrary can be used to convert between wide- and long-format. - Alternatively, there are some programs such as VoiceSauce that export data in long-format.
- The degree of ‘wideness’ depends on your needs.
References
Boersma, Paul, and David Weenink. 2018. “Praat: doing phonetics by computer.”