6 Terminology

Each language has its own jargon and R is not the exception. These are some of the most common terms with their meanings and representation:

6.1 Vector

An ordered collection of usually numbers. E.g., x <- c(3,1,4,1,5,9). The ‘c’ in function c() can be thought as collection or column.

  • character vectors: stores a string of characters.
  • logical vectors: a collection of False and True values. E.g.:
## [1] 6 5 4 3 2 1 0
## [1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE
  • rev() returns a reversed version of its argument.

  • A missing value is represented by the NA characters. For example:

## [1] FALSE FALSE  TRUE FALSE FALSE

-is.na() returns TRUE for missing values.

6.2 Dataframes

Dataframes are a collection of vectors in which the columns can be of different types. Usually, a row has one data observation with different aspects of the observation in different columns.

6.3 Factor

A categorical variable in a dataframe may be considered a factor, and each of its categories a level.

##   c1  c2       c3          c4
## 1  a Yes  English  0.34134757
## 2  a Yes Japanese -0.63737743
## 3  a  No  English -0.01157832
## 4  a  No Japanese -0.24766104
## 5  b Yes  English -0.72545116
## 6  b Yes Japanese -1.14623683
## 7  b  No  English  0.35646986
## 8  b  No Japanese -0.37289745
## [1] a a a a b b b b
## Levels: a b
  • rep(x) replicate the value in x
  • cbind() combine the arguments by columns
  • as.data.frame(x) coerce \(x\) into a dataframe
  • $ is used to access a factor in a dataframe

6.4 Indexing

A row x of a datafame can be selected using square brackets []:

##   c1 c2      c3          c4
## 3  a No English -0.01157832

A column maybe selected the same way:

## [1] Yes Yes No  No  Yes Yes No  No 
## Levels: No Yes

This way of indexing can be combined:

##   c1      c3
## 1  a English
## 3  a English
## 5  b English
## 7  b English