attributes(anscombe)
$names
[1] "x1" "x2" "x3" "x4" "y1" "y2" "y3" "y4"
$class
[1] "data.frame"
$row.names
[1] 1 2 3 4 5 6 7 8 9 10 11
In this chapter, we introduce the concept of objects in R, which is fundamental to working effectively with the language. Objects can take many forms, and R manages all of them, both pre-defined and user-created, in a central workspace called the global environment. Assigning valid and meaningful names to objects using the assignment operator (<-
) is essential for writing clear, organized, and shareable code.
R works with objects, which include everything we interact with or encounter in R, including numbers, data structures, functions, and outputs such as plots. Objects can either come from R packages or be created by the user. User-created objects are assigned names specified by the user. R stores these objects in the global environment, enabling easy access and manipulation throughout the R session.
In R, objects typically have several properties, known as attributes, which define their structure and guide how R interprets them. We can access an object’s attributes using the attributes()
function. However, not all R objects have attributes; in such cases, attributes()
returns NULL
.
For example, the well-known anscombe
dataset, which contains 11 rows and 8 columns, has three key attributes. These include names
(the column names), class
(which is data.frame
), and row.names
(which represent the row identifiers).
attributes(anscombe)
$names
[1] "x1" "x2" "x3" "x4" "y1" "y2" "y3" "y4"
$class
[1] "data.frame"
$row.names
[1] 1 2 3 4 5 6 7 8 9 10 11
Two relative functions that reveal a dataset’s structure are class()
and dim()
:
Note that the dim()
function is particularly useful because dimensions are not stored as an explicit attribute; instead, they are derived from the lengths of row.names
and names
.
<-
)In R, the recommended way to assign a value to an object is by using the left-arrow assignment operator (<-
), which combines the less-than sign (<
) and the hyphen (-
) (keyboard shortcut: Alt + - for Windows/Linux and Option + - for Mac).
For example, to store the value 1/40
in an object named x
, we assign it as follows:
x <- 1/40 # assignment (right to left)
Note that assignment does not automatically print the result to the Console. Instead, R stores the value in the object x
for later use. If we check the Environment
tab in RStudio, we will see x
and its corresponding value listed there. To retrieve the stored value, we can simply call the object x
, as shown below:
x
[1] 0.025
However, if the assignment expression is enclosed in parentheses, R will evaluate the expression and display the result immediately. For example:
(x <- 1/40)
[1] 0.025
It is important to include spaces before and after comparison operators and assignment operators. For example, consider evaluating the expression “x less than -1/50” (note that the current value of x
is 1/40
):
x < -1/50 # with spaces
[1] FALSE
The result is FALSE
because the value of x
(which is 1/40) is greater than -1/50.
x<-1/50 # without spaces
x
[1] 0.02
If we omit the spaces, the left assignment operator is used instead of the comparison operator, leading to x <- 1/50
, which equals 0.02. As a result, the value of x
, which was originally 1/40, is reassigned to 1/50.
It is important to note that the object x
can also be used in place of a numeric value in any function that expects a numeric input. For example:
log(x) # given that x is currently 1/50, this computes log(1/50)
[1] -3.912023
Additionally, assignment expressions can include the object being assigned to, as demonstrated below:
x <- x + 1
x
[1] 1.02
In this expression, x
on the right-hand side refers to the current value of x
(1/50), and the expression adds 1 to it before assigning the result back to x
.
It is also possible to use the equal sign (=
) for assignment or the right-arrow assignment operator (->
) for rightwards assignment, though these are less commonly preferred by R users.
For example:
x = 1/40 # assignment (right to left), equivalent to <-
x
[1] 0.025
or
1/40 -> x # assignment (left to right)
x
[1] 0.025
It’s advisable to be consistent with the assignment operator we use.
Valid object names must adhere to specific rules: they must start with a letter and may include letters, numbers, underscores ( _
), and periods (.
). However, they cannot contain spaces, or use Reserved words such as TRUE
, FALSE
, or NA
, as these have special meanings in R. When naming an object with multiple words, various naming conventions can be used. Common styles include periods.between.words, underscores_between_words, and camelCaseToSeparateWords.
The choice of what we use is up to us, but it’s crucial to maintain consistency. If assistance is needed, we can refer to:
??make.names
??clean_names
It is also important to note that R is case-sensitive, meaning it distinguishes between uppercase and lowercase letters.
Y <- 50
Y
[1] 50
If we try to access the lowercase y instead, R will return an error because it does not recognize it as the same object:
y
Error: object ‘y’ not found
Objects can store more than just numbers, including strings of characters, which must be enclosed in quotes. For example:
sentence <- "the Fellowship of the Ring consisted of Frodo plus eight others"
The data type stored in an object determines which operations can be performed on it. For instance, if a number is stored as a character string instead of a numeric type, arithmetic operations won’t work. Consider this example:
eight <- "8"
eight + 1
Since eight is a character string here, R cannot perform addition, resulting in this error:
Error in eight + 1: non-numeric argument to binary operator
To apply arithmetic operations, the object must contain a numeric value. We can convert the character “8” to numeric with as.numeric()
function before adding (see Chapter @ref(rvectors) for the concept of “explicit coercion”).
eight <- "8"
as.numeric(eight) + 1
[1] 9
Notice that R commands are typically written on separate lines, but they can also be combined on a single line by separating them with a semicolon (;
). For example, the previous two commands could be written on one line like this:
eight <- "8"; as.numeric(eight) + 1 # R commands can be separated by a semicolon
[1] 9
While this approach is syntactically valid, it is generally discouraged because it reduces code readability and makes debugging more challenging.