1 R and RStudio
1.1 About R and RStudio
R is a free software for statistical (but not only!) computing and graphics.
R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It includes:
an effective data handling and storage facility;
a suite of operators for calculations on arrays, in particular matrices;
a large, coherent, integrated collection of intermediate tools for data analysis;
graphical facilities for data analysis and display either on-screen or on hardcopy;
well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input and output facilities.
1.2 Installing R and RStudio
Go to https://cran.r-project.org/ - this site will offer you all necessary information on R and its packages.
Download R for Windows or MacOS to install the base distribution.
Install R.
Go to https://rstudio.com/products/rstudio/ - this site will offer you all necessary information on RStudio.
Download RStudio for Windows or MacOS to install the desktop distribution.
Install RStudio.
1.3 The prompt
R has a command line interface, and will accept simple commands to it. This is marked by a \(>\) symbol, called the prompt. If you type a command and press return, R will evaluate it and print the result for you.
Try now! Write
- Lines starting with # are ignored by R and can be used to insert comments in the script.
1.4 Assignments
The expression x <- 1 creates a called x and assigns the value 1 to x. Note that the variable on the left is assigned to the value on the right. The left hand side must contain only a single variable name.
To get the “<-” write the “<” sign and the “-” sign: “x <- 1”.
It is possible (actually is a good idea) to leave spaces between the variable name and its value, but it is not possible to have a space between the \(<\) and \(-\) signs!
One can also assign with = (or ->). However, in order to avoid confusion, it is common to use <- to distinguish from the equality operator =.
1.5 Rules for defining variables
Names of variables can be chosen quite freely in R. They can be built from letters, digits, the period (dot) symbol and the underscore symbol (_).
However, one should pay attention to:
do not start a name with a digit or a period followed by a digit;
R is case sensitive, so “A” and “a” refer to different variables;
be consistent with variables names throughout the program;
avoid names that provide no description, e.g., single-letter names, unless if they are parameters;
some names are already used by the system, e.g., FALSE, TRUE, exp, sum, etc.