on
A Taste of R
A Taste of R
R is widely accepted as a powerful as well as easy-to-learn language in data analysis, though python has got its own numpy, pandas, sympy, matplotlib, scipy, and more handy modules, R can still serve as a sharp weapon to deal with data.
This passage is mainly a comprehension on P.Haschke’s R-Course. Some documents like official ones are included as well.
Basics
extremely basic operations shall not be included in this text. All comments start with #.
-
To get help, type
?<anything>orhelp(<anything>),?log()equals but can’t replacehelp(log). -
To get functions can be used on anything, use
aporopos("<anything>"). For example,apropos("mean"). Looks familiar, python gotdir(). -
By
install.packages("<packageName>")better be conducted in the console, you can get libraries you want from the CRAN mirror, to use anything inside your module or session, import the library usinglibrary("<packageName>").
Introduction
-
Objects types are: Vectors(containing elements of the same type, one-dimensional), Matrices & Arrays(two or more dimensional, same type), Lists(like vectors but different type allowed), Data Frames(like table mapping, two-dimensional), Factors, Functions(As you know)
-
Modes types are: integer, numeric(real numbers), complex, character(AKA strings), logical(AKA Bool)
-
Assignment:
<variableName> <- <toBeAssigned>, to get the type, useis(variableName), to get and remove the variables, usels()andrm(<variableToBeRemoved>). I have to admit this part is weird, bash and lambda? -
function
c()is for concatenating elements making vectors. -
some new(to me) functions may be
sd()(standard deviation,var()(variance),cov()(covariance),cor()(correlation coefficient),unique()andwhichis rather tricky.prod()is awesome. The most weird thing I believe is the use and access of variables like<var>.<v>and<var>$<v>. -
seq()is used for creating defined vectors,rep()do repeat. Notation usage is similar to python but more powerful, simplyVector[Vector >= 1]can save you lots of trouble.summary()shows basic information you need.subset()works perfect likemap()function in python. -
print(),paste(),cat()will deal with Stdout,paste()is for multiple modes,cat()doesn’t create object in active memory
Matrices
-
source()import your previous codes.save()saves your changes. -
Matrices have three main arguments: data(R object), nrow(number of rows), ncol(number of columns). Two more optional arguments are byrow and dimnames.
-
rbind()&cbind()can be useful.rownames()&colnames()cane be used to define and get names.diag(), you knew it already. -
Matrices use
[]for notations, black magic likeMatrix[Matrix[ , 2] > 4, ]can be astonishing.t()for transpose,solve()for inverse,det()for determinant,chol()for cholesky decomposition,eigen()is perfect,crossprod()for cross product.%*%for matrix multiplication.
Data Frames
-
data()shows and load packages; Firstlibrary(), thendata().class()likeis(),names()is self-explanatory. -
read.csv()for csv,download.file()for getting files.DataFram$nameis for particular name extraction. -
str()anddescribe()for summary.
Graphics
- ggplot2 shall be installed in advance, and to set more, do like this
plot1 <- plot1 + geom_point(aes(x = FEhighway, y = FEcity)), pretty ugly to me. :( .Other options can be referred at ggplot
Programs
-
if&ifelse(), other control flow functions just like C-style in one-line -
Defined functions are javascript-style, consider this
```r MyFunction <- function(Object) { Object + Object } ```