Monday, November 5, 2018

Plotting functions in R

Exploratory data visualization is the main strength of R. R is more flexible and efficient compared to other data visualization tools and programming languages like D3. In this post we will see the different graphical facilities available in R.

The built-in graphical facilities in R, makes it possible to display a wide variety of statistical graphs and also build entirely new types of graph.  R plotting commands can be divided into 3 basic groups:
  • High-level plotting functions create a new plot on the graphics device, possibly with axes, labels, titles and so on.
  • Low-level plotting functions add more information to an existing plot, such as extra points, lines and labels.
  • Interactive graphics functions allow you interactively add information to, or extract information from, an existing plot, using a pointing device such as a mouse.

In addition , R maintains a list of graphical parameters which can be manipulated to customize your plots.In this post we will just cover the high level plotting functions .

Plot() function
One of the most frequently used plotting function in R is the plot() function. This is a  generic function and the type of plot produced is dependent on the type or class of the first argument .

Usage : plot(x,y,..)
x -> the coordinates of points in the plot. Alternatively, a single plotting structure, function or any R object with a plot method can be provided.
y -> the y coordinates of points in the plot, optional if x is an appropriate structure.
… -> Arguments to be passed to methods, such as graphical parameters.

Refer the documentation for more details on the usage of this function.
Let’s take a data set which shows the US murders per state. We can plot the total murders against the population in millions using the function plot() and we can infer that states with larger populations are likely to have more murders . 



Hist() function
Another example of a plot we can quickly make are histograms. Histograms are powerful graphical summaries of a list of numbers . We can use the hist() function to see the distribution of the murder rate in the US .



boxplot() function
One other powerful plotting function is the boxplot() function which provides a more terse summary than the histogram , but they’re easier to stack against each other. So we can see many distributions in one plot and we use them to compare the murder rates for different regions. In our example below we stratify the state populations by region and generate boxplots for each strata. Use (rate~region) to create the strata and specify the dataset with the data argument.



Refer the R Manual for more documentation on the usage of these plotting functions. In our future posts we will explore the ggplot2 package which is more powerful and makes you appreciate the power of R when it comes to data visualization.