Hello All,
Today in this blog lets get into the nitty-gritty and details of R. This blog will briefly discuss upon different data types used in R and some basic operations on those data types and we will then have a hands-on with loading data frames in R, removing fill_values or NAN from these data frames and summarising the datasets.
1.# R Data Types: Objects and Attributes
Everything in R is an object
R has 5 basic classes of objects :
Character
numeric (real numbers)
integer
complex
logical (True/False)
Data may be any one of the above class of objects or may be combined to form data structures. The most basic object is a vector; a vector contains objects of the same class, it holds either characters or numeric or any of the above-defined classes.
There is an exception however, a list which is represented as a vector contains objects of different classes.
Below are examples of atomic character vectors, numeric vectors, integer vectors, etc.
character: "a", "swc"
numeric: 2, 15.5
integer: 2L (the L tells R to store this as an integer)
logical: TRUE, FALSE
complex: 1+4i (complex numbers with real and imaginary parts)
There are some of the very handy function that R provides to examine feature of vectors and other objects:
class() - what kind of object is it (high-level)?
typeof() - what is the object’s data type (low-level)?
length() - how long is it? (1 dimension object)
attributes() - does it have any metadata?
Consider the following examples for clarity:
2.# R Data Structures:
R has many data structures. These include
atomic vector
list
matrix
data frame
factors
Both list and vectors are types of vectors, however, as explained above a list can be a combination of different objects, while vector or atomic vector consists only of a single type of object.
An empty vector is defined by vector()
Similarly, there are other data structures matrix and data frame can be formulated calling the base function data.frame() and matrix() and factor()
But we will now focus on the data structure called data frames with the objective to load these data frames in R and explore within the data frame and remove NAN values
Data frame is a two-dimensional data structure in R. It is a special case of a list which has each component of equal length
Many data input functions of R like, read.table(), read.csv(), read.delim(), read.fwf(), read.xlsx() also read data into a data frame.
In this example we will load a comma-separated variable(.csv) data into our R studio explore the data and try removing the NAN values.
In this way, we can remove NA values from our data and have a quick view of the summary of our data.
So this was our result
So we have achieved our objective to clean data and have a quick view of its summary.
Thank you all !!!!!
Take care
Comments