A vector is a sequence of data elements of the same basic data types. In a single vector, you cannot mix different data types. However, if you do so, R will decide the most appropriate data types at runtime.
In R, you will find the following basic data types :
- Numeric (e.g. 2, 2.5, -45.4, etc)
- Integer (e.g. 1L, 5L, -10L, etc)
- Character (e.g. ‘R’, ‘Professionals’, etc)
- Logical (e.g. TRUE, FALSE, T, F)
- Complex (e.g. 5+3i, where i is the imaginary component)
Creating a vector
c-function gets used to combine the elements, lists or vectors into a vector. By using the assignment operator, <-, you can create initial vector as shown below:
student.grades <- c(9, 9.1, 8.1, 9.5, 9.45) print(student.grades)
The output looks like below.(Make sure to have no blank lines in between else you get an error)
 9.00 9.10 8.10 9.50 9.45
Creating a vector with potentially mixed data types
When you create a vector with mixed data types, as shown, below then
mixed.datatypes <- c(TRUE, F, 10, 'Skill', 9L, 8+5i) print(mixed.datatypes)
It gives following output:
 "TRUE" "FALSE" "10" "Skill" "9" "8+5i"
Further, when you try to know the data types of the vector,
It does indicate that it has converted rest of the elements into character data types
Naming a vector
While c function allows us to combine elements, it will be good to have a name against the data to make it more meaningful. The names function allows us to achieve the same.
When you execute following statements, the name gets assigned to the vector elements:
names(student.grades) <- c('Aayush', 'Pratyush', 'John', 'Alisha', 'Peter')
The output looks as shown below:
The names attribute must have the same length as the vector name.
- In case you give more names than desired, it will give you error which will look like below:
- Error in names(student.grades) <- c(“Aayush”, “Pratyush”, “John”, “Alisha”, : ‘names’ attribute  must be the same length as the vector 
- In case you give less number of names than the vector length then it considers <NA> as value for the missing names
Alternately, we can first define the names (in our example student names) and then assign those names to the vector. Following example demonstrates the same:
> student.names <- c('Aayush', 'Pratyush', 'John', 'Alisha', 'Peter') > names(student.grade) <- student.names > print(student.grade) Aayush Pratyush John Alisha Peter 9.00 9.10 8.10 9.50 9.45
Vectors can be accessed using index as well as name.
Accessing using Names and Indexes
The following example shows the first element as well as the element with the name “John”:
> student.grades["John"] John 8.1 > student.grades Aayush 9
Further, you can apply functions like sort to get the sorted names and corresponding values. Following example shows one such application:
> student.names <- c('Aayush', 'Pratyush', 'John', 'Alisha', 'Peter') > student.grades <- c(9, 9.1, 8.1, 9.5, 9.45) > names(student.grades) <- student.names > print(student.grades[sort(student.names)]) Aayush Alisha John Peter Pratyush 9.00 9.50 8.10 9.45 9.10
- Unlike arrays (which is 0-based) in some languages, the vector indexes are 1-based. What it means is that the first element will be at index 1.
Accessing using negative index
While this sounds unusual, you can indeed access vector using the negative index. In that case the corresponding absolute position element gets removed from the vector and rest of the elements will be made available.
For example, in below example, the accessing the vector using “-4” removes the fourth element and makes rest of the vector accessible to the user:
> student.names <- c('Aayush', 'Pratyush', 'John', 'Alisha', 'Peter') > student.grades <- c(9, 9.1, 8.1, 9.5, 9.45) > names(student.grades) <- student.names
Accessing out-of-range index
As you would expect, the out-of-range index shall not return anything. When you try to access an out of range element, then it gives following output, which shows NA:
> print(student.grades) <NA> NA
Accessing more than one element in the order of your choice
There may be situations where you may need to access the same element more than once or few elements in a given order. You can make use of the combine (c) function to achieve this specific need. Following example demonstrates the same:
Adding two vectors
When you add two vectors with same data types, you may have following situations
- Vectors are of equal length or
- They are of different lengths
When they are of equal length, the element values in the same index gets added. However, if the two vectors are of different lengths then the shorter vector starts cycling its element.
Let’s look at these two situations through an example.
Same length vectors
Let’s look at the value of below the student.netscore, where we have two equal length vectors
student.math.grade <- c(9.5, 9.4, 9.1, 9.8, 9.7) student.science.grade <- c(9.1, 9.6, 8.5, 8.8, 8.7) student.netscore <- student.math.grade + student.science.grade names(student.netscore) <- student.names
As expected, in the output, you see that the numbers for science and maths have been added:
Different Length Vectors
Let’s add two vectors of different lengths and notice the outcome. In below example,
student.wholeclass.maths <- c(9.5, 9.4, 9.1, 9.8, 7.7, 6.5, 6.4, 6.1, 7.8, 6.7) student.netscore <- student.netscore + student.wholeclass.maths names(student.netscore) <- student.names
After executing above statements, you would see following output, which clearly indicates that student.netscore elements are getting repeated (i.e. recycling happens):
- The subtraction works in a similar way, where elements on a given index gets subtracted and the result is the vector with differences at the element level
- Further, in case of vectors of different lengths, similar recycling happens. You may like to play with this to become more comfortable.
- Similarly, when you divide one vector by the another vector, the elements at the specific index in the dividend gets divided by the element at the same index in the divisor.
- Exactly, same way it works in case of multiplication of two vectors. In nutshell, the arithmetic operations between two vectors are performed member-by-member.
- You would like to note that every time you reassign value to a vector, the names associated with the vector gets reset. So, you may need to rename it again.
When you multiply (or divide) a vector by specific number then all the elements of the vector get multiplied (or divided) by that specific number.
Slicing a Vector
Many times you would need a slice of the vector data to be able to do some calculation.
Accessing a range of elements
Let’s consider the following vector:
student.wholeclass.maths <- c(9.5, 9.4, 9.1, 9.8, 7.7, 6.5, 6.4, 6.1, 7.8, 6.7) names(student.wholeclass.maths)<- student.names student.wholeclass.maths
Using the slice of a vector, you need to use the colon (:) operator on the indexes. For example, if I need the grade of students between index 3 and 7 then following statement would enable me to get that slice:
Accessing multiple ranges
You can specify multiple ranges using the combine (c) function and the comma separator. For example, following code allows you to select elements in the range 2-4 and 6-8:
Applying Filters on a vector
By mentioning the conditional statement inside the square bracket (), you can apply a logical filter on a vector. For example, below code filters all the students whose maths grade is more than 9:
Further, you can combine more conditions using the & (and), | (or) operators. For example, below statement shows the students whose grades are either more than 9 or less than 7:
Logical Vector Index
When you applied filters on the vector, you passed the conditional statement inside the square brackets. Let’s take a look at these conditions by looking at how do they look like:
In above statements, the conditional statements like student.wholeclass.maths>9, student.wholeclass.maths<7 and (student.wholeclass.maths>9 | student.wholeclass.maths<7) are logical vectors. In fact, grade.filter is a vector created out of one of these logical operations.
So, essentially your filter is a logical vector and eventually the indexes where the value is TRUE, get returned as the output of the filter activity.
Run following command to see the data types of the grade.filter vector
> class(grade.filter)  "logical"
Using Aggregate Functions
You can use following aggregate functions on vector to achieve the specific results:
- mean: Arithmetic average of all the elements of the vector
- max: Maximum value of all the element
- min: Minimum value of all the element
- sd: standard deviation, which is the square root of the variance
- var: Variance, a numerical measure of how the data values is dispersed around the mean
- median: is the value at the middle when the data is sorted in ascending order
- range: is the difference of its largest and smallest data values
Following example shows its execution and corresponding results:
I will cover the complete list of functions when I will delve into statistics using R. However, in the context of Vector, now you should be able to use aggregate and statistical functions.