Computing Co-variance and Correlation of Online data in R-Project

Computing-Co-variance-and-Correlation-of-Online-data-in-R-Project

Getting online data sometimes can be very tedious. This becomes more entangling while dealing with financial data as these values change every moment in time.  Below is an example of R-Project, where you can get data for “n” number of listed stocks, currencies, derivatives, etc. right on your screen in no time from around the world (here shown only three: National Thermal Power Corporation Limited symbolised as NTPC.NS, Australia and New Zealand Banking Group symbolised as ANZ.AX and BSE Sensex symbolised as ^BSESN). This will help in analysing the collected data and now the output can be studied further.

Install R-Project from the link below

Once Installed Please Copy-Paste the code below in the R-Console Window

# Step 1: Creating a function called anomalies_functions. 
# So that we dont have to write the code again 
# Once the function is created we can use it in future analysis as well.

anomalies_functions = function(x)
 {

# required Lib
list.of.packages <- c("quantmod")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages, dependencies=TRUE, repos='http://cran.rstudio.com/')
print("TUTORIAL BY ANOMALIES POST : Computing Covariance and Correlation of Online data in R-Project")
 library (quantmod)

 # Creating a conection and checking ig internet is working
 con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
 source(con)
 close(con)

 # Get ticker values from the list
 tickers = spl(x)
 data <- new.env() # creating new data evvironment

 # Defining From date - 3 years from now (365x3)
 dt <- as.Date(format(Sys.time(), "%Y/%m/%d"))
 new.dt <- dt - as.difftime(1095, unit="days")
 new.dt

 getSymbols.extra(tickers, src = 'yahoo', from = new.dt,to = Sys.Date(), env = data, auto.assign = T)

 bt.start.dates(data)
 data$symbolnames = spl(x)
 for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
 bt.prep(data, align='keep.all')
 
 cof <- data.frame(na.omit(data$prices))

 print(".........Head of Data.............") 
 print(head(cof))
 
 #Saving the data in csv format
 write.csv(cof, file = "test.csv")
 cof1 = read.csv("test.csv", header=T)
 
 write.csv(cof, file = "test.csv")
 cof1 = read.csv("test.csv", header=T)
 
 cof1$X <- NULL
 n1 = ncol(cof1)
 
 returns = unlist(sapply(cof1, function(x) diff(log(x))))
 print(".........Summary of Data.............")
 print(summary(returns))
 print("...........Log Returns...........")
 print(head(returns))
 print("..........Correlation Matrix............")
 print(cor(returns))
 print("..........Covariance Matrix............")
 print(cov(returns))
 print(".........Average Returns.............")

 y1 = ncol(returns)
 cat(names(returns))
 areturns = c()
 for (i in 1:y1)
 {
 print(paste(colnames(cof1[i]),round(unlist(mean(returns[,i])),digit=6)))
 areturns = cbind(areturns,mean(returns[,i]))
 }
 
cat("TUTORIAL OUTPUT END : ANOMALIES POST")
 }
 # End of function

# Step 2: Creating a Dummy list of Stocks
# here multiple stocks and securites can be added
 dummy_list = c("NTPC.NS", "ANZ.AX","^BSESN")

# Step 3: Calling the dummy list in the above created funtion
anomalies_functions(dummy_list)
Please feel free to contact me on abhishek.kanther@uq.net.au, if you have any queries

#Example Output

[1] “TUTORIAL BY ANOMALIES POST : Computing Covariance and Correlation of Online data in R-Project”
[1] “………Head of Data………….”
NTPC.NS ANZ.AX X.BSESN
2013-08-19 125.83 22.86195 18307.52
2013-08-20 125.27 22.95514 18246.04
2013-08-21 120.21 23.08716 17905.91
2013-08-22 120.67 22.90855 18312.94
2013-08-23 118.74 23.01726 18519.44
2013-08-26 121.82 23.15705 18558.13
[1] “………Summary of Data………….”
NTPC.NS ANZ.AX X.BSESN
Min. :-0.1248523 Min. :-0.0778454 Min. :-0.0611971
1st Qu.:-0.0088761 1st Qu.:-0.0067493 1st Qu.:-0.0043928
Median : 0.0006691 Median : 0.0009281 Median : 0.0005266
Mean : 0.0003390 Mean : 0.0002141 Mean : 0.0005768
3rd Qu.: 0.0109150 3rd Qu.: 0.0075698 3rd Qu.: 0.0060836
Max. : 0.1011435 Max. : 0.0541337 Max. : 0.0370346
[1] “………..Log Returns………..”
NTPC.NS ANZ.AX X.BSESN
[1,] -0.004460382 0.004067920 -0.003363860
[2,] -0.041231195 0.005734742 -0.018817184
[3,] 0.003819334 -0.007766416 0.022477049
[4,] -0.016123318 0.004734166 0.011213078
[5,] 0.025608317 0.006054901 0.002087053
[6,] -0.061882390 0.002009893 -0.032311160
[1] “……….Correlation Matrix…………”
NTPC.NS ANZ.AX X.BSESN
NTPC.NS 1.0000000 0.1047407 0.4711408
ANZ.AX 0.1047407 1.0000000 0.3174380
X.BSESN 0.4711408 0.3174380 1.0000000
[1] “……….Covariance Matrix…………”
NTPC.NS ANZ.AX X.BSESN
NTPC.NS 3.320287e-04 2.478399e-05 8.420792e-05
ANZ.AX 2.478399e-05 1.686305e-04 4.043351e-05
X.BSESN 8.420792e-05 4.043351e-05 9.621191e-05
[1] “………Average Returns………….”
[1] “NTPC.NS 0.000339”
[1] “ANZ.AX 0.000214”
[1] “X.BSESN 0.000577”

%d bloggers like this: