I have been spending some time working with the package 'rmarkdown'. It's really revolutionized the way I try to approach analyses with R. Where before I was copying code written in a text editor directly into the console window to run analyses in piecemeal steps, I can now use RMarkdown to make fully documented write-ups of analyses. All you have to do at the end is hit 'Knit', and it generates an HTML, .pdf, or .doc output of the full writeup. I've copied the Markdown document of a practice session where I was working on generating random data from known distributions, as well as practicing function writing. The RMarkdown document follows:
Data Simulation
Dan Walker
September 8, 2015
This document is a practice exhibition with RMarkdown, generating simulated data, ggplot2, and writing functions.
Packages to include: stats, ggplot2
- Generating new data from given distributions
- Normally Distributed - n = 100, mean = 0, SD = 1
library(stats) library(ggplot2) Normaldist<- rnorm(100, 0, 1) qplot(Normaldist, geom = 'histogram', main = 'Normally Distributed Data')
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
- Poisson Distribution - n = 1000 , mean = 50
Poisdist<- rpois(1000, 50) qplot(Poisdist, geom = 'histogram', main = 'Poisson Distributed Data')
## stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.
- Binomial Distribution - 10 replicates of 100 coin tosses - ntotal = 1000 , chance of success (heads) = 0.5
cointoss<-rbinom(10, 100, 0.5) trials<-c(1,2,3,4,5,6,7,8,9,10) trials.success<-data.frame(trials, cointoss) cointossplot<-qplot(trials, cointoss, geom = 'bar', stat='identity', xlab = 'Trial Number', ylab = 'Number of Heads', main = 'Number of Heads in 1000 Trials') cointossplot + scale_x_discrete(levels(trials))
#This step adds space for an 11th trial, so I state explicitly the labels for the x-axis cointossplot + scale_x_discrete(limits=c(1,2,3,4,5,6,7,8,9,10))
- Function to generate summary statistics about the Normal dataset.
#A function that returns a table of the mean, variance, and standard deviation of the input dataset sum.stat1<- function(x){ mean<- mean(x) variance<- var(x) standarddev<- sqrt(var(x)) result<-data.frame(mean, variance, standarddev) return(result) } #Running the function with the normally distributed simulated data fxn1<-sum.stat1(Normaldist) fxn1
## mean variance standarddev ## 1 0.1409396 0.9489135 0.9741219
#Compare the results of my function to the summary() function fxn2<-summary(Normaldist) fxn2
## Min. 1st Qu. Median Mean 3rd Qu. Max. ## -2.3670 -0.5213 0.1364 0.1409 0.8603 2.7350
No comments:
Post a Comment