Create "Week_Start" variable in R

I have a dataframe that look like the one below. bus_date <- as.Date(c('2017-04-03', '2017-04-04', '2017-04-06', '2017-04-11', '2017-04-13', '2017-04-17')) sales <- c(100, 110, 120, 200, 300, 100) daily_sales <- data.frame(bus_date, sales...
more »

2017-04-18 21:04 (2) Answers

categorize based on date ranges in R

How do I categorize each row in a large R dataframe (>2 million rows) based on date range definitions in a separate, much smaller R dataframe (12 rows)? My large dataframe, captures, looks similar to this when called via head(captures) : id...
more »

2017-04-18 21:04 (3) Answers

dplyr for rowwise quantiles

I have a df of strata, each of which has 1000 samples from a posterior distribution of the estimates from that stratum. mydf <- as.data.frame(lapply(seq(1, 1000), rnorm, n=100)) colnames(mydf) <- paste('s', seq(1, ncol(mydf)), sep='') I wan...
more »

2017-04-18 21:04 (3) Answers

R -- Grouping observations based on a code list

I have a dataset where each observation has an integer "code" variable, which I would like to convert to a character "class" variable. Here is a simple example to illustrate what I am trying to do. code.list <- data.frame(code = 1:10, ...
more »

2017-04-18 19:04 (1) Answers

How to sort list in R?

I have list of lists similar to this: a <- list( list(day = 5, text = "foo"), list(text = "bar", day = 1), list(text = "baz", day = 3), list(day = 2, text = "quux") ) with unknown number of fields and the fields my be out of order. how...
more »

2017-04-18 18:04 (2) Answers

Create a colour blind test with ggplot

I would like to create a colour blind test, similar to that below, using ggplot. The basic idea is to use geom_hex (or perhaps a voronoi diagram, or possibly even circles as in the figure above) as the starting point, and define a dataframe that,...
more »

2017-04-18 17:04 (0) Answers

Aggregate character/string variables

I've got a data frame which has a list of Person Reference and then a Trade, each person reference can have multiple trades. I want to aggregate so that it shows the data as below: Record Person Ref Trade Code 1 512 elec, plumbing...
more »

2017-04-18 16:04 (1) Answers

dplyr summarise with condition

I would like to apply the dplyr function summarise_all to calculate the mean in each column. However, I do want to omit the 0 values, thus need to build in a conditional statment. df <- data.frame(x = c(1,0,4,6,0,9), y = c(12,42,8,0,11,2)) df %&...
more »

2017-04-18 11:04 (1) Answers

Error while loading RTextTools library in Mac

Error while loading RTextTools library in Mac, however this works for windows pc. Here is the error: library(RTextTools) # For removeSparseTerms() Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : there is no packag...
more »

2017-04-18 05:04 (0) Answers

How to plot a function family in ggplot2

I need to plot a family of functions variying according to a set of parameters, say, a family of normal distribution curves that depend on the mean and standard deviation. I found here, a code snipet that almost do the task: p9 <- ggplot(data.fr...
more »

2017-04-18 04:04 (2) Answers

Milliseconds separated by comma

I have the following data.frame called "data" (it is much larger but i just give the first lines as an example): Timestamp Weight Degrees 1 30-09-2016 11:45:00,000 38.19 40.00 2 01-10-2016 06:19:57,860 39.12 40.00 3 01-10-2016 06:20:46,393 42.11...
more »

2017-04-17 23:04 (1) Answers

R searching for information within column

I have two tables. Which have the kind of formatting shown below. One of it is table A as such: students|Test Score A | 100 B | 81 C | 92 D | 88 Another table B I have looks like this: Class | Students 1 | {A,D}...
more »

2017-04-17 23:04 (5) Answers

Custom sorting and calling for columns in R

Got 100 columns with currency data not sorted in a specific order. However, there is one column of spot rates and one for forward rates for each currency. They can all be identified by their name which are in the format USDXXX (XXX=ticker for currenc...
more »

2017-04-17 22:04 (0) Answers

Negative lookahead in R not behaving as expected

I am trying to replace instances in a string which begin with abc in a text I'm working with in R. The output text is highlighted in HTML over a couple of passes, so I need the replacement to ignore text inside HTML carets. The following seems to w...
more »

2017-04-17 21:04 (1) Answers

Divide many columns by another column

Ok with A <- c(1:10) B <- c(2:11) C <- c(3:12) df1 <- data.frame(A,B,C) I do not understand this error: df2 <- df1 / df1[,"C"] df2 <- df1[1:3,] / df1[1:3,"C"] a <- subset (df1, select = c(A, B)) b <- subset (df1, select ...
more »

2017-04-17 20:04 (2) Answers

R: Shuffle dataframe columnwise

This link answers a part of my question: How to randomize (or permute) a dataframe rowwise and columnwise?. > df1 a b c 1 1 1 0 2 1 0 0 3 0 1 0 4 0 0 0 Column-wise shuffle gives me below output df3, which is reordering the columns > df3 &...
more »

2017-04-17 19:04 (1) Answers

Accessing a R user defined function in Python

So I need to do Principle Component Regression with cross validation and I could not find a package in Python that would do so. I wrote my own PCR class but when tested against R's pls package it performs significantly worse and is much slower on hig...
more »

2017-04-17 17:04 (1) Answers

number of rows of .SD

If I have the following simple datatable: DT <- data.table(VAL = sample(c(1, 2, 3), 10, replace = TRUE),Group = c(rep("A",5),rep("B",5))) I can calc the mean via: DT[,lapply(.SD,function(x){mean(x)}),by=Group] I could also use: DT[,lapply(....
more »

2017-04-17 16:04 (1) Answers

Count the number of duplicated values

I am trying to run function that will take one parameter, which is list of numbers, and count the duplicates. Example: number <- c(1, 1, 1, 2, 2, 4, 5) In this case, the output will be 2, because only 1 and 2 have duplicates. I know how to u...
more »

2017-04-17 03:04 (2) Answers

read.fwf reverses sign on all numbers

For some inexplicable reason, read.fwf is reversing the signs of all the numbers in the table I'm reading in. My code is like this: person.widths <- c(6, 8, 3, 6, rep(7, 8), 6, 12) meas <- - read.fwf(file=myfile, header=FALSE, ...
more »

2017-04-17 02:04 (1) Answers