## Why Decision Tree code written in python predicts differently than the code written in R?

I am working with load_iris data set from sklearn in python and R(it's just called iris in R). I built the model in both language using "gini" index and in both languages I am able to test the model properly when the test data is taken directly from...
more »

## Read a large (1.5 GB) file in h2o R

I am using h2o package for modelling in R. For this I want to read a dataset which has a size of about 1.5 GB using h2o.importfile(). I start the h2o server using the lines library(h2oEnsemble) h2o.init(max_mem_size = '1499m',nthreads=-1) This pro...
more »

## How to create new function from output of previous function in R?

I am ignorant when it comes to R programming and programming in general but I have two pieces of code that have come across a similar problem (for me). Here we go... (A) I currently have a function that returns record(s) of a patient, trial number,...
more »

## Defining exponential distribution in R to estimate probabilities

I have a bunch of random variables (X1,....,Xn) which are i.i.d. Exp(1/2) and represent the duration of time of a certain event. So this distribution has obviously an expected value of 2, but I am having problems defining it in R. I did some research...
more »

## Find the first set of consecutive integers in a vector

I have a dataset where I am trying to find the first instance in a consecutive set of rows that are identical. So let's say given this dataset: df <- data.frame(trial = c(1:16), DV = c(2, 3, 2, 3, 3, 4, 4, 4, 4, 4, 2, 2, 2, 2, 2, 1)) If I were...
more »

## Write a wrapper function to successfully take addition arguments (like subset) via ellipsis (...)

I am writing a function that calls another function (e.g. lm), and I would like to pass other arguments to it using ellipsis (...). However, the data to be used is not in the global environment, but inside a list. A minimal example: L <- list(...
more »

## Prevent showing the year several times unnecessarily with time series

I would like to display months (in abbreviated form) along the horizontal axis, with the corresponding year printed once. I know how to display month-year: The un-needed repetition of the year clutters the labels. Instead I would like something l...
more »

## Convert Json array key's as csv column name and values

I am parsing a json data to write a csv file. I am using tidyjson package to do this work. In some point I need to print all the subjects value below in a separate columns and score as a value. Meaning Physics, Mathematics will be a column name and ...
more »

## How to build libRmath.dylib on Mac

I am a Mac user. I am trying to install libRmath.dylib, the standalone Rmath library from R. I found that R does not compile libRmath by default. So I tried to compile the Rmath library from source. But I can only find .tar file for Windows/Unix on C...
more »

## Statistics of multiple similarly named columns

I have a huge dataset with multiple columns like x1, x2, x3......x25, y1, y2, y3......y50, z1, z2.......z10 etc. which looks something like this: x1 x2 x3 x4 y1 y2 y3 1 2 1 2 1 1 2 2 1 1 1 3 1 1 1 2 2 1 1...
more »

## R POSIXct returns NA with "03/12/2017 02:17:13"

I have a data set containing the following date, along with several others 03/12/2017 02:17:13 I want to put the whole data set into a data table, so I used read_csv and as.data.table to create DT which contained the date/time information in date....
more »

## Write a for loop in a function to get a matrix

I need to put delta.vec and sigma.vec values through my required.replicates function and store them in my practice1 matrix. But I get NULL. sigma.vec <- c(2,4,6,8,10,12) delta.vec <- c(1,2,5,8,10) practice1 <- matrix(0, nrow=length(delta...
more »

## Sort list of strings by order of numeric parts

I have a list of paths of files, which I want to sort in ascending order based on the first path of each list. The list of paths is show \$`HG-U133_Plus_2` [1] "C:\\Users\\agaz\\AppData\\Local\\Temp\\Rtmp0wZI21/008947515435900b4d1a0b8d/22" [2] "C:\\U...
more »

## Extract data from multiple webpages from a website which reloads automatically in r

I have seen other posts which show to extract data from multiple webpages But the problem is that for my website when I scroll the website to see the number of webpages to check in how many pages the data is divided into, the page automatically refr...
more »

## How to Duplicate Rows Based on Character in a String of Multiple Columns

I have a data frame like the below which contains commas in columns x & y: df <- data.frame(var1=letters[1:5], var2=letters[6:10], var3=1:5, x=c('apple','orange,apple', 'grape','apple,orange,grape','cherry,peach'), y=c('wine', 'wine', 'juice'...
more »

## ggplot2: make the points on the line a darker color than the line color

I would like to make each point on the graph a different color from the line. Here is sample data. df <- structure(list(yrmonth = structure(c(17167, 17167, 17167, 17198, 17198, 17198, 17226, 17226, 17226, 17257, 17257, 17257), class = "Date"), ...
more »

## dplyr: sum inside consecutive mutate

library(dplyr) tib <- tibble(a = c(1,2,3)) The following work as expected: tib %>% mutate(b = a^2, c = sqrt(b)) # A tibble: 3 x 3 a b c <dbl> <dbl> <dbl> 1 1 1 1 2 2 4 2 3 3 9...
more »

## Subsetting if contains multiple variables in a certain order

In my dataframe, I have two columns of interest: id and name - my goal is to only keep records of id where id has more than one value in name and where the final value in name is 'B' . The sample data would look like this: > test id name...
more »

## How to get week ending next Sunday from date

I know lubridate has a function ceiling_date but it provides week ending on the next Saturday from a given date. How can I change it to get the week ending next Sunday instead? > ceiling_date(as.Date('2017-06-16'), 'week') [1] "2017-06-17 20:00:0...
more »

## How to create a customized theme similar to theme_bw in ggplot2?

I have the following codes in ggplot2: require("ggplot2") df <- data.frame(x=factor(rep(1:2,5)), y=rnorm(10)) ggplot(df , aes(x,y)) + geom_point(size = 3) + theme(axis.text.x = element_text(angle = 40, hjust = 1, colour = "black", size=12), ...
more »

## removing the central white circle?

I have a circular plot and I would like to find a way to remove the little white circle in the middle. Here is my code: ggplot(d5)+geom_tile(aes(x=x, y=y, fill=CNP))+ scale_y_continuous(expand=c(0,0),breaks=NULL,limits=c(0,3.6))+ scale_fill_...
more »

## Extract text from inner-most nested parentheses of string

From the text string below, I am trying to extract a specific string subset. string <- c("(Intercept)", "scale(AspectCos_30)", "scale(CanCov_500)", "scale(DST50_30)", "scale(Ele_30)", "scale(NDVI_Tin_250)", "scale(Slope_500)", ...
more »

## Apply FUN row-wise on data frame with integer and character variables

A completely basic question - and forgive me if it is a duplicate. set.seed(1) df <- data.frame(id=c('a', 'a', 'b', 'b', 'a'), a=sample(1:10, size=5, replace=T), b=sample(1:10, size=5, replace=T), c=sampl...
more »

## julia: outer product function

In R, the function outer structurally allows you to take the outer product of two vectors x and y while providing a number of options for the actual function applied to each combination. For example outer(x,y,'-') creates an "outer product" matrix of...
more »

## Is the regex \\L supported in R 3.5.0?

I am experiencing difficulty with the perl expression \\L\\1 in very particular circumstances on R-dev (2017-06-06 and 2017-06-16 r72796 builds): bib <- readLines("https://raw.githubusercontent.com/HughParsonage/TeXCheckR/master/tests/testthat/li...
more »

## Group to Group division

Data Set: date bal 1/31/2013 10 1/31/2013 11 1/31/2013 12 1/31/2013 13 1/31/2013 14 2/28/2013 20 2/28/2013 30 2/28/2013 40 2/28/2013 50 2/28/2013 60 3/30/2013 10 3/30/2013 11 3/30/2013 12 ...
more »

## generate random sequences of NA of random lengths in a vector

I want to generate missing values in a vector so that the missing value are grouped in sequences, to simulate periods of missing data of different length. Let's say I have a vector of 10 000 values and I want to generate 12 sequences of NA at random...
more »

## Using ggplot's sec.axis with a non-monotonic transformation

I would like to use ggplot's sec.axis option to produce a secondary X-axis (call it Z) showing the transformation Z = X + sqrt( X^2 - X ). This transformation is not monotonic in general, but is monotonic over the range of X that is possible in my ap...
more »

## Spaghetti plot in R with repeated measurements

I would like to realize a spaghetti plot in R to visualize differences between two conditions for every single participant. I computed a repeated measurements ANOVA with the factors CRITICALITY (critical, non-critical) and LATERALITY (ipsilateral, co...
more »

## Concatenating two vectors in R

I want to concatenate two vectors one after the other in R. I have written the following code to do it: > a = head(tracks_listened_train) > b = head(tracks_listened_test) > a [1] cc1a46ee0446538ecf6b65db01c30cd8 19acf9a5cbed34743ce0ee42ef...
more »

## for loop/function to create new variable by conditional checking of multiple character strings

I am not a new R user but have never had to write loops and I would like to learn as in this case I think it will save time and makes more sense. I have a large data set that has data on visit frequency to different forest types, a simplified subset...
more »

## How to create a running total, according to factor, in R?

I am wishing to create a running scoreline margin for sport data. For example, consider my data as follows: df <- data.frame(Club = c("O", "H", "H", "O", "H", "O", "O"), TimeOfScore = c("1:30", "2:06", "7:09", "9:09", "11:08", "1...
more »

## Return two objects from lapply

I have created a function which takes a little while to run (lots of crunching going on) and there are two distinct outputs that I need to return from this function. The inputs into these outputs are the same which is why I have combined them in the ...
more »

## Select or highlight data on map by click on legend

Is there any way to select or highlight data on a leaflet map in by clicking on the legend in Rshiny? example code: library(shiny) library(leaflet) library(RColorBrewer) library(leafletGeocoderRshiny) ui <- fluidPage( leafletOutput("map"), p...
more »