gsub: replace word if not wrapped in brackets

Question

I would like to gsub a word, but only cases where it is not wrapped in brackets.

x <- c("hello","[hello]")

I would like gsub(regex,"test",x) to return c("test","[hello]"), but I am having trouble creating the correct regex statement.

A naive implementation is: gsub("^(?!\\[).*$","test",x, perl=TRUE), which works in the above case, but only because each string is one word, so it doesn't work for x <- "hello [hello]" for example, which I want to be test [hello].

I've tried a bunch of different lookaheads to no avail. Any help would be appreciated.


Input

x <- c("hello", "[hello]", "hello [hello]")

Desired

# [1] "test"         "[hello]"      "test [hello]"

Show source
| R   | regex   2017-01-06 17:01 2 Answers

Answers ( 2 )

  1. 2017-01-06 17:01

    You can use negative look around to set constraint to the word boundaries, for instance (?<!\\[)\\b\\w+\\b(?!\\]) will replace words only if the word boundary is not []:

    gsub("(?<!\\[)\\b\\w+\\b(?!\\])", "test", x, perl = TRUE)
    # [1] "test [hello]"       # assuming this is your desired output
    

    \\b\\w+\\b will look for a word but with negative look-behind ?<! and negative look-ahead ?!, the word boundary should not be []. You can also reference this answer.

  2. 2017-01-06 17:01

    We can do this easily with grep

    x[grep("^[^[]+$", x)] <- "test"
    x
    #[1] "test"    "[hello]"
    

    Or with sub

    sub("^[^[]+", "test", x)
    #[1] "test"    "[hello]"
    

    For the second case

    sub("^\\b[^[+]+\\b", "test", x1)
    #[1] "test [hello]"
    

    data

    x <- c("hello","[hello]")
    x1 <- "hello [hello]"
    
◀ Go back