lead or lag function to get several values, not just the nth












6















I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.



The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.



df <- tibble(words = c("it", "was", "the", "best", "of", "times", 
"it", "was", "the", "worst", "of", "times"))
df <- df %>% mutate(chunks = ifelse(words=="times",
paste(lag(words, 3),
words,
lead(words, 3), sep = " "),
NA))


The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.



Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.



Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.










share|improve this question









New contributor




wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

























    6















    I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.



    The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.



    df <- tibble(words = c("it", "was", "the", "best", "of", "times", 
    "it", "was", "the", "worst", "of", "times"))
    df <- df %>% mutate(chunks = ifelse(words=="times",
    paste(lag(words, 3),
    words,
    lead(words, 3), sep = " "),
    NA))


    The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.



    Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.



    Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.










    share|improve this question









    New contributor




    wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.























      6












      6








      6


      1






      I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.



      The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.



      df <- tibble(words = c("it", "was", "the", "best", "of", "times", 
      "it", "was", "the", "worst", "of", "times"))
      df <- df %>% mutate(chunks = ifelse(words=="times",
      paste(lag(words, 3),
      words,
      lead(words, 3), sep = " "),
      NA))


      The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.



      Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.



      Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.










      share|improve this question









      New contributor




      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.












      I have a tibble with a list of words for each row. I want to create a new variable from a function that searches for a keyword and, if it finds the keyword, creates a string composed of the keyword plus-and-minus 3 words.



      The code below is close, but, rather than grabbing all three words before and after my keyword, it grabs the single word 3 ahead/behind.



      df <- tibble(words = c("it", "was", "the", "best", "of", "times", 
      "it", "was", "the", "worst", "of", "times"))
      df <- df %>% mutate(chunks = ifelse(words=="times",
      paste(lag(words, 3),
      words,
      lead(words, 3), sep = " "),
      NA))


      The most intuitive solution would be if the lag function could do something like this: lead(words, 1:3) but that doesn't work.



      Obviously I could pretty quickly do this by hand (paste(lead(words,3), lead(words,2), lead(words,1),...lag(words,3)), but I'll eventually actually want to be able to grab the keyword plus-and-minus 50 words--too much to hand-code.



      Would be ideal if a solution existed in the tidyverse, but any solution would be helpful. Any help would be appreciated.







      r dplyr lag lead






      share|improve this question









      New contributor




      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 1 hour ago







      wscampbell













      New contributor




      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 1 hour ago









      wscampbellwscampbell

      313




      313




      New contributor




      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      wscampbell is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.
























          4 Answers
          4






          active

          oldest

          votes


















          4














          One option would be sapply:



          library(dplyr)

          df %>%
          mutate(
          chunks = ifelse(words == "times",
          sapply(1:nrow(.),
          function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
          NA)
          )


          Output:



          # A tibble: 12 x 2
          words chunks
          <chr> <chr>
          1 it NA
          2 was NA
          3 the NA
          4 best NA
          5 of NA
          6 times the best of times it was the
          7 it NA
          8 was NA
          9 the NA
          10 worst NA
          11 of NA
          12 times the worst of times


          Although not an explicit lead or lag function, it can often serve the purpose as well.






          share|improve this answer



















          • 1





            Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

            – wscampbell
            48 mins ago











          • You're welcome! Consider accepting the answer if it helped.

            – arg0naut
            41 mins ago



















          3














          Similar to @arg0naut but without dplyr:



          r  = 1:nrow(df)
          w = which(df$words == "times")
          wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

          df$chunks <- NA_character_
          df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")

          # A tibble: 12 x 2
          words chunks
          <chr> <chr>
          1 it <NA>
          2 was <NA>
          3 the <NA>
          4 best <NA>
          5 of <NA>
          6 times the best of times it was the
          7 it <NA>
          8 was <NA>
          9 the <NA>
          10 worst <NA>
          11 of <NA>
          12 times the worst of times


          The data.table translation:



          library(data.table)
          DT = data.table(df)

          w = DT["times", on="words", which=TRUE]
          wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

          DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]





          share|improve this answer































            3














            data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).



            With data table:



            library(data.table)
            setDT(df)

            df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]

            # words chunks
            # 1: it <NA>
            # 2: was <NA>
            # 3: the <NA>
            # 4: best <NA>
            # 5: of <NA>
            # 6: times the best of times it was the
            # 7: it <NA>
            # 8: was <NA>
            # 9: the <NA>
            # 10: worst <NA>
            # 11: of <NA>
            # 12: times the worst of times


            With dplyr and only using data.table for the shift function:



            library(dplyr)

            df %>%
            mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
            chunks = trimws(ifelse(words != "times", NA, chunks)))

            # # A tibble: 12 x 2
            # words chunks
            # <chr> <chr>
            # 1 it NA
            # 2 was NA
            # 3 the NA
            # 4 best NA
            # 5 of NA
            # 6 times the best of times it was the
            # 7 it NA
            # 8 was NA
            # 9 the NA
            # 10 worst NA
            # 11 of NA
            # 12 times the worst of times





            share|improve this answer


























            • That's right. Thanks!

              – IceCreamToucan
              22 mins ago



















            0














            Here is a another tidyverse solution using lag and lead



            laglead_f <- function(what, range)
            setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))

            df %>%
            mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
            unite(chunks, -words, sep = " ") %>%
            mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
            ## A tibble: 12 x 2
            # words chunks
            # <chr> <chr>
            # 1 it NA
            # 2 was NA
            # 3 the NA
            # 4 best NA
            # 5 of NA
            # 6 times the best of times it was the
            # 7 it NA
            # 8 was NA
            # 9 the NA
            #10 worst NA
            #11 of NA
            #12 times the worst of times


            The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".






            share|improve this answer

























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });






              wscampbell is a new contributor. Be nice, and check out our Code of Conduct.










              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55010810%2flead-or-lag-function-to-get-several-values-not-just-the-nth%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              4














              One option would be sapply:



              library(dplyr)

              df %>%
              mutate(
              chunks = ifelse(words == "times",
              sapply(1:nrow(.),
              function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
              NA)
              )


              Output:



              # A tibble: 12 x 2
              words chunks
              <chr> <chr>
              1 it NA
              2 was NA
              3 the NA
              4 best NA
              5 of NA
              6 times the best of times it was the
              7 it NA
              8 was NA
              9 the NA
              10 worst NA
              11 of NA
              12 times the worst of times


              Although not an explicit lead or lag function, it can often serve the purpose as well.






              share|improve this answer



















              • 1





                Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

                – wscampbell
                48 mins ago











              • You're welcome! Consider accepting the answer if it helped.

                – arg0naut
                41 mins ago
















              4














              One option would be sapply:



              library(dplyr)

              df %>%
              mutate(
              chunks = ifelse(words == "times",
              sapply(1:nrow(.),
              function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
              NA)
              )


              Output:



              # A tibble: 12 x 2
              words chunks
              <chr> <chr>
              1 it NA
              2 was NA
              3 the NA
              4 best NA
              5 of NA
              6 times the best of times it was the
              7 it NA
              8 was NA
              9 the NA
              10 worst NA
              11 of NA
              12 times the worst of times


              Although not an explicit lead or lag function, it can often serve the purpose as well.






              share|improve this answer



















              • 1





                Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

                – wscampbell
                48 mins ago











              • You're welcome! Consider accepting the answer if it helped.

                – arg0naut
                41 mins ago














              4












              4








              4







              One option would be sapply:



              library(dplyr)

              df %>%
              mutate(
              chunks = ifelse(words == "times",
              sapply(1:nrow(.),
              function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
              NA)
              )


              Output:



              # A tibble: 12 x 2
              words chunks
              <chr> <chr>
              1 it NA
              2 was NA
              3 the NA
              4 best NA
              5 of NA
              6 times the best of times it was the
              7 it NA
              8 was NA
              9 the NA
              10 worst NA
              11 of NA
              12 times the worst of times


              Although not an explicit lead or lag function, it can often serve the purpose as well.






              share|improve this answer













              One option would be sapply:



              library(dplyr)

              df %>%
              mutate(
              chunks = ifelse(words == "times",
              sapply(1:nrow(.),
              function(x) paste(words[pmax(1, x - 3):pmin(x + 3, nrow(.))], collapse = " ")),
              NA)
              )


              Output:



              # A tibble: 12 x 2
              words chunks
              <chr> <chr>
              1 it NA
              2 was NA
              3 the NA
              4 best NA
              5 of NA
              6 times the best of times it was the
              7 it NA
              8 was NA
              9 the NA
              10 worst NA
              11 of NA
              12 times the worst of times


              Although not an explicit lead or lag function, it can often serve the purpose as well.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered 1 hour ago









              arg0nautarg0naut

              5,2741319




              5,2741319








              • 1





                Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

                – wscampbell
                48 mins ago











              • You're welcome! Consider accepting the answer if it helped.

                – arg0naut
                41 mins ago














              • 1





                Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

                – wscampbell
                48 mins ago











              • You're welcome! Consider accepting the answer if it helped.

                – arg0naut
                41 mins ago








              1




              1





              Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

              – wscampbell
              48 mins ago





              Works pefectly, arg0naut. Thanks a bunch! Really, really helpful

              – wscampbell
              48 mins ago













              You're welcome! Consider accepting the answer if it helped.

              – arg0naut
              41 mins ago





              You're welcome! Consider accepting the answer if it helped.

              – arg0naut
              41 mins ago













              3














              Similar to @arg0naut but without dplyr:



              r  = 1:nrow(df)
              w = which(df$words == "times")
              wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

              df$chunks <- NA_character_
              df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")

              # A tibble: 12 x 2
              words chunks
              <chr> <chr>
              1 it <NA>
              2 was <NA>
              3 the <NA>
              4 best <NA>
              5 of <NA>
              6 times the best of times it was the
              7 it <NA>
              8 was <NA>
              9 the <NA>
              10 worst <NA>
              11 of <NA>
              12 times the worst of times


              The data.table translation:



              library(data.table)
              DT = data.table(df)

              w = DT["times", on="words", which=TRUE]
              wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

              DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]





              share|improve this answer




























                3














                Similar to @arg0naut but without dplyr:



                r  = 1:nrow(df)
                w = which(df$words == "times")
                wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                df$chunks <- NA_character_
                df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")

                # A tibble: 12 x 2
                words chunks
                <chr> <chr>
                1 it <NA>
                2 was <NA>
                3 the <NA>
                4 best <NA>
                5 of <NA>
                6 times the best of times it was the
                7 it <NA>
                8 was <NA>
                9 the <NA>
                10 worst <NA>
                11 of <NA>
                12 times the worst of times


                The data.table translation:



                library(data.table)
                DT = data.table(df)

                w = DT["times", on="words", which=TRUE]
                wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]





                share|improve this answer


























                  3












                  3








                  3







                  Similar to @arg0naut but without dplyr:



                  r  = 1:nrow(df)
                  w = which(df$words == "times")
                  wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                  df$chunks <- NA_character_
                  df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")

                  # A tibble: 12 x 2
                  words chunks
                  <chr> <chr>
                  1 it <NA>
                  2 was <NA>
                  3 the <NA>
                  4 best <NA>
                  5 of <NA>
                  6 times the best of times it was the
                  7 it <NA>
                  8 was <NA>
                  9 the <NA>
                  10 worst <NA>
                  11 of <NA>
                  12 times the worst of times


                  The data.table translation:



                  library(data.table)
                  DT = data.table(df)

                  w = DT["times", on="words", which=TRUE]
                  wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                  DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]





                  share|improve this answer













                  Similar to @arg0naut but without dplyr:



                  r  = 1:nrow(df)
                  w = which(df$words == "times")
                  wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                  df$chunks <- NA_character_
                  df$chunks[w] <- tapply(df$words[unlist(wm)], rep(w, lengths(wm)), FUN = paste, collapse=" ")

                  # A tibble: 12 x 2
                  words chunks
                  <chr> <chr>
                  1 it <NA>
                  2 was <NA>
                  3 the <NA>
                  4 best <NA>
                  5 of <NA>
                  6 times the best of times it was the
                  7 it <NA>
                  8 was <NA>
                  9 the <NA>
                  10 worst <NA>
                  11 of <NA>
                  12 times the worst of times


                  The data.table translation:



                  library(data.table)
                  DT = data.table(df)

                  w = DT["times", on="words", which=TRUE]
                  wm = lapply(w, function(wi) intersect(r, seq(wi-3L, wi+3L)))

                  DT[w, chunks := DT[unlist(wm), paste(words, collapse=" "), by=rep(w, lengths(wm))]$V1]






                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 1 hour ago









                  FrankFrank

                  55k659133




                  55k659133























                      3














                      data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).



                      With data table:



                      library(data.table)
                      setDT(df)

                      df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]

                      # words chunks
                      # 1: it <NA>
                      # 2: was <NA>
                      # 3: the <NA>
                      # 4: best <NA>
                      # 5: of <NA>
                      # 6: times the best of times it was the
                      # 7: it <NA>
                      # 8: was <NA>
                      # 9: the <NA>
                      # 10: worst <NA>
                      # 11: of <NA>
                      # 12: times the worst of times


                      With dplyr and only using data.table for the shift function:



                      library(dplyr)

                      df %>%
                      mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
                      chunks = trimws(ifelse(words != "times", NA, chunks)))

                      # # A tibble: 12 x 2
                      # words chunks
                      # <chr> <chr>
                      # 1 it NA
                      # 2 was NA
                      # 3 the NA
                      # 4 best NA
                      # 5 of NA
                      # 6 times the best of times it was the
                      # 7 it NA
                      # 8 was NA
                      # 9 the NA
                      # 10 worst NA
                      # 11 of NA
                      # 12 times the worst of times





                      share|improve this answer


























                      • That's right. Thanks!

                        – IceCreamToucan
                        22 mins ago
















                      3














                      data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).



                      With data table:



                      library(data.table)
                      setDT(df)

                      df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]

                      # words chunks
                      # 1: it <NA>
                      # 2: was <NA>
                      # 3: the <NA>
                      # 4: best <NA>
                      # 5: of <NA>
                      # 6: times the best of times it was the
                      # 7: it <NA>
                      # 8: was <NA>
                      # 9: the <NA>
                      # 10: worst <NA>
                      # 11: of <NA>
                      # 12: times the worst of times


                      With dplyr and only using data.table for the shift function:



                      library(dplyr)

                      df %>%
                      mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
                      chunks = trimws(ifelse(words != "times", NA, chunks)))

                      # # A tibble: 12 x 2
                      # words chunks
                      # <chr> <chr>
                      # 1 it NA
                      # 2 was NA
                      # 3 the NA
                      # 4 best NA
                      # 5 of NA
                      # 6 times the best of times it was the
                      # 7 it NA
                      # 8 was NA
                      # 9 the NA
                      # 10 worst NA
                      # 11 of NA
                      # 12 times the worst of times





                      share|improve this answer


























                      • That's right. Thanks!

                        – IceCreamToucan
                        22 mins ago














                      3












                      3








                      3







                      data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).



                      With data table:



                      library(data.table)
                      setDT(df)

                      df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]

                      # words chunks
                      # 1: it <NA>
                      # 2: was <NA>
                      # 3: the <NA>
                      # 4: best <NA>
                      # 5: of <NA>
                      # 6: times the best of times it was the
                      # 7: it <NA>
                      # 8: was <NA>
                      # 9: the <NA>
                      # 10: worst <NA>
                      # 11: of <NA>
                      # 12: times the worst of times


                      With dplyr and only using data.table for the shift function:



                      library(dplyr)

                      df %>%
                      mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
                      chunks = trimws(ifelse(words != "times", NA, chunks)))

                      # # A tibble: 12 x 2
                      # words chunks
                      # <chr> <chr>
                      # 1 it NA
                      # 2 was NA
                      # 3 the NA
                      # 4 best NA
                      # 5 of NA
                      # 6 times the best of times it was the
                      # 7 it NA
                      # 8 was NA
                      # 9 the NA
                      # 10 worst NA
                      # 11 of NA
                      # 12 times the worst of times





                      share|improve this answer















                      data.table::shift accepts a vector for the n (lag) argument and outputs a list, so you can use that and do.call(paste the list elements together. However, unless you're on data.table version >= 1.12, I don't think it will let you mix negative and positive n values (as below).



                      With data table:



                      library(data.table)
                      setDT(df)

                      df[, chunks := trimws(ifelse(words != "times", NA, do.call(paste, shift(words, 3:-3, ''))))]

                      # words chunks
                      # 1: it <NA>
                      # 2: was <NA>
                      # 3: the <NA>
                      # 4: best <NA>
                      # 5: of <NA>
                      # 6: times the best of times it was the
                      # 7: it <NA>
                      # 8: was <NA>
                      # 9: the <NA>
                      # 10: worst <NA>
                      # 11: of <NA>
                      # 12: times the worst of times


                      With dplyr and only using data.table for the shift function:



                      library(dplyr)

                      df %>%
                      mutate(chunks = do.call(paste, data.table::shift(words, 3:-3, fill = '')),
                      chunks = trimws(ifelse(words != "times", NA, chunks)))

                      # # A tibble: 12 x 2
                      # words chunks
                      # <chr> <chr>
                      # 1 it NA
                      # 2 was NA
                      # 3 the NA
                      # 4 best NA
                      # 5 of NA
                      # 6 times the best of times it was the
                      # 7 it NA
                      # 8 was NA
                      # 9 the NA
                      # 10 worst NA
                      # 11 of NA
                      # 12 times the worst of times






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited 23 mins ago

























                      answered 43 mins ago









                      IceCreamToucanIceCreamToucan

                      9,9821818




                      9,9821818













                      • That's right. Thanks!

                        – IceCreamToucan
                        22 mins ago



















                      • That's right. Thanks!

                        – IceCreamToucan
                        22 mins ago

















                      That's right. Thanks!

                      – IceCreamToucan
                      22 mins ago





                      That's right. Thanks!

                      – IceCreamToucan
                      22 mins ago











                      0














                      Here is a another tidyverse solution using lag and lead



                      laglead_f <- function(what, range)
                      setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))

                      df %>%
                      mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
                      unite(chunks, -words, sep = " ") %>%
                      mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
                      ## A tibble: 12 x 2
                      # words chunks
                      # <chr> <chr>
                      # 1 it NA
                      # 2 was NA
                      # 3 the NA
                      # 4 best NA
                      # 5 of NA
                      # 6 times the best of times it was the
                      # 7 it NA
                      # 8 was NA
                      # 9 the NA
                      #10 worst NA
                      #11 of NA
                      #12 times the worst of times


                      The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".






                      share|improve this answer






























                        0














                        Here is a another tidyverse solution using lag and lead



                        laglead_f <- function(what, range)
                        setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))

                        df %>%
                        mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
                        unite(chunks, -words, sep = " ") %>%
                        mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
                        ## A tibble: 12 x 2
                        # words chunks
                        # <chr> <chr>
                        # 1 it NA
                        # 2 was NA
                        # 3 the NA
                        # 4 best NA
                        # 5 of NA
                        # 6 times the best of times it was the
                        # 7 it NA
                        # 8 was NA
                        # 9 the NA
                        #10 worst NA
                        #11 of NA
                        #12 times the worst of times


                        The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".






                        share|improve this answer




























                          0












                          0








                          0







                          Here is a another tidyverse solution using lag and lead



                          laglead_f <- function(what, range)
                          setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))

                          df %>%
                          mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
                          unite(chunks, -words, sep = " ") %>%
                          mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
                          ## A tibble: 12 x 2
                          # words chunks
                          # <chr> <chr>
                          # 1 it NA
                          # 2 was NA
                          # 3 the NA
                          # 4 best NA
                          # 5 of NA
                          # 6 times the best of times it was the
                          # 7 it NA
                          # 8 was NA
                          # 9 the NA
                          #10 worst NA
                          #11 of NA
                          #12 times the worst of times


                          The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".






                          share|improve this answer















                          Here is a another tidyverse solution using lag and lead



                          laglead_f <- function(what, range)
                          setNames(paste(what, "(., ", range, ", default = '')"), paste(what, range))

                          df %>%
                          mutate_at(vars(words), funs_(c(laglead_f("lag", 3:0), laglead_f("lead", 1:3)))) %>%
                          unite(chunks, -words, sep = " ") %>%
                          mutate(chunks = ifelse(words == "times", trimws(chunks), NA))
                          ## A tibble: 12 x 2
                          # words chunks
                          # <chr> <chr>
                          # 1 it NA
                          # 2 was NA
                          # 3 the NA
                          # 4 best NA
                          # 5 of NA
                          # 6 times the best of times it was the
                          # 7 it NA
                          # 8 was NA
                          # 9 the NA
                          #10 worst NA
                          #11 of NA
                          #12 times the worst of times


                          The idea is to store values from the three lagged and leading vectors in new columns with mutate_at and a named function, unite those columns and then filter entries based on your condition where words == "times".







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited 8 mins ago

























                          answered 13 mins ago









                          Maurits EversMaurits Evers

                          29k41535




                          29k41535






















                              wscampbell is a new contributor. Be nice, and check out our Code of Conduct.










                              draft saved

                              draft discarded


















                              wscampbell is a new contributor. Be nice, and check out our Code of Conduct.













                              wscampbell is a new contributor. Be nice, and check out our Code of Conduct.












                              wscampbell is a new contributor. Be nice, and check out our Code of Conduct.
















                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55010810%2flead-or-lag-function-to-get-several-values-not-just-the-nth%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Why is a white electrical wire connected to 2 black wires?

                              Waikiki

                              What are all the squawk codes?