Implementing try (look-ahead) and untilStop with parser combinators


I am following this tutorial for implementing Parser Combinators (a la parsec) in Haskell. I implemented everything of the NanoParsec mentioned throught this post.

For some hours now, I am struggeling to implement

-- try p. If p fails continue without consuming anything
try :: Parser a -> Parser a
try p = ...

-- Parser for everything, until the character-sequence stop appears. 
-- If stop does not appear at all: fail
untilStop :: String -> Parser String
untilStop stop = ...

My best attempt to implement untilStop looks like somewhat like this and does not quite work

untilStop :: String -> Parser String
untilStop (c : cs) = do
  s <- some $ satisfy (/= d)
  string (c : cs) <|> do
    i <- item
    untilStop (d : ds)
  -- maybe use msum from MonadPlus to combine?

I could not figure out how to combine s, i and the recursive call without failing everthing because of string are not getting everything together.

I think once I have try, untilStop should be straightforward. Can someone point me in the right direction or implement it (try) for me?

Right now I am still learning about Monads, Applicative and related stuff so trying to understand the sourcecode of parsec was impossible for me.

Show source
| haskell   | monads   | parsing   2017-01-01 14:01 1 Answers

Answers ( 1 )

  1. 2017-01-01 16:01

    As I said in a comment, I think you don't need to have a Parsec-like try.

    For the untilStop, check this:

    untilStop :: String -> Parser String
    untilStop [] = everything
    untilStop (c:cs) = item >>= fun
      where fun i = do { s <- untilStop cs;
                         if i == c && s == "" then return "" else failure } <|> do
                         s <- untilStop (c : cs)
                         return (i : s)

    First, if the stop string is empty, you parse everything. Where everything is:

    everything :: Parser String
    everything = Parser (\inp -> [(inp,"")])

    Otherwise, if it is of the form c:cs, then parse a character i and consider two cases:

    • The stop string is right in the front of the parsing stream (because c == i and parsing with the rest of the string cs gives an empty result), then return "". Or,

    • It is somewhere in the stream, so you look for it further.

    Note that the <|> operator is used to backtrack. If untilStop cs fails to be what we want, we need to reparse, using untilStop (c:cs) instead.

◀ Go back