Answers ( 3 )

  1. 2016-12-12 08:12

    Haskell expressions are referentially transparent. This means that if readFile would really have a type of FilePath -> String, then expression readFile "a.txt" would always yield the same result. Even if you read the file, then change it, and then read again, you will get the contents in its first state.

    Thus, we need to distingush between values and actions, and this is what IO is for. It doesn't let you use the result readFile "a.exe" in other expressions until you perform an action associated with it. As a consequence, after changing your file you have to perform the reading action again, to get file contents, and because of that you will able to see the changes.

  2. 2016-12-12 09:12

    We write functions that create computer programs

    It should be noted that Haskell is a functional programming language. Functions, in the mathematical sense, always produce the same values for the same inputs.

    Now this requirement to always produce the same result constrains things quite a bit, since a function to read a file would somehow have to produce the same result every time, even if the file was later changed. That's obviously not what we really want.

    There is, however, a way to make a functional programming language that can handle reading a changing file. What you do is to write a function that produces some action the computer should perform. So you might perform an action composed of the following steps:

    Read the file
    Break it into lines
    Change the even-numbered lines to uppercase
    Output the lines to the screen
    

    These four actions aren't performed yet. They're just a sequence of actions that we might perform. A function can return that exact same sequence of potential actions every time it's called, which makes it a proper mathematical function.

    The main :: IO a function in Haskell returns the action that the program should perform. It always returns the same action, making it a proper mathematical function. When the program is run, the computer evaluates the main function, producing the action the computer should perform, and the computer then executes the action.

    Do notation

    Do notation takes the strangeness out of the process, giving you the feel of a much more standard programming language. You have three options:

    1. Perform an action and do nothing with its results
    2. Perform an action and store its results
    3. Process data using only functions (no actions)

    These are done in the following ways, respectively:

    1. action args
    2. result <- action args
    3. let result = f . g . h . whateverCalculation $ value

    This is similar to an imperative language like C where you do, respectively:

    1. action(args);
    2. result = action(args);
    3. result = f(g(h(whateverCalculation(value))));
  3. 2016-12-12 20:12

    For (lines.readFile) path to work, the type of readFile would need to be FilePath -> String. That, however, doesn't make sense in Haskell. A Haskell function is supposed to always produce the same results when given the same arguments. If the result type readFile was String, however, that would not happen, as readFile "foo.txt" would have to, for any useful implementation of such a readFile, produce different strings depending on the contents of the foo.txt file.

    The solution to this issue is giving readFile the type FilePath -> IO String. An IO String is not a string, but a program that can be executed by the computer and that, when executed, somehow materialises a String into memory. While the String thus produced might be different each time the program is executed, the program itself remains the same, and therefore readFile always returns the same results when given the same arguments (and so, for instance, readFile "foo.txt" is always the same program).

    This trick of manipulating a program that produces an I/O-dependent result instead of the result itself only works if the I/O-dependent result is kept opaque; that is, if there is no way of directly extracting it. In other words, there cannot be, for instance, an IO String -> String function -- for one, it would allow us to implement a readFile with the inappropriate type FilePath -> String that we have discussed above. There are, however, indirect ways of using the I/O-dependent result that do not lead to trouble. One of them is using it to create a second program, whose I/O-dependent result is just as opaque as the first one was. The Monad interface allows us to express this usage pattern:

    (>>=) :: Monad m => m a -> (a -> m b) -> m b
    

    Specialising (>>=) to IO, we get:

    (>>=) @IO :: IO a -> (a -> IO b) -> IO b
    

    The first program has type IO a, and the function that produces the second program using the I/O-dependent result of the first one has type a -> IO b. The result of (>>=) is a program which executes the first program and the second, newly generated, one in sequence. For instance...

    readFile "foo.txt" >>= putStrLn
    

    ... is a program which reads the contents of foo.txt and then displays these contents.


    P.S.: With respect to your example involving lines, it is worth noting that both (readFile >>= lines) path, as you have written it, and (\p -> readFile p >>= lines) path are rejected by the type checker. Something that does work is:

    (fmap lines . readFile) path
    

    In it, we are making indirect use of the file contents in a different way. If we have a program which produces an I/O-dependent result, we can turn it into a program which produces a modified version of this result. That is done through fmap, from the Functor class:

    fmap :: Functor f => (a -> b) -> f a -> f b
    

    Or, specialising to IO:

    fmap @IO :: (a -> b) -> IO a -> IO b
    
◀ Go back