Finding the longest path through a DFS Forest in Haskell

Question

I have a graph that I'm trying to find the longest path through but I'm unsure how to go about it. I'm using the standard Graph and Edge types from Data.Graph, and the graph was generated using the buildG function. My initial thought was to use dfs to perform a depth-first search, but this yielded a Forest object that I'm unsure how to manipulate.

If it at all helps (and I'm sorry I can't pretty print this, showForest only prints strings apparently), this is the output of the dfs command on my graph:

[Node {rootLabel = 87, subForest = [Node {rootLabel = 82, subForest = [Node {rootLabel = 70, subForest = []},Node {rootLabel = 83, subForest = [Node {rootLabel = 66, subForest = [Node {rootLabel = 72, subForest = [Node {rootLabel = 88, subForest = []}]}]},Node {rootLabel = 79, subForest = [Node {rootLabel = 69, subForest = [Node {rootLabel = 85, subForest = []},Node {rootLabel = 84, subForest = [Node {rootLabel = 86, subForest = []},Node {rootLabel = 73, subForest = [Node {rootLabel = 81, subForest = []}]}]}]}]}]},Node {rootLabel = 89, subForest = []}]}]}]

I've found a few different functions for finding longest paths through Trees, such as in this answer, but they only work for Trees, not Forests, and I'm unsure if it's even possible to convert between the two.

Thanks!


Show source
| haskell   | graph   | tree   2016-11-17 06:11 2 Answers

Answers ( 2 )

  1. 2016-11-17 18:11

    I will first explain what dfs does. As you mentioned in your comments, you use buildG function so I will also use it in my answer.

    dfs takes 2 arguments: graph and list of root nodes. And then dfs runs depth-first search starting from each vertex number in list, returning Tree of reachable vertices from those nodes in list. For example:

    1. You have graph of 1 vertex and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is empty.
    2. You have graph of 2 vertices and edge from 1 to 2 and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is list with 2 as root label.
    3. You have graph of 3 vertices and edges (1,2) and (1,3) and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is list with two trees with 2 and 3 as root labels correspondingly.

    All these examples (and even some more) are demonstrated in next ghci session:

    λ: let g = buildG (1,1) []
    λ: dfs g [1]
    [Node {rootLabel = 1, subForest = []}]
    λ: dfs g [2]
    *** Exception: Ix{Int}.index: Index (2) out of range ((1,1))
    λ: let g = buildG (1,2) [(1,2)]
    λ: dfs g [1]
    [Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = []}]}]
    λ: dfs g [2]
    [Node {rootLabel = 2, subForest = []}]
    λ: let g = buildG (1,3) [(1,2), (1,3)]
    λ: dfs g [1]
    [Node {rootLabel = 1, subForest = [Node {rootLabel = 3, subForest = []},Node {rootLabel = 2, subForest = []}]}]
    

    But, unfortunately, dfs can't help you with your task: finding furthest vertex from given one. In the way algorithm works it is just not capable to solve such problems. Suppose you have graph with 4 vertices and edges:

    1 → 2, 2 → 3, 3 → 4, 1 → 4

    Vertex that has longest path from 1 is 3 (this path has 2 edges). But dfs will return you different results depending on order of edges you specified.

    λ: let g = buildG (1,4) [(1,2), (1,4), (2,3), (3,4)]
    λ: dfs g [1]
    [Node {rootLabel = 1, subForest = [Node {rootLabel = 4, subForest = []},Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = []}]}]}]
    λ: let g = buildG (1,4) [(1,4), (1,2), (2,3), (3,4)]
    λ: dfs g [1]
    [Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = [Node {rootLabel = 4, subForest = []}]}]}]}]
    

    It is not the problem of particular implementation. It is just how dfs works. And you can't use dfs in general case to solve your particular problem. Also, this problem can't be fixed with just rearranging edges in list because you don't know how to rearrange them in general.

    What you really need to use is bfs algorithm — breadth-first search. Unfortunately, it is not implemented in Data.Graph library. So you need to implement whole algo from zero or find implementation somewhere (here some dicussions: Traversing a graph breadth first, marking visited nodes in Haskell).

  2. 2016-11-18 00:11

    As Shersh explains in their answer, depth-first search is not enough to solve your problem (if it was, you could use the generalFold from the answer you linked to to, for instance, reconstruct the longest path in each tree of the forest). One alternative is switching from Data.Graph to fgl, which offers a wide assortment of graph algorithms, including breadth-first search. Note that The Haddock documentation of FGL is rather laconic, and so you will also want to consult the helpful User Guide available though the package homepage. In the demo below, I will use the bft function to get a breadth-first spanning tree of a graph, which should be enough to get you started:

    GHCi> import Data.Graph.Inductive.Graph
    GHCi> import Data.Graph.Inductive.PatriciaTree 
    GHCi> import Data.Graph.Inductive.Query.BFS
    GHCi> :{
    GHCi| test :: UGr
    GHCi| test = mkUGraph [1..7] [(1,3),(1,2),(2,4),(3,5),(2,6),(5,2),(5,7)]
    GHCi| :}
    GHCi> prettyPrint test
    1:()->[((),2),((),3)]
    2:()->[((),4),((),6)]
    3:()->[((),5)]
    4:()->[]
    5:()->[((),2),((),7)]
    6:()->[]
    7:()->[]
    GHCi> bft 1 test
    [[1],[2,1],[3,1],[4,2,1],[6,2,1],[5,3,1],[7,5,3,1]]
    

    The spanning tree is essentially a list of paths from a node to all reachable nodes. If you, for instance, just want to find one of the paths with maximum length and don't care about draws, all you need is:

    GHCi> import Data.Ord
    GHCi> import Data.List
    GHCi> maximumBy (comparing length) (bft 1 test)
    [7,5,3,1]
    

    If you do care about draws, you will need to do some juggling with things like groupBy, but it won't be fundamentally more difficult.

◀ Go back