## Finding the longest path through a DFS Forest in Haskell

Question

I have a graph that I'm trying to find the longest path through but I'm unsure how to go about it. I'm using the standard `Graph` and `Edge` types from `Data.Graph`, and the graph was generated using the `buildG` function. My initial thought was to use `dfs` to perform a depth-first search, but this yielded a `Forest` object that I'm unsure how to manipulate.

If it at all helps (and I'm sorry I can't pretty print this, `showForest` only prints strings apparently), this is the output of the `dfs` command on my graph:

`[Node {rootLabel = 87, subForest = [Node {rootLabel = 82, subForest = [Node {rootLabel = 70, subForest = []},Node {rootLabel = 83, subForest = [Node {rootLabel = 66, subForest = [Node {rootLabel = 72, subForest = [Node {rootLabel = 88, subForest = []}]}]},Node {rootLabel = 79, subForest = [Node {rootLabel = 69, subForest = [Node {rootLabel = 85, subForest = []},Node {rootLabel = 84, subForest = [Node {rootLabel = 86, subForest = []},Node {rootLabel = 73, subForest = [Node {rootLabel = 81, subForest = []}]}]}]}]}]},Node {rootLabel = 89, subForest = []}]}]}]`

I've found a few different functions for finding longest paths through Trees, such as in this answer, but they only work for `Tree`s, not `Forest`s, and I'm unsure if it's even possible to convert between the two.

Thanks!

Show source

## Answers to Finding the longest path through a DFS Forest in Haskell ( 2 )

1. I will first explain what `dfs` does. As you mentioned in your comments, you use `buildG` function so I will also use it in my answer.

`dfs` takes 2 arguments: graph and list of root nodes. And then `dfs` runs depth-first search starting from each vertex number in list, returning `Tree` of reachable vertices from those nodes in list. For example:

1. You have graph of 1 vertex and you want find paths from 1. You will then have list with only one `Tree` where root label is `1` and subforest is empty.
2. You have graph of 2 vertices and edge from 1 to 2 and you want find paths from 1. You will then have list with only one `Tree` where root label is `1` and subforest is list with 2 as root label.
3. You have graph of 3 vertices and edges (1,2) and (1,3) and you want find paths from 1. You will then have list with only one `Tree` where root label is `1` and subforest is list with two trees with 2 and 3 as root labels correspondingly.

All these examples (and even some more) are demonstrated in next ghci session:

``````λ: let g = buildG (1,1) []
λ: dfs g [1]
[Node {rootLabel = 1, subForest = []}]
λ: dfs g [2]
*** Exception: Ix{Int}.index: Index (2) out of range ((1,1))
λ: let g = buildG (1,2) [(1,2)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = []}]}]
λ: dfs g [2]
[Node {rootLabel = 2, subForest = []}]
λ: let g = buildG (1,3) [(1,2), (1,3)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 3, subForest = []},Node {rootLabel = 2, subForest = []}]}]
``````

But, unfortunately, `dfs` can't help you with your task: finding furthest vertex from given one. In the way algorithm works it is just not capable to solve such problems. Suppose you have graph with 4 vertices and edges:

1 → 2, 2 → 3, 3 → 4, 1 → 4

Vertex that has longest path from 1 is 3 (this path has 2 edges). But `dfs` will return you different results depending on order of edges you specified.

``````λ: let g = buildG (1,4) [(1,2), (1,4), (2,3), (3,4)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 4, subForest = []},Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = []}]}]}]
λ: let g = buildG (1,4) [(1,4), (1,2), (2,3), (3,4)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = [Node {rootLabel = 4, subForest = []}]}]}]}]
``````

It is not the problem of particular implementation. It is just how `dfs` works. And you can't use `dfs` in general case to solve your particular problem. Also, this problem can't be fixed with just rearranging edges in list because you don't know how to rearrange them in general.

What you really need to use is `bfs` algorithm — breadth-first search. Unfortunately, it is not implemented in `Data.Graph` library. So you need to implement whole algo from zero or find implementation somewhere (here some dicussions: Traversing a graph breadth first, marking visited nodes in Haskell).

2. As Shersh explains in their answer, depth-first search is not enough to solve your problem (if it was, you could use the `generalFold` from the answer you linked to to, for instance, reconstruct the longest path in each tree of the forest). One alternative is switching from `Data.Graph` to fgl, which offers a wide assortment of graph algorithms, including breadth-first search. Note that The Haddock documentation of FGL is rather laconic, and so you will also want to consult the helpful User Guide available though the package homepage. In the demo below, I will use the `bft` function to get a breadth-first spanning tree of a graph, which should be enough to get you started:

``````GHCi> import Data.Graph.Inductive.Graph
GHCi> import Data.Graph.Inductive.PatriciaTree
GHCi> import Data.Graph.Inductive.Query.BFS
GHCi> :{
GHCi| test :: UGr
GHCi| test = mkUGraph [1..7] [(1,3),(1,2),(2,4),(3,5),(2,6),(5,2),(5,7)]
GHCi| :}
GHCi> prettyPrint test
1:()->[((),2),((),3)]
2:()->[((),4),((),6)]
3:()->[((),5)]
4:()->[]
5:()->[((),2),((),7)]
6:()->[]
7:()->[]
GHCi> bft 1 test
[[1],[2,1],[3,1],[4,2,1],[6,2,1],[5,3,1],[7,5,3,1]]
``````

The spanning tree is essentially a list of paths from a node to all reachable nodes. If you, for instance, just want to find one of the paths with maximum length and don't care about draws, all you need is:

``````GHCi> import Data.Ord
GHCi> import Data.List
GHCi> maximumBy (comparing length) (bft 1 test)
[7,5,3,1]
``````

If you do care about draws, you will need to do some juggling with things like `groupBy`, but it won't be fundamentally more difficult.