Tuesday, 24 November 2015

Understanding Depth first Search DFS using Java (for trees and Graphs)

Hello Friends,

DFS is one of the algorithm that we can use to traverse the elements of a Graph.

Lets try to build the algorithm for DFS below. For simplicity lets use tree with multiple children (tree is also a special type of graph).





Lets try to follow below steps and see how we traverse the above tree.


Lets have 3 basic entities to implement DFS algorithm:
1. currentNode
2. currentConnectedList (List to contain connected nodes)
3. visited[]  (Array of type boolean and  size = number of nodes in the graph and all cells initialized with false, let us have node with name 'i' be mapped with cell with index as 'i-1')

 

Algorithm :
1. Pick a node and mark it as currentNode
2. For the currentNode, find the connected nodes and add them in its currentConnectedList
3. Set the cell value in array visited as true for the cell corresponding to the currentNode
4. print the visited node name print(nodeName)
5. pick the first Non visited node from the current node's currentConnectedList and mark it as currentNode
In case currentConnectedList has no nodes in it or all the nodes have been visited then go to parent Node of the currentNode and pick next non visited node. In case the parent node is null then break the loop.
[we have put the condition to pick the non visited ones, when working with Graphs with cycles then it is this condition that prevents the algorithm from running infinitely by being stuck in the cycle.]
6. Go to step 2.



currentNode = 1
visited = [true][false][false][false][false][false][false][false][false][false][false][false][false][false]
print(1) 
currentConnectedList = 2->7->14 [1st non visited node is 2 so
2 will be picked next and we will come back for 7 and 14 later]

currentNode = 2
visited = [true][true][false][false][false][false][false][false][false][false][false][false][false][false]
print(2)
currentConnectedList= 3->5

currentNode = 3
visited = [true][true][true][false][false][false][false][false][false][false][false][false][false][false]
print(3)
currentConnectedList= 4

currentNode = 4
visited = [true][true][true][true][false][false][false][false][false][false][false][false][false][false]
print(4)
currentConnectedList= NULL

As currentConnectedList has no nodes so we go to parent of 4, i.e. 3. Lets go to 3 and see who is left unvisited. At currentNode = 3 we find only one element in
connectedList = 4, and 4 already has been visited (we see in the Visited array) so nothing left to pick from here

Lets go to parent of 3, i.e. 2 and see who is left unvisited. for 2 we see connectedList = 3->5, from visited array we find 3 already have been visited so now we pick 5

currentNode = 5
visited = [true][true][true][true][true][false][false][false][false][false][false][false][false][false]
print(5)
currentConnectedList= 6

currentNode
= 6
visited = [true][true][true][true][true][true][false][false][false][false][false][false][false][false]
print(6)
currentConnectedList= NULL

Similar to above again we have a situation when connectedList do not have any element.
Lets go to parent of 6 i.e. 5 and there we see only connected element was 6 which has already
been visited so we go to parent of 5 i.e. 2 and again we find connectedList = 3->5 and these
both elements have been visited. So now lets see parent of 2 i.e 1, for 1 we see connectedList = 2->7->14.
As per our rule, lets pick first non visited node from the connected List i.e. 7

currentNode = 7
visited = [true][true][true][true][true][true][true][false][false][false][false][false][false][false]
print(7)
currentConnectedList= 8->10->13

currentNode = 8
visited = [true][true][true][true][true][true][true][true][false][false][false][false][false][false]
print(8)
currentConnectedList= 9

currentNode = 9
visited = [true][true][true][true][true][true][true][true][true][false][false][false][false][false]
print(9)
currentConnectedList= NULL

going to 8 we see no more node left unvisited so lefts go to 7 and there we see connectedList = 8->10
and 8 has already been visited so next unvisited node is 10

currentNode = 10
visited = [true][true][true][true][true][true][true][true][true][true][false][false][false][false]
print(10)
currentConnectedList= 11->12

currentNode = 11
visited = [true][true][true][true][true][true][true][true][true][true][true][false][false][false]
print(11)
currentConnectedList= null

Going to 10 , we find connectedList = 11->12 and as 11 has already been visited so next unvisited node is 12

currentNode = 12
visited = [true][true][true][true][true][true][true][true][true][true][true][true][false][false]
print(12)
currentConnectedList= NULL

Going to parent of 12 i.e. 10 we find all the nodes in its connectedList have been visited
so going to parent of 10 i.e. 7 we find connectedList = 8->10->13, 8 and 10 have already been visited so next we have 13 to pick

currentNode = 13
visited = [true][true][true][true][true][true][true][true][true][true][true][true][true][false]
print(13)
currentConnectedList= NULL

Going back to 7 we find no more unvisited node in the connectedList, so going to parent of 7 i.e. 1 we see its connectedList = 2->7->14.
As 2 and 7 have already been visited, we are left with 14.

currentNode = 14
visited = [true][true][true][true][true][true][true][true][true][true][true][true][true][true]
print(14)
currentConnectedList= NULL

As no element in the connectedList, we go to the parent of 14, i.e. 1 and 1 has all its nodes in the connectedList 2->7->14 as visited.
Parent of 1 is Null hence we have reached the starting Node, so we are done with visiting all the nodes.

Now having understood a basic idea about the DFS, lets see a graph below and do some analysis.








Here we see the graph contains 3 sub graphs, 1,2,3,4,5,6,7 are connected then 9,10,11 are connected and then 8 is alone. These 3 sub graph make up our complete graph. Suppose this graphs represents good baseball players from 3 different teams (say Team A, Team B and Team C) who are coming to some separate city be part of some super team. Now suppose we wish to pass some information to all these players then it is enough if we pass that information to any 1 player from each team (so at-least  we need to pass information to 3 players).

Here DFS can come handy. We can start a for loop set for 11 iterations and then run the DFS for the node (representing a player) if it is not visited. DFS will visit all the connected nodes (players). Then coming back to the next iterations of the loop, as another non visited node is encountered (meaning we have a node from another sub graph) we need to  execute DFS  again  to visit the nodes of that sub graph.

In above case 3 times DFS will run 1st for Node-1, 2nd for Node 8 and 3rd for Node-9.

Now lets see the implementation of DFS below:


Below lets see the code for DFS in graph implemented using Adjacency matrix and then using Adjacency list

Code using Adjacency Matrix :

/**
 * 
 * @author krishna kumar
 * 
 *         Below code shows basic implementation of DFS in a graph. Graph has
 *         been implemented using adjacency matrix
 *
 */
public class Graph {
    private int[][] adjacencyMatrix;
    private int size; // number of nodes in the graph
    private boolean[] visited;

    public Graph(int size) {
        this.size = size;
        adjacencyMatrix = new int[size][size];// Initialized with all 0s
        visited = new boolean[size];
    }

    public void addEdge(int sourceName, int destinationName) {
        int sourceIndex = sourceName - 1;
        int destinationIndex = destinationName - 1;
        adjacencyMatrix[sourceIndex][destinationIndex] = 1;

        // the graph is non directional so if from S, D is reachable then vice
        // versa is also true
        adjacencyMatrix[destinationIndex][sourceIndex] = 1;
    }

    public void traverseGraph() {
        System.out.println("Using DFS Traversing the graph");
        /*
         * If graph contains disconnected nodes or sub graphs then below loops
         * makes sure that all such nodes or sub graphs are also visited
         * 
         * In a single connected graph below loop is not needed and simply
         * dfs(1); can be called and the graph will be visited.
         */
        for (int i = 1; i <= size; i++) {
            if (!visited[i - 1]) {
                dfs(i);
            }
        }
    }

    private void dfs(int currentNodeName) {
        visited[currentNodeName - 1] = true;
        System.out.println(currentNodeName);
        for (int i = 0; i < size; i++) {
            if (adjacencyMatrix[currentNodeName - 1][i] != 0) {
                if (!visited[i]) {
                    dfs(i + 1);
                }
            }
        }
    }

    public static void main(String[] args) {
        Graph graph = new Graph(11);

        graph.addEdge(1, 3);
        graph.addEdge(1, 4);
        graph.addEdge(2, 3);
        graph.addEdge(3, 5);
        graph.addEdge(3, 6);
        graph.addEdge(4, 5);
        graph.addEdge(4, 7);
        graph.addEdge(9, 11);
        graph.addEdge(10, 11);
        graph.traverseGraph();
    }

}


Below lets use adjacency List and see the implementation. For Linked List implementation of Stack styled Linked list have been used [ holding the reference of the last added element in the graph]
We will use class Node to represent the graph-node so  instead of separate array 'visited', we can simply declare a boolean field : 'visited' in class Node itself.

package com.graph.adjacencylist;

/**
 * 
 * @author krishna kumar
 * 
 *         This is a basic implementation of Graph using Adjacency list.
 *
 */
public class Graph {

    private Node[] vertices; // stores the nodes of the graph
    private int size; // number of nodes in the graph

    public Graph(int size) {
        this.size = size;
        vertices = new Node[size];
    }

    public class Node {
        int name;
        boolean visited = false;
        Neighbour neighbourList;

        Node(int name) {
            this.name = name;
        }
    }

    public class Neighbour {
        int index;
        Neighbour next;

        public Neighbour(int index, Neighbour next) {
            this.index = index;
            this.next = next;
        }
    }

    public void addNode(int name) {
        vertices[name - 1] = new Node(name);
    }

    public void addEdge(int sourceName, int destiName) {
        int srcIndex = sourceName - 1;
        int destiIndex = destiName - 1;
        Node srcNode = vertices[srcIndex];
        Node destiNode = vertices[destiIndex];
        srcNode.neighbourList = new Neighbour(destiIndex, srcNode.neighbourList);
        // the graph is non directional so if from S, D is reachable then vice
        // versa is also true
        destiNode.neighbourList = new Neighbour(srcIndex, destiNode.neighbourList);
    }

    public void traverseGraph() {
        /*
         * If graph contains disconnected nodes or sub graphs then below loops
         * makes sure that all such nodes or sub graphs are also visited
         * 
         * In a single connected graph below loop is not needed and simply
         * dfs(vertices[0]); can be called and the graph will be visited.
         */
        for (int i = 0; i < size; i++) {
            if (!vertices[i].visited) {
                dfs(vertices[i]);
            }
        }
    }

    private void dfs(Node currentNode) {
        currentNode.visited = true;
        System.out.println(currentNode.name);
        Neighbour temp = currentNode.neighbourList;
        while (temp != null) {
            if (!vertices[temp.index].visited) {
                dfs(vertices[temp.index]);
            }
            temp = temp.next;
        }
    }

    public static void main(String[] args) {
        Graph graph = new Graph(11);
        graph.addNode(1);
        graph.addNode(2);
        graph.addNode(3);
        graph.addNode(4);
        graph.addNode(5);
        graph.addNode(6);
        graph.addNode(7);
        graph.addNode(8);
        graph.addNode(9);
        graph.addNode(10);
        graph.addNode(11);
        graph.addEdge(1, 3);
        graph.addEdge(1, 4);
        graph.addEdge(2,3);
        graph.addEdge(3,5);
        graph.addEdge(3, 6);
        graph.addEdge(4, 5);
        graph.addEdge(4,7);
        graph.addEdge(9,11);
        graph.addEdge(10,11);
        graph.traverseGraph();
    }
}