Path-finding Webapp Part 2: Back-end

This project was focused around developing a few different algorithms for path-finding. By following a series of directions and rules, an algorithm could search across an area (grid) and find the end. This project also focused on the development of an application where a user can create a maze on an interactive gird, choose and algorithm, and send it back for processing. This post goes into depth the algorithms and functions used in the back end of the application.


Endpoint:

Once a user accesses the route (as a GET request), a 30 X 30 numpy array is created. This array used in the creation of the table for the front end. After users have created their maze and customized their options, the chosen algorithm is called and the altered array passed to it. The data that is anticipated is a JSON that houses data relating to the array, the chosen algorithm, and the start and end points.


@application.route("/pathfinding", methods=["GET","POST"])
def pathfinding():

    if request.method == "GET":
        cells = np.zeros((30,30))
        search_form = SearchForm()

        return render_template("webapp_templates/pathfinding.html",cells=cells,search_form=search_form)

    if request.method == "POST":
        data = request.json['data']
        algorithm = request.json["algorithm"]
        start = tuple(request.json["start"])
        end = tuple(request.json["end"])
        data_np = np.array(data,dtype=np.int)

        if algorithm == "breadth_first_search":
            visited_list, path_list = bfs(data_np, start, end)
        elif algorithm == "depth_first_search":
            visited_list,path_list = dfs(data_np,start,end)
        elif algorithm == "a_star_search":
            visited_list,path_list = a_star(data_np,start=start,end=end)

        return json.dumps({"status":"OK","visited_list":visited_list,"path_list":path_list})


Data Structures and Classes:

Several classes were used to contain and encapsulate certain methods and variables. The Maze class was designed to house the grid and produce a grid shape.

class Maze:
    def __init__(self,grid:np.array) -> None:
        self.grid = grid
        self.grid_shape = grid.shape

    def __repr__(self) -> str:
        return f"grid: {self.grid} \n bounds: {self.grid_shape}"


The Node class is a child class of the maze class and contains various attributes and methods designed around node in the maze (i.e. cell). These attributes include:


  • parent: the parent node of the current node
  • pos: the position of the current node
  • cost: the cost of getting to the current node (10 for straight line and 14 for diagonal)
  • heuristic: distance between the current node and the end point
  • children: child nodes of the current node
  • info_dict: a dictionary object to hold various information
  • total_cost: the total cost being the cost + heuristic

A method call _get_children_grid was designed to automatically collect all the child cells surrounding the node. The method would iterate over a series of directions (e.g. NWSE) and add the appropriate indices together to get a potential child cell. For example, if the current node's position was set within the grid at (0,0), then a potential child could be found immediately south of it, or at position (1,0). The calculation for this example would be to (current_node_row + 1, current_node_col + 0). Below is a dictionary that helps illustrate the directions and the appropriate calculations. The children nodes will be used to create new nodes, adding them to the data structure that is required by the algorithm chosen.

class Node(Maze):

    def __init__(self, grid:np.array, pos:tuple, parent:Union[tuple,None] = None, neighbors:str= "4_wind", cost:Union[int,None] = None,
                 heuristic:Union[int,None] = None) -> None:
        Maze.__init__(self,grid=grid)

        self.parent = parent
        self.pos = pos
        self.cost = cost
        self.heuristic = heuristic
        self.total_cost = self._get_total_cost()
        self.children = self._get_children_grid(neighbors)
        self.info_dict = {"current":self.current,"parent":self.parent,"children": self.children}

    def _get_total_cost(self) -> int:

        if self.cost is not None and self.heuristic is not None:
            return self.cost + self.heuristic


    def _get_children_grid(self, neighbors:str) -> list:


        def _iterate_over_directions(neighbors_dict:dict) -> list:


            children = list()
            for key,value in neighbors_dict.items():
                child_node = (self.current[0] + value["calc"][0],self.current[1] + value["calc"][1])
                if child_node == self.parent or child_node[0] < 0 or child_node[1] < 0 or child_node[0] >= self.grid_shape[0] or child_node[1] >= self.grid_shape[1]:
                    pass
                else:
                    if self.grid[child_node[0]][child_node[1]] == 0:
                        access = True
                    else:
                        access = False
                    children.append({"node":child_node,"accessibility":access,"cost":value["cost"]})
            return children

        diagonal_line_cost = 14
        straight_line_cost = 10

        # help visualize directions
        four_wind = {
            "north": {
                "calc": [-1, 0],
                "cost": straight_line_cost
            },
            "east": {
                "calc": [0, 1],
                "cost": straight_line_cost
            },
            "south": {
                "calc": [1, 0],
                "cost": straight_line_cost
            },
            "west": {
                "calc": [0, -1],
                "cost": straight_line_cost
            }
        }

        eight_wind = {
            "north_west": {
                "calc": [-1, -1],
                "cost": diagonal_line_cost
            },
            "north": {
                "calc": [-1, 0],
                "cost": straight_line_cost
            },
            "north_east": {
                "calc": [-1, 1],
                "cost": diagonal_line_cost
            ...
        }

        if neighbors == "4_wind":
            return _iterate_over_directions(four_wind)
        if neighbors == "8_wind":
            return  _iterate_over_directions(eight_wind)

    def __repr__(self):
        return f"Node{self.current}"


The Stack class is a data structure designed to take in a node and pop the first node within it. The stack utilizes a last in first out procedure, pushing nodes into it and typically removing them immediately. The class uses a push method to push nodes into a list at the first position of a list (nodes). When removing the node from the stack, the class uses the pop method to remove a node from that first position.


class Stack:

    def __init__(self) -> None:
        self.nodes = list()

    def push(self,node: Node) -> None:

        self.nodes.insert(0,node)

    def pop(self) -> Node:

        return self.nodes.pop(0)

    def __repr__(self):
        return f"current stack is: {self.nodes}"


The Queue class is a data structure that acts very similar to the Stack class. However, the major difference is the order by which nodes are inserted into the queue; the data structure utilizes a first in first out (similar to a queue at an amusement park). The class utilizes the same methods as the Stack class, except the push method appended a new node object at the end of the list (nodes). The pop method will remove and return a node from the first element of the list.


class Queue:

    def __init__(self) -> None:
        self.queued_nodes = list()
        self.queue_list = list()

    def push(self, node: Node) -> None:

        self.queued_nodes.append(node)
        self.queue_list.append(node.pos)

    def pop(self) -> Node:

        self.queue_list.pop(0)
        return self.queued_nodes.pop(0)

    def __repr__(self):
        return f"current queue is: {self.queued_nodes}"


The PriorityQueue class is a child class of the Queue class and behaves similarly. The only major difference is the _prioritize method, which rearranges the nodes in the queue, based on total cost. This structure makes use of a heuristic to determine which node should be the next in queue. It only rearranges the queue in order to get node with the least cost to the front of the queue.


class PriorityQueue(Queue):

    def __init__(self) -> None:
        super().__init__()

    def push(self,node:Node) -> None:

        self.queued_nodes.append(node)
        self.queue_list.append(node.pos)
        self._prioritize()

    def _prioritize(self) -> None:

        least_total_cost = None
        for node in self.queued_nodes:
            if least_total_cost == None or node.total_cost < least_total_cost:
                self.queued_nodes.remove(node)
                self.queued_nodes.insert(0, node)
                least_total_cost = node.total_cost

    def __repr__(self):
        return f"Queue is {self.queued_nodes}"


Finally, the VisitedNodes class is a class designed for the specific purpose of collecting all the nodes that were visited during the process. The class was mainly designed with returning an output that could be used to visualize the process. The class stores nodes in a series of lists, one to store the actual node, another to store the node's position, and a final list to store information about the node. A create_path method was created to recreate a path from the end point to the start point iterating in reverse through each node and their parent node.


class VisitedNodes:

    def __init__(self) -> None:
        self.nodes = list()
        self.visited_nodes = list()
        self.node_info = dict()

    def _store_node(self, node:Node) -> None:

        self.nodes.append(node)
        self.visited_nodes.append(node.pos)
        self.node_info[node.current] = node.info_dict

    def create_path(self,current_node_position,start_position) -> Tuple[list,list]:

        path_list = list()
        while current_node_position != start_position:
            path_list.append(self.node_info[current_node_position]["pos"])
            current_node_position = self.node_info[current_node_position]["parent"]

        path_list.append(self.node_info[current_node_position]["pos"])

        return self.visited_nodes,path_list

    def __repr__(self):
        return f"visited nodes are: {self.nodes}"



Breadth-First Algorithm:

If the breadth-first algorithm was chosen in the drop-down, then the breadth_first_search function will be called. The function first creates a Queue object (queue) and a VisitedNodes object (visited). The first node at the start position is pushed into the queue and the process begins. While their are objects in the queue, the first node in the queue is popped and appended in visited. The same node is then checked to see if it coincides with the end position; if it does, then a visited list and a path list is return. Otherwise, each child of the node is iterated upon, and checked to see if it is in visited or queue, and is accessible (if its value in the grid is equal to zero). If true, then the child and its corresponding data is pushed into the queue, and the process repeats. One thing to make not is that the child's parent (the current node) is included with it and the other information pushed into the queue. This is later used to recreate a path back from the end to the starting position.


def bfs(grid:np.array,start:tuple,end:tuple) -> Tuple[list,list]:

    queue = Queue()
    queue.push(Node(grid,start))
    visited= VisitedNodes()

    while queue:
        node = queue.pop()
        visited._store_node(node)
        if node.current == end:
            visited_list, path_list = visited.create_path(node.pos,start)
            return visited_list, path_list
        else:
            for child in node.children:
                if child["node"] not in visited.visited_nodes and child["node"] not in queue.queue_list and child["accessibility"]:
                    queue.push(Node(grid, child["node"], parent=node.current))


Depth-First Algorithm:

The second algorithm that was developed was based on the depth-first search algorithm. This algorithm is similar to the Breadth-First algorithm with the major difference being a different data structure used that being a stack. A Stack object is created an the first node is pushed into it; a VisitedNodes object (visited) is also created. While there are nodes in the stack, the first node is popped and appended to visited. The node's position is checked to see if it is equal the ending position. If so, then a visited list and a path list is returned. Otherwise, each child of the node is iterated upon and check to see if they are in visited and the child is accessible. If both conditions are found to be true, then the child node and its associated information is pushed into the stack.


def dfs(grid:np.array, start:tuple, end:tuple, _stack: Stack = Stack()) -> Tuple[list, list]:

    _stack.push(Node(grid, start))
    visited = VisitedNodes()

    while _stack:
            node = _stack.pop()
            visited._store_node(node)
            if node.current == end:
                visited_list,path_list = visited.create_path(node.pos,start)
                return visited_list,path_list
            else:
                for child in node.children:
                    if child["node"] not in visited.visited_nodes and child["accessibility"]:
                            _stack.push(Node(grid, child["node"], parent=node.current))



A* Algorithm:

The final algorithm that was used in this project was the A* search algorithm. The main difference with this algorithm is the use of a heuristic and the cost to travel from cell to cell. The algorithm makes use of the sum of the two (heuristic + cost) to get a total cost value. That value paired with a priority queue data structure encourages the algorithm to move to cells that are closer to the end point. A PriorityQueue object (priority_queue) and a VisitedNodes (visited) object is created. The first node is pushed into priority_queue. While there are nodes in the priority_queue, the first node is popped and appended to visited. If that node's position is equal to the end position, then return a visited list and a path list. Otherwise, each child of the node is iterated upon and checked to see if they are already in visited or in the priority_queue, and that the child is accessible. If found true, then the child is pushed into the priority_queue.


def a_star(grid:np.array,start:tuple,end:tuple) -> Tuple[list,list]:

  def diagonal_distance(child_node,end):
      dx = abs(child_node[0] - end[0])
      dy = abs(child_node[1] - end[1])
  
      return 10 * (dx+dy) + (14 - 2 * 10) * min(dx,dy)
  
  visited = VisitedNodes()
  priority_queue = PriorityQueue()
  priority_queue.push(Node(grid, start, neighbors="8_wind", cost=0, heuristic=diagonal_distance(start, end)))
  
  while priority_queue:
      node = priority_queue.pop()
      print(f"on {node}")
      visited._store_node(node)
      if node.current == end:
          visited_list,path_list = visited.create_path(node.pos,start)
          return visited_list,path_list
      else:
          for child in node.children:
              if child["node"] not in visited.visited_nodes and child["node"] not in priority_queue.queue_list and child["accessibility"]:
                  priority_queue.push(Node(grid, child["node"], parent=node.current, neighbors="8_wind", cost=child["cost"],
                                           heuristic=diagonal_distance(child["node"], end)))



Conclusion:

The back-end of the project revolved on the use of different type of data structures and various algorithms that made use of those structures. Data structures such as a queue, stack, and priority queue, can be used to handle and organize the nodes, systematically processing and checking them. Since the algorithms rely so heavily on the structure that is used, patterns between them can be easily recognized. For example, the dfs and bfs are essentially the same algorithm, however they both have different approaches in which order the node is returned when moving on to the next node. In future iterations of the project, certain functions will be aggregated, and more algorithms and features will be added.



Comments: