Travelling Salesman Problem

Travelling Salesman Problem

The Traveling Salesman Problem (TSP) can be viewed as finding a Hamiltonian cycle in a weighted graph (where the weights represent distances or costs), with the additional constraint that this cycle must have the minimum total weight. In other words, TSP seeks the shortest possible tour that visits each city exactly once and returns to the starting city.

Pre Requisities

Hamiltonian Path and Hamiltonian Cycle:

Hamiltonian Path: A Hamiltonian path in a graph is a path that visits each vertex exactly once. In other words, it’s a way to traverse the graph such that every node is visited once and only once. If a Hamiltonian path exists that starts at one vertex and ends at another, it is known as a Hamiltonian circuit.

Hamiltonian Cycle: A Hamiltonian cycle is a special case of a Hamiltonian path. It is a cycle that visits each vertex exactly once, and it returns to the starting vertex, forming a closed loop.

Real World Problem Statement:

A real-world problem related to Hamiltonian paths and cycles could be the Delivery Route Problem:

Problem Statement: Imagine a delivery person needs to visit a set of locations (represented as nodes in a graph) to drop off packages. The goal is to find the shortest route that starts and ends at the same location, while visiting each location exactly once.

In this problem, the locations can be represented as nodes in a graph, and the connections between them (representing valid paths) are the edges. The graph would be undirected because the delivery person can travel in both directions along a road.

Assumptions

The Traveling Salesman Problem (TSP) is typically formulated with several assumptions:

  1. Complete Graph: This implies that the problem is defined on a complete graph, where every city is directly reachable from every other city.

  2. Symmetric Distances: The distance or cost of traveling between two cities is assumed to be the same regardless of the direction. This means that the distance from City A to City B is the same as the distance from City B to City A.

  3. Triangle Inequality: In the context of TSP, this means that the direct path between any two cities must be shorter than or equal to the sum of the direct paths through any other city. In other words, AB <= AC + CB.

  4. Non-Negative Distances: Distances or costs between cities are non-negative. This is a practical assumption, as distances cannot be negative in real-world scenarios.

  5. Single Salesman: There is only one salesman who is responsible for visiting all the cities.

graph LR
    A ---|20| B
    A ---|42| C
    A ---|25| D
    B ---|30| C
    B ---|34| D
    C ---|10| D

Brute Force Approach

What Needs to Be Found Out

The objective is to find a Hamiltonian cycle with minimum weight in this context. This would be the optimal route for the delivery person to visit all locations exactly once and return to the starting point.

Hamiltonian Cycle vs TSP

In this case, Hamiltonian tour exists (because the graph is complete) and in fact, many such tours exist, the problem is to find the minimum weight Hamiltonian cycle.

The Path A - B - C - D - A is a hamiltonian cycle. There exists many such cycles. Since this is a complete graph, there are n! possibilities.

Starting from Node A:

RowPathWeight
1A -> B -> C -> D -> A85
2A -> B -> D -> C -> A106
3A -> C -> B -> D -> A131
4A -> C -> D -> B -> A106
5A -> D -> B -> C -> A131
6A -> D -> C -> B -> A85

Starting from Node B:

RowPathWeight
7B -> A -> C -> D -> B106
8B -> A -> D -> C -> B85
9B -> C -> A -> D -> B131
10B -> C -> D -> A -> B85
11B -> D -> A -> C -> B131
12B -> D -> C -> A -> B106

Starting from Node C:

RowPathWeight
13C -> A -> B -> D -> C106
14C -> A -> D -> B -> C131
15C -> B -> A -> D -> C85
16C -> B -> D -> A -> C131
17C -> D -> A -> B -> C85
18C -> D -> B -> A -> C106

Starting from Node D:

RowPathWeight
19D -> A -> B -> C -> D85
20D -> A -> C -> B -> D131
21D -> B -> A -> C -> D106
22D -> B -> C -> A -> D131
23D -> C -> A -> B -> D106
24D -> C -> B -> A -> D85

If we know the starting node, the total number of Hamiltonian paths in a complete graph with n nodes can be calculated as: (n−1) × (n−2) × (n−3) × … × 1, which is equivalent to (n−1)!. Note that the weight of a Hamiltonian cycle is equal to that of its reverse due to symmetrical distances. So, the complexity of this approach is (n-1)! / 2

Recursion Tree

graph TB
  subgraph S["min(85, 106, 85) = 85"]
    direction TB
    A -->|"min(65, 86) = 65 \n 20 + 65 = 85"| B
    A -->|"min(89, 64) = 64 \n 42 + 64 = 106"| C
    A -->|"min(106, 60) = 60 \n 25 + 60 = 85"| D

    B -->|"30 + 35 = 65"| C1[C]
    B -->|"34 + 52 = 86"| D1[D]

    C -->|"30 + 59 = 89"| B1[B]
    C -->|"10 + 54 = 64"| D2[D]

    D -->|"34 + 72 = 106"| B2[B]
    D -->|"10 + 50 = 60"| C2[C]

    C1 -->|"10 + 25 = 35"| D3[D]
    D1 -->|"10 + 42 = 52"| C3[C]

    B1 -->|"34 + 25 = 59"| D4[D]
    D2 -->|"34 + 20 = 54"| B3[B]

    B2 -->|"30 + 42 = 72"| C4[C]
    C2 -->|"30 + 20 = 50"| B4[B]

    D3 -.25.-> A1[A]
    C3 -.42.-> A2[A]
    D4 -.25.-> A3[A]
    B3 -.20.-> A4[A]
    C4 -.42.-> A5[A]
    B4 -.20.-> A6[A]
  end

Implementation

Complexity Analysis

Time Complexity

  • The time complexity of this approach is O(n^2 * 2^n). Here, n is the number of nodes in the graph.

Space Complexity

  • The space complexity is O(n * 2^n). This is due to the space required for the recursion stack (which can have at most n levels deep), and the additional space used for the visited array which keeps track of visited nodes.

Dynamic Programming

From the above implementation it is clear that we are depending on two variables i.e., visited and current. Using a straightforward approach, we might need a 2D array to store the state of all possible subsets of nodes. This can be extremely memory-intensive and impractical for large graphs.

// visited it self is an array
dp [visited][current]

Bit masking is used to efficiently represent and track the state of visited nodes in a more memory-efficient manner. By using bit masking, we can represent the set of visited nodes as a binary number. Each bit corresponds to a node, where 1 indicates the node has been visited and 0 indicates it hasn’t. This representation significantly reduces the memory footprint compared to a 2D array.

Bit Masking

Bit Masking is a technique used in programming where we manipulate individual bits of a number. It involves performing bitwise operations (like AND, OR, XOR) on binary representations of numbers.

In the context of the Traveling Salesman Problem (TSP), we can use bit masking to represent the visited nodes. Instead of using an array of booleans to keep track of visited nodes, we can use an integer where each bit represents the visited status of a node.

For example, if we have 4 nodes, we can represent the visited status as follows:

Visited ArrayBinary RepresentationBitmask NumComment
[0, 0, 0, 0]00000All nodes unvisited
[0, 0, 0, 1]00011Only node A is visited
[0, 0, 1, 0]00102Only node B is visted
[0, 0, 1, 1]00113Nodes A and B are visited
[1, 1, 1, 1]111115All nodes are visited

Using bit masking in TSP reduces the space complexity, especially when dealing with a large number of nodes.

Implementation

Complexity Analysis

This implementation has a similar time complexity of O(n^2 * 2^n), but the space complexity is improved to O(n * 2^n) due to the reduced space required for the bitmask.

Dp with Bit Mask