Optimal Binary Search Tree



In this article, we will discuss a classic Dynamic Programming problem that involves constructing an optimal binary search tree for a given set of keys with their search probabilities.

Before diving into the problem, let us understand what are Binary Search Trees and Dynamic Programming.

Optimal Binary Search Tree Problem

In this problem, you are given:

  • A sorted array of keys[] of size n that contains the keys to make the binary search tree.
  • An array freq[] of size n, where freq[i] is how many times keys[i] is searched.

The task is to find the minimum cost of a binary search tree that can be constructed using the given keys and their frequencies. The cost of a binary search tree is defined as the sum of the frequencies of all keys multiplied by their depth in the tree. We must keep the BST property (keys sorted in an in-order traversal).

$$ \text{Cost}(\text{node}) = \text{freq}(\text{node}) \times \text{level}(\text{node}) $$

$$ \text{Total Search Cost} = \sum_{i=1}^{n} \text{freq}(i) \cdot \text{level}(i) $$

Where level(i) is the depth of the node with key keys[i] in the binary search tree, and freq(i) is the frequency of that key. Return the minimum value possible for the total search cost.

Scenario

Input: Keys = {10, 12, 20}, Frequency = {34, 8, 50}
Output: Minimum cost: 142

Explanation: The Following are possible binary search trees for the given keys:

Optimal Binary Search Trees

For case 1, the cost is: (34*1) + (8*2) + (50*3) = 200
For case 2, the cost is: (8*1) + (34*2) + (50*2) = 176.
Similarly, for case 5, the cost is: (50*1) + (34 * 2) + (8 * 3) = 142 (Minimum)

We have three algorithms to solve this problem. We will discuss each of them in detail.

Recursive Algorithm to Find Optimal Binary Search Tree

In this approach, we will use recursion to explore all possible binary search trees that can be formed with the given keys and their frequencies. Here are the steps to implement the recursive algorithm:

  • Choose each key between i and j as the root
  • Recursively find the optimal cost of the left subtree (i to r-1)
  • Recursively find the optimal cost of the right subtree (r+1 to j)
  • Add the sum of frequencies for all keys in [i..j] (because when you go deeper, every node's level increases by 1, so we add the total frequency for the subproblem).

Example

Following is the C++ implementation for finding the optimal binary search tree using recursion:

#include <iostream>
# define INT_MAX 1000
using namespace std;

int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}

int optimalBST(int keys[], int freq[], int i, int j) {
   if (i > j) return 0; // Base case: no keys in this range
   if (i == j) return freq[i]; // Base case: only one key

    int minCost = INT_MAX;

   // Try each key as root
   for (int r = i; r <= j; r++) {
      // Cost of left subtree
      int leftCost = optimalBST(keys, freq, i, r - 1);
      // Cost of right subtree
      int rightCost = optimalBST(keys, freq, r + 1, j);
      // Total cost with current root
      int totalCost = leftCost + rightCost + sum(freq, i, j);

      minCost = min(minCost, totalCost);
   }
   return minCost;
}

int main() {
   int keys[] = {10, 12, 20};
   int freq[] = {34, 8, 50};
   int n = sizeof(keys) / sizeof(keys[0]);

   int minCost = optimalBST(keys, freq, 0, n - 1);
   cout << "Minimum cost: " << minCost << endl;
   return 0;
}

The output of the above code will be:

Minimum cost: 142

DP Memoization to Find Optimal Binary Search Tree

If you look at the recursive solution, you will notice that it has overlapping sub-problems. For example, the cost of sub tree (1, 1) is calculated while evaluating the cost of sub tree (0, 2) and again while evaluating the cost of sub tree (1, 2). To optimize this, we can use DP memoization to store the results of previously calculated sub-problems.

  • Use a 2D array, dp, to store the minimum cost for each range of keys.
  • Check if the value for the current range is already calculated. If yes, return it.
  • If not, calculate it using the same logic as in the recursive approach and store the result in dp.

Example

Here is the C++ code implementation for finding the optimal binary search tree using dynamic programming memoization:

#include <iostream>
#include <cstring>
# define INT_MAX 1000
using namespace std;

int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}

int optimalBST(int keys[], int freq[], int i, int j, int dp[][100]) {
   if (i > j) return 0; // Base case: no keys in this range
   if (i == j) return freq[i]; // Base case: only one key

   if (dp[i][j] != -1) return dp[i][j]; // Return cached result

   int minCost = INT_MAX;

   // Try each key as root
   for (int r = i; r <= j; r++) {
      // Cost of left subtree
      int leftCost = optimalBST(keys, freq, i, r - 1, dp);
      // Cost of right subtree
      int rightCost = optimalBST(keys, freq, r + 1, j, dp);
      // Total cost with current root
      int totalCost = leftCost + rightCost + sum(freq, i, j);

      minCost = min(minCost, totalCost);
   }
   dp[i][j] = minCost; // Cache the result
   return minCost;
}

int main() {
   int keys[] = {10, 12, 20};
   int freq[] = {34, 8, 50};
   int n = sizeof(keys) / sizeof(keys[0]);
   int dp[100][100];
   memset(dp, -1, sizeof(dp)); // Initialize dp array

   int minCost = optimalBST(keys, freq, 0, n - 1, dp);
   cout << "Minimum cost: " << minCost << endl;

   return 0;
}

The output of the above code will be:

Minimum cost: 142

DP Tabulation to Find Optimal Binary Search Tree

In this approach, we will use a bottom-up dynamic programming technique to fill the 2D table that stores the minimum cost for each range of keys. This approach is more memory efficient because it uses an iterative method instead of recursion.

  • Create a 2D array, dp of size n x n, where dp[i][j] will store the minimum cost for keys from i to j.
  • Initialize the diagonal elements (when i == j) with freq[i].
  • Iterate over lengths of subarrays from 2 to n.
  • For each subarray, calculate the minimum cost by trying each key as the root and updating the DP table.

Example

Here is the C++ implementation for finding the optimal binary search tree using dynamic programming tabulation:

#include <iostream>
#include <cstring>
# define INT_MAX 1000
using namespace std;

int sum(int freq[], int i, int j) {
   int total = 0;
   for (int k = i; k <= j; k++) {
      total += freq[k];
   }
   return total;
}

void optimalBST(int keys[], int freq[], int n) {
   int dp[100][100] = {0};

   // Initialize the diagonal elements
   for (int i = 0; i < n; i++) {
      dp[i][i] = freq[i];
   }

   // Fill the dp table
   for (int len = 2; len <= n; len++) { // length of subarray
      for (int i = 0; i <= n - len; i++) {
         int j = i + len - 1;
         dp[i][j] = INT_MAX;

         // Try each key as root
         for (int r = i; r <= j; r++) {
            int leftCost = (r > i) ? dp[i][r - 1] : 0;
            int rightCost = (r < j) ? dp[r + 1][j] : 0;
            int totalCost = leftCost + rightCost + sum(freq, i, j);
            dp[i][j] = min(dp[i][j], totalCost);
         }
      }
   }

   cout << "Minimum cost: " << dp[0][n - 1] << endl;
}

int main() {
   int keys[] = {10, 12};
   int freq[] = {34, 50};
   int n = sizeof(keys) / sizeof(keys[0]);
   optimalBST(keys, freq, n);
   return 0;
}

The output of the above code will be:

Minimum cost: 118

Time Complexity and Space Complexity

Here is a comparison of the time complexity and space complexity of the above-mentioned approaches.

Approach Time Complexity Space Complexity
Brute Force O(N!) O(N)
DP (Memoization) O(N3) O(N2 + N)
DP (Tabulation) O(N3) O(N2)

Note that the memoization technique will take recursion stack space, which is O(N) in the worst case; hence, tabulation is the most space-efficient approach.

Farhan Muhamed
Farhan Muhamed

No Code Developer, Vibe Coder

Updated on: 2025-08-12T16:23:36+05:30

8K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements