Skip to content

proposed program to solve a 2 sum variation problem

Notifications You must be signed in to change notification settings

d4h0y0sr/NBA2sumHeights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

2-sum NBA players height problem

Overview

The problem to solve can be summarized as follows:

Given a set of NBA players, which are the pairs of players whose height in inches adds up to a given natural number?

Conditions

The solution program should meet the following conditions:

  • The program should download the dataset form https://mach-eight.uc.r.appspot.com
  • The program should print a list of all pairs of players whose height in inches adds up to the integer input to the application.
  • The algorithm to find the pairs must be faster than $O (n^2)$
  • If no matches are found, the application will print "No matches found"
  • All edge cases should be handled appropriately

Proposed Algorithm

I propose a variation of the 2-sum problem algorithm. The significant difference is that the numbers we consider (players height) can be repeated. This will impact the complexity as the algorithm will work better or worse depending not on the size of the set but on the sample distribution. A higher concentration in the distribution of heights of the players will mean a higher probability of an scenario where we could need to pair all the players of the dataset which will be always a $O (n^2)$ . Fortunately, is not the case of our dataset:

Distribution

As we can see the higher concentration is around 79. Thus, the worst case scenario will be probably around target input $2 * 79 ≈ 158$. Empirically I found it was 159.

Algorithm steps

1- Iterates the dataset creating a frequency dictionary with the different heights as key and the correspondent players who have that height as value.

       $O(n)$ where n is the number of players in the dataset .

2- Creates a sorted list of the different heights taking the keys of the dictionary which are unique.

       $O(c_1 log(c_1))$ where $c_1$ is the number of unique heights in the data set.

3- Iterates the sorted list of different heights until the height is greater than the input number/2. For each iteration check if the target input number minus the current height is present in the dictionary of height frequencies.

       $O(c_2)$

    3.1- - As the two numbers $(A,B)$ that adds up to the given input number are found in the dictionary we generate the combination between the two lists of players and printed it.

             $O(P_1P_2)$ where $P_1$ is the number of players with height $A$ , and $P_2$ is the number of players with height $B$.

The complexity of the algorithm is approximately:

     $O (n)+O(C_1log(C_1))+O(C_2)O(P_1P_2)$

In other words:

     $O(C_2P_1P_2)$

where

     $nlog(n) < C_2P_1P_2 < n^2$

Please be aware that the complexity could vary depending on the sample's distribution of heights.

Edge Cases

Handled edge cases found were:

  • Wrong input type: String, float.
  • input number out of range.

Execution

The program is written in python 3.9.7

You can run the file nba2sum.py from terminal to run the main program. After running it, insert the target sum number and press enter. It will display all the combinations.

image

You can run the unit test file unittesting.py

image

About

proposed program to solve a 2 sum variation problem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages