The problem to solve can be summarized as follows:
Given a set of NBA players, which are the pairs of players whose height in inches adds up to a given natural number?
The solution program should meet the following conditions:
- The program should download the dataset form https://mach-eight.uc.r.appspot.com
- The program should print a list of all pairs of players whose height in inches adds up to the integer input to the application.
- The algorithm to find the pairs must be faster than
$O (n^2)$ - If no matches are found, the application will print "No matches found"
- All edge cases should be handled appropriately
I propose a variation of the 2-sum problem algorithm. The significant difference is that the numbers we consider (players height) can be repeated. This will impact the complexity as the algorithm will work better or worse depending not on the size of the set but on the sample distribution. A higher concentration in the distribution of heights of the players will mean a higher probability of an scenario where we could need to pair all the players of the dataset which will be always a
As we can see the higher concentration is around 79. Thus, the worst case scenario will be probably around target input
1- Iterates the dataset creating a frequency dictionary with the different heights as key and the correspondent players who have that height as value.
2- Creates a sorted list of the different heights taking the keys of the dictionary which are unique.
3- Iterates the sorted list of different heights until the height is greater than the input number/2. For each iteration check if the target input number minus the current height is present in the dictionary of height frequencies.
3.1- - As the two numbers
The complexity of the algorithm is approximately:
In other words:
where
Please be aware that the complexity could vary depending on the sample's distribution of heights.
Handled edge cases found were:
- Wrong input type: String, float.
- input number out of range.
The program is written in python 3.9.7
You can run the file nba2sum.py from terminal to run the main program. After running it, insert the target sum number and press enter. It will display all the combinations.
You can run the unit test file unittesting.py