This repository is a collection of the data interview questions from the mailing list run by InterviewQs. Slowly I will try and provide my answers for each of them - mostly as a self improvement exercise.
Note: the InterviewQs people currently charge $9 a month for the answers - which I don't subscribe to - therefore, the any answers in this repository are mine only and may or may not even be correct.
Note: The mailings do not always seem to make it to me, so where I don't have the data for the problem I will leave a marker, the intention being to fill the details in later if I manage to get hold of it.
- Question 1 - Fradulent retail accounts
- Question 2 - Calculating a moving average using python
- Question 3 - The carshare dilemma
- Question 4 - Active users on a messaging application
- Question 5 - Employee survey results
- Question 6 - Python function to express power sets
- Question 7 - Probability of passing through interview stages
- Question 8 - Python function to traverse a binary tree path
- Question 9 - Time for a response on a messaging application
- Question 10 - Calculating student attendance using SQL
- Question 11 - Bias-Variance Tradeoff
- Question 12 - Finding the value closest to 0
- Question 13 - Cleaning and analyzing employee data
- Question 14 - Popular songs
- Question 15 - Drawing cards from a standard deck
- Question 16 - Points within an interval
- Question 17 - Analyzing employee data using Python
- Question 18 - A hotel chain's loyal customers
- Question 19 - New strain of flu
- Question 20 - One edit away
- Question 21 - Expanding a data set
- Question 22 - Picking a survey group
- Question 23 - Drawing cards from a standard deck, once more
- Question 24 - Oh foo-ie!
- Question 25 - Application feedback
- Question 26 - Is red independent from an ace?
- Question 27 - Property revenue across cities)
- Question 28 - Winning the lottery)
- Question 29 - American Football Scoring)
- Question 30 - Assigning grades)
- Question 31 - Life expectancy)
- Question 32 - Revenue per employee)
- Question 33 - Testing a claim around student intelligence)
- Question 34 - Identifying prime numbers with Python)
- Question 35 - Filtering student information with Pandas)
- Question 36 - Rolling to win!)
- Question 37 - Employees who are managers)
- Question 38 - Calculating earnings with Python)
- Question 39 - Normalizing student grades with Pandas)
- Question 40 - Calculating monthly revenue growth in SQL)
- Question 41 - Rolling dice to win...again)
- Question 42 - Sentiment analysis for app reviews
- Question 43 - Revenue trends for online vs. in store channels
- Question 44 - Joining to get total sales, a SQL problem
- Question 45 - Defective gaskets
- Question 46 - Smallest missing number in array
- Question 47 - Bayes' Theorem
- Question 48 - Top selling products
- Question 49 - Skittles from a bag
- Question 50 - Python function to count tuple elements
- Question 51 - Counting capital letters
- Question 52 - Naive Bayes
- Question 53 - Planning for a new office location, using SQL
- Question 54 - Sell, sell, sell!
- Question 55 - Categorizing foods
- Question 56 - Twitch content creators
- Question 57 - Probability of selecting a wardrobe
- Question 58 - Longest palindrome substring
- Question 59 - Chipotle item analysis
- Question 60 - Choosing two ice creams
- Question 61 - Ranking vendors by spend
- Question 62 - Subarray sums
- Question 63 - Baby names
- Question 64 - Reddit posts and comments
- Question 65 - Rolling 10 times
- Question 66 - Matrix frequencies
- Question 67 - Drop rows in dataframe that are between two dates
- Question 68 - Type I vs Type II error
- Question 69 - Airbnb stays by country
- Question 70 - Stepping through nested while loops in Python
- Question 71 - Employee tenure
- Question 72 - Shoe prices
- Question 73 - Testing the toxicity of water
- Question 74 - Temperature Conversion
- Question 75 - Retail revenue trends
- Question 76 - Resampling
- Question 77 - State populations
- Question 78 - Largest elements in an array
- Question 79 - Channel attribution
- Question 80 - Days to first sale
- Question 81 - Relationship between fitness and smoking
- Question 82 - Count of Triplet
- Question 83 - Sales by marketing channel using Pandas
- Question 84 - In SQL, 0 = 0?
- Question 85 - Testing user conversion
- Question 86 - Alternative array sorting
- Question 87 - Best ad group
- Question 88 - Best ad group, a SQL version
- Question 89 - Confidence intervals for a dataset
- Question 90 - Spiral matrix
- Question 91 - Replacing bad data with Pandas
- Question 92 - SQL Patterns
- Question 93 - Rental Car Locations
- Question 94 - Reverse it
- Question 95 - User messaging
- Question 96 - Tallying up absent students using SQL
- Question 97 - Life expectancy, revisited
- Question 98 - Counting numbers, letters, and other things from a text file
- Question 99 - Plotting stock prices
- Question 100 - Reddit comment activity
- Question 101 - Pricing a Coin Toss
- Question 102 - Births by state
- Question 103 - Cleaning termination dates
- Question 104 - Termination survey data
- Question 105 - Car Battery
- Question 106 - Histogram of years worked
- Question 107 - Customer recency (a multi-question problem)
- Question 108 - Twitch view times by creators
- Question 109 - Weather forecasting, a Markov Chain problem
- Question 110 - Customer frequency -- a multi-question problem
- Question 111 - Filtering student information with Pandas, revisited
- Question 112 - eCommerce Margins
- Question 113 - Students in a class
- Question 114 - Reorder an array by x elements in Python
- Question 115 - Binning employee experience levels in Python (Pandas)
- Question 116 - ???
- Question 117 - ???
- Question 118 - ???
- Question 119 - Chipotle item analysis (part 2)
- Question 120 - Student GPA by subject
- Question 121 - Rainy flights
- Question 122 - Middle matrix sums
- Question 123 - Normalizing student grades with Pandas, revisited
- Question 124 - Monthly revenue growth in SQL for Online channel
- Question 125 - Pricing a life insurance product
- Question 126 - Check whether two arrays are equal
- Question 127 - Animal classification with python
- Question 128 - Adult animal weights
- Question 129 - Biased coin toss
- Question 130 - Pythagorean triplet
- Question 131 - Candy production increase
- Question 132 - Querying San Francisco public worker salaries
- Question 133 - Second price (Vickrey) auction
- Question 134 - Arrays of arrays
- Question 135 - Volume weighted price average (VWAP)
- Question 136 - Chocolate Bar Reviews
- Question 137 - Auto component quality
- Question 138 - Length of largest contiguous array
- Question 139 - Plotting daily earnings (given balances)
- Question 140 - Top Earners
- Question 141 - One Way ANOVA
- Question 142 - Smallest subarray with a sum greater than X
- Question 143 - Formatting dirty data
- Question 144 - Formatting dirty data (continued)
- Question 145 - Similarity Matrix and Heat Chart
- Question 146 - Smallest, unrepresented integer
- Question 147 - Filtering strings in Pandas dataframes
- Question 148 - Filtering grades based on city avg, in SQL
- Question 149 - Testing impact of exercise on memory
- Question 150 - Multiply array values, return the remainder
- Question 151 - PCA, Inertia, and Silhouette Plots
- Question 152 - Clustering function
- Question 153 - Calculating quarterly moving average of revenue in SQL
- Question 154 - MSE vs RMSE
- Question 155 - Calculating max distance between array elements in Python
- Question 156 - World Leader Exits
- Question 157 - Company acquisitions
- Question 158 - Ads in a newsfeed
- Question 159 - Monotonic array
- Question 160 - Leader life expectancy
- Question 161 - A hotel chain's loyal customers
- Question 162 - Overfitting vs. underfitting
- Question 163 - Counting the distance between words in Python
- Question 164 - Cereal ratings
- Question 165 - College interview results
- Question 166 - Cereal ratings: top 3 variables
- Question 167 - Identity Matrix