ctbk.dev Citi Bike Dashboard
- ctbk/ contains a Python library and CLI (
ctbk
) that derives various datasets from Citi Bike's public data ats3://tripdata
- s3://ctbk contains cleaned, public data output by
ctbk
- www/ contains the static web app served at ctbk.dev.
- GitHub Actions in .github/workflows:
- poll for new Citi Bike data at the start of each month
- compute new derived data when found, and
- build and publish the ctbk.dev website
ctbk.dev?d=1406-2102&g=mf&pct&s=g&y=m:
(Gender labels stopped appearing in the data in February 2021)
This is a work-in-progress; red = newer, yellow = older:
You can get some interesting upper bounds on e-bike fee revenue from this:
That doesn't count various reasons that e-bike minutes end up being free (positive bike angel points for the ride, no classic bikes available at station, etc.). With the discussion around recent price increases, you can do some envelope math like:
- Suppose an e-bike costs $1000 (probably a low-ball estimate)
- Suppose 10 rides per e-bike per day, and avg. 15mins / ride ⟹ $30 in e-bike fees per e-bike per day ⟹ 1-2 months to break even on each e-bike, before accounting for any operational cost of maintaining the e-bike fleet. Most likely the bikes don't break even for years, I'd guess…
This isn't live anywhere yet, but:
The peak of the distribution (rounded down to the nearest 10s) is 4m20s. See notebook here.
Many great analyses of Citi Bike data have been done over the years!
My hope is that this dashboard will:
- stay up to date automatically
- support enough exploratory data analysis and visualization to answer most q's a layperson might have about system-wide stats
Feel free to file an issue here with any comments, bug reports, or feedback!