TODO

(split off from main readme to reduce clutter)

Future Plans

fully integrate with the functional competency map, and replace this syllabus with that when it is ready
split up the syllabus and the curriculum for this package (this makes it easier to look for new and better materials, and to make updates to the curriculum, while at the same time making sure the scope of the training package doesn't drift too much as stuff is added or removed)
- the syllabus notes down things that need to be learned and outcomes that should be achieved
- the curriculum collates and orders the material needed to cover the syllabus
cover deep learning in the ML module in a bot more depth
update ASR
notes on timeseries data
- fb prophet
- changepoint detection
- model / data drift, and how to detect it
- how to split training and test sets (and how NOT to)
  - split by time (or in some cases, some other time-dependent variable)
  - splitting randomly leaks labels into the training set
- remember that near-100% results are suspicious
evaluating your results "by eye"
- is the accuracy weird
  - is it too good? (98-100%, but 95+ is where you'd get suspicious)-> leaking labels, easy problems
  - is it too bad? -> low quality labels and test set?
- rules of thumb
  - would the opposite finding be surprising?
  - if you were told that some other model predicted the opposite (or a different) result, would it be believable?
  - if you were instructed to come up with the opposite result (given your current data), would you be able to do so easily?

https://github.com/ahmedbahaaeldin/From-0-to-Research-Scientist-resources-guide
https://www.scribbr.com/methodology/research-design/
deep learning specialization course (free to audit)
google ML crash course
microsoft ML course
- git repo
CRISP-DM (CRoss-Industry Standard Process for Data Mining)
- seems a bit buzzwordy, but the workflow is generally valid
glossary
rules for ML
how to work with users
technical debt in ML
wizard of oz models
see sidebar for titanic walkthrough
10 rules for better Jupyter notebooks
Elements of Statistical Learning (book)
Data Science and Machine Learning: Mathematical and Statistical Methods
- with associated Python code
More Unicode
- http://reedbeta.com/blog/programmers-intro-to-unicode/
- see also grapheme, which is a library for working with what you probably think are unicode characters
- see also wcswidth, which gives you the length of a string, double-counting CJK characters since those are double-wide
see also 'words of estimative probability' for an example of how categories may be only semi-ordinal

https://www.rawgraphs.io/gallery
https://flourish.studio/examples/
- https://flourish.studio/blog/six-ways-to-visualize-elections/
https://www.storytellingwithdata.com/chart-guide
https://seaborn.pydata.org/examples/index.html
https://plotly.com/python/plotly-fundamentals/
https://www.reddit.com/r/dataisbeautiful/top/?t=all
https://www.reddit.com/r/Infographics/top/?t=all
https://informationisbeautiful.net/
https://design.google/library/exploring-color-google-maps/
https://blog.datawrapper.de/colors-for-data-vis-style-guides/