How to prepare for Data Science Retreat

You probably have an interest in data, some coding experience with R or Python, and have taken some online Data Science courses. All of this is great preparation for a career change and for participating at Data Science Retreat. Yet is there anything you can specifically do in the weeks before starting at Data Science Retreat?

We have one elementary requirement, and three suggestions for you. The requirement is that you set up your machine. As operating systems go, Mac and Linux still are much easier to use - and thus preferred. It is essential that you have tried out the latest version of Python, it is set up, and is ready to go.

The three suggestions we have are:

Maths refresher for Data Science models
Practical computer science for Data Science deployment
Preparing for a data-driven project

Math refresher for Data Science models

Books

Linear algebra connected with computational techniques: No bullshit guide to linear algebra
Linear algebra and statistics applied in python: linear_algebra_for_machine_learning

Online courses

Statistics and probability in Khan Academy
Linear Algebra in Khan Academy

Practical computer science for Data Science deployment

Command Line basics: Linux Command Line CheatSheet
Git: Everyday Git
Algorithms: algorithms in Khan Academy

Preparing for the data Science project

Data Science Retreat asks participants to execute one major data-driven project as a group effort. The objective is building or improving a product prototype. Teams of four will invest 1,000 hours of work into the project. This equates to between 200-250 hours per group member, or 25-hour weeks. Teams will be selected in the first few weeks, and an interim assessment follows in Week 8 at the latest. To get ready for the Data Science project, you may consider contributing one project proposal to the DSR pool. All proposals will be evaluated for originality, feasibility, and methodological soundness across the following four dimensions:

Data
1. Sources: Description of one or more possible data sources, e.g. data type, quantity, volume and so on
2. Availability: Is the data available? If so, how, e.g. scraping, database, API?
3. Accessibility. Is the data actually accessible to you (us)?
4. Quality: What is the quality of the data? How much pre-processing needs to be done?
Problem and Solution
1. What is the problem to which you want to provide a solution through Data Science?
2. Can you sketch the problem from the point-of-view of a business or organization?
3. What would you consider the perhaps 2-3 essential features of the solution?
Use case
1. Can you sketch the problem from the point-of-view of the user?
2. If you provide the solution, who would be your first and/or best user?
Approach
1. Which ML or DL approach do you recommend to provide the Data Science solution?
2. Which second or other approaches would you suggest to compare your evaluation metrics and outcome?
3. Are you aware of tools or packages which can be of help to develop your project?

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

How to prepare for Data Science Retreat

Math refresher for Data Science models

Books

Online courses

Practical computer science for Data Science deployment

Preparing for the data Science project

About

Releases

Packages

DataScienceRetreat/preparing_for_DSR

Folders and files

Latest commit

History

Repository files navigation

How to prepare for Data Science Retreat

Math refresher for Data Science models

Books

Online courses

Practical computer science for Data Science deployment

Preparing for the data Science project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages