Pre-Read Material-
Python
Linux
Docker Fundamentals & Installation- https://www.youtube.com/watch?v=jPdIRX6q4jA&list=PLy7NrYWoggjzfAHlUusx2wuDwfCrmJYcs
Github fundamentals: https://youtu.be/8JJ101D3knE
Quick setting up:
- Windows with github and 2. Linux with github Install git on windows: https://git-scm.com/
# open git bash
git config --global user.name "prabh8331"
git config --global user.email "prabh8331@gmail.com"
ssh git and github setup
cd ~/.ssh
ssh-keygen -t ed25519 -C "prabh8331@gmail.com"
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_ed25519
cat id_ed25519.pub (copy)
go to github>settings>ssh and GPG keys > new ssh key > paste the key
ssh -T git@github.com
- Windows with Linux
# windows ssh setup
# -- ubuntu server part
cd ~/.ssh
ssh-keygen -t ed25519
name the key as windows
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/windows
cat windows.pub (copy key from here)
nano authorized_keys (paste here)
# windows part
--- now copy this "windows" (privte key to "C:\Users\Komalpreet Kaur\.ssh" location)
or open vscode and edit ssh config file (which is in location "C:\Users\Komalpreet Kaur.ssh\config")
Host ubuntu_server
HostName 192.168.1.111
User userver
IdentityFile "C:\Users\Komalpreet Kaur\.ssh\windows"
MySQL workbench setup windows- https://youtu.be/8JJ101D3knE
create GCP account https://www.youtube.com/watch?v=m5hwU0jD0qc
what interview qns others are getting:
basic DSA is required from Leetcode
incremental data refresh in snowflacks and databricks
SQL- window's functions common table expressions how to create funciotn in SQL stored proceedures - know the thoury views indexing itterative and recursive (CTE)
kuberneeties are more part of Devops
nosql don't support ACID property nosql we would want consisitcy nosql is best trafic , scalibility, parallel , analytical query
python - pandas, (numpy not required) DSA (leetcode - 2 qns everyday) system design is not needed but basic of datapipeline is needed Scala
tockenization in oracle stream?
How to practice SQL
- leetcode
- search case study in github
Azure fibric
DP203 certificate
after course can cover devops part
data modeling and data warehousing, datalakes, iceberge hudi kubernities , devops etc.
in interview asking the ETL part, data processing part with respect to databricks