Welcome to the GitHub repository for the Gravitational Wave Data Exploration Bootcamp Series! This course is meticulously designed to provide a solid foundation in programming, operational knowledge, and data-driven modeling skills centered around gravitational wave data analysis and research.
- Equip participants with robust programming and operational skills, and foundational training in data-driven modeling, focusing on gravitational wave data analysis and related research areas.
- Note: The course is conducted entirely in Mandarin Chinese to cater to a wide range of Chinese-speaking students and researchers.
- Discuss the common research methodologies combining gravitational wave data processing with AI technologies, with hands-on examples and projects for practical understanding and mastery.
- Analyze cutting-edge deep learning models and apply them to real-world gravitational wave data analysis problems through specific case studies.
- Undergraduate and graduate students interested in data analysis and algorithm development, especially those focusing on gravitational wave data processing and related research.
- The course also welcomes undergraduates with a basic programming background, looking to enhance their data analysis skills or with an interest in gravitational wave data processing.
- Future professionals aspiring to work in space-based gravitational wave detection projects and related research fields.
- Drawing from past teaching experiences and identified knowledge gaps in student research projects, the course introduces relevant concepts and common methods to ensure comprehensive understanding and application in research.
- The course is scheduled weekly or bi-weekly, each session lasting about 3 hours, combining online and offline methods (腾讯会议) to ensure interactivity and practicality.
- The curriculum is expected to be offered once per semester or annually, with continual updates and enrichment based on student feedback and research demands.
-
Part Zero: Motivational Introduction
Description
- 办课初衷与学员构成 - 讲师介绍 - 与本课程相关的知识架构 - 引力波数据分析 - 课程大纲 - 本课程是什么,不是什么 - 本课程的学习方法与教学团队 - 本课程的考核规则和项目作业 - 通向自我实现之路 - 如何自学 - 如何提问 - 提问环节
- Date:2023/11/08 | Video recording | Slide: PDF or online
-
Part One: Programming Development Environment and Workflow
Description
- Linux Commands and Shell Scripting - Git Version Control (GitHub / GitLab) - SSH Remote Server Access (Shell / VSCode) - Containerization with Docker - Hands-On: Setting up Python / Jupyter Development Environment - Hands-On: Compiling LALsuite / LISAcode Source Code
- Date:2023/11/12 | Video recording | Slide: PDF or online
- Homework
- Docker Container Setup
- Remote Development for Python/Jupyter with GPU Support and LALsuite/LISAcode Compilation
- Date:2023/11/19 | Video recording | Slide: PDF or online
- Homework
- Introduction to Git and GitHub Workflow
- Setting Up and Submitting Homework via GitHub
-
Tech Talk: It's all about data (Guest Lecture by Xinyao Tian)
Description
- 数据的起源 (The origin of data) - 何谓数据? (What is data?) - 现代数据技术的发展脉络 (The development momentum behind data) - 当前主流数据技术 (Modern data technologies) - 关系型数据库 (RDBMS) - 非关系型数据库 (Not-only SQL (NoSQL) Database) - 大数据 (Big Data) - 数据仓库 (Data Warehouse) - 流式计算 (Stream Processing) - 数据湖 (Data Lake) - 数据湖仓 (Data Lakehouse) - 思考:从数据的角度认识世界 (Thinking: Realizing the world from a data perspective) - 推荐阅读 (Recommend readings) - Q & A
- Date:2023/11/19 | Video recording | Slide: markdown
-
Part Two: Python-Based Data Analysis Fundamentals
Description
- Introduction to Python Programming - Algorithms with Numpy / Pandas / Scipy - Hands-On: Exploratory Data Analysis of GW Event Catalog / Glitch Data - Hands-On: Matched Filtering for GW150914 Data - Data Visualization in Python: Theory and Practice - Hands-On: Reproducing Figures from GWTC Papers
- Date:2023/11/29 | Video recording | Slide: PDF or online
- Date:2023/12/01 | Video recording | Slide: PDF or online
- Date:2023/12/03 | Video recording | Slide: PDF or online
- Homework
- Python, Numpy, Pandas Basic Homework
- Leetcode Extension Tasks
- Date:2023/12/08 | Video recording | Slide: PDF or online
- Date:2023/12/10 | Video recording
- Homework
- Data visualization and analysis using Python libraries, matplotlib and seaborn.
- Homework | Vidwo recording
- Recreate Figure 7 from the GWTC-3 paper using numpy, pandas, matplotlib, and seaborn.
- Date:2023/12/15 | Video recording | ipynb | html
-
Sci Talk: Bayesian inference for gravitational-wave science (Guest Lecture by Junjie Zhao)
Description
- Brief introduction to gravitational wave (引力波简要介绍) - Part I: Bayesian inference (贝叶斯推断) - Definition of “probability” ("概率"的定义) - Rethink the interpretations (重思概率诠释) - Frequentist statistics (频率学派) - Bayesian statistics (贝叶斯学派) - Bayes' theorem (贝叶斯定理) - Application to the detection of gravitational wave (在引力波探测上应用) - Bayesian inference framework (贝叶斯推断框架) - Parameter estimation for gravitational-wave data (引力波数据分析中参数估计) - Model selection for gravitational-wave data (引力波数据分析中模型选择) - Q & A - Part II: Bayesian computation (贝叶斯计算方法) - Markov Chain Monte Carlo (MCMC; 马尔可夫链-蒙特卡罗方法) - hands-on tiny mcmc example - Nested sampling (嵌套采样) - hands-on tiny nested-sampling example - Part III: All in gravitational-wave data (一切尽在引力波数据中) - Use Bilby & Parallel Bilby in the GW data analysis - nShow the complete pipeline for the data analysis - The AMAZING Thomas Bayes (为美好的世界献上"贝叶斯定理") - Q & A
- Date:2023/12/16 | Video recording | Slide: PDF
- Date:2023/12/17 | Video recording | Slide: PDF
-
Part Three: Basics of Machine Learning
Description
- Overview of Artificial Intelligence - Definitions, Objectives, and Types of Machine Learning - Machine Learning Project Development and Preparation - Hands-On: Clustering Analysis of LIGO's Glitch Data
- Date:2023/12/22 | Video recording | Slide: PDF or online
- Homework
- Implement a classification model for credit scoring using the sklearn library in Python.
- Date:2023/12/24 | Video recording | Slide: PDF or online
- Homework
- Model Evaluation and Hyperparameter Tuning for a Credit Scoring Dataset.
-
Part Four: Introduction to Deep Learning
Description
- Overview of Deep Learning Technologies - Fundamentals of Artificial Neural Networks (ANN) - Convolutional Neural Networks (CNN) - Hands-On: Identifying Gravitational Waves from Binary Black Hole Systems using CNN - Frontiers of Gravitational Wave Data Analysis and AI
- Date:2023/12/27 | Video recording | Slide: PDF or online
- Date:2023/12/29 | Video recording | Slide: PDF or online
-
Tech Talk: AI Revolution: From Concept to GPT Breakthroughs (Guest Lecture by Minquan Gao)
Description
1. Why AI Was Proposed: - Exploring the historical context and reasoning behind the emergence of AI. - Initial challenges and needs that AI aimed to address. 2. Earliest Form of AI and Solutions: - Description of the first AI systems, such as simple computational machines. - Early AI applications and the problems they solved. 3. Similarities between AI and Physics Methodologies: - Comparing the theoretical frameworks and approaches used in both fields. - Identifying shared principles and methods. 4. From Symbolic Systems to Machine Learning: - Evolution of AI from early symbolic and numeric systems. - The transition to probabilistic and statistical methods. - The development of machine learning technologies. 5. Principles of Deep Learning: - Understanding the core concepts behind deep learning. - The architecture of neural networks and their functionality. 6. Breakthroughs Brought by Deep Learning: - Identifying key advancements and innovations due to deep learning. - Impact of deep learning on various AI applications. 7. Typical Deep Learning Scenarios: - Examples of deep learning applications in real-world scenarios. - Discussion of its effectiveness and adaptability. 8. Pre-trained Models and Large Models: - The role and significance of pre-trained models in AI. - Characteristics and implications of large-scale AI models. 9. Principles of GPT: - Explaining the foundational concepts of Generative Pre-trained Transformers. - Discussing its applications and impact. 10. Breakthroughs in AIGC (AI Generated Content): - Overview of advancements in AI-generated content. - Examples and implications of these breakthroughs. 11. Current Challenges in AI: - Discussing ethical, technical, and practical problems in AI. - Examination of ongoing debates and concerns in the field. 12. Frontiers of AI Research: - Exploring cutting-edge research and future directions in AI. - Innovations and potential developments on the horizon.
- Date:2023/12/31 | Video recording
-
Final Part: End-of-Camp Ceremony and Wrap-Up
Description
- Acknowledgements - Training Camp Course Review and Summary - Homework Completion & Competition Rankings - Awards Ceremony - Video Recording Sharing + Bilibili Channel Launch + Remember to Star the Course - Welcoming More Feedback and Suggestions
- Date:2024/01/14 | Video recording | Slide: PDF or online
Homepage: https://www.kaggle.com/competitions/2023-gwdata-bootcamp/
- Welcome to the final challenge of the Gravitational Wave Data Exploration: A Practical Training in Programming and Analysis (2023) - "Can you find the GW signals?" Kaggle Data Science Competition (Hackathon)!
- This competition is designed to apply the knowledge and skills you've learned throughout the course, focusing on gravitational wave data analysis and research.
- The objective of this competition is to develop a model that can accurately identify gravitational wave signals from the provided dataset.
- You will be given a dataset containing a mix of noise and gravitational wave signals. Your task is to develop a model that can accurately distinguish between the two.
-
This competition will start at 10:00 PM (Beijing Time) on December 29, 2023, and end at 11:59 PM (Beijing Time) on January 6, 2024.
-
Please make sure to submit your solutions before the deadline.
-
Good luck and may the best team win!
- data_prep_bbh.py - script for data generation (credit: Dr. Hunter Gabbard)
- utils.py - supplemental script containing some useful functions
- main.py - main script for training / evaluation / submission
- test.npy - test data for submission (You can load the test data in the Kaggle notebook)
Anyway, just check the baseline notebook for everything!
- You can view the complete assignment results from the assignments committed by students with a total score of 6 and 7.
Total Score | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
---|---|---|---|---|---|---|---|
Frequency | 4 | 5 | 6 | 10 | 7 | 23 | 8 |
Top Percentage Ranking |
|
|
|
|
|
|
|
高远坤 7️⃣ ✅️ |
郭印达 7️⃣ ✅️ |
刘守潘 7️⃣ ✅️ |
张徐蔚 7️⃣ ✅️ |
汤丰杰 7️⃣ ✅️ |
苏鸿 7️⃣ ✅️ |
李炳辰 7️⃣ ✅️ |
郝赵 7️⃣ ✅️ |
黄震洋 6️⃣ ✅️ |
范钧 6️⃣ ✅️ |
邹增慧 6️⃣ ✅️ |
蒙晓锋 6️⃣ ✅️ |
刘世睿 6️⃣ ✅️ |
孟德双 6️⃣ ✅️ |
董玉豪 6️⃣ ✅️ |
王尊 6️⃣ ✅️ |
薛亚东 6️⃣ ✅️ |
刘冉 6️⃣ ✅️ |
单磊磊 6️⃣ ✅️ |
邹靓 6️⃣ ✅️ |
沈萍 6️⃣ ✅️ |
韩佩佳 6️⃣ ✅️ |
吉祥 6️⃣ ✅️ |
张嘉宝 6️⃣ ✅️ |
潘洋 6️⃣ ✅️ |
周子力 6️⃣ ✅️ |
邱智翀 6️⃣ ✅️ |
李倾城 6️⃣ ✅️ |
何禹成 6️⃣ ✅️ |
王天龙 6️⃣ ✅️ |
汪一凡 6️⃣ ✅️ |
Rank | Team | Members | Score | Rank | Team | Members | Score |
---|---|---|---|---|---|---|---|
1 | XAO |
黄震洋
张徐蔚 |
0.86173 | 16 | Yuanhao Zhang | 张渊皞 | 0.83317 |
2 | UCAS Li Jiahao | 李嘉豪 | 0.86160 | 17 | B4rRY_G |
郭意扬
赖景祺 |
0.82729 |
3 | UCAS_212x2 | 刘洋毓 | 0.86157 | 18 | Shao dong zhao | 赵少东 | 0.82723 |
4 | sophiainshao | 沈萍 | 0.86145 | 19 | Sparkle79 | 0.82595 | |
5 | Haihao SHI | 史海浩 | 0.85954 | 20 | Shoupan Liu | 刘守潘 | 0.82526 |
6 | Yinda Guo | 郭印达 | 0.85832 | 21 | Capoo Cat | 孙文博 | 0.82387 |
7 | deslenlir | 温怡蓉 | 0.85753 | 22 | Tian_Jun | 田军 | 0.82204 |
8 | MengXiaofeng-UCAS | 蒙晓锋 | 0.85188 | 23 | Zhao_Hao | 郝赵 | 0.82047 |
9 | 1500!!! |
刘冉
刘世睿 王天龙 |
0.84868 | 24 | douking | 王尊 | 0.81871 |
10 | Qinglin Yan |
王霆澜
闫庆琳 |
0.84530 | 25 | JunFan | 范钧 | 0.8141 |
11 | Zhiqing Zhu | 朱智清 | 0.84527 | 26 | Phi267 | 秦戈宇 | 0.81304 |
12 | knnbenn | 0.84023 | 27 | tastonlyjust | 0.81038 | ||
13 | HIAS |
苏鸿
张景瑞 汤丰杰 |
0.84023 | 28 | junda zhou | 周均达 | 0.80846 |
14 | HanPeijia | 韩佩佳 | 0.84003 | 29 | SCU_CTP |
曹旭
高鸿飞 李志威 |
0.80705 |
15 | Zenghui Zou |
邹增慧
李倾城 |
0.83883 | 30 | DESHUANGMeng | 孟德双 | 0.80462 |
Welcome to the course project! To get started with your programming assignments, you'll need to set up your workspace. Here's a step-by-step guide to help you through the process.
Step 1: Set Up Your GitHub Account and Fork the Repository
- Create a GitHub Account: If you don't already have a GitHub account, go to GitHub and sign up.
- Fork the Course Repository:
- Navigate to the course's GitHub repository: GWData-Bootcamp.
- Click on the
Fork
button at the top right of the page. - In the fork settings, make sure to uncheck the option 'copy
main
branch only'.
- Clone the Forked Repository:
- Open your terminal or Git Bash.
- Clone the forked repository to your local machine using the following command:
git clone git@github.com:<YourGitHubUsername>/GWData-Bootcamp.git
- Replace
<YourGitHubUsername>
with your actual GitHub username.
Step 2: Set Up Your Local Workspace
- Switch to the
homework
Branch:- Navigate to your cloned repository's directory:
cd GWData-Bootcamp
- Switch to the
homework
branch using:git switch homework
- Navigate to your cloned repository's directory:
- Create Your Personal Homework Directory:
- Inside the
GWData-Bootcamp
directory, create a new directory path for your homework submissions:mkdir -p 2023/homework/<YourName>
- Replace
<YourName>
with your name or a unique identifier.
- Replace
- Inside the
Step 3: Submitting Your Homework
- Complete Your Assignments:
- Add your completed assignments to your personal homework directory that you created in the previous step.
- The assignments should be named as
python_submit.txt
,numpy_submit.txt
, orpandas_submit.txt
depending on the type of the assignment.
- Push Your Changes:
- Stage and commit your changes. For example:
git add . git commit -m "Add homework for <SpecificHomework>"
- Push your
homework
branch to your forked repository:git push origin homework
- Stage and commit your changes. For example:
- Create a Pull Request:
- Go to your forked repository on GitHub.
- Switch to the
homework
branch. - Click on New Pull Request.
- Ensure the base repository is set to the original
GWData-Bootcamp
repository and the base branch is set tohomework
. - Complete the PR form and submit.
- The GitHub Actions workflow will automatically check your submission (Homework) and compare it with the solution. If your submission passes the check, a merge request will be initiated. Please note that only the repository owners have the authority to merge the request.
- Do Not Modify Other Students' Work: It's crucial that you do not make any changes to other students' homework directories and contents.
- Regular Updates: Keep your fork synchronized with the main repository to get the latest updates and assignments.
- Automated Checks: The GitHub Actions workflow will automatically check your submission when you create a pull request. Make sure your submission passes the check before you submit it.
- Happy Coding! 🚀👩💻👨
This class is co-taught by He Wang and several esteemed colleagues, including guest lecturers (Junjie Zhao) and industry experts (Xinyao Tian and Minquan Gao), whose names will be announced as they join.
For any inquiries regarding the course, please email us at 📧 taiji@ucas.ac.cn.
We look forward to your participation and contribution to this exciting field of study!
We welcome contributions to enhance course materials. Please fork the repository, make your changes, and submit a pull request.
-
University of Chinese Academy of Sciences (UCAS)
-
International Centre for Theoretical Physics Asia-Pacific (ICTP-AP)
-
Taiji Laboratory for Gravitational Wave Universe
This project is licensed under the MIT License - see the LICENSE file for details.
- Contributions from the Gravitational Wave Open Science Center (gwosc.org).
- Educational resources and datasets from renowned institutions and projects in the field.