Study Notes Index (knowledge base structure)


🚧 ONGOING CONSTRUCTION 🚧

1 Statistics

1.1 General topics

1.2 Survival analysis

1.3 Bayesian Statistics

1.4 Missing data and imputation methods

1.5 Causal inference (Propensity score methods)

1.5.1 Use of external control data in RCT

1.6 Meta-analysis methods

1.7 Regression methods

1.9 Correlated/longitudinal data analysis

1.10 Ordinal data

1.xx Other topics


2 Data Science & Machine Learning

2.1 Data preprocessing pipeline

2.1.1 Data manipulation

  • Numpy
  • Pandas
  • SQL
  • Spreadsheets

2.2 ML methods

  • Fundamental ML concepts

    • Data importation
    • Data manipulation (above)
    • Frame ML problem
    • EDA (including data visualization)
    • Implement ML models (below)
    • Optimize ML model
      • Feature selection/engineering
      • Improve model generalization (hyperparameter tuning, model validation, etc.)
    • ML model interpretation
  • ML General

  • Stochastic gradient descent

2.2.1 Surpervised learning

2.2.2 Unsupervised learning

2.2.3 Deep learning

2.2.4 Reinforcement learning

2.2.5 NLP

2.2.6 Recommendation systems

2.3 Data visualization

2.4 Analytical tools

2.4.1 R

2.4.2 Python

2.4.3 Use notebook

2.4.4 Cloud service

2.5 Other topics

2.5.1 ML real-world application

2.5.2 Bussiness sense

2.5.3 Data ethics


3 Math

3.1 Essence of Linear Algebra

3.2 Essence of Calculus


4 Computer Science & Programming

4.1 Python

4.2 SQL

4.3 Algorithm & Data Structure

4.4 Git & Github

4.5 Use Linux

4.6 Rust

4.7 Containers

4.8 Web Development

4.9 D3

4.x Learning resources


5 Soft skills

5.1 Communication & collaboration

5.2 Interview skills

5.3 Learning method/philosophy


Last updated: January 2026