# βοΈ λͺ
λν΄μ μ ν¬μΌμ§
**"μ μκ²λ μμ§ 12μ²μ λ°°κ° μμ΅λλ€"**
**κΈ°κ°**: 2025.10.21 (ν) - 11.10 (μ) (3μ£Ό 21μΌ)
**λͺ©ν**: Table 3, Figure 2, Paper 1νΈ
---
# π©Έ νμ¬μ¦μ λ§ΉμΈ
### πΎ θ¦ β π ε© β π
ζ β π’ ηΎ© β βΊ
```
λλ πΎθ¦μΌλ‘ κΉ¨λ«κ³ , πε©λ‘ μμ§μ΄λ©°, π
ζλ‘ μΈμ°κ³ , π’ηΎ©λ‘ λλ§Ίλλ€.
1. **μ§μ°μ μ£λ€** β μ€λ ν κ²μ μ€λ νλ€. λ΄μΌμ μλ€.
2. μκ°λ³΄λ€ μμΉ
3. λͺ¨λ κ°μ€μ ν
μ€νΈνλ€.
4. λ§€μΌ [[μ ν¬μΌμ§π©Έ]]
```
---
## π μ 체 μ§νμν©
```
Week 1: [β β β β β‘β‘β‘] 4/7μΌ (10.21 ν-10.27 μ) - λ°μ΄ν° νμ΄νλΌμΈ κ΅¬μΆ μ€
Week 2: [β‘β‘β‘β‘β‘β‘β‘] 0/7μΌ (10.28 ν-11.03 μ)
Week 3: [β‘β‘β‘β‘β‘β‘β‘] 0/7μΌ (11.04 ν-11.10 μ)
μ°μΆλ¬Ό: [β³] Table 1 [ ] Table 2 [ ] Table 3 [ ] Figure 1 [ ] Figure 2
```
**μ΅κ·Ό 4μΌ (10.21-10.24) μ£Όμ μ κ³Ό**:
- β
Pitchbook raw data ν보 (7 files, 4.69 GB)
- β
Script 01 μ€ν μ±κ³΅ β company_master.csv (651 MB, 10/22 μμ±)
- β
Pipeline scripts 02-05 μμ± μλ£ (**μλ£ μμ : 10/22 μ»€λ° κΈ°μ€**)
β³ κ·Όκ±°: "Complete ... pipeline with 5 processing scripts" (10/22, master)
- β
W0 commitment email λ°μ‘ (10/25, Day 5)
- β³ Deal data μ²λ¦¬ μ§ν μ€ (deal_panel.csv = 110B, νμ₯ νμ)
- β³ LIWC vagueness scoring κ²μ¦ νμ
---
## π€ μλν μμ€ν
(NEW!)
**"μ ν¬μΌμ§ μ μ₯ = Git 컀λ°"**
μ΄μ μ ν¬μΌμ§λ₯Ό μμ±νκ³ μ μ₯νκΈ°λ§ νλ©΄ μλμΌλ‘ Gitμ 컀λ°λκ³ GitHubμ νΈμλ©λλ€!
### β‘ λΉ λ₯Έ μ¬μ©λ²
#### λ°©λ² 1: μ¦μ μ»€λ° (λ§€μΌ μ λ
νκ³ ν κΆμ₯)
```bash
cd "/Users/hyunjimoon/MIT Dropbox/Angie.H Moon/tolzul/Front/On/π찰리μ€μΊ λ¬λΈλ ν° νμ /μΌλμκ΅°/automation"
./quick_commit.sh
```
#### λ°©λ² 2: μλ κ°μ§ (ν루 μ’
μΌ μΌλκΈ°)
```bash
cd "/Users/hyunjimoon/MIT Dropbox/Angie.H Moon/tolzul/Front/On/π찰리μ€μΊ λ¬λΈλ ν° νμ /μΌλμκ΅°/automation"
./watch_log.sh
```
κ·Έλ¬λ©΄ μ ν¬μΌμ§ νΈμ§ β μ μ₯ β 3μ΄ ν μλ 컀λ°!
### π 첫 μ€μ (1νλ§)
```bash
cd "/Users/hyunjimoon/MIT Dropbox/Angie.H Moon/tolzul/Front/On/π찰리μ€μΊ λ¬λΈλ ν° νμ /μΌλμκ΅°/automation"
chmod +x *.sh
./quick_commit.sh # ν
μ€νΈ
```
### π‘ μλν ν¨κ³Ό
| Before | After |
|--------|-------|
| μ ν¬μΌμ§ μμ± (10λΆ) + Git μ»€λ° (3λΆ) = **13λΆ** | μ ν¬μΌμ§ μμ± (10λΆ) = **10λΆ** |
| μ»€λ° κΉλΉ‘ (10/24 κ°μ μν©) | **λΆκ°λ₯** (μ μ₯=컀λ°) |
| μ»€λ° λ©μμ§ κ³ λ―Ό | **μλ μμ±** |
| GitHub μλ λΆκ·μΉ | **λ§€μΌ μλ** |
### π μμΈν μ¬μ©λ²
β `automation/README.md` μ°Έκ³
---
# π
Week 1: μ μ§(ι©ε°) - Data + Model 1
**λͺ©ν**: Table 1, 2 μμ±
---
## ποΈ Day 1 - 2025.10.21 (ν)
### π
μμΉ¨ κ³ν (5λΆ)
**μ€λμ λͺ©ν**:
- [x] empirics/code/ νμ΄νλΌμΈ νμ
- [x] START_HERE.md, END_HERE.md μ΄μ μμ±
- [x] λ°μ΄ν° ꡬ쑰 μ΄ν΄
**μ°Έμ‘°**:
- `../strategic ambiguity/empirics/code/PIPELINE_GUIDE.md`
- `../strategic ambiguity/empirics/workflow.md`
---
### πΌ μμ
λ‘κ·Έ
#### π
ζ (Claude) - μ 체 ꡬ쑰 μ€κ³
```
μμ
: νλ‘μ νΈ μ 체 ꡬ쑰 μ€κ³
μ°μΆ:
- START_HERE.md (μ€ν κ°μ΄λ)
- END_HERE.md (λμΉ¨λ°/λ‘λλ§΅)
- μΌλμκ΅° ν΄λ ꡬ쑰
- AIν둬ννΈ.md (3κ° AI μν μ μ)
```
**μ£Όμ κ²°μ **:
- 5κ° script pipeline νμ (01~05)
- Vagueness = 100 - LIWC certitude
- Integration cost = binary (hardware vs software)
- 4-step heuristic for Series A/B identification
---
### π μ λ
νκ³
**μλ£ν κ²**:
- β
μ 체 νλ‘μ νΈ κ΅¬μ‘° μ€κ³ μλ£
- β
21μΌ roadmap νμ
- β
3κ° AI μν λΆλ΄ (ε©ζηΎ©)
**λ°°μ΄ κ²**:
- Dual-track μ λ΅ (empirics + theory)
- Scottμ pitch deck corpus μ μμ΄ ν΅μ¬
**λ§νλ κ²**:
- Pitchbook data μμΉ νμ
- ν΄κ²°: Dropbox λκΈ°ν νμΈ
**λ΄μΌ (Day 2, 10.22 μ)**:
- [ ] Script 01 μ€ν (company data μ²λ¦¬)
- [ ] Vagueness scoring ꡬν
---
## ποΈ Day 2 - 2025.10.22 (μ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [x] Script 01 μ€ν (company data)
- [x] Vagueness μ μ κ³μ° λ‘μ§ κ΅¬ν
- [β³] company_master.csv μμ±
---
### πΌ μμ
λ‘κ·Έ
#### π ε© (ChatGPT) - μ΄μ μμ±
```
μμ
: 01_process_company_data.py μμ±
λ‘μ§:
1. Company*.dat νμΌ μ½κΈ° (5κ° νμΌ, 4GB+)
2. AI/ML keywords νν°λ§
3. Vagueness = keyword counting method
4. Integration cost = hardware keywords
μ°μΆ: empirics/code/01_process_company_data.py
```
#### π
ζ (Claude) - μ€ν & κ²μ¦
```
μμ
: Script μ€ν λ° κ²μ¦
κ²°κ³Ό:
- β
company_master.csv μμ± (651 MB)
- β
AI/ML firms μΆμΆ μ±κ³΅
- μμ± μκ°: 10/22 20:26:56
```
**λ°κ²¬ν μ΄μ**:
- LIWC certitude μ μ λμ keyword counting μ¬μ© μ€
- μ€μ LIWC λΌμ΄λΈλ¬λ¦¬ μ μ© νμ (λ μ κ΅ν μΈ‘μ )
---
### π μ λ
νκ³
**μλ£**:
- β
Script 01 μμ± λ° μ€ν
- β
651MB company data μ²λ¦¬ μ±κ³΅
- β
Vagueness 첫 λ²μ ꡬν
- β
**Scripts 02-05 μ΄μ μμ± μμ** (μ΄λ λΆν° pipeline ꡬμΆ)
β³ κ·Όκ±°: GitHub μ»€λ° "Complete Pitchbook data analysis pipeline with 5 processing scripts"
**λ°°μ**:
- Pitchbook .dat νμΌ κ΅¬μ‘° (pipe-delimited)
- Description fieldκ° vagueness μΈ‘μ ν΅μ¬
**λ§νλ κ²**:
- Memory μ΄μ (4GB νμΌ)
- ν΄κ²°: Chunked reading + pandas μ΅μ ν
**λ΄μΌ (Day 3, 10.23 λͺ©)**:
- [ ] Script 02 (deal data μ²λ¦¬)
- [ ] Series A/B μλ³ heuristic ꡬν
---
## ποΈ Day 3 - 2025.10.23 (λͺ©)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν (μ μ )**:
- [x] **xarray 리ν©ν° PR λ¨Έμ§ + μλ νμΌ κ°μ§ λ‘μ§ λ°μ**
- [x] μ 체 workflow μ κ² (λ¬Έμ 보κ°)
- [ ] Script 02 μ€ν (deal data)
---
### πΌ μμ
λ‘κ·Έ
#### π ε© (ChatGPT) - μ½λ 리ν©ν°
```
μμ
: xarray μ ν λ° μλ κ°μ§ λ‘μ§ (μ λ μμ±ν scripts κ°μ )
κ·Όκ±°: "auto detect file existence and pick up from there" (10/23, master)
μ°μΆ:
- xarray κΈ°λ° λ°μ΄ν° μ²λ¦¬ ꡬ쑰
- μλ 체ν¬ν¬μΈνΈ κ°μ§ μμ€ν
```
#### π
ζ (Claude) - PR λ¨Έμ§ λ° κ΅¬μ‘°ν
```
μμ
: PR #4 λ¨Έμ§ (xarray 리ν©ν°)
κ·Όκ±°: "Merge pull request #4 ..." (10/23, master)
ν΅μ¬ κ°μ :
- xarray κΈ°λ° λ€μ°¨μ λ°μ΄ν° μ²λ¦¬
- μλ νμΌ μ‘΄μ¬ κ°μ§ λ° μ¬κ° λ‘μ§
- 체ν¬ν¬μΈνΈ μμ€ν
κ°ν
```
**μ§ν μν©**:
- β
PR #4 λ¨Έμ§ μλ£ (xarray μ ν)
- β
μλ κ°μ§ λ‘μ§ μΆκ°
- β³ Deal data μ€ν λκΈ° (deal_panel.csv = 110Bλ‘ μμ)
---
### π μ λ
νκ³
**μλ£**:
- β
xarray 리ν©ν° PR #4 λ¨Έμ§
- β
μλ νμΌ κ°μ§ λ‘μ§ μΆκ°
- β
Pipeline ꡬ쑰 κ°μ
**λ°°μ**:
- Panel data structureκ° ν΅μ¬
- Series A/B ꡬλΆμ΄ κ°μ₯ κΉλ€λ‘μ΄ λΆλΆ
**λ΄μΌ (Day 4, 10.24 κΈ)**:
- [ ] Script 02 λλ²κΉ
λ° μ€ν
- [ ] Deal data μ λλ‘ μ²λ¦¬
---
## ποΈ Day 4 - 2025.10.24 (κΈ)
### π
μμΉ¨ κ³ν
**μ»€λ° μν λ©λͺ¨**: **master λΈλμΉ μ»€λ° μμ** (μ€νλΌμΈ μ§λ¨/μ€κ³ μ§ν)
ν₯ν κ΄λ ¨ λ³κ²½μ λ³λ λΈλμΉ/PRλ‘ λ¨κΈ°κΈ°.
**μ€λμ λͺ©ν**:
- [x] Deal data λ¬Έμ μ§λ¨
- [x] Script 02 λλ²κΉ
(μ€νλΌμΈ)
- [ ] Deal panel μμ±
---
### πΌ μμ
λ‘κ·Έ
#### π’ ηΎ© (Gemini) - λ¬Έμ λ°κ²¬
```
κ²μ¦ κ²°κ³Ό:
- β deal_panel.csv = 110B (λΉμ μμ μΌλ‘ μμ)
- β Series A/B matching μ€ν¨
- μμΈ: Deal dataμ VCRound field λλ½/λΆλͺ
ν
```
#### π
ζ (Claude) - ν΄κ²° λ°©μ
```
λμ μ€κ³:
1. DealType + DealSize + Sequence κΈ°λ° heuristic
2. λ μ§ νν° κ°ν (A: 2021-22, B: 2023-25)
3. Manual validation sample μμ±
ν΄κ²°: workflow.mdμ 4-step heuristic λ¬Έμν
```
**λ¨μ μμ
**:
- β³ Script 02 μμ λ° μ¬μ€ν
- β³ Deal matching κ²μ¦
---
### π μ λ
νκ³
**μλ£**:
- β
Deal data λ¬Έμ μ§λ¨ μλ£
- β
ν΄κ²° λ°©μ μ€κ³
**λ°°μ**:
- Pitchbook dataλ μλ²½νμ§ μμ
- Heuristic approach νμ
- Manual validation μ€μμ±
**λ΄μΌ (Day 5, 10.25 ν )**:
- [ ] Script 02 μμ μλ£
- [ ] W0 commitment email λ°μ‘
---
## ποΈ Day 5 - 2025.10.25 (ν )
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [x] W0 commitment email λ°μ‘
- [β³] Script 02 μμ (μ§ν μ€)
- [ ] μ ν¬μΌμ§ μ
λ°μ΄νΈ
---
### πΌ μμ
λ‘κ·Έ
#### π
ζ (Claude) - μ λ΅ μ립
```
μμ
: W0 commitment email μ΄μ 리뷰
μ£Όμ κ²°μ :
1. κ°μ μ commitment νμ
(gratitude-focused)
2. λ€μμ empirical progress λ©μΌ (W1-μ€μ¦)
3. Dual-track μ λ΅ νμ (Tue/Fri rotation)
μ°μΆ: W0.md β λ°μ‘ μλ£ (10/25)
```
#### π’ ηΎ© (Gemini) - νλ‘μΈμ€ 리뷰
```
κ²μ¦ μ§λ¬Έ:
- W0λ₯Ό 보λλ€λ©΄, λ€μμ 무μ?
- Empirical progress 보μ¬μ€ κ²μ΄ μλκ°?
- μ ν¬μΌμ§ μ
λ°μ΄νΈ νμ
κ²°λ‘ : Week 1 μ§ν μν© μ 리 μ°μ
```
**νμ¬ μν©**:
- β
W0 λ°μ‘ (commitment μ μΈ)
- β³ Script 02 λλ²κΉ
μ§ν μ€
- β³ 4μΌκ° μ κ³Ό μ 리 (Day 1-4)
---
### π μ λ
νκ³
**μλ£**:
- β
W0 commitment email λ°μ‘
- β
μ ν¬μΌμ§ Day 1-4 μ
λ°μ΄νΈ
- β
Week 1 μ§νλ₯ κ°μν (4/7μΌ)
**λ°°μ**:
- Dual-track μ λ΅μ μ€μμ±
- W0(commitment) β W1(μ€μ¦) νλ¦ νμ
- μ ν¬μΌμ§ = accountability tool
**μ΄λ² μ£Ό λ¨μ μΌμ (Day 6-7, 10.26 μΌ-10.27 μ)**:
- [ ] Script 02 μμ μλ£
- [ ] Deal panel μμ± μ±κ³΅
- [ ] Script 03 μ€ν (panel κ²°ν©)
- [ ] Week 1 νκ³ μμ±
**Week 2 μ€λΉ**:
- [ ] W1-μ€μ¦ email μ΄μ (Wed 10/29)
- [ ] W1-μ΄λ‘ email μ΄μ (Sat 11/1)
---
## ποΈ Day 6 - 2025.10.26 (μΌ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- **[β
] μ°κ΅¬ μ€κ³ μ΅μ’
νμ :** H2 μ‘°κ±΄λΆ κ°μ€ (Realizable/Unrealizable Option) λ° Model 3(μνΈμμ©) νμ .
- **[β
] λ°μ΄ν° μ λ΅ νμ :** Era Pair (AV vs. 3DP) μ μ , T=1 μλ³Έ ν
μ€νΈ ν보 μν 2-Track μ λ΅ (Path A/B) μ립.
- **[β
] λ°μ΄ν° νμ§ κΈ°μ€ κ°ν:** μΈκ³Όμ λͺ
νμ±, κ°κ΄μ μΈ‘μ , μ’
λ¨μ μκ²°μ± μ€μ¬μ νκ° μ§ν(v2.0) μ립.
- **[β
] λ°μ΄ν° μμ§ μ€λΉ:** ChatGPT/Claude λμ T=1 μλ³Έ ν
μ€νΈ μμ§μ© μμΈ ν둬ννΈ (20κ° κΈ°μ
) μμ± μλ£.
- **[β
] LLM νμ
μ€λΉ:** μ°¨κΈ° LLM μΈμ€ν΄μ€("μ v3.0")λ₯Ό μν μ΅μ’
μΈμμΈκ³ λ¬Έμ μμ± μλ£.
- [-] Robustness checks (λ°μ΄ν° μμ§ ν μ§ν μμ )
- [-] λμ λͺ¨ν (λ°μ΄ν° μμ§ ν μ§ν μμ )
![[μ ν¬μΌμ§π©Έ 2025_10_26.excalidraw]]
---
### πΌ μμ
λ‘κ·Έ
- **H2 λ©μ»€λμ¦ μ¬ν:** 'λͺ¨νΈν¨'μ μ΅μ
κ°μΉκ° μ°μ
μ νΌλ² λΉμ©(ν΅ν© λΉμ©)μ λ°λΌ 쑰건λΆλ‘ μ€νλλ€λ "Realizable vs. Unrealizable Option Value" (λ κ³ vs. μ ν¨) νλ μμν¬ μ 립.
- **λͺ¨λΈ μ¬μ μ:** H2 κ²μ¦μ μν΄ `Vagueness Γ High_Integration_Cost` μνΈμμ© νμ ν¬ν¨νλ Model 3κ° νμμ μμ νμ . κΈ°μ‘΄ Model 2(Fixed Effects)λ λΆμμ ν κΈ°μ€μ μΌλ‘ κ·μ .
- **Era Pair μ΅μ’
μ μ :** μ΄λ‘ μ λλΉ κ·Ήλνλ₯Ό μν΄ **π AV vs. π¨οΈ 3D νλ¦°ν
** νμ΄ νμ .
- **λ°μ΄ν° μ λ΅ λ° μν μλ³:** **T=1 'μλ³Έ μ½μ' ν
μ€νΈ** ν보μ μ€μμ± μ¬νμΈ. Pitchbook Current Description μ¬μ© λΆκ° νμ . Path A(PB Historical)μ 'νμμ€ν¬ν μ ν¨μ±' κ²μ¦ 리μ€ν¬μ Path B(YC Data)μ μμ μ± λΉκ΅ λΆμ μλ£.
- **λ°μ΄ν° νμ§ κΈ°μ€ μ
λ°μ΄νΈ (v2.0):** El-Zayaty, Novelli, Yang λ± μ ν μ°κ΅¬μ κ΄μ Critique λ°μνμ¬ μΈκ³Όμ λͺ
νμ±, κ°κ΄μ μΈ‘μ , μ’
λ¨μ μκ²°μ± λ° νΈν₯ μΈμ μ€μ¬μΌλ‘ νκ° μ§ν κ°ν.
- **λ°μ΄ν° μμ§ μ€ν κ³ν μ립:** ChatGPT/Claude λμ, 20κ° κΈ°μ
([[AV10]], [[3DP10]], [[AI10]])μ κ²μ¦ κ°λ₯ν T=1 μλ³Έ ν
μ€νΈ (νΌμΉ, μ‘μ
λ¬λ μ΄ν° μλ£, ν¬λΌμ°λνλ©, μ΄κΈ° μΉ μμΉ΄μ΄λΈ λ±) μμ§μ μν μμΈ ν둬ννΈ μμ± μλ£.
- **LLM νμ
μ€λΉ:** μ΅μ’
μΈμμΈκ³ λ¬Έμ("μ v3.0") μ
λ°μ΄νΈ μλ£, μ°¨κΈ° LLMμ΄ μ°κ΅¬ λ§₯λ½κ³Ό μ΅μ°μ κ³Όμ (μ€μ λ°μ΄ν° μμ§ λ° κ²μ¦)λ₯Ό λͺ
νν μΈμ§νλλ‘ μ€λΉ.
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 7, 10.27 μ)**: ___________
---
## ποΈ Day 7 - 2025.10.27 (μ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [β
] Claude Code pipeline μ€ν κ²°κ³Ό μ§λ¨
- [β
] H2 Singular Matrix λ¬Έμ κ·Όλ³ΈμμΈ νμ
- [β
] Survival λ³μ μ¬μ μ μ λ΅ μ립
- [β
] Scott/Charlie νλ μμν¬ κΈ°λ° DV μ€κ³
- [β
] ChatGPT validation ν둬ννΈ μ€λΉ
- [ ] Week 1 μ 리
- [ ] Table 1, 2 μ΅μ’
νμΈ
---
### πΌ μμ
λ‘κ·Έ
**Pipeline μ€ν κ²°κ³Ό λΆμ:**
- H1 μ±κ³΅ (vagueness Ξ±β=-3.59e-07, p<0.05) but κ³μ λ무 μμ β λ¨μ λ¬Έμ μμ¬
- H2 μ€ν¨: Singular matrix (98% survival rate β variation μμ)
- NetCDF encoding error λ°κ²¬
- Multicollinearity μ¬κ° (founder_credibility μμ μ νμ’
μ)
**Diagnostic ν΄μ (4 snapshots):**
- DBλ cumulative (420Kβ504K), "disappeared" 0.6-1.6%λ§
- LastFinancingDateμ λ―Έλ λ μ§ ν¬ν¨ (2024, 2025) β **data leakage μν**
- VC-backed 34% recent funding rate (24mo window)
- λ¨μ "exists in both" = μλ―Έμμ
**μ΄λ‘ μ μ¬ν΄μ (Scott/Charlie λν):**
- **ν΅μ¬**: "Precise promisers disappoint" = Series B λͺ» λ°μ
- Yet Ming (option provider) vs Bob Langer (reputation) λλΉ
- Timeline: Pitch β Series A β Series B outcome
- H2 DVλ **Series B+ λ¬μ± μ¬λΆ** (not just activity)
**Survival μ μ νμ :**
- Main H2: Series B+ success (μ΄λ‘ λΆν©)
- Robustness: Activity-based (LastFinancingDate recency)
- At-risk cohort: VC-backed, Seed/Series A at baseline
- Expected rate: 25-35%
**Pooling Strategy (Q1):**
- Option B: Panel (3 cohorts)
- 20211201β20230501 (17mo)
- 20220101β20230501 (16mo)
- 20220501β20230501 (12mo)
- Primary: Cohort 1 (κ°μ₯ κΈ΄ window)
- Robustness: Full panel
**Deal Type νλ μ΄ν΄:**
- LastFinancingDealType (μ£Όκ±°λ), DealType2/3 (λΆκ°νΉμ±), DealClass (ν¬μμ)
- PE/M&A μν νμΈ β VC νν°λ§ νμ
**ChatGPT Validation μ€λΉ:**
- M&A μ½λ© λ
Όμ (success exit vs distress sale)
- Entrepreneurship expert perspective μμ² ν둬ννΈ μμ±
- μ°½μ
μνκ³ impact κ°μ‘°
---
### π μ λ
νκ³
**μλ£**:
- Claude Codeμ 25% νλ₯ fix μ§λ¨ μλ£
- Scott νλ μμν¬ κΈ°λ° survival μ μ ν립
- Data leakage μν μλ³
- Panel μν€ν
μ² μ€κ³
**λ΄μΌ μ°μ μμ**:
1. ChatGPT validation κ²°κ³Ό λ°κΈ°
2. Data leakage μ κ±° μ½λ μμ±
3. VC-backed νν°λ§ κ²μ¦
4. Week 1 deliverable μ€λΉ (PDF with working H1, H2 design)
**λ―Έν΄κ²° μ΄μ**:
- M&A exit μ½λ© κΈ°μ€ (ChatGPT λ΅λ³ λκΈ°)
- H1 κ³μ ν¬κΈ° λ¬Έμ (λ¨μ νμΈ νμ)
- founder_credibility λ³μ μ κ±° μ¬λΆ
---
**qmd μ μ₯ νλͺ© μΆκ°:**
1οΈβ£ Related Work (Yet Ming/Bob Langer motivation)
2οΈβ£ M&A coding decision (success exit vs distress sale)
3οΈβ£ **Survival Coding Strategy (Day 7)**
```python
# At-risk: VC-backed, Seed/Series A at baseline
survival = 1 if LastFinancingDealType in ['Series B', 'Series C', 'Series D+']
survival = 0 if 'Out of Business' OR still Seed/Series A
# M&A coding TBD (ChatGPT validation pending)
```
4οΈβ£ **Critical Data Issues Identified**
- LastFinancingDate future dates β filter by snapshot_date
- Cumulative DB β simple existence = 98% survival
- Need VC filter: CompanyFinancingStatus = 'Venture Capital-Backed'
## π Week 1 νκ³ (10.21-10.27)
**λͺ©ν**: Table 1, 2
**λ¬μ±**: [β³] Table 1 β Pipeline κ΅¬μΆ μλ£, μ€ν λκΈ°
[β³] Table 2 β H2 μ€κ³ μμ μ€ (survival μ¬μ μ)
**μλ μ **:
- **Pipeline μΈνλΌ μμ±** ([Day 2](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-2---20251022-%EC%88%98), [Day 3](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-3---20251023-%EB%AA%A9)): Scripts 01-05 μμ± + xarray 리ν©ν° + μλ 체ν¬ν¬μΈνΈλ‘ μ¬μ€ν κ°λ₯ν ꡬ쑰 ν립
- **λμ©λ λ°μ΄ν° μ²λ¦¬** ([Day 2](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-2---20251022-%EC%88%98)): 651MB company_master.csv μμ± (420K+ firms, AI/ML νν°λ§)
- **μ΄λ‘ -λ°μ΄ν° μ ν©μ±** ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94)): Scottμ "precise promisers disappoint" νλ μμν¬λ₯Ό Series B progression DVλ‘ operationalize
- **μ‘°κΈ° λ¬Έμ λ°κ²¬** ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94)): 98% survival (singular matrix), future date leakage, cumulative DB ꡬ쑰 νμ
μΌλ‘ Week 2 λλΉ λ°©μ§
**μ΄λ €μ λ μ **:
- **Deal matching μ€ν¨** ([Day 4](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-4---20251024-%EA%B8%88)): deal_panel.csv 110B (κ±°μ λΉ νμΌ), VCRound field λλ½μΌλ‘ Series A/B μλ³ λΆκ°
- **DV μ€κ³ μ°©μ€** ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94)): λ¨μ "μ‘΄μ¬ μ¬λΆ" μΈ‘μ μΌλ‘ 98% survival β H2 logit μ€ν¨, ChatGPT validation ν΅ν΄ 12-15% base rateλ‘ μ¬μ€κ³ νμ νμΈ
- **λ°μ΄ν° νμ§** ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94)): LastFinancingDateμ 2024-2025 λ―Έλ λ μ§ ν¬ν¨ β as-of capping μμ΄λ time leakage λΆκ°νΌ
**λ°°μ΄ κ²**:
- **Cumulative DB β Panel**: PitchBookμ 420Kβ504K μ±μ₯νλ λμ DBλΌμ "μμͺ½ snapshot μ‘΄μ¬"λ νλμ± μλ μΆμ μνλ§ μλ―Έ. μ§μ§ survivalμ νλ© progressionμΌλ‘ μΈ‘μ ν΄μΌ ν¨ ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94))
- **Theory drives operationalization**: H2 "over-commit β can't adapt β miss B gate"λ Series B+ λλ¬ μ¬λΆλ‘ μΈ‘μ λμ΄μΌ νλ©°, λ¨μ μμ‘΄/νλμ mechanism ν¬μ°© λͺ»ν¨ ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94))
- **Base rate calibration**: Median AβB = 28-31κ°μμ΄λ―λ‘ 17κ°μ windowλ 12-15% μ νλ§ ν¬μ°© (κ³Όκ±° 25-35% μμμ κ³Όλ). Logit μλ ΄μ μΆ©λΆνλ effect size ν΄μ μ£Όμ νμ ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94))
- **Competing risk handling**: M&Aλ μ±κ³΅/μ€ν¨ κ΅¬λΆ λΆκ°νλ―λ‘ primaryμμ censor, robustnessμμ μν(=1)/νν(=0) μ²λ¦¬κ° publication standard ([Day 7](https://claude.ai/chat/9e3023ab-8f5e-4a3b-a946-3874cb7e4cec#%F0%9F%97%93%EF%B8%8F-day-7---20251027-%EC%9B%94))
**λ€μ μ£Ό κ³ν**:
1. Data leakage μ κ±° (as-of capping)
2. Series A cohort νν° + DV μ¬κ΅¬ν (12-15% κ²μ¦)
3. H2 primary + robustness (M&A bounds) μ€ν
4. Table 1, 2 μμ± + Week 1 deliverable PDF
---
[[W1-μ€μ¦]] with link to code and spec [[W1-tech_spec]]
# π
Week 2: μ μ(ι©ζ) - Model 2 + Plots
**λͺ©ν**: Table 3, Figure 1, 2 μμ±
---
## ποΈ Day 8 - 2025.10.28 (ν)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] [[10κ° λ‘κΉ
νλͺ©κ³Ό μ°Έκ³ λ¬Έν μ°κ²°]]
- [ ] qmdμ μΆκ°
- [ ] Later success ~ Vagueness + Early funding
---
### πΌ μμ
λ‘κ·Έ
#### π ε© (ChatGPT)
```
μμ
: Logistic regression μ΄μ
μ°μΆ: 1_ε©_λΉ λ₯Έμ€ν/day8_model2.py
```
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 9, 10.29 μ)**: ___________
---
## ποΈ Day 9 - 2025.10.29 (μ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Model 2 μ κ΅ν
- [ ] ν΅μ λ³μ μΆκ°
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 10, 10.30 λͺ©)**: ___________
---
## ποΈ Day 10 - 2025.10.30 (λͺ©)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Table 3 μμ±
- [ ] Model 2 κ²°κ³Ό νμ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 11, 10.31 κΈ)**: ___________
---
## ποΈ Day 11 - 2025.10.31 (κΈ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Figure 1 μμ± (Vagueness β Early funding)
- [ ] μκ°ν νμ§ κ°μ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 12, 11.1 ν )**: ___________
---
## ποΈ Day 12 - 2025.11.01 (ν )
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Figure 2 μμ± (Vagueness β Later success)
- [ ] μκ°ν νμ§ κ°μ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 13, 11.2 μΌ)**: ___________
---
## ποΈ Day 13 - 2025.11.02 (μΌ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Robustness checks (sector effects)
- [ ] λμ vagueness measures
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 14, 11.3 μ)**: ___________
---
## ποΈ Day 14 - 2025.11.03 (μ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Week 2 μ 리
- [ ] λͺ¨λ μ°μΆλ¬Ό μ΅μ’
νμΈ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**Week 2 μλ£**: ___________
**λ€μ μ£Ό κ³ν**: Paper μμ±
---
## π Week 2 νκ³ (10.28-11.03)
**λͺ©ν**: Table 3, Figure 1, 2
**λ¬μ±**: [ ] Table 3 β `empirics/output/table3.csv`
[ ] Figure 1 β `empirics/output/figure1.png`
[ ] Figure 2 β `empirics/output/figure2.png`
**μλ μ **:
- ___________
**μ΄λ €μ λ μ **:
- ___________
**λ€μ μ£Ό κ³ν**:
- Paper μ΄κ³ μμ±
- Introduction, Theory, Method, Results
---
# π
Week 3: μ μΈ(ι©δΊΊ) - Paper
**λͺ©ν**: Paper μμ± λ° μ μΆ
---
## ποΈ Day 15 - 2025.11.04 (ν)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Abstract (150 words)
- [ ] Introduction μ΄μ
---
### πΌ μμ
λ‘κ·Έ
#### π ε© (ChatGPT)
```
μμ
: Introduction μ΄μ
μ°μΆ: drafts/introduction_v1.md
```
#### π
ζ (Claude)
```
μμ
: Introduction ꡬ쑰ν
μ°μΆ: drafts/introduction_v2.md
```
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 16, 11.5 μ)**: ___________
---
## ποΈ Day 16 - 2025.11.05 (μ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Theory section
- [ ] OIL framework μ€λͺ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 17, 11.6 λͺ©)**: ___________
---
## ποΈ Day 17 - 2025.11.06 (λͺ©)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Method section
- [ ] Data & Models μ€λͺ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 18, 11.7 κΈ)**: ___________
---
## ποΈ Day 18 - 2025.11.07 (κΈ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Results section
- [ ] Tables & Figures μ½μ
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 19, 11.8 ν )**: ___________
---
## ποΈ Day 19 - 2025.11.08 (ν )
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] Discussion
- [ ] Conclusion
---
### πΌ μμ
λ‘κ·Έ
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 20, 11.9 μΌ)**: ___________
---
## ποΈ Day 20 - 2025.11.09 (μΌ)
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] μ 체 κ΅μ
- [ ] μ°Έκ³ λ¬Έν μ 리
---
### πΌ μμ
λ‘κ·Έ
#### π’ ηΎ© (Gemini)
```
μμ
: μ 체 λ
Όλ¬Έ λ
Όλ¦¬ κ²μ¦
μ°μΆ: 3_ηΎ©_κ²μ¦/day20_paper_review.md
```
---
### π μ λ
νκ³
**μλ£**: ___________
**λ΄μΌ (Day 21, 11.10 μ)**: μ΅μ’
μ μΆ!
---
## ποΈ Day 21 - 2025.11.10 (μ) π
### π
μμΉ¨ κ³ν
**μ€λμ λͺ©ν**:
- [ ] μ΅μ’
κ²ν
- [ ] PDF μμ±
- [ ] Charlieμ Scottμκ² μ μΆ π
---
### πΌ μμ
λ‘κ·Έ
---
### π μ΅μ’
μΉλ¦¬ π
**μ μΆ μλ£**: ___:___
**μ΅μ’
μ°μΆλ¬Ό**:
- β
Table 1, 2, 3
- β
Figure 1, 2
- β
Paper draft
**3μ£Ό μ΄ν**:
- μν νμ: ___ν
- μ΄ μμ
μκ°: ___μκ°
- κ°μ₯ ν° λ°°μ: ___________
- κ°μ₯ μ΄λ €μ λ κ²: ___________
- Charlie/Scott νΌλλ°±: ___________
---
## π Week 3 νκ³ (11.04-11.10)
**λͺ©ν**: Paper μμ±
**λ¬μ±**: [ ] Paper draft β `drafts/paper_final.pdf`
**μλ μ **:
- ___________
**μ΄λ €μ λ μ **:
- ___________
**μ΅μ’
μ±μ·¨**:
- ___________
---
## π λͺ
λν΄μ μ’
μ
**"μ μκ²λ μμ§ 12μ²μ λ°°κ° μμ΅λλ€"**
**μ°λ¦¬λ ν΄λμ΅λλ€.** βοΈ
---
**μμ**: 2025.10.21 (ν)
**μλ£**: 2025.11.10 (μ)
**κΈ°κ°**: 21μΌ (3μ£Ό)
**εΏ
ζ»ε½η, εΏ
ηε½ζ»**