# Paper Generation Pipeline (4-Phase Framework)
**μ λΌμ’μκ΅° (Jeonla Naval Fleet) κΈ°μΉμ κ²° Architecture**
## π The Four Commanders
This pipeline implements the **4-Phase Paper Generation Framework** based on κΈ°μΉμ κ²° (θ΅·ζΏθ½η΅), the traditional Korean narrative structure. Each phase is commanded by one of the μ λΌμ’μκ΅° (Jeonla Naval Fleet) admirals:
| Phase | Role | Commander | Color | Responsibility |
|-------|------|-----------|-------|----------------|
| **1. θ΅·** | Introduction | π’ μ μ΄ (Jeong-un) | Teal (#20B2AA) | Open the door with compelling narrative |
| **2. ζΏ** | Theory & Conceptual | π
κΆμ€ (Kwon-jun) | Orange (#FF8C00) | Build the intellectual structure |
| **3. θ½** | Empirics & Results | π κΉμ (Kim-wan) | Crimson (#DC143C) | Prove righteousness through evidence |
| **4. η΅** | Discussion & Conclusion | πΎ μ΄μλ΄ (Eo-yeong-dam) | Purple (#9370DB) | Close the story with wisdom |
---
## Quick Start (4-Phase Mode)
### Generate All 4 Phases
```bash
cd src/scripts/paper_generation
python generate_all.py
```
This generates:
- `01_Introduction.md` (μ μ΄'s door-opening narrative)
- `02_Theory_Conceptual.md` (κΆμ€'s theoretical structure)
- `03_Empirics_Results.md` (κΉμ's empirical proof)
- `04_Discussion_Conclusion.md` (μ΄μλ΄'s wisdom and closure)
### Generate Individual Phases
```bash
# Phase 1: Introduction (θ΅· - μ μ΄ π’)
python generate_01_introduction.py
# Phase 2: Theory & Conceptual Model (μΉ - κΆμ€ π
)
python generate_02_theory_conceptual.py
# Phase 3: Empirics & Results (θ½ - κΉμ π)
python generate_03_empirics.py
# Phase 4: Discussion & Conclusion (η΅ - μ΄μλ΄ πΎ)
python generate_04_discussion.py
```
### Generate Supplementary Materials
```bash
# Visual poster (μ λΌμ’μκ΅° 4-phase structure)
python generate_07_poster.py
# Industry-specific analysis (PR #13 integration)
python generate_08_industry_comparison.py
```
---
## π Phase Descriptions
### Phase 1: θ΅· (Introduction) β μ μ΄ π’
**File**: `generate_01_introduction.py`
**Output**: `01_Introduction.md`
**Commander**: μ μ΄ (Jeong-un) β "The Door Opener"
**Color**: Teal (#20B2AA)
**Responsibilities**:
- Hook readers with vivid case study (Tesla vs Bosch paradox)
- Articulate the core puzzle: Why does vagueness help some but hurt others?
- Preview main findings with empirical results
- Outline three theoretical contributions
- Provide paper roadmap linking to other phases
**Content Structure**:
1. The Vagueness Paradox (2 paragraphs)
2. The Puzzle (1 paragraph)
3. Theoretical Contributions (3 bullet points)
4. Roadmap (1 paragraph introducing κΆμ€, κΉμ, μ΄μλ΄)
**Data Sources**:
- `outputs/all/models/h1_coefficients.csv` (H1 regression results)
- `outputs/all/models/h2_main_coefficients.csv` (H2 regression results)
**μ μ΄'s Philosophy**: *"Open the door with stories that make readers want to enter. Hook first, theory later."*
---
### Phase 2: μΉ (Theory & Conceptual Model) β κΆμ€ π
**File**: `generate_02_theory_conceptual.py`
**Output**: `02_Theory_Conceptual.md`
**Commander**: κΆμ€ (Kwon-jun) β "The Structure Builder"
**Color**: Orange (#FF8C00)
**Responsibilities**:
- Review theoretical foundations (Information Economics, Real Options, Modularity)
- Identify gaps in prior work
- Develop four-module conceptual framework (Customer-Technology-Organization-Competition)
- Formalize testable hypotheses (H1, H2, H2a, H2b)
- Present descriptive statistics (Table 1)
**Content Structure**:
1. **Literature Review** (3 subsections)
- 2.1 Information Economics: Vagueness as Adverse Selection
- 2.2 Real Options: Vagueness as Strategic Flexibility
- 2.3 Modularity Theory: When is Flexibility Valuable?
2. **Conceptual Framework** (5 subsections)
- 2.4 Four-Module Framework Overview
- 2.5 Module 1: Customer Heterogeneity
- 2.6 Module 2: Technology Modularity (CORE)
- 2.7 Module 3: Organizational Slack
- 2.8 Module 4: Competitive Intensity
3. **Hypotheses** (1 subsection)
- 2.9 Formal Hypothesis Development (H1, H2)
4. **Table 1**: Descriptive Statistics
**Data Sources**:
- `data/processed/analysis_panel.csv` (for descriptive statistics)
**κΆμ€'s Philosophy**: *"Build a fortress of theory strong enough to hold κΉμ's evidence. Structure before proof."*
---
### Phase 3: θ½ (Empirics & Results) β κΉμ π
**File**: `generate_03_empirics.py`
**Output**: `03_Empirics_Results.md`
**Commander**: κΉμ (Kim-wan) β "The Righteousness Prover"
**Color**: Crimson (#DC143C)
**Responsibilities**:
- Describe data sources and sample construction
- Explain measurement strategy (vagueness score, hardware classification)
- Present empirical specifications (H1 OLS, H2 Logit)
- Report main results with regression tables
- Challenge findings (Devil's Advocate: 4 alternative explanations)
- Demonstrate robustness (Specification curve analysis, subsample analyses)
- Generate figures (spec curve plot)
**Content Structure**:
**PART A: EMPIRICAL STRATEGY**
1. 3.1 Data Sources & Sample Construction
2. 3.2 Measurement Strategy
3. 3.3 Empirical Specifications
**PART B: RESULTS**
4. 3.4 H1 Results: Vagueness β Early Funding (Table 3)
5. 3.5 H2 Results: Vagueness Γ Hardware β Growth (Table 4)
6. 3.6 Robustness Checks
- Devil's Advocate (4 alternatives: reverse causality, measurement error, selection bias, omitted variables)
- Specification Curve Analysis (1,296 model variants)
- Subsample Analyses (quantum, transportation, all companies)
**Data Sources**:
- `outputs/all/models/h1_coefficients.csv`
- `outputs/all/models/h2_main_coefficients.csv`
**Figures Generated**:
- `spec_curve_analysis.png` (specification curve plot)
**κΉμ's Philosophy**: *"Prove righteousness through uncompromising rigor. Challenge your own findings before critics do."*
---
### Phase 4: η΅ (Discussion & Conclusion) β μ΄μλ΄ πΎ
**File**: `generate_04_discussion.py`
**Output**: `04_Discussion_Conclusion.md`
**Commander**: μ΄μλ΄ (Eo-yeong-dam) β "The Story Closer"
**Color**: Purple (#9370DB)
**Responsibilities**:
- Summarize key findings
- Derive theoretical implications (Productive vs. Destructive Ambiguity)
- Provide managerial guidance (Tesla Rule, Waymo Rule)
- Offer policy and ecosystem implications
- Acknowledge limitations honestly
- Chart future research directions
- Close the narrative with wisdom
**Content Structure**:
1. 4.1 Summary of Findings
2. 4.2 Theoretical Implications
- Productive vs. Destructive Ambiguity
- Modularity β Communication Strategy
- Reconciling Info Econ vs. Real Options
3. 4.3 Managerial Implications
- The Tesla Rule (when vagueness works)
- The Waymo Rule (when specificity works)
- Decision Matrix (2Γ2: Modularity Γ Uncertainty)
4. 4.4 Policy and Ecosystem Implications
5. 4.5 Limitations
6. 4.6 Future Research Directions
7. 4.7 Conclusion
**Data Sources**:
- `outputs/all/models/h2_main_coefficients.csv` (for effect size interpretation)
**μ΄μλ΄'s Philosophy**: *"Close the story with wisdom that transcends the data. Leave readers with actionable insights and intellectual humility."*
---
## π¨ Supplementary Materials
### Section 7: Academic Poster (νμ§μ ν¬μ€ν° 곡방)
**File**: `generate_07_poster.py`
**Output**: `07_Poster.svg`, `07_Poster.md`
Visual representation of the 4-phase framework in a 2Γ2 grid format. Each quadrant corresponds to one phase (μ μ΄Β·κΆμ€Β·κΉμΒ·μ΄μλ΄) with color coding.
**Generate**:
```bash
python generate_07_poster.py
# OR
python generate_all.py --sections 7
```
### Section 8: Industry Comparison (PR #13 Integration)
**File**: `generate_08_industry_comparison.py`
**Output**: `08_IndustryComparison.md`
Analysis across 6 industries (Quantum, Transportation, Biotech, FinTech, Enterprise SW, Hardware) testing the "μ€κ°μ μ£½λλ€" (The Middle Dies) phenomenon.
**Generate**:
```bash
python generate_08_industry_comparison.py
# OR
python generate_all.py --sections 8
```
---
## π Directory Structure
```
src/scripts/paper_generation/
βββ __init__.py # Common configuration
βββ README_4PHASE.md # This file
βββ DEPRECATION_NOTICE.md # Migration guide from 8-section to 4-phase
β
βββ generate_all.py # Master script (4-phase mode)
β
βββ generate_01_introduction.py # Phase 1 (θ΅· - μ μ΄ π’)
βββ generate_02_theory_conceptual.py # Phase 2 (ζΏ - κΆμ€ π
)
βββ generate_03_empirics.py # Phase 3 (θ½ - κΉμ π)
βββ generate_04_discussion.py # Phase 4 (η΅ - μ΄μλ΄ πΎ)
β
βββ generate_07_poster.py # Supplementary: Visual poster
βββ generate_08_industry_comparison.py # Supplementary: Industry analysis
β
βββ parallel_generator.py # 8-agent parallel execution
βββ parallel_test_guide.md # Parallel testing guide
βββ TESTING_GUIDE.md # Comprehensive testing guide
β
βββ output/ # Generated markdown files
β βββ 01_Introduction.md
β βββ 02_Theory_Conceptual.md
β βββ 03_Empirics_Results.md
β βββ 04_Discussion_Conclusion.md
β βββ 07_Poster.svg
β βββ 07_Poster.md
β βββ 08_IndustryComparison.md
β βββ spec_curve_analysis.png
β
βββ [DEPRECATED] # Legacy 8-section files (kept for reference)
βββ generate_01_intro.py # β Replaced by generate_01_introduction.py
βββ generate_02_litreview.py # β Merged into generate_02_theory_conceptual.py
βββ generate_03_conceptual.py # β Merged into generate_02_theory_conceptual.py
βββ generate_04_method.py # β Merged into generate_03_empirics.py
βββ generate_05_results.py # β Merged into generate_03_empirics.py
βββ generate_06_discussion.py # β Enhanced as generate_04_discussion.py
```
---
## π§ Data Dependencies
### Required for All Phases
```
outputs/all/models/
βββ h1_coefficients.csv # H1: Early Funding ~ Vagueness (OLS)
βββ h2_main_coefficients.csv # H2: Growth ~ Vagueness Γ Hardware (Logit)
```
### Optional (for Table 1 in Phase 2)
```
data/processed/
βββ analysis_panel.csv # For descriptive statistics
```
### Generated Outputs
```
src/scripts/paper_generation/output/
βββ 01_Introduction.md # Phase 1 output (~4KB)
βββ 02_Theory_Conceptual.md # Phase 2 output (~15KB)
βββ 03_Empirics_Results.md # Phase 3 output (~20KB)
βββ 04_Discussion_Conclusion.md # Phase 4 output (~15KB)
βββ spec_curve_analysis.png # Figure from Phase 3 (~360KB)
```
---
## π§ͺ Testing
### Test All 4 Phases
```bash
cd src/scripts/paper_generation
python generate_all.py
```
Expected output:
```
β
Successfully generated: 4/4 phases
Generated files:
β 01_Introduction.md
β 02_Theory_Conceptual.md
β 03_Empirics_Results.md
β 04_Discussion_Conclusion.md
```
### Test Individual Phase
```bash
python generate_01_introduction.py
```
Expected output:
```
======================================================================
PHASE 1: θ΅· β Introduction
Commander: μ μ΄ π’ (The Door Opener)
======================================================================
β
Generated: output/01_Introduction.md
π’ μ μ΄ says: 'The door is open. κΆμ€, build the structure!'
```
### See Also
- `TESTING_GUIDE.md`: Comprehensive testing procedures
- `parallel_test_guide.md`: 8-agent parallel execution guide
---
## π― Design Philosophy
### Why 4 Phases?
The traditional academic paper structure (Intro, Lit Review, Methods, Results, Discussion) is **fragmented** and doesn't align with narrative flow. The 4-phase κΈ°μΉμ κ²° structure:
1. **Clearer narrative arc**: Setup β Development β Turn β Resolution mirrors natural storytelling
2. **Better modularity**: Each phase is self-contained with clear responsibilities
3. **Commander ownership**: Each phase has a designated leader who "owns" that narrative role
4. **Reduced redundancy**: Literature + Conceptual merged; Methods + Results merged
5. **Easier maintenance**: 4 core files instead of 6
### κΈ°μΉμ κ²° (θ΅·ζΏθ½η΅) Explained
- **θ΅· (Setup)**: Introduce the problem, create intrigue
- **ζΏ (Development)**: Build the theoretical structure and framework
- **θ½ (Turn)**: Present the critical evidence that "turns" theory into proof
- **η΅ (Resolution)**: Synthesize findings into wisdom and close the narrative
This structure has been used in Korean poetry, prose, and military strategy for centuries. The μ λΌμ’μκ΅° (Jeonla Naval Fleet) successfully defended Korea using this strategic philosophy during the Imjin War (1592-1598).
---
## π Next Steps
1. **Generate paper**: Run `python generate_all.py`
2. **Review outputs**: Check `output/` directory for markdown files
3. **Expand sections**: Use META_PROMPT from each script's source code to expand with LLM
4. **Visual summary**: Open `output/07_Poster.svg` in browser for 4-phase visualization
5. **Integrate**: Copy markdown content into LaTeX template or Word document
---
## π Additional Resources
- **Migration Guide**: See `DEPRECATION_NOTICE.md` for transitioning from old 8-section structure
- **Testing Guide**: See `TESTING_GUIDE.md` for comprehensive testing procedures
- **Parallel Execution**: See `parallel_test_guide.md` for running 8 agents in parallel
- **Legacy Documentation**: See original `README.md` for 8-section structure (deprecated)
---
*The μ λΌμ’μκ΅° (Jeonla Naval Fleet) awaits your command.*
**κΈ°μΉμ κ²° (θ΅·ζΏθ½η΅) β From Setup to Resolution**
π’ μ μ΄ β π
κΆμ€ β π κΉμ β πΎ μ΄μλ΄