# λ…Όλ¬Έ μžλ™ 생성 νŒŒμ΄ν”„λΌμΈ κ°€μ΄λ“œ # Paper Auto-Generation Pipeline Guide ## 🎯 λͺ©ν‘œ **Input**: 데이터 + κ°€μ„€ **Output**: 32개 단락 + ν‘œ + 그림으둜 κ΅¬μ„±λœ μ™„μ„±λœ λ…Όλ¬Έ (PDF) --- ## πŸ—οΈ μ•„ν‚€ν…μ²˜ ``` Raw Data (.dat files) ↓ [1] Data Processing (features.py) ↓ features_engineered.parquet [2] Statistical Analysis (models.py) ↓ H1/H2 results [3] Table Generation (scripts/generate_paper_tables.py) ↓ table1_h1.tex, table2_h2.tex [4] Figure Generation (plotting.py) ↓ fig2_*.pdf, fig3_*.pdf [5] Results Section (scripts/generate_paper_results_section.py) ↓ results_auto.tex (Module #23-27 μ™„μ „ μžλ™) [6] LaTeX Compilation (pdflatex) ↓ main.pdf (μ΅œμ’… λ…Όλ¬Έ) ``` --- ## πŸš€ λΉ λ₯Έ μ‹œμž‘ (5λΆ„) ### **전체 νŒŒμ΄ν”„λΌμΈ ν•œ λ²ˆμ— μ‹€ν–‰** ```bash # 1. 전체 νŒŒμ΄ν”„λΌμΈ μ‹€ν–‰ (데이터 β†’ 뢄석 β†’ λ…Όλ¬Έ) make all # κ²°κ³Ό: # - paper/output/main.pdf ← μ΅œμ’… λ…Όλ¬Έ # - paper/tables/*.tex ← ν…Œμ΄λΈ” 2개 # - paper/figures/*.pdf ← κ·Έλ¦Ό 2-3개 # - paper/results_auto.tex ← μžλ™ μƒμ„±λœ Results μ„Ήμ…˜ ``` ### **이미 데이터 있으면 λΉ λ₯΄κ²Œ** ```bash # 데이터 처리 κ±΄λ„ˆλ›°κ³  λΆ„μ„λ§Œ make quick ``` --- ## πŸ“‹ 단계별 μ‹€ν–‰ ### **Step 1: 데이터 처리** (Module #14-16) ```bash make data # λ˜λŠ” 직접: python -m src.cli load-data python -m src.cli engineer-features # Output: data/processed/features_engineered.parquet ``` ### **Step 2: 톡계 뢄석** (Module #23-27) ```bash make analysis # λ˜λŠ” 직접: python scripts/generate_paper_results_section.py \ --data data/processed/features_engineered.parquet \ --output paper/ # Output: # - paper/results_auto.tex (μ™„μ „ μžλ™ Results μ„Ήμ…˜) # - paper/results_values.json (λͺ¨λ“  톡계 κ°’) ``` ### **Step 3: ν…Œμ΄λΈ” 생성** ```bash make tables # λ˜λŠ” 직접: python scripts/generate_paper_tables.py \ --data data/processed/features_engineered.parquet \ --output paper/tables/ # Output: # - paper/tables/table1_h1.tex (H1 regression table) # - paper/tables/table2_h2.tex (H2 logit table) ``` ### **Step 4: κ·Έλ¦Ό 생성** ```bash make figures # λ˜λŠ” 직접: python -m src.cli generate-plots --dataset all --output paper/figures/ # Output: # - paper/figures/fig2_early_funding.pdf # - paper/figures/fig3_later_success.pdf ``` ### **Step 5: λ…Όλ¬Έ 컴파일** ```bash make paper # λ˜λŠ” 직접: cd paper/output pdflatex main.tex bibtex main pdflatex main.tex pdflatex main.tex # Output: paper/output/main.pdf ``` --- ## πŸ“ μžλ™ν™” 레벨 ### **Tier 1: μ™„μ „ μžλ™ (Results μ„Ήμ…˜)** **Module #23-27**: Results μ„Ήμ…˜μ€ **100% μžλ™ 생성**λ©λ‹ˆλ‹€. ```latex % paper/main.tex에 μΆ”κ°€ \input{paper/results_auto.tex} % μ™„μ „ μžλ™! ``` **포함 λ‚΄μš©:** - Paragraph 23: H1 κ²°κ³Ό 해석 (κ³„μˆ˜, p-value μžλ™ μ‚½μž…) - Paragraph 24: H2 κ²°κ³Ό 해석 (main effect + interaction) - Table 1: H1 regression table (LaTeX) - Table 2: H2 logit table (LaTeX) - Figure 2-3: μ°Έμ‘° 경둜 μžλ™ 생성 **μž₯점:** - 데이터 λ°”λ€Œλ©΄ β†’ `make analysis` β†’ Results μ„Ήμ…˜ μžλ™ μ—…λ°μ΄νŠΈ - κ³„μˆ˜ 틀릴 일 μ—†μŒ (μ½”λ“œμ—μ„œ 직접 μΆ”μΆœ) - p-value에 따라 ν…μŠ€νŠΈ μžλ™ λ³€κ²½ ("significant" vs "not significant") ### **Tier 2: λ°˜μžλ™ (Methodology μ„Ήμ…˜)** **Module #14-22**: 데이터 톡계λ₯Ό μžλ™μœΌλ‘œ 채움 ```latex % paper/templates/methodology.tex.j2 \subsection{Sample Construction} Our final sample comprises \VAR{descriptive.n_total} ventures, with \VAR{descriptive.n_software} (\VAR{descriptive.pct_software}\%) software ventures and \VAR{descriptive.n_hardware} hardware ventures. The average vagueness score is \VAR{descriptive.vagueness_mean} (SD = \VAR{descriptive.vagueness_std}). ``` **μ‚¬μš©λ²•:** ```bash python scripts/generate_paper_full.py # β†’ paper/output/methodology.tex (μžλ™ κ°’ μ‚½μž…) ``` ### **Tier 3: μˆ˜λ™ (λ‚˜λ¨Έμ§€)** **Module #1-13, #28-32**: μˆ˜λ™ μž‘μ„± ν•„μš” - Introduction (#1-7): μŠ€ν† λ¦¬ν…”λ§ ν•„μš” - Literature (#8-10): λ¬Έν—Œ 정리 ν•„μš” - Theory (#11-13): 이둠 μ „κ°œ ν•„μš” - Discussion (#28-32): 해석 및 ν•¨μ˜ ν•„μš” **ν•˜μ§€λ§Œ** μˆ˜μΉ˜λŠ” μžλ™μœΌλ‘œ μ°Έμ‘° κ°€λŠ₯: ```latex % paper/main.tex % Introductionμ—μ„œ Results κ°’ μ°Έμ‘° Our analysis of \input{paper/results_values.json} companies reveals... ``` --- ## πŸ”„ 일반적인 μ›Œν¬ν”Œλ‘œμš° ### **λ…Όλ¬Έ μž‘μ„± 쀑** ```bash # 1. 데이터 μ—…λ°μ΄νŠΈλ˜λ©΄ make data # 2. Results μ„Ήμ…˜ μž¬μƒμ„± make results-only # 3. λ…Όλ¬Έ 확인 open paper/output/main.pdf ``` ### **λ…Όλ¬Έ 제좜 μ „** ```bash # 1. 전체 νŒŒμ΄ν”„λΌμΈ μ‹€ν–‰ (μ²˜μŒλΆ€ν„°) make clean-all make all # 2. ν…ŒμŠ€νŠΈ μ‹€ν–‰ (λ…Όλ¬Έ κ°’ 검증) make test # 3. μ΅œμ’… 확인 make validate ``` ### **리뷰 ν”Όλ“œλ°± ν›„** ```bash # 1. μ½”λ“œ μˆ˜μ • (예: H1 formula λ³€κ²½) vi src/models.py # 2. Results μž¬μƒμ„± (λ°μ΄ν„°λŠ” κ·ΈλŒ€λ‘œ) make quick # 3. ν…ŒμŠ€νŠΈλ‘œ 검증 make test # 4. λͺ¨λ‘ ν†΅κ³Όν•˜λ©΄ commit git add . && git commit -m "Update H1 specification" ``` --- ## πŸ“Š 파일 ꡬ쑰 ``` empirics_ent_strat_ops/ β”œβ”€β”€ data/ β”‚ β”œβ”€β”€ raw/*.dat # 원본 데이터 β”‚ └── processed/ β”‚ └── features_engineered.parquet # 처리된 데이터 β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ models.py # H1/H2 ν•¨μˆ˜ β”‚ β”œβ”€β”€ features.py # 데이터 처리 β”‚ └── plotting.py # κ·Έλ¦Ό 생성 β”œβ”€β”€ scripts/ β”‚ β”œβ”€β”€ generate_paper_results_section.py # Results μžλ™ 생성 β”‚ β”œβ”€β”€ generate_paper_tables.py # ν…Œμ΄λΈ” 생성 β”‚ └── generate_paper_full.py # 전체 λ…Όλ¬Έ 생성 β”œβ”€β”€ paper/ β”‚ β”œβ”€β”€ main.tex # 메인 λ…Όλ¬Έ 파일 (μˆ˜λ™ μž‘μ„±) β”‚ β”œβ”€β”€ results_auto.tex # μžλ™ μƒμ„±λœ Results βœ“ β”‚ β”œβ”€β”€ results_values.json # λͺ¨λ“  톡계 κ°’ β”‚ β”œβ”€β”€ tables/ β”‚ β”‚ β”œβ”€β”€ table1_h1.tex # μžλ™ 생성 βœ“ β”‚ β”‚ └── table2_h2.tex # μžλ™ 생성 βœ“ β”‚ β”œβ”€β”€ figures/ β”‚ β”‚ β”œβ”€β”€ fig2_early_funding.pdf # μžλ™ 생성 βœ“ β”‚ β”‚ └── fig3_later_success.pdf # μžλ™ 생성 βœ“ β”‚ β”œβ”€β”€ templates/ # Jinja2 ν…œν”Œλ¦Ώ (선택) β”‚ β”‚ β”œβ”€β”€ main.tex.j2 β”‚ β”‚ └── results.tex.j2 β”‚ └── output/ β”‚ └── main.pdf # μ΅œμ’… λ…Όλ¬Έ PDF βœ“ β”œβ”€β”€ test/ β”‚ └── integration/ β”‚ └── test_paper_results.py # λ…Όλ¬Έ κ°’ 검증 β”œβ”€β”€ Makefile # νŒŒμ΄ν”„λΌμΈ μžλ™ν™” └── README.md ``` --- ## 🎨 main.tex ꡬ쑰 μ˜ˆμ‹œ ```latex \documentclass{article} \begin{document} % ============================================ % Tier 3: μˆ˜λ™ μž‘μ„± % ============================================ \section{Introduction} % Paragraph 1-7: 직접 μž‘μ„± In 2008, Tesla Motors approached investors... \section{Literature Review} % Paragraph 8-10: 직접 μž‘μ„± The information economics tradition... \section{Theory} % Paragraph 11-13: 직접 μž‘μ„± Information and Real option value... % ============================================ % Tier 2: λ°˜μžλ™ (ν…œν”Œλ¦Ώ + κ°’) % ============================================ \section{Empirical Methodology} % Paragraph 14-22: ν…œν”Œλ¦Ώ μ‚¬μš© (선택) % \input{paper/methodology_auto.tex} % λ˜λŠ” μˆ˜λ™ + κ°’ μ°Έμ‘° Our final sample comprises \VAR{n_total} ventures... % ============================================ % Tier 1: μ™„μ „ μžλ™ βœ“ % ============================================ \section{Results} \input{paper/results_auto.tex} % Module #23-27 μ™„μ „ μžλ™! % ============================================ % Tier 3: μˆ˜λ™ μž‘μ„± % ============================================ \section{Discussion} % Paragraph 28-32: 직접 μž‘μ„± Our findings reconcile... \bibliographystyle{plainnat} \bibliography{references} \end{document} ``` --- ## βš™οΈ κ³ κΈ‰ κΈ°λŠ₯ ### **μžλ™ μž¬λΉŒλ“œ (파일 λ³€κ²½ 감지)** ```bash # 파일 λ³€κ²½ μ‹œ μžλ™μœΌλ‘œ λ…Όλ¬Έ 재컴파일 make watch # λ˜λŠ” (entr ν•„μš”): ls paper/*.tex paper/tables/*.tex | entr make paper ``` ### **CI/CD 톡합** ```yaml # .github/workflows/paper.yml name: Build Paper on: [push] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - run: make all - uses: actions/upload-artifact@v3 with: name: paper path: paper/output/main.pdf ``` ### **버전 관리** ```bash # λ…Όλ¬Έ λ²„μ „λ³„λ‘œ μŠ€λƒ…μƒ· μ €μž₯ make all cp paper/output/main.pdf paper/versions/draft_v1_$(date +%Y%m%d).pdf # Git에 commit (PDF μ œμ™Έ) git add paper/results_auto.tex paper/tables/*.tex git commit -m "Update Results section - H2 interaction now significant" ``` --- ## πŸ“ˆ μ„±κ³Ό μ§€ν‘œ ### **μˆ˜λ™ μž‘μ—… 절감** **Before (μˆ˜λ™)**: - Results μ„Ήμ…˜ μž‘μ„±: 2-3μ‹œκ°„ - Table 1-2 LaTeX μž‘μ„±: 1-2μ‹œκ°„ - κ³„μˆ˜ 틀릴 ν™•λ₯ : λ†’μŒ - 데이터 λ°”λ€Œλ©΄: μ²˜μŒλΆ€ν„° λ‹€μ‹œ **After (μžλ™)**: - Results μ„Ήμ…˜: 2λΆ„ (`make analysis`) - Tables: 1λΆ„ (`make tables`) - κ³„μˆ˜ 였λ₯˜: 0% - 데이터 λ°”λ€Œλ©΄: `make quick` β†’ 3λΆ„ ### **μž¬ν˜„μ„± 보μž₯** ```bash # 리뷰어가 μž¬ν˜„ μš”μ²­ν•˜λ©΄: git clone https://github.com/user/paper.git cd paper make all # 5λΆ„ ν›„ β†’ λ˜‘κ°™μ€ λ…Όλ¬Έ PDF 생성 βœ“ ``` --- ## ❓ FAQ **Q: λͺ¨λ“  32개 단락을 μžλ™ 생성할 수 μžˆλ‚˜μš”?** A: μ•„λ‹ˆμš”. **Results (Module #23-27)만 100% μžλ™**μž…λ‹ˆλ‹€. Introduction/Discussion은 μˆ˜λ™ μž‘μ„± ν•„μš” (μŠ€ν† λ¦¬ν…”λ§/해석). **Q: 데이터가 λ°”λ€Œλ©΄ μ–΄λ–»κ²Œ ν•˜λ‚˜μš”?** A: ```bash make data # 데이터 재처리 make quick # 뢄석+λ…Όλ¬Έ μž¬μƒμ„± ``` 3-5뢄이면 μ™„λ£Œ. **Q: λ…Όλ¬Έ ν…œν”Œλ¦Ώμ„ μ–΄λ–»κ²Œ λ§Œλ“œλ‚˜μš”?** A: κΈ°μ‘΄ LaTeX λ…Όλ¬Έμ—μ„œ 숫자 λΆ€λΆ„λ§Œ `\VAR{λ³€μˆ˜λͺ…}`으둜 ꡐ체. 예: `450 ventures` β†’ `\VAR{n_total} ventures` **Q: FigureλŠ” μ–΄λ–»κ²Œ μžλ™ν™”ν•˜λ‚˜μš”?** A: `src/plotting.py`에 κ·Έλ¦Ό 생성 ν•¨μˆ˜ μΆ”κ°€: ```python def generate_figure2(df): # ... plotting code plt.savefig('paper/figures/fig2.pdf') ``` **Q: CI/CDμ—μ„œ λ…Όλ¬Έ μžλ™ λΉŒλ“œν•  수 μžˆλ‚˜μš”?** A: λ„€! GitHub Actionsμ—μ„œ `make all` μ‹€ν–‰ β†’ PDF μžλ™ 생성. --- ## πŸŽ“ Best Practices ### **1. ResultsλŠ” μ™„μ „ μžλ™, λ‚˜λ¨Έμ§€λŠ” μˆ˜λ™** ```latex % 쒋은 예: \section{Results} \input{paper/results_auto.tex} % μžλ™ βœ“ \section{Discussion} % 직접 μž‘μ„± (μžλ™ν™” μ‹œλ„ X) ``` ### **2. μ€‘μš”ν•œ κ°’λ§Œ JSON으둜 μ°Έμ‘°** ```latex % main.texμ—μ„œ Our analysis of {{ n_total }} companies reveals that vagueness reduces early funding by {{ h1_coef_abs }}%. % λ‚˜λ¨Έμ§€ ν…μŠ€νŠΈλŠ” μˆ˜λ™ μž‘μ„± ``` ### **3. 버전 κ΄€λ¦¬λŠ” μ†ŒμŠ€λ§Œ, PDFλŠ” μ œμ™Έ** ```bash # .gitignore paper/output/*.pdf paper/output/*.aux ``` ### **4. ν…ŒμŠ€νŠΈλ‘œ λ…Όλ¬Έ κ°’ 검증** ```bash # λ…Όλ¬Έ 제좜 μ „ 항상 make test make validate ``` --- ## πŸš€ Next Steps 1. **μ§€κΈˆ λ°”λ‘œ μ‹œλ„**: ```bash make all ``` 2. **Results μ„Ήμ…˜ 확인**: ```bash cat paper/results_auto.tex ``` 3. **논문에 톡합**: ```latex % paper/main.tex에 μΆ”κ°€ \input{paper/results_auto.tex} ``` 4. **컴파일 & 확인**: ```bash make paper open paper/output/main.pdf ``` --- **문의**: 이 κ°€μ΄λ“œλŠ” `docs/PAPER_INTEGRATION_STRATEGY.md`와 ν•¨κ»˜ μ½μ–΄μ£Όμ„Έμš”.