# TASK: Create Research Index System for Text Corpus

**Working Directory:** [specify path]

Analyze all .txt files in this directory and create a two-tier research indexing system.

---

## FILE 1: research-index.json (Machine-Readable Metadata)

Create a JSON file with this structure:

{
  "research_base": "[folder name]",
  "created": "[YYYY-MM-DD]",
  "total_transcripts": [count],
  "transcripts": [
    {
      "id": "[sequential: doc-001, doc-002, etc.]",
      "filename": "[original filename]",
      "path": "[relative path from working directory]",
      "title": "[descriptive title extracted or inferred]",
      "source_url": "[if mentioned in document]",
      "duration_mins": [if applicable],
      "topics": ["[keyword1]", "[keyword2]", "[keyword3]"],
      "key_insights": [
        "[Counter-intuitive finding or key takeaway]",
        "[Specific number, statistic, or concrete detail]",
        "[Actionable technique or method]",
        "[3-7 insights per document]"
      ],
      "relevant_to": ["[related_topic1]", "[related_topic2]"],
      "technical_depth": "[beginner|intermediate|advanced]"
    }
  ]
}

**Key requirements:**
- Extract 5-7 key_insights per document (most valuable findings)
- Topics should be searchable keywords (lowercase_underscore format)
- relevant_to tags create cross-reference network between documents
- technical_depth helps filter by audience level

---

## FILE 2: research-guide.md (Human Strategy Guide)

Create a markdown file with these sections:

### 1. Quick Topic Index
Group documents by subject matter clusters with IDs for quick lookup.

Example:
**Core Techniques (3 documents)**
- **doc-001** - [Title] ([duration]) - [one-line description]
- **doc-005** - [Title] ([duration]) - [one-line description]

### 2. Content Integration Notes
How to use this research base in practice:
- "For articles about [topic]: Start with doc-X, doc-Y"
- "Best documents for [voice patterns]: doc-A, doc-B"
- "Counter-intuitive insights: Check doc-Z"

### 3. Topic Clusters
Organize documents by use case:
- **Fundamentals Cluster** - Core concepts every article needs
- **Advanced Cluster** - Deep-dive technical content
- **Practical Examples Cluster** - Real-world applications

### 4. Search Strategy Examples
Concrete examples of how to query the index:
"For article on [specific topic]" → Load: doc-X, doc-Y, doc-Z

### 5. Key Counter-Intuitive Insights
Pre-extracted gold nuggets from key_insights with document IDs.
These are engagement hooks for articles.

### 6. Frequently Accessed Combinations
Common document sets that work well together.

---

## ANALYSIS GUIDELINES

**When extracting key_insights:**
✅ DO include: Counter-intuitive findings, specific numbers, actionable techniques
✅ DO extract: Unique analogies, real-world examples, specific recommendations
❌ DON'T include: Generic statements, obvious facts, vague descriptions

**When assigning topics tags:**
✅ DO use: Specific, searchable terms (trail_braking, token_optimization)
❌ DON'T use: Too broad (driving, writing) or too narrow (turn_1_spa)

**When setting relevant_to relationships:**
Think: "If someone is researching [this topic], which other documents complement it?"

**When rating technical_depth:**
- Beginner: Assumes no prior knowledge, focuses on fundamentals
- Intermediate: Assumes basics, adds refinement and troubleshooting
- Advanced: Deep theory, counter-intuitive concepts, why it works

---

## OUTPUT LOCATIONS

Save files as:
- `./research-index.json`
- `./research-guide.md`

Create a third file `./INDEX-SUMMARY.md` documenting:
- Statistics (total documents, topic distribution, depth breakdown)
- How to use both files
- Example query workflows
- Token efficiency analysis (if applicable)