Exercises

Test your understanding of document processing, ETL pipelines, and DeepSeek-OCR. Progress from warm-up to stretch.

Warm-Up

1. Order the ETL Pipeline

Interactive exercise — requires JavaScript to display

2. What Gets Lost in Flattening?

When traditional OCR flattens a document to text, which are lost? Select all that apply.

Interactive exercise — requires JavaScript to display

Core

3. Complete the Architecture

Interactive exercise — requires JavaScript to display

4. Compression vs Accuracy

Find the configuration that achieves >95% accuracy for financial documents.

Interactive parameter tuner — requires JavaScript to display

5. Token Cost Visualizer

Drag the slider to change batch size. Watch how token counts and costs compare across VLM systems in real time.

Interactive parameter tuner — requires JavaScript to display

Stretch

6. Match Tool to Layer

Which tool provides the most leverage at each pipeline layer?

Interactive exercise — requires JavaScript to display

7. Visual Document Router

Configure a document's properties, then click Route to see which OCR system it gets sent to and why. The path animates through the decision tree.

Interactive exercise — requires JavaScript to display