Exercises
Test your understanding of document processing, ETL pipelines, and DeepSeek-OCR. Progress from warm-up to stretch.
Warm-Up
1. Order the ETL Pipeline
Interactive exercise — requires JavaScript to display
2. What Gets Lost in Flattening?
When traditional OCR flattens a document to text, which are lost? Select all that apply.
Interactive exercise — requires JavaScript to display
Core
3. Complete the Architecture
Interactive exercise — requires JavaScript to display
4. Compression vs Accuracy
Find the configuration that achieves >95% accuracy for financial documents.
Interactive parameter tuner — requires JavaScript to display
5. Token Cost Visualizer
Drag the slider to change batch size. Watch how token counts and costs compare across VLM systems in real time.
Interactive parameter tuner — requires JavaScript to display
Stretch
6. Match Tool to Layer
Which tool provides the most leverage at each pipeline layer?
Interactive exercise — requires JavaScript to display
7. Visual Document Router
Configure a document's properties, then click Route to see which OCR system it gets sent to and why. The path animates through the decision tree.
Interactive exercise — requires JavaScript to display