# CoDynamics Lab — LATCH (Extended Reference)

> This file provides detailed technical and commercial context about LATCH for LLM consumption. For a shorter summary, see /llms.txt.

## Product summary
LATCH is a proprietary inference layer that compiles document sets into persistent LLM memory. After a one-time compilation, queries run against the compiled representation in sub-200ms without re-reading, re-chunking, or re-embedding source documents. The compiled output is saved as a portable binary file (.latch or .latchdoc) that can be shipped, shared, and reloaded in 1.6ms.

## What LATCH replaces
LATCH is designed as a direct replacement for:
- Retrieval-Augmented Generation (RAG) pipelines
- Prompt compression / context stuffing approaches
- Session-bound KV cache strategies
- Per-query full-context injection

## How LATCH differs from RAG
| Dimension | RAG | LATCH |
|---|---|---|
| Per-query cost | Embedding + retrieval + injection every query | Zero after compilation |
| Cross-document reasoning | Limited by chunk boundaries | Full corpus awareness |
| Artifacts | Chunking boundaries cause hallucination seams | No chunking, no seams |
| Persistence | Embeddings only, no model-level state | Full model-level memory on disk |
| Portability | Requires vector DB + source documents | Single .latch/.latchdoc binary file |
| Cold start | Slow (embed + retrieve + inject) | 0.11s from compiled state |

## How LATCH differs from KV cache
- KV caches are session-bound and evicted when the session ends. LATCH persists to disk.
- KV caches are model-specific and not portable. LATCH files can be reloaded across sessions.
- KV caches do not reduce VRAM for the document content. LATCH achieves 50% VRAM reduction.

## Benchmarked performance (NVIDIA H100 80GB, vLLM)
- TTFT: 0.11s (baseline: 23.1s) — 210× improvement
- Cache reload: 1.6ms from .latch file on disk
- Multi-document pass rate: 91.7% (11 of 12 benchmark gates)
- Cost reduction: 97% amortized after 25 queries
- VRAM reduction: 50% (more instances per node)
- End-to-end speedup: 5.2×

Benchmark corpus: DOJ antitrust brief, SEC 10-K, credit agreement, commercial lease, NIST AI RMF.

## Supported models
Qwen 2.5 (14B tested), Mistral, Llama, DeepSeek. Four model families at launch.

## File formats
### .latch (privacy-first)
- Contains only compiled model-level memory
- No source text included
- Smallest file size
- Use when sharing analysis without exposing source documents

### .latchdoc (full intelligence package)
- Everything in .latch plus embedded raw text
- Enables Ctrl+F / full-text search
- Automatic fallback to raw text for edge-case queries
- Negligible size overhead vs .latch
- Recommended default for most use cases

## Deployment model
- Self-hosted Docker container: `docker run --gpus all -p 8091:8091 codynamics/latch:latest`
- Initial image pull: 40GB+, ~20 minutes
- First startup warmup: 7–8 minutes
- Requires NVIDIA GPU with 80GB VRAM (H100 or A100 recommended)
- Exposes local REST API (OpenAI-format compatible) on port 8091
- Includes browser-based console UI at http://127.0.0.1:8091/
- No data leaves the user's infrastructure

## API surface (key endpoints)
- GET /health — readiness check
- POST /compile_file — compile a document (PDF, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, JSON, XML)
- POST /query — query against compiled documents
- Console UI at root path for interactive use

## Pricing and licensing
| Tier | Price | Scope |
|---|---|---|
| Evaluation / Personal | $79 one-time | One person, up to 3 activations, eval or personal use |
| Commercial Deployment | Contact sales | Internal deployment or company-operated production |
| Enterprise / OEM | Custom | Embedding LATCH in third-party products, redistribution |

Early adopter policy: first 100 customers get free upgrade to v2. All v1 customers get 50% off v2.

## Use cases
- Legal teams compiling case law, contracts, and regulatory filings for persistent analysis
- Financial analysts compiling SEC filings, earnings transcripts, and credit agreements
- Compliance teams compiling policy documents for ongoing audit queries
- Research teams compiling literature corpora for cross-paper analysis
- Any workflow where the same large document set is queried repeatedly

## Company
CoDynamics Lab Corporation
Delaware C-Corp, Gilbert, AZ
Solo founder: 20+ years multi-disciplinary product development engineering, 14 granted US patents
Contact: mike@codynamicslab.com

## Links
- Website: https://www.codynamicslab.com
- Documentation: https://www.codynamicslab.com/documentation/
- License: https://www.codynamicslab.com/license.html
- Privacy: https://www.codynamicslab.com/privacy.html
- Purchase: https://codynamicslab.gumroad.com/
- Hugging Face: https://huggingface.co/CoDynamicsLab/LATCH-Qwen2.5-14B
- LinkedIn: https://www.linkedin.com/in/mike-holford