# CoDynamics Lab — LATCH (Extended Reference) > This file provides detailed technical and commercial context about LATCH for LLM consumption. For a shorter summary, see /llms.txt. ## Product summary LATCH is a proprietary inference layer that compiles document sets into persistent LLM memory. After a one-time compilation, queries run against the compiled representation in sub-200ms without re-reading, re-chunking, or re-embedding source documents. The compiled output is saved as a portable binary file (.latch or .latchdoc) that can be shipped, shared, and reloaded in 1.6ms. ## What LATCH replaces LATCH is designed as a direct replacement for: - Retrieval-Augmented Generation (RAG) pipelines - Prompt compression / context stuffing approaches - Session-bound KV cache strategies - Per-query full-context injection ## How LATCH differs from RAG | Dimension | RAG | LATCH | |---|---|---| | Per-query cost | Embedding + retrieval + injection every query | Zero after compilation | | Cross-document reasoning | Limited by chunk boundaries | Full corpus awareness | | Artifacts | Chunking boundaries cause hallucination seams | No chunking, no seams | | Persistence | Embeddings only, no model-level state | Full model-level memory on disk | | Portability | Requires vector DB + source documents | Single .latch/.latchdoc binary file | | Cold start | Slow (embed + retrieve + inject) | 0.11s from compiled state | ## How LATCH differs from KV cache - KV caches are session-bound and evicted when the session ends. LATCH persists to disk. - KV caches are model-specific and not portable. LATCH files can be reloaded across sessions. - KV caches do not reduce VRAM for the document content. LATCH achieves 50% VRAM reduction. ## Benchmarked performance (NVIDIA H100 80GB, vLLM) - TTFT: 0.11s (baseline: 23.1s) — 210× improvement - Cache reload: 1.6ms from .latch file on disk - Multi-document pass rate: 91.7% (11 of 12 benchmark gates) - Cost reduction: 97% amortized after 25 queries - VRAM reduction: 50% (more instances per node) - End-to-end speedup: 5.2× Benchmark corpus: DOJ antitrust brief, SEC 10-K, credit agreement, commercial lease, NIST AI RMF. ## Supported models Qwen 2.5 (14B tested), Mistral, Llama, DeepSeek. Four model families at launch. ## File formats ### .latch (privacy-first) - Contains only compiled model-level memory - No source text included - Smallest file size - Use when sharing analysis without exposing source documents ### .latchdoc (full intelligence package) - Everything in .latch plus embedded raw text - Enables Ctrl+F / full-text search - Automatic fallback to raw text for edge-case queries - Negligible size overhead vs .latch - Recommended default for most use cases ## Deployment model - Self-hosted Docker container: `docker run --gpus all -p 8091:8091 codynamics/latch:latest` - Initial image pull: 40GB+, ~20 minutes - First startup warmup: 7–8 minutes - Requires NVIDIA GPU with 80GB VRAM (H100 or A100 recommended) - Exposes local REST API (OpenAI-format compatible) on port 8091 - Includes browser-based console UI at http://127.0.0.1:8091/ - No data leaves the user's infrastructure ## API surface (key endpoints) - GET /health — readiness check - POST /compile_file — compile a document (PDF, DOCX, XLSX, PPTX, TXT, MD, HTML, CSV, JSON, XML) - POST /query — query against compiled documents - Console UI at root path for interactive use ## Pricing and licensing | Tier | Price | Scope | |---|---|---| | Evaluation / Personal | $79 one-time | One person, up to 3 activations, eval or personal use | | Commercial Deployment | Contact sales | Internal deployment or company-operated production | | Enterprise / OEM | Custom | Embedding LATCH in third-party products, redistribution | Early adopter policy: first 100 customers get free upgrade to v2. All v1 customers get 50% off v2. ## Use cases - Legal teams compiling case law, contracts, and regulatory filings for persistent analysis - Financial analysts compiling SEC filings, earnings transcripts, and credit agreements - Compliance teams compiling policy documents for ongoing audit queries - Research teams compiling literature corpora for cross-paper analysis - Any workflow where the same large document set is queried repeatedly ## Company CoDynamics Lab Corporation Delaware C-Corp, Gilbert, AZ Solo founder: 20+ years multi-disciplinary product development engineering, 14 granted US patents Contact: mike@codynamicslab.com ## Links - Website: https://www.codynamicslab.com - Documentation: https://www.codynamicslab.com/documentation/ - License: https://www.codynamicslab.com/license.html - Privacy: https://www.codynamicslab.com/privacy.html - Purchase: https://codynamicslab.gumroad.com/ - Hugging Face: https://huggingface.co/CoDynamicsLab/LATCH-Qwen2.5-14B - LinkedIn: https://www.linkedin.com/in/mike-holford