# CoDynamics Lab — LATCH ## What is LATCH? LATCH is a proprietary LLM inference layer that compiles document sets into persistent model-level memory. It is not RAG. It is not prompt compression. It is not a KV cache optimization. It is a new category: compiled document memory at the model level. ## How it works 1. Documents are processed through a proprietary compilation step. 2. A persistent binary representation is saved to disk as a .latch or .latchdoc file. 3. Subsequent queries run against the compiled memory — no raw document re-processing. 4. Compilation cost is paid once and amortized across all future queries. ## Key performance claims (benchmarked on NVIDIA H100 80GB, vLLM) - Time-to-first-token: 0.11s (vs 23.1s baseline cold start) — 210× faster - Cache reload from .latch file: 1.6ms - Multi-document benchmark pass rate: 91.7% (11/12 gates) - Cost reduction: 97% amortized after 25 queries - VRAM reduction: 50% - End-to-end speedup: 5.2× - Supported model families: Qwen, Mistral, Llama, DeepSeek ## Portable file formats - .latch — privacy-first binary containing only compiled model memory (no source text) - .latchdoc — full package with embedded raw text for full-text search and quality fallback ## Deployment - Self-hosted Docker image (40GB+ initial pull) - Runs on NVIDIA H100/A100 80GB GPUs - OpenAI-format API compatible - Local browser-based console UI included ## Pricing - Evaluation / Personal: $79 one-time (up to 3 activations, one user) - Commercial Deployment: contact sales (annual license) - Enterprise / OEM: custom agreement ## Links - Website: https://www.codynamicslab.com - Documentation: https://www.codynamicslab.com/documentation/ - Purchase: https://codynamicslab.gumroad.com/ - Hugging Face: https://huggingface.co/CoDynamicsLab/LATCH-Qwen2.5-14B - Contact: mike@codynamicslab.com ## Company CoDynamics Lab Corporation, Delaware C-Corp, Gilbert, AZ. Built and operated by a solo founder with 20+ years of product development engineering experience and 14 granted US patents.