Bundles

Compile once, ship the file, query anywhere.

.latch and .latchdoc are the portable bundle formats for LATCH. They let one operator pay the document-processing cost once, then move the finished compiled corpus to another compatible runtime without recompiling, re-extracting, or re-reading the original source set.

Format Thesis

The bundle is the finished product, not a request to do the work again.

PDF made document layout portable. LATCH bundles do the AI-era equivalent for compiled document intelligence. The expensive steps such as extraction, tokenization, and neural compilation are paid once by the creator. Every recipient opens the finished result and starts querying immediately.

On the validated path, bundle imports reload in roughly 1.6 ms to under 2 ms. The destination runtime inherits the compiled state instead of rebuilding it.

Format

.latch

Compact, privacy-first bundle. Carries the compiled latent state but omits extracted source text. Use it when the destination only needs the reasoning surface and should not carry readable documents.

Format

.latchdoc

Full-fidelity bundle. Carries compiled latent state plus extracted source text for full-text lookup and text-based fallback. This is the recommended default for most operator-facing workflows.

Use Cases

Where each bundle format fits best.

M&A / Deal Room

.latchdoc

Compile the diligence corpus once, share the finished file with multiple analysts, and keep exact source language available for memos, committee materials, and fallback review.

Insurance Underwriting

.latch

Move the compiled reasoning surface across underwriting rooms without shipping readable policyholder text, loss runs, or other downstream-sensitive source material.

Regulatory Compliance

.latchdoc

Use when compliance teams need cross-document reasoning plus exact text lookup for audit responses, policy review, and citation-heavy work.

Game / Lore Memory

.latch

Ship compiled world knowledge as a runtime asset. The destination machine gets fast, portable memory without carrying a human-readable lore corpus.

FAQ

General

What is LATCH?

LATCH is a persistent document-intelligence runtime that compiles a document set into reusable latent memory instead of re-reading the raw corpus on every query.

How is this different from RAG?

RAG retrieves chunks per question. LATCH compiles the corpus into a persistent representation so cross-document reasoning does not depend on chunk retrieval quality at query time.

How is this different from a large context window?

Large-context prompting pays the full document-read cost on every query. LATCH pays the compilation cost once, then amortizes it across all later queries and recipients of the bundle.

Does LATCH require internet access?

The runtime and bundles stay local. The only network dependency in the supported customer path is the startup license validation call.

What models does LATCH support?

The current validated lineup is centered on Qwen 2.5 14B, with additional supported tuples across the Llama, Mistral, and DeepSeek families on the product roadmap and internal validation path.

FAQ

Bundles And Fallback

When should I use .latch?

Use .latch when the destination should get the compiled reasoning surface but should not carry readable extracted text. This is the privacy-first choice.

When should I use .latchdoc?

Use .latchdoc when the destination also needs exact source language, full-text lookup, and text-based fallback for edge cases.

What is the fallback mechanism?

Fallback is the quality safety net for .latchdoc. If the primary LATCH answer is refusal-like, low-confidence, or errors, the runtime can fall back to the embedded text path instead of leaving the query stranded.

Does fallback work with .latch?

No. Fallback needs embedded source text, so it is available only on .latchdoc. That tradeoff is intentional.

How large are bundle files?

Size depends on corpus size and memory settings, but typical enterprise bundles land in the low hundreds of megabytes. .latchdoc usually adds only modest text overhead compared with the compiled tensor payload.

Default operational guidance: use auto fallback for .latchdoc, use off for strict benchmarking, and reserve always for diagnostics because it runs both paths and returns the fallback answer regardless of primary quality.

FAQ

Compilation, Security, And Operations

What processing cost does the recipient avoid?

The recipient avoids re-extraction, re-tokenization, and recompilation. They inherit the finished compiled artifact instead of paying the ingestion pipeline again.

Are bundles encrypted?

Yes. LATCH bundles are encrypted and portable only across compatible runtimes. Team sharing can use the instance license token or an explicit passphrase when cross-license sharing is required.

Can someone reverse a .latch file?

The supported privacy posture is that .latch omits extracted source text and carries compiled latent state rather than a readable copy of the original documents.

What hardware does LATCH target?

The primary validated path targets NVIDIA H100 80GB and A100 80GB class rooms. Customer-facing quickstarts currently assume that class of GPU.

FAQ

Deployment And Commercial Questions

How do I install LATCH?

LATCH ships as a Docker image. The supported quickstarts cover private ACR pull, startup token placement, and first-room validation for RunPod and Ubuntu GPU hosts.

What file types can I compile?

The current product path supports PDF, TXT, MD, HTML, DOCX, XLSX, PPTX, CSV, JSON, and XML.

What does the evaluation license include?

The public evaluation path includes the self-hosted runtime, Console UI, local API, and current v1 update stream under the applicable customer license terms.

What if I want production or OEM use?

Production deployment and embedded distribution remain direct-conversation paths because the right terms depend on the workload, distribution model, and support expectations.

Is the API compatible with existing tools?

The supported local runtime exposes an OpenAI-format API surface, so many existing tools and internal integrations can switch to LATCH with an endpoint change rather than a full client rewrite.