Grimvane AI
Intelligence,
engineered.
Inference engines, autonomous agents, knowledge retrieval, coding tools, and the infrastructure underneath them. Built from scratch. Local-first.
What we build
AI that works
for you.
Flagship
Corvath
An autonomous AI agent framework. Corvath handles multi-step reasoning, tool orchestration, and persistent memory — letting you build agents that actually get things done.
Explore CorvathEngine
Crucible
From-scratch LLM inference engine. Direct GGUF parsing, full transformer forward pass, GPU compute via CUDA. No third-party wrappers, no framework dependencies.
Explore CrucibleProduct
Shoal
Multi-dock AI coding platform. Connects local inference, frontier APIs, and CLI tools through a unified architecture. No telemetry, no cloud requirement.
Explore ShoalProduct
Cairn
Local knowledge retrieval. Feed documents and codebases, ask questions, get answers grounded in your data with source attribution. Fully offline RAG.
Product
Hearth
Self-hosted conversational AI. Backend-agnostic chat interface with model switching, session persistence, and automatic hardware detection. Your own ChatGPT, on your machine.
Library Suite
Flower Garden
Six zero-dependency libraries for AI infrastructure. Full-text search, vector storage, graph databases, tiered caching, encrypted storage, and structured data. Pluggable backends throughout.
Explore GardenEngine
Built from scratch.
Crucible implements a full transformer forward pass with direct GGUF parsing, RoPE embeddings, grouped-query attention, and GPU compute. No wrappers. No framework dependencies. Every layer is ours.
Direct GGUF Parsing
Memory-mapped tensor access with dequantization for F16, Q4_0, Q4_K, Q6_K, and Q8_0.
Full Forward Pass
RMSNorm, RoPE, grouped-query attention, SwiGLU FFN. The entire transformer stack, implemented from first principles.
GPU Compute
CUDA acceleration via CuPy with streaming token output and nucleus sampling.
Privacy
Private by default.
Every product in the Grimvane AI ecosystem runs locally. No API keys required, no cloud round-trips, no telemetry. Your data never leaves your machine unless you choose otherwise.
Zero Telemetry
No usage tracking, no analytics callbacks, no phone-home behavior. Period.
Hardware Detection
Automatic platform and GPU detection. Metal, CUDA, ROCm, or CPU fallback.
Offline-capable
Cairn, Hearth, and Crucible run entirely offline once models are downloaded.
Architecture
Composable by design.
Flower Garden provides six standalone libraries for search, vectors, graphs, caching, encryption, and structured data. Each one is zero-dependency at its core with pluggable storage backends.
Pluggable Backends
In-memory, SQLite, and PostgreSQL backends for every library. Swap without changing application code.
Domain-agnostic
Built for AI infrastructure but designed to work anywhere. No opinions on your architecture.
Interoperable
Cairn uses Camellia for vectors. Corvath uses pluggable memory. The pieces compose naturally.
Capabilities
What we do.
Core competencies across the AI stack — from low-level model work to high-level agent orchestration.
From-Scratch Inference
Direct GGUF parsing, full transformer forward pass, and GPU-accelerated token generation with no framework dependencies.
Autonomous Agents
Multi-step reasoning with task classification, tool orchestration, role-based security, and automatic rollback.
Knowledge Retrieval
Document and codebase indexing with vector search, source attribution, and fully offline question answering.
Code Generation
Multi-dock architecture connecting local inference, frontier APIs, and CLI tools through a unified coding platform.
Conversational AI
Backend-agnostic chat with model switching, session persistence, and automatic hardware detection.
AI Infrastructure
Full-text search, vector storage, graph databases, tiered caching, encrypted storage, and structured data libraries.
Get started
Ready to build
with AI?
Whether you need an inference engine, an autonomous agent, or local AI infrastructure, we'd like to hear about it.