Bring your data. Take the weights. Run your own stack.
Logos is fine-tuning as a service for open-weight language models — 45 curated bases across 16 model families: Llama, Qwen 2.5 / 3 / 3.5 / 3.6 (dense + MoE up to 397B), NVIDIA Llama-Nemotron (Nano / Super / Ultra), Mistral & Mixtral, Google Gemma 3 + Gemma 4 (26B MoE + 31B Dense), Phi-4, DeepSeek-R1 distills, GLM-5 + GLM-5.1, Kimi K2.5, Nous Research Hermes 3 + 4.3, and MiniMax M1 + M2. Describe your objective, hand us your data, and we ship you the trained weights.
Need to fine-tune in-house going forward? Pick a framework at checkout — OpenClaw (open scripted), Hermes (Nous-style alignment + function-calling), NemoClaw (NVIDIA NeMo / Megatron), IronClaw (security-hardened), or your own choice — and we install it on your infrastructure during the same engagement. Frontier-model jobs come with a built-in setup discount of up to 35%. No vendor lock-in. No proprietary API. No model you can't take with you.
Three steps from prompt to production model
Configure
Choose a base model, declare your objective, tell us where the data lives. Takes about three minutes with our wizard.
We Forge
We upload your dataset to isolated job storage, tune hyperparameters against a held-out split, run training on our local GPU hardware or rented H100/B200 capacity, and evaluate against baseline. Your data is deleted on completion.
You Deploy
Signed download link with merged weights and LoRA adapter. HTML evaluation report. Inference snippets for vLLM, llama.cpp, and Python. You take the weights and run inference yourself, however and wherever you want — we do not host.
16 families. 45 selectable bases.
Start at the family level, then expand only the models you are evaluating. Each family shows the procurement-critical details first: publisher, model count, size coverage, starting tier, surcharge range, and best-fit workload.
Four tiers. One-time fees.
Every tier ships the weights to you. Subscriptions are for the vendor-lock world — we don't do that here. Pick the tier that matches your model size and training volume. The Custom tier is à la carte for off-catalog work — submit an inquiry and we settle payment by email rather than on-site checkout.
Starter
LoRA fine-tuning on small open-weight models (≤14B). Every Starter model is included at the base price — no surcharges. Perfect for prototyping, style transfer, or niche domain adaptation.
Solo developers · Prototypes · Side projects
- LoRA adapter fine-tuning
- Any 8 Starter-tier models — all included, zero surcharge
- Up to 10,000 training examples
- 1 training epoch
- Merged full-precision weights (.safetensors)
- Evaluation report (PDF) with baseline comparison
- Inference examples (Python + curl)
Studio
LoRA or QLoRA on mid-to-large models. Includes everything in Starter plus most dense models up to 32B and small MoE up to 50B-total/13B-active — all at base price. A small per-model surcharge applies to a handful of harder-to-train variants (35B-A3B MoE — 35B total / 3B active, 49B reasoning dense, R1-distilled 32B) that need materially more GPU-hours.
SMBs · Agencies · Product teams
- Everything in Starter
- LoRA or QLoRA fine-tuning
- Most dense models 12B–32B + small MoE: included, zero surcharge (Mixtral 8x7B, Gemma 4 26B-MoE, Phi-4, Mistral Nemo/Small, Qwen 2.5 32B, Gemma 3 27B, etc.)
- Surcharged Studio models (transparent, $200–$300 extra): DeepSeek-R1-Distill-Qwen-32B, Qwen 3.5/3.6 35B-A3B MoE, Nemotron Super 49B
- Up to 100,000 training examples
- Up to 3 training epochs
- Merged full-precision + quantized weights (.safetensors + LoRA adapter)
- Inference snippets (Python / vLLM / llama.cpp) so you can deploy locally or on any provider
- Training configuration export (reproducible)
Atelier
Production-grade fine-tuning. Includes everything in Studio plus 70B dense models (Llama 3.3 70B, Qwen 2.5 72B) and Mixtral 8x22B (141B MoE) — all at base price. Frontier MoE models (200B+) are accessible only at this tier, and carry a per-model surcharge that reflects the actual GPU-hours required (8× H100/H200 for hours-to-days). The surcharge is shown on the model selector before you commit.
Production deployments · Model specialization · IP-critical work
- Everything in Studio
- Atelier-base included, zero surcharge: Llama 3.3 70B, Qwen 2.5 72B, Mixtral 8x22B (141B MoE)
- Frontier MoE access (per-model surcharge transparently shown):
- • Qwen3 235B-A22B — +$1,499
- • Nemotron Ultra 253B — +$1,799
- • Qwen 3.5 397B-A17B — +$2,499
- • GLM-5 (744B MoE) — +$2,999
- • GLM-5.1 (754B MoE) — +$3,499
- • Kimi K2.5 (1.04T MoE) — +$4,499
- Optional preference optimization (DPO/SimPO) if preference data supplied
- Custom evaluation harness tailored to your use case (HTML report delivered with weights)
- Unlimited training examples
- Priority queue position
- 1 revision round included
- Direct engineer consultation via email during the engagement
Custom
Off-catalog models, dedicated infrastructure, NDA-required engagements, multi-job programs, full-parameter fine-tuning on frontier MoE, exotic architectures (DBRX, vision-language fusion, custom MoE topologies), or anything that doesn't fit a standard tier.
Enterprise · Regulated industries · Custom architectures · Long-term programs
- Off-catalog model support (DBRX, Mixtral 8x22B full-FT, Qwen 235B full-FT, etc.)
- Multi-week or multi-job engagements with milestone billing
- Vision-language and multimodal fine-tuning
- NDA + dedicated communications channel
- For on-prem regulated/air-gapped: see the Erkos service
- Quoted per engagement based on scope, compute, and timeline
Need on-prem, regulated, or air-gapped? → ErkosἝρκος
Logos runs jobs on our own dedicated GPU hardware (smaller jobs) or rented H100 / B200 capacity from a SOC 2 Type II compliant vendor (larger jobs). For HIPAA, SOC 2, PCI, ITAR, sovereign data, or any deployment where training data cannot leave your environment, DDG offers Erkos — on-prem secure fine-tuning installed inside your infrastructure, built on the open-source IronClaw security framework. Quoted per engagement.
What Every Order Delivers
Merged full-precision .safetensors + LoRA adapter, signed download link
HTML report with before/after metrics, sample outputs, and recommendations
YAML config + training logs for reproducibility
Drop-in Python, vLLM, and llama.cpp deployment examples — you run inference, not us
Email a working engineer for technical questions during delivery window
Configure your fine-tune
Six guided steps. About three minutes. We'll email you within one business day to confirm and trigger the job.
Choose the package that fits your project
Start with budget and scope, then we will only show model options that make sense for that package. Most clients begin with Studio and adjust from there.
Questions, answered
JSONL (one example per line), CSV with clear column mappings, Parquet, or any public HuggingFace dataset. For Atelier-tier projects we also accept raw sources (PDFs, transcripts, call logs) and handle the pair-generation ourselves.
You receive a time-limited signed download link for the merged weights (.safetensors) and LoRA adapter, valid for 14 days (Starter / Studio) or 30 days (Atelier). After you download, the model is yours — host it locally, on your own cloud, or with any inference provider (vLLM, llama.cpp, Ollama, Together, Fireworks, RunPod, etc.). We do not host your trained model and we do not provide an inference endpoint or API. We delete our copy of the weights once your download window closes.
Training data is uploaded to isolated job storage, used only for your job, and deleted on job completion. We do not reuse customer data for any other model or purpose. For regulated industries (healthcare, finance, government), see Erkos — our on-prem secure fine-tuning service built on the IronClaw security framework.
Training time depends on the dataset size, base model, training technique, current GPU availability, and add-ons selected. We provide a specific estimate after intake (when we've seen your data) and update if anything changes during execution. We do not commit to a fixed turnaround on the order page because doing so would be misleading — a 10K-example LoRA on a 7B model is hours; a full-parameter 70B with DPO on 100K examples is days. We work as fast as quality permits.
We refund in full if a job fails due to our infrastructure. Once a job runs successfully, fees are not refundable — compute has already been consumed. Atelier tier includes one revision round to tune hyperparameters if the first run doesn't hit your target.
Those services fine-tune hosted models you never own and you can only run via their API. With Logos you get the actual weight files — full .safetensors plus LoRA adapter — and you run inference however and wherever you want. We do not host the model. We do not run your inference. You own the weights.
For HIPAA, SOC 2, PCI, ITAR, or air-gapped environments, use Erkos — our on-prem secure fine-tuning service, built on the IronClaw security framework. Logos is great for prototyping and non-regulated production; Erkos is for when the data can't leave your building.
Smaller jobs run on our own dedicated local GPU hardware. Larger jobs (frontier MoE, full-parameter, multi-GPU) run on rented H100 / B200 capacity from a SOC 2 Type II compliant vendor we have an enterprise relationship with. We do not name the vendor publicly. Jobs are containerized and isolated per-customer. Training data is deleted on completion (Logos) or never leaves your environment (Erkos).
Still have questions?
Email usInterested in Logos?
Whether you're an investor, a prospective partner, or an engineer who wants to use or integrate Logos, reach out — we'll get back to you within a business day.