AI SERVICES · FINE-TUNING AS A SERVICELive
LogosΛόγος

Bring your data. Take the weights. Run your own stack.

Logos is fine-tuning as a service for open-weight language models — 45 curated bases across 16 model families: Llama, Qwen 2.5 / 3 / 3.5 / 3.6 (dense + MoE up to 397B), NVIDIA Llama-Nemotron (Nano / Super / Ultra), Mistral & Mixtral, Google Gemma 3 + Gemma 4 (26B MoE + 31B Dense), Phi-4, DeepSeek-R1 distills, GLM-5 + GLM-5.1, Kimi K2.5, Nous Research Hermes 3 + 4.3, and MiniMax M1 + M2. Describe your objective, hand us your data, and we ship you the trained weights.

Need to fine-tune in-house going forward? Pick a framework at checkout — OpenClaw (open scripted), Hermes (Nous-style alignment + function-calling), NemoClaw (NVIDIA NeMo / Megatron), IronClaw (security-hardened), or your own choice — and we install it on your infrastructure during the same engagement. Frontier-model jobs come with a built-in setup discount of up to 35%. No vendor lock-in. No proprietary API. No model you can't take with you.

See Pricing
45
Base Models
Meta · Alibaba · NVIDIA · Mistral · Google · Microsoft · DeepSeek · Z.ai · Moonshot · Nous Research · MiniMax
6
Framework Choices
None · OpenClaw · Hermes · NemoClaw · IronClaw · BYO
5
Premium Add-ons
Unsloth · Quantization · Eval · DPO · Tool Use
100%
Weights Ownership
Deploy anywhere, no lock-in
How It Works

Three steps from prompt to production model

01

Configure

Choose a base model, declare your objective, tell us where the data lives. Takes about three minutes with our wizard.

02

We Forge

We upload your dataset to isolated job storage, tune hyperparameters against a held-out split, run training on our local GPU hardware or rented H100/B200 capacity, and evaluate against baseline. Your data is deleted on completion.

03

You Deploy

Signed download link with merged weights and LoRA adapter. HTML evaluation report. Inference snippets for vLLM, llama.cpp, and Python. You take the weights and run inference yourself, however and wherever you want — we do not host.

Enterprise Model Catalog

16 families. 45 selectable bases.

Start at the family level, then expand only the models you are evaluating. Each family shows the procurement-critical details first: publisher, model count, size coverage, starting tier, surcharge range, and best-fit workload.

Qwen 3.5 2B
Smallest current-gen Qwen — fast LoRA iteration, edge-deployment friendly
Size
2B
License
Apache 2.0
Tier
Starter
Surcharge
None
Qwen 3.5 4B
Qwen 3.5 Small (March 2026) — 4B with 256K context, 201 languages, thinking/non-thinking modes. Edge-ready efficient reasoning
Size
4B
License
Apache 2.0
Tier
Starter
Surcharge
None
Qwen 3.5 9B
Qwen 3.5 Small (March 2026) — 9B sweet spot for sub-12B production agents. 256K context, strong agentic coding and multimodal
Size
9B
License
Apache 2.0
Tier
Starter
Surcharge
None
Qwen 3.5 27B
Current Qwen generation in the dense 27B class — released Feb 2026
Size
27B
License
Apache 2.0
Tier
Studio
Surcharge
None
Qwen 3.5 35B-A3B (MoE)
Sparse MoE — 35B total / 3B active. Strong reasoning with thinking mode
Size
35B MoE
License
Apache 2.0
Tier
Studio
Surcharge
+$300
Qwen 3.5 122B-A10B (MoE)
Qwen 3.5 Medium (Feb 2026) — 122B total / 10B active. Production-ready MoE with strong agentic capability; faster+cheaper than the 397B flagship
Size
122B MoE / 10B active
License
Apache 2.0
Tier
Atelier
Surcharge
+$799
Qwen 3.5 397B-A17B (MoE)
Current Qwen flagship MoE — 397B total / 17B active. Native multimodal agent capabilities
Size
397B MoE / 17B active
License
Apache 2.0
Tier
Atelier
Surcharge
+$2,499
Pricing

Four tiers. One-time fees.

Every tier ships the weights to you. Subscriptions are for the vendor-lock world — we don't do that here. Pick the tier that matches your model size and training volume. The Custom tier is à la carte for off-catalog work — submit an inquiry and we settle payment by email rather than on-site checkout.

Starter

$349one-time

LoRA fine-tuning on small open-weight models (≤14B). Every Starter model is included at the base price — no surcharges. Perfect for prototyping, style transfer, or niche domain adaptation.

Best for

Solo developers · Prototypes · Side projects

  • LoRA adapter fine-tuning
  • Any 8 Starter-tier models — all included, zero surcharge
  • Up to 10,000 training examples
  • 1 training epoch
  • Merged full-precision weights (.safetensors)
  • Evaluation report (PDF) with baseline comparison
  • Inference examples (Python + curl)
MOST POPULAR

Studio

$1,499one-time

LoRA or QLoRA on mid-to-large models. Includes everything in Starter plus most dense models up to 32B and small MoE up to 50B-total/13B-active — all at base price. A small per-model surcharge applies to a handful of harder-to-train variants (35B-A3B MoE — 35B total / 3B active, 49B reasoning dense, R1-distilled 32B) that need materially more GPU-hours.

Best for

SMBs · Agencies · Product teams

  • Everything in Starter
  • LoRA or QLoRA fine-tuning
  • Most dense models 12B–32B + small MoE: included, zero surcharge (Mixtral 8x7B, Gemma 4 26B-MoE, Phi-4, Mistral Nemo/Small, Qwen 2.5 32B, Gemma 3 27B, etc.)
  • Surcharged Studio models (transparent, $200–$300 extra): DeepSeek-R1-Distill-Qwen-32B, Qwen 3.5/3.6 35B-A3B MoE, Nemotron Super 49B
  • Up to 100,000 training examples
  • Up to 3 training epochs
  • Merged full-precision + quantized weights (.safetensors + LoRA adapter)
  • Inference snippets (Python / vLLM / llama.cpp) so you can deploy locally or on any provider
  • Training configuration export (reproducible)

Atelier

$3,499one-time

Production-grade fine-tuning. Includes everything in Studio plus 70B dense models (Llama 3.3 70B, Qwen 2.5 72B) and Mixtral 8x22B (141B MoE) — all at base price. Frontier MoE models (200B+) are accessible only at this tier, and carry a per-model surcharge that reflects the actual GPU-hours required (8× H100/H200 for hours-to-days). The surcharge is shown on the model selector before you commit.

Best for

Production deployments · Model specialization · IP-critical work

  • Everything in Studio
  • Atelier-base included, zero surcharge: Llama 3.3 70B, Qwen 2.5 72B, Mixtral 8x22B (141B MoE)
  • Frontier MoE access (per-model surcharge transparently shown):
  • • Qwen3 235B-A22B — +$1,499
  • • Nemotron Ultra 253B — +$1,799
  • • Qwen 3.5 397B-A17B — +$2,499
  • • GLM-5 (744B MoE) — +$2,999
  • • GLM-5.1 (754B MoE) — +$3,499
  • • Kimi K2.5 (1.04T MoE) — +$4,499
  • Optional preference optimization (DPO/SimPO) if preference data supplied
  • Custom evaluation harness tailored to your use case (HTML report delivered with weights)
  • Unlimited training examples
  • Priority queue position
  • 1 revision round included
  • Direct engineer consultation via email during the engagement

Custom

Quotedper engagement

Off-catalog models, dedicated infrastructure, NDA-required engagements, multi-job programs, full-parameter fine-tuning on frontier MoE, exotic architectures (DBRX, vision-language fusion, custom MoE topologies), or anything that doesn't fit a standard tier.

Best for

Enterprise · Regulated industries · Custom architectures · Long-term programs

  • Off-catalog model support (DBRX, Mixtral 8x22B full-FT, Qwen 235B full-FT, etc.)
  • Multi-week or multi-job engagements with milestone billing
  • Vision-language and multimodal fine-tuning
  • NDA + dedicated communications channel
  • For on-prem regulated/air-gapped: see the Erkos service
  • Quoted per engagement based on scope, compute, and timeline
ENTERPRISE TIERUpgrade path

Need on-prem, regulated, or air-gapped? → ErkosἝρκος

Logos runs jobs on our own dedicated GPU hardware (smaller jobs) or rented H100 / B200 capacity from a SOC 2 Type II compliant vendor (larger jobs). For HIPAA, SOC 2, PCI, ITAR, sovereign data, or any deployment where training data cannot leave your environment, DDG offers Erkos — on-prem secure fine-tuning installed inside your infrastructure, built on the open-source IronClaw security framework. Quoted per engagement.

Contact for Erkos

What Every Order Delivers

Trained model weights

Merged full-precision .safetensors + LoRA adapter, signed download link

Evaluation report

HTML report with before/after metrics, sample outputs, and recommendations

Training configuration

YAML config + training logs for reproducibility

Inference snippets

Drop-in Python, vLLM, and llama.cpp deployment examples — you run inference, not us

Direct engineer access (Atelier)

Email a working engineer for technical questions during delivery window

Start Your Order

Configure your fine-tune

Six guided steps. About three minutes. We'll email you within one business day to confirm and trigger the job.

Choose the package that fits your project

Start with budget and scope, then we will only show model options that make sense for that package. Most clients begin with Studio and adjust from there.

If you're unsure, choose Studio. It gives your team access to the strongest mid-size models without moving into frontier compute unless your project truly needs it.
FAQ

Questions, answered

JSONL (one example per line), CSV with clear column mappings, Parquet, or any public HuggingFace dataset. For Atelier-tier projects we also accept raw sources (PDFs, transcripts, call logs) and handle the pair-generation ourselves.

You receive a time-limited signed download link for the merged weights (.safetensors) and LoRA adapter, valid for 14 days (Starter / Studio) or 30 days (Atelier). After you download, the model is yours — host it locally, on your own cloud, or with any inference provider (vLLM, llama.cpp, Ollama, Together, Fireworks, RunPod, etc.). We do not host your trained model and we do not provide an inference endpoint or API. We delete our copy of the weights once your download window closes.

Training data is uploaded to isolated job storage, used only for your job, and deleted on job completion. We do not reuse customer data for any other model or purpose. For regulated industries (healthcare, finance, government), see Erkos — our on-prem secure fine-tuning service built on the IronClaw security framework.

Training time depends on the dataset size, base model, training technique, current GPU availability, and add-ons selected. We provide a specific estimate after intake (when we've seen your data) and update if anything changes during execution. We do not commit to a fixed turnaround on the order page because doing so would be misleading — a 10K-example LoRA on a 7B model is hours; a full-parameter 70B with DPO on 100K examples is days. We work as fast as quality permits.

We refund in full if a job fails due to our infrastructure. Once a job runs successfully, fees are not refundable — compute has already been consumed. Atelier tier includes one revision round to tune hyperparameters if the first run doesn't hit your target.

Those services fine-tune hosted models you never own and you can only run via their API. With Logos you get the actual weight files — full .safetensors plus LoRA adapter — and you run inference however and wherever you want. We do not host the model. We do not run your inference. You own the weights.

For HIPAA, SOC 2, PCI, ITAR, or air-gapped environments, use Erkos — our on-prem secure fine-tuning service, built on the IronClaw security framework. Logos is great for prototyping and non-regulated production; Erkos is for when the data can't leave your building.

Smaller jobs run on our own dedicated local GPU hardware. Larger jobs (frontier MoE, full-parameter, multi-GPU) run on rented H100 / B200 capacity from a SOC 2 Type II compliant vendor we have an enterprise relationship with. We do not name the vendor publicly. Jobs are containerized and isolated per-customer. Training data is deleted on completion (Logos) or never leaves your environment (Erkos).

Still have questions?

Email us
Let's Talk

Interested in Logos?

Whether you're an investor, a prospective partner, or an engineer who wants to use or integrate Logos, reach out — we'll get back to you within a business day.