COMPUTATIONAL RESEARCH · ACTIVEARCHITECTURE CONFIDENTIAL

Priscus
P1

A new class of language model — engineered for parameter efficiency and convergence speed.

At 431M parameters, our flagship configuration reaches 0.28 validation loss on a cross-domain held-out set — roughly 18× lower than comparable dense transformers trained on the same corpus and hardware. A compact configuration reaches competitive quality at under half the parameter count. The architecture, the training procedure, and the primitives that make this possible are held under NDA.

Visit priscus.aiSee the Numbers
Headline Results

What the architecture actually does

Every number below comes from a controlled training run on shared hardware and a common corpus. The measurement methodology, validation harness, and comparator runs are documented in full under NDA.

0.28
Best Validation Loss
Flagship configuration, held-out cross-domain validation
~18×
Lower Than Baselines
Versus comparable-scale dense transformers on the same corpus
~55%
Parameter Reduction
Equivalent quality at under half the parameter count of prior generation
351k
Tokens / Second
Peak training throughput on 4× H100 during full training

Full benchmark suite, validation methodology, and reproduction artifacts available under NDA.

Priscus P1 configurations

Three configurations trained on shared hardware and a common corpus. Each was measured on the same held-out cross-domain validation set.

ConfigurationParamsVal Loss
Priscus P1 (flagship)
Best-performing configuration, held-out validation
431M0.28
Priscus P1-XL
Alternate configuration under evaluation
408M0.78
Priscus P1-mini
Competitive quality at under half the parameter count
178M0.80

Validation loss vs industry baselines

Every model below was trained on the same hardware and the same corpus. Held-out cross-domain validation loss is a direct measure of how much learning each training token produced. Lower is dramatically better — and the gap is not subtle.

DDG · 431M
Priscus P1 (flagship)
0.28
val loss
Flagship configuration — baseline
DDG · 178M
Priscus P1-mini
0.80
val loss
~2.9× the flagship loss · still ~6× lower than industry baselines
Alibaba · 403M
Qwen 2.5 0.5B
5.00
val loss
~17.9× the flagship loss
Hugging Face · 363M
SmolLM2 360M
5.02
val loss
~17.9× the flagship loss

Reading the chart: every training token consumed by Priscus P1 produces roughly 18× the validation-loss reduction of the same token consumed by a comparable-scale dense transformer. The compact 178M configuration still outperforms 400M-class baselines by ~6× at under half the parameter count.

Comparisons run on shared hardware and a common corpus. Validation harness, comparator configuration, and vocabulary-normalization methodology documented in full under NDA.

Three Pillars

Why the numbers are what they are, without revealing how

The results on the page above are direct outcomes of three design principles. We can describe the principles. The mechanisms that implement them are held under NDA.

01

Learning Density

Every training token buys dramatically more validation-loss reduction than a comparable dense transformer on the same corpus. The separation is not marginal — it is multi-fold, and it holds across held-out domains.

02

Parameter Discipline

Reaching competitive quality at substantially lower parameter counts changes the economics of training and the reach of deployment. A configuration at under half the usual parameter count still outperforms larger prior-generation baselines.

03

Production Viability

Training throughput is on the order of production open-weight models, not a slow research curiosity. The design is engineered from the outset for real inference workloads, not only research-bench numbers.

WHY THE DESIGN IS CONFIDENTIAL

The results page is public. The architecture is not.

The design is protected as a trade secret. We are not disclosing how the architecture is structured, how training proceeds, or what the underlying primitives are. Serious partners — investors, strategic customers, acquirers — can arrange an NDA briefing via the waitlist below.

Research Waitlist

Get notified when we share more

Paper releases, technical briefings, early access, and investor conversations all flow through this list. No newsletter spam — we email only when there's a concrete milestone.

We'll never share your email. Unsubscribe any time.