ResearchIntroducing Nico 2.5 — our frontier GTM foundation model with 1M contextRead paper
Frontier AI research lab
Est. 2020New YorkSan FranciscoBucharestZug

Agents that reason, collaborate, and deliver.

Macrodeep is a frontier AI research lab. We train our own specialized industry models and build the autonomous agents, platform, and tools that run on them — so every team can ship pipeline, code, and decisions at a scale no team has touched before.

Autonomous sales intelligenceFleet-scale coding agentsFrontier GTM foundation modelTrust-gated executionHuman-in-the-loop by design
Autonomous sales intelligenceFleet-scale coding agentsFrontier GTM foundation modelTrust-gated executionHuman-in-the-loop by design
0+
specialized AI agents, shared brain
1M
context window on Nico 2.5
0M+
real GTM training trajectories
0x
average pipeline lift in month one
What we believe

Three things most labs still get wrong.

I.

Specialists beat generalists.

The next decade of AI is about depth, not breadth. A model post-trained on a real distribution will out-reason a generalist chat model an order of magnitude larger — at a tenth the cost. We build the specialists.

II.

Reasoning is not the same as completion.

Most models guess. We train ours to plan, act, and revise — against verifiable outcomes, not preference ratings. Replies sent. Meetings booked. Deals won. Code that compiles and ships.

III.

Autonomy belongs in the work, not the workflow.

The future is not chat. It is autonomous systems that own outcomes — research, draft, send, follow up, update the CRM, log the call, escalate to a human only when trust gates demand it.

Building the first artificial general agents.

Autonomous AI will reshape every industry. Macrodeep is building the agents — and the infrastructure to run them — so every business can operate at a scale that was never possible before.

Macrodeep Research

Frontier research, shipped as product.

We train our own specialized industry models and the agentic systems that run on them. Our research moves straight from the lab into production — no middlemen, no generalist chat wrappers.

Model releases

In production today.

All releases
Nico 2.5GTM
1M context · retrieval-native Production
Nico CodeCoding
agentic SWE-bench 76–82% Roadmap
Olivia v3Conversation
grounded dialog · citations Production
CortexReasoning
trust-gated action policy Production
Field notes

From the lab, this week.

All updates
  1. Nico 2.5 reaches 87% reply-rate uplift over Claude Sonnet on the internal outbound eval (n=12,000).

  2. Salestools GTM agent fleet processed 480K autonomous CRM writes across design partners last week.

  3. Cortex routing live in production for 12 enterprise customers — trust-gated execution, full audit trail.

  4. Grain ships native macOS desktop app with bundled local executor — zero SaaS lock-in.

  5. Nico 2.5 architecture, training stages, and corpus published. Read the paper.

  6. Training-data-at-scale post: how we curated the corpora behind Nico 2.5.

Evaluations

A specialist beats a generalist on every revenue task we measured.

Internal head-to-head benchmarks. Same prompts, same accounts, same scoring rubrics. Numbers refresh per release.

Task
Nico 2.5
GPT-4o
Claude Sonnet
First-touch reply rate (B2B outbound, n=12K)
8.4%
4.1%
4.6%
Account brief — citation accuracy
94%
71%
78%
Stakeholder identification @top-3
0.81
0.62
0.65
Multi-step pipeline workflow (success)
76%
48%
52%
Cost per million output tokens (USD)
$0.30
$10.00
$15.00

Methodology and full results in the Nico 2.5 model card.

Products

A full stack for autonomous work.

One mission, four surfaces. The flagship agent, the platform it runs in, the model that powers them, and the autonomous sales intelligence that feeds them.

Flagship · GTM Agent

Salestools GTM.

An autonomous revenue teammate, not another chat wrapper.

It researches each account, personalizes every touch, enriches pipeline, and runs multi-step GTM workflows inside the CRM and inbox you already use — with humans in the loop where trust requires it.

Explore the agent
Principles

How we build.

Three commitments that shape every product decision.

01
Autonomy with oversight
Trust-gated execution. Agents do the work; humans approve what matters. Every action is logged, replayable, and reversible.
02
Grounded reasoning
No hallucinated claims. Every output ships with a citation trail — CRM fields, research sources, call transcripts, prior deal history.
03
Privacy by default
Your CRM and prospect data is never used to train models. Aggregate, opt-in telemetry only — and enterprise deployments can run entirely inside your network.

Start with the agents.

Deploy an autonomous team for sales or engineering in the time it takes to make a coffee.