OrcaPulse | Ai Sales & Marketing Automation Platform

MODEL RATES

Per-token pricing by model

Rates below are OpenAI's published list prices × 1.2 OrcaPulse platform markup. Billing matches the exact tokens reported by the model after each request completes.

Model	Input (per 1M)	Output (per 1M)	Notes
GPT-4o minigpt-4o-mini	$0.18	$0.72	Default model. Best fit for assistants, replies, and most workflow reasoning.
GPT-4.1 nanogpt-4.1-nano	$0.12	$0.48	Cheapest option. Short classification, tagging, and lightweight routing logic.
GPT-4.1 minigpt-4.1-mini	$0.48	$1.92	Balanced option for richer reasoning without the GPT-4o price.
GPT-4ogpt-4o	$3.00	$12.00	Higher quality output for complex assistant conversations and generation.
GPT-4.1gpt-4.1	$2.40	$9.60	Flagship reasoning model for demanding workflow logic.
GPT-3.5 Turbogpt-3.5-turbo	$0.6	$1.80	Legacy chat model available for backwards compatibility.

All prices include the 20% OrcaPulse platform markup. Final cost per run depends on the prompt size and generated output length. Realtime and audio models also bill audio input/output tokens separately.

SCALE PLANNING

Token volume and cost planning

The published per-token rate is a clean starting point. As token volume grows, OrcaPulse can help tune model mix, throughput, and commercial structure.

Monthly token usage

Usage band	Commercial model
0 – 50M tokens / monthSelf-serve per-token pricing with the 20% platform markup baked in.	Published rate
50M – 250M tokens / monthTeams moving into heavier workflow or assistant automation volume.	Talk to sales
250M+ tokens / monthCommercial review for larger AI workloads, throughput planning, and cost tuning.	Custom quote

Cost by workload

Workload type	Typical billed cost
Short routing logic	< $0.001 / call
Assistant response	~$0.001 – $0.003 / call
Long-form generation	~$0.005 – $0.02 / call
GPT-4o / 4.1 reasoning	5 – 20× the default rate

Estimates assume GPT-4o mini at typical prompt and completion sizes. Actual cost is recorded per run based on exact tokens used.

PLATFORM DETAILS

How OrcaPulse LLM billing works

OrcaPulse charges the exact tokens returned by OpenAI × a flat 1.2 platform markup. No hidden per-use fees and no flat-rate abstraction on top.

Billing component	Current pricing	Notes
Billing unit	Per token	Input and output tokens are priced separately, matching the underlying OpenAI bill.
Platform markup	1.2× (20%)	Applied to actual OpenAI token cost. Covers platform hosting, orchestration, and support.
Credit equivalent	$1 USD = 1 credit	Credits are deducted after each operation based on the exact tokens used by the selected model.
Underlying token rates	OpenAI published	OrcaPulse tracks OpenAI list prices per model. See the model rates table for current numbers.
Audio & realtime	Separate audio rates	Realtime and audio models bill input/output audio tokens in addition to text tokens.
Whisper transcription	$0.006 / audio min × 1.2	Audio transcription is billed by input audio minute, also with the 20% platform markup.

ENTERPRISE

LLM pricing built for automation at scale

Contact us

For teams running heavy AI workloads across assistants, workflow decisions, and content generation we offer custom annual
commitments with throughput planning and commercial review.

Volume-based termsLower committed pricing for sustained token volume

Model mix tuningAlign cost, latency, and output quality across different workload classes

Prompt efficiency supportReduce cost through better automation design and shorter prompts

Reliability planningDesign for throughput, fallback behavior, and stable AI operations across the platform

FAQ

Frequently asked questions

How is OrcaPulse LLM priced?

Per token. Every AI call bills the exact input + output tokens returned by the model, priced at the OpenAI list rate × 1.2 OrcaPulse markup. For GPT-4o mini that is $0.18 per 1M input tokens and $0.72 per 1M output tokens.

Why is there a 20% markup?

The markup covers OrcaPulse platform hosting, orchestration, retries, logging, and support on top of the raw OpenAI token cost.

Which model does OrcaPulse use by default?

GPT-4o mini is the default. It balances cost and quality for assistants, workflow reasoning, generated replies, and most AI-powered tasks. You can pick a different model on assistants and workflow steps that need it.

Can I see the real token cost per run?

Yes. Every AI operation records the exact token usage, the actual OpenAI cost, the 1.2× billed amount, and the credits deducted. You can review this in the billing and usage views inside the app.

Do larger models cost more?

Yes. GPT-4o and GPT-4.1 cost significantly more per token than GPT-4o mini or GPT-4.1 nano. The markup stays flat at 1.2× regardless of model, so the ratio between models matches OpenAI.

Do you offer discounted LLM pricing at scale?

Yes. Once your monthly token volume becomes large enough, OrcaPulse can review throughput, model mix, and automation design for custom commercial terms.

Run assistants and AI workflows on real OpenAI token pricing.

Pricing summary

Per-token pricing by model

Token volume and cost planning

Monthly token usage

Cost by workload

How OrcaPulse LLM billing works

LLM pricing built for automation at scale

Frequently asked questions

Your team shouldn't stop
when you sleep.

Run assistants and AI workflows on real OpenAI token pricing.

Pricing summary

Per-token pricing by model

Token volume and cost planning

Monthly token usage

Cost by workload

How OrcaPulse LLM billing works

LLM pricing built for automation at scale

Frequently asked questions

Your team shouldn't stopwhen you sleep.

Your team shouldn't stop
when you sleep.