OrcaPulse logoOrcaPulse
Start free
OrcaPulse LLM

Run assistants and AI workflows on real OpenAI token pricing.

OrcaPulse LLM pricing illustration

Pricing summary

OrcaPulse LLM usage is billed on OpenAI token rates with a flat 20% platform markup. Default model (GPT-4o mini) starts at $0.18 per 1M input tokens and $0.72 per 1M output tokens.

Blended average on the default model works out to about $0.45 per 1M tokens.

One credit ($1.00) covers roughly 2,222,222 tokens at the GPT-4o mini blended rate.

Model mix, prompt size, and output length all influence the final cost. Heavier models and longer completions cost proportionally more.

MODEL RATES

Per-token pricing by model

Rates below are OpenAI's published list prices × 1.2 OrcaPulse platform markup. Billing matches the exact tokens reported by the model after each request completes.

ModelInput (per 1M)Output (per 1M)Notes
GPT-4o minigpt-4o-mini$0.18$0.72Default model. Best fit for assistants, replies, and most workflow reasoning.
GPT-4.1 nanogpt-4.1-nano$0.12$0.48Cheapest option. Short classification, tagging, and lightweight routing logic.
GPT-4.1 minigpt-4.1-mini$0.48$1.92Balanced option for richer reasoning without the GPT-4o price.
GPT-4ogpt-4o$3.00$12.00Higher quality output for complex assistant conversations and generation.
GPT-4.1gpt-4.1$2.40$9.60Flagship reasoning model for demanding workflow logic.
GPT-3.5 Turbogpt-3.5-turbo$0.6$1.80Legacy chat model available for backwards compatibility.

All prices include the 20% OrcaPulse platform markup. Final cost per run depends on the prompt size and generated output length. Realtime and audio models also bill audio input/output tokens separately.

SCALE PLANNING

Token volume and cost planning

The published per-token rate is a clean starting point. As token volume grows, OrcaPulse can help tune model mix, throughput, and commercial structure.

Monthly token usage

Usage bandCommercial model
0 – 50M tokens / monthSelf-serve per-token pricing with the 20% platform markup baked in.Published rate
50M – 250M tokens / monthTeams moving into heavier workflow or assistant automation volume.Talk to sales
250M+ tokens / monthCommercial review for larger AI workloads, throughput planning, and cost tuning.Custom quote

Cost by workload

Workload typeTypical billed cost
Short routing logic< $0.001 / call
Assistant response~$0.001 – $0.003 / call
Long-form generation~$0.005 – $0.02 / call
GPT-4o / 4.1 reasoning5 – 20× the default rate

Estimates assume GPT-4o mini at typical prompt and completion sizes. Actual cost is recorded per run based on exact tokens used.

PLATFORM DETAILS

How OrcaPulse LLM billing works

OrcaPulse charges the exact tokens returned by OpenAI × a flat 1.2 platform markup. No hidden per-use fees and no flat-rate abstraction on top.

Billing componentCurrent pricingNotes
Billing unitPer tokenInput and output tokens are priced separately, matching the underlying OpenAI bill.
Platform markup1.2× (20%)Applied to actual OpenAI token cost. Covers platform hosting, orchestration, and support.
Credit equivalent$1 USD = 1 creditCredits are deducted after each operation based on the exact tokens used by the selected model.
Underlying token ratesOpenAI publishedOrcaPulse tracks OpenAI list prices per model. See the model rates table for current numbers.
Audio & realtimeSeparate audio ratesRealtime and audio models bill input/output audio tokens in addition to text tokens.
Whisper transcription$0.006 / audio min × 1.2Audio transcription is billed by input audio minute, also with the 20% platform markup.
ENTERPRISE

LLM pricing built for automation at scale

Contact us

For teams running heavy AI workloads across assistants, workflow decisions, and content generation we offer custom annual
commitments with throughput planning and commercial review.

Volume-based termsLower committed pricing for sustained token volume
Model mix tuningAlign cost, latency, and output quality across different workload classes
Prompt efficiency supportReduce cost through better automation design and shorter prompts
Reliability planningDesign for throughput, fallback behavior, and stable AI operations across the platform
FAQ

Frequently asked questions

Per token. Every AI call bills the exact input + output tokens returned by the model, priced at the OpenAI list rate × 1.2 OrcaPulse markup. For GPT-4o mini that is $0.18 per 1M input tokens and $0.72 per 1M output tokens.

The markup covers OrcaPulse platform hosting, orchestration, retries, logging, and support on top of the raw OpenAI token cost.

GPT-4o mini is the default. It balances cost and quality for assistants, workflow reasoning, generated replies, and most AI-powered tasks. You can pick a different model on assistants and workflow steps that need it.

Yes. Every AI operation records the exact token usage, the actual OpenAI cost, the 1.2× billed amount, and the credits deducted. You can review this in the billing and usage views inside the app.

Yes. GPT-4o and GPT-4.1 cost significantly more per token than GPT-4o mini or GPT-4.1 nano. The markup stays flat at 1.2× regardless of model, so the ratio between models matches OpenAI.

Yes. Once your monthly token volume becomes large enough, OrcaPulse can review throughput, model mix, and automation design for custom commercial terms.

Your team shouldn't stop
when you sleep.

Capture leads, respond instantly, automate follow-ups, and convert more customers across every channel - 24/7.

Start Free Today