Per token. Every AI call bills the exact input + output tokens returned by the model, priced at the OpenAI list rate × 1.2 OrcaPulse markup. For GPT-4o mini that is $0.18 per 1M input tokens and $0.72 per 1M output tokens.

OrcaPulse LLM usage is billed on OpenAI token rates with a flat 20% platform markup. Default model (GPT-4o mini) starts at $0.18 per 1M input tokens and $0.72 per 1M output tokens.
Blended average on the default model works out to about $0.45 per 1M tokens.
One credit ($1.00) covers roughly 2,222,222 tokens at the GPT-4o mini blended rate.
Model mix, prompt size, and output length all influence the final cost. Heavier models and longer completions cost proportionally more.
Rates below are OpenAI's published list prices × 1.2 OrcaPulse platform markup. Billing matches the exact tokens reported by the model after each request completes.
| Model | Input (per 1M) | Output (per 1M) | Notes |
|---|---|---|---|
| GPT-4o minigpt-4o-mini | $0.18 | $0.72 | Default model. Best fit for assistants, replies, and most workflow reasoning. |
| GPT-4.1 nanogpt-4.1-nano | $0.12 | $0.48 | Cheapest option. Short classification, tagging, and lightweight routing logic. |
| GPT-4.1 minigpt-4.1-mini | $0.48 | $1.92 | Balanced option for richer reasoning without the GPT-4o price. |
| GPT-4ogpt-4o | $3.00 | $12.00 | Higher quality output for complex assistant conversations and generation. |
| GPT-4.1gpt-4.1 | $2.40 | $9.60 | Flagship reasoning model for demanding workflow logic. |
| GPT-3.5 Turbogpt-3.5-turbo | $0.6 | $1.80 | Legacy chat model available for backwards compatibility. |
All prices include the 20% OrcaPulse platform markup. Final cost per run depends on the prompt size and generated output length. Realtime and audio models also bill audio input/output tokens separately.
The published per-token rate is a clean starting point. As token volume grows, OrcaPulse can help tune model mix, throughput, and commercial structure.
| Usage band | Commercial model |
|---|---|
| 0 – 50M tokens / monthSelf-serve per-token pricing with the 20% platform markup baked in. | Published rate |
| 50M – 250M tokens / monthTeams moving into heavier workflow or assistant automation volume. | Talk to sales |
| 250M+ tokens / monthCommercial review for larger AI workloads, throughput planning, and cost tuning. | Custom quote |
| Workload type | Typical billed cost |
|---|---|
| Short routing logic | < $0.001 / call |
| Assistant response | ~$0.001 – $0.003 / call |
| Long-form generation | ~$0.005 – $0.02 / call |
| GPT-4o / 4.1 reasoning | 5 – 20× the default rate |
Estimates assume GPT-4o mini at typical prompt and completion sizes. Actual cost is recorded per run based on exact tokens used.
OrcaPulse charges the exact tokens returned by OpenAI × a flat 1.2 platform markup. No hidden per-use fees and no flat-rate abstraction on top.
| Billing component | Current pricing | Notes |
|---|---|---|
| Billing unit | Per token | Input and output tokens are priced separately, matching the underlying OpenAI bill. |
| Platform markup | 1.2× (20%) | Applied to actual OpenAI token cost. Covers platform hosting, orchestration, and support. |
| Credit equivalent | $1 USD = 1 credit | Credits are deducted after each operation based on the exact tokens used by the selected model. |
| Underlying token rates | OpenAI published | OrcaPulse tracks OpenAI list prices per model. See the model rates table for current numbers. |
| Audio & realtime | Separate audio rates | Realtime and audio models bill input/output audio tokens in addition to text tokens. |
| Whisper transcription | $0.006 / audio min × 1.2 | Audio transcription is billed by input audio minute, also with the 20% platform markup. |
Per token. Every AI call bills the exact input + output tokens returned by the model, priced at the OpenAI list rate × 1.2 OrcaPulse markup. For GPT-4o mini that is $0.18 per 1M input tokens and $0.72 per 1M output tokens.
The markup covers OrcaPulse platform hosting, orchestration, retries, logging, and support on top of the raw OpenAI token cost.
GPT-4o mini is the default. It balances cost and quality for assistants, workflow reasoning, generated replies, and most AI-powered tasks. You can pick a different model on assistants and workflow steps that need it.
Yes. Every AI operation records the exact token usage, the actual OpenAI cost, the 1.2× billed amount, and the credits deducted. You can review this in the billing and usage views inside the app.
Yes. GPT-4o and GPT-4.1 cost significantly more per token than GPT-4o mini or GPT-4.1 nano. The markup stays flat at 1.2× regardless of model, so the ratio between models matches OpenAI.
Yes. Once your monthly token volume becomes large enough, OrcaPulse can review throughput, model mix, and automation design for custom commercial terms.
Capture leads, respond instantly, automate follow-ups, and convert more customers across every channel - 24/7.
Start Free Today