The Trinity Beast — Translation Parameters

Complete reference for all 39 runtime parameters governing the Translation Engine — Bedrock pricing, token estimation, cost controls, customer pricing, and batch inference configuration.

Parameters: 39 Categories: 5 Storage: Aurora translation_parameters Cache: Valkey tx:params (5-min TTL) Updated: June 2026

Table of Contents

  1. Overview
  2. How Parameters Work
  3. The Cost Formula
  4. Parameter Categories
    1. Bedrock Pricing
    2. Token Estimation
    3. Cost Controls
    4. Customer Pricing
    5. Batch Configuration
  5. Managing Parameters
  6. Nightly Sync Job
  7. Complete Parameter Reference

Overview

Translation parameters control every cost, pricing, and operational aspect of the Translation Engine (TBTS). They are stored in the translation_parameters Aurora table — the authoritative source — and cached in Valkey as the tx:params hash for fast reads by ECS containers and Lambda workers.

  • Separate from application parameters: Translation is a standalone service with its own cost model. These parameters live in translation_parameters, not application_parameters. This prevents a bad translation config change from affecting the LPO server, and vice versa.
  • Startup: ECS containers load tx:params from Valkey on startup. Lambda workers load it on cold start.
  • Polling: Containers refresh every 5 minutes. Changes apply within one poll cycle — no restart needed.
  • Write-through: Any update via the admin API writes to Aurora and Valkey simultaneously. In-memory values on the responding container update immediately.
  • Nightly sync: The sync job re-syncs all parameters from Aurora to Valkey each night, ensuring consistency after any direct Aurora edits.
  • Force reload: POST /admin/translate/params/reload flushes Valkey and reloads from Aurora immediately. All containers pick up changes on their next 5-minute poll.

Important: The legacy Valkey keys translation:rate:{model}:input and translation:rate:{model}:output are superseded by tx:params. The app:config keys translation_markup_pct and translation_infra_per_pair are also superseded. All translation cost logic now reads exclusively from tx:params. The legacy keys are retained for backward compatibility only.

How Parameters Work

-- Aurora: authoritative source
SELECT key, value, description, category, updated_at, updated_by
FROM translation_parameters
ORDER BY category, key;

-- Valkey: fast cache (tx:params hash, 5-min TTL)
HGETALL tx:params

-- Update a parameter (applies immediately to Aurora + Valkey + in-memory)
POST /admin/translate/params
{"key": "pricing.markup_pct", "value": "35", "updated_by": "cory"}

-- Force all containers to reload from Aurora
POST /admin/translate/params/reload

-- View parameters in the TBI Administration dashboard
Translation tab → Translation Parameters section

Fallback Chain

Every parameter accessor follows a three-level fallback to ensure the system never returns zero for a cost value:

  1. Valkey tx:params hash — fast path, ~0.1ms, refreshed every 5 minutes
  2. Aurora translation_parameters table — authoritative source, ~5ms, used when Valkey is unavailable
  3. Hardcoded defaults — safe floor values matching the seed data, used only if both Aurora and Valkey are unreachable

The Cost Formula

The Translation Engine tracks every token at the most granular level Bedrock reports. The ground-truth cost calculation for any translation job is:

BedrockCostForTokens — the single source of truth for all cost calculations:

cost = (uncached_input_tokens  × input_rate_per_token)
     + (cached_tokens          × cache_read_rate_per_token)   ← 90% cheaper than input
     + (cache_write_tokens     × cache_write_rate_per_token)  ← one-time per cache population
     + (output_tokens          × output_rate_per_token)

Where uncached_input = total_input - cached_tokens. This formula is used everywhere actual cost is recorded — finalize Lambda, SQS cost messages, Aurora bedrock_cost_usd column, and the daily spend counter in Valkey.

Customer Price Formula

What we charge customers is derived from the Bedrock cost plus infrastructure overhead and a service fee markup:

subtotal    = bedrock_cost + infra_per_pair_usd
with_markup = subtotal × (1 + markup_pct / 100)

-- Standard tier (batch, 12-24h SLA):
customer_price = with_markup × (1 - standard_tier_discount_pct / 100)

-- Express tier (real-time, 1-6h SLA):
customer_price = with_markup × (1 + express_tier_premium_pct / 100)

-- Floor:
customer_price = max(customer_price, minimum_quote_usd)

Savings Stack

The three optimization layers compound multiplicatively for internal document translation:

OptimizationMechanismSavingsParameter
Delta processingSkip pairs where translation is newer than source70–90% reduction in workControlled by options.delta at job submission
Batch inferenceBedrock batch API vs on-demand50% off per tokenbedrock.*.batch.*_per_1m
Prompt cachingSystem prompt + protected terms cached90% off cached tokensbedrock.*.cache_read_per_1m

Parameter Categories

Bedrock Pricing Live — update when AWS changes rates

The actual per-token rates Amazon Bedrock charges us. These are our costs — not what we charge customers. Separate keys exist for each model, execution mode (realtime vs batch), and caching operation.

Batch inference is 50% of on-demand. Prompt cache reads are 90% off on-demand input. Cache writes are 1.25× on-demand (paid once per cache population).

KeyDefaultUnitDescription
bedrock.sonnet46.realtime.input_per_1m3.00USD / 1M tokensSonnet 4.6 on-demand input. The "Express tier" rate — real-time jobs, 1–6h SLA.
bedrock.sonnet46.realtime.output_per_1m15.00USD / 1M tokensSonnet 4.6 on-demand output.
bedrock.sonnet46.batch.input_per_1m1.50USD / 1M tokensSonnet 4.6 batch inference input. 50% off on-demand. Used for Standard tier (12–24h SLA).
bedrock.sonnet46.batch.output_per_1m7.50USD / 1M tokensSonnet 4.6 batch inference output. 50% off on-demand.
bedrock.sonnet46.cache_write_per_1m3.75USD / 1M tokensSonnet 4.6 prompt cache write. 1.25× on-demand — paid once when the cache is first populated.
bedrock.sonnet46.cache_read_per_1m0.30USD / 1M tokensSonnet 4.6 prompt cache read. 90% off on-demand input. Applied to system prompt + protected terms tokens on every cached request.
bedrock.haiku35.realtime.input_per_1m0.80USD / 1M tokensHaiku 3.5 on-demand input. Used for Latin-script languages in auto-routing mode.
bedrock.haiku35.realtime.output_per_1m4.00USD / 1M tokensHaiku 3.5 on-demand output.
bedrock.haiku35.batch.input_per_1m0.40USD / 1M tokensHaiku 3.5 batch inference input.
bedrock.haiku35.batch.output_per_1m2.00USD / 1M tokensHaiku 3.5 batch inference output.
bedrock.opus4.realtime.input_per_1m15.00USD / 1M tokensOpus 4 on-demand input. Highest quality — critical documents, complex grammar.
bedrock.opus4.realtime.output_per_1m75.00USD / 1M tokensOpus 4 on-demand output.
bedrock.opus4.batch.input_per_1m7.50USD / 1M tokensOpus 4 batch inference input.
bedrock.opus4.batch.output_per_1m37.50USD / 1M tokensOpus 4 batch inference output.

Token Estimation Derived from job history — update after each batch run

Ratios used to estimate cost before a job runs — for customer quotes and pre-flight spend checks. These are derived from actual Aurora translation_job_events data and should be updated as real batch job data accumulates.

KeyDefaultUnitDescription
estimation.tokens_per_byte0.64tokens / byteInput tokens per translatable byte. Derived from job history — remarkably consistent at 0.63–0.66 across all document types. Update after each batch run from the translation:actuals:{model}:{mode} Valkey keys written by the nightly sync job.
estimation.output_input_ratio0.76ratioOutput tokens divided by input tokens. Varies by document type: code-heavy docs ~0.46 (code passes through untranslated), prose-heavy docs ~1.01. The 0.76 average is a reasonable middle ground for mixed technical documentation.
estimation.quote_padding_pct6percentPadding applied to token estimates for customer quotes. A 6% buffer keeps quotes slightly conservative — actual cost is usually at or below the estimate. Absorbed by the markup buffer on prose-heavy documents.
estimation.cache_hit_rate0.30fraction (0–1)Expected fraction of input tokens served from the prompt cache. The system prompt + protected terms list accounts for approximately 30% of total input tokens. Update from actual cache_hit_rate values in translation:actuals:* Valkey keys after batch jobs run.
estimation.cached_system_tokens1200tokensApproximate token count of the cacheable system prompt block (system instructions + protected terms list). This is the fixed overhead per request — identical for every doc/lang pair. Must be a multiple of 3 per Trinity Beast convention.

How to update estimation parameters: After each batch run, the nightly sync job writes actual token ratios to Valkey keys like translation:actuals:claude-sonnet-4.6:batch. Review these values in the TBI Administration dashboard (Translation tab → Token Usage) and update the estimation parameters if the actuals drift significantly from the defaults. A 5% drift is normal; a 15%+ drift warrants an update.

Cost Controls Safety guardrails — prevent runaway spend

Operational limits that prevent runaway Bedrock costs. The daily spend cap is a hard stop — no new jobs are accepted after it is reached. The token limit is a secondary guard that catches unusually large documents before they hit the dollar cap.

KeyDefaultUnitDescription
controls.daily_spend_limit_usd600USDDaily Bedrock spend cap. No new translation jobs are accepted after this limit is reached. Resets at midnight UTC. Must be a multiple of 3 (200×3). Tracked in Valkey key autoops:bedrock:spend:daily.
controls.daily_token_limit51,000,000tokensDaily combined token limit (input + output). Secondary guard — model-agnostic. Catches unusually large documents before they hit the dollar cap. 51M = 17M×3. Tracked in Valkey keys autoops:bedrock:tokens:input:daily and autoops:bedrock:tokens:output:daily.
controls.max_docs_per_request6documentsMaximum documents per single POST /admin/translate request. Multiple of 3 per Trinity Beast convention. Customer quote submissions (TBTS) are unconstrained at the request layer — this limit applies to internal admin submissions only.
controls.max_active_jobs3jobsMaximum concurrent active translation jobs. Multiple of 3. A warning is logged when this threshold is reached, but jobs are still accepted (the limit is advisory, not a hard stop).
controls.max_queue_depth12jobsMaximum jobs in the translation queue. Multiple of 3. New submissions are rejected with a 429 when this depth is reached.
controls.batch_poll_interval_seconds300secondsHow often the Step Function polls Bedrock for batch job status. 300 = 5 minutes. Multiple of 3 (100×3). Shorter intervals increase API call costs; longer intervals delay job completion notification.
controls.batch_timeout_hours6hoursHours before a Bedrock batch job is considered timed out and marked as failed. Multiple of 3 (2×3). AWS typically completes batch jobs within 1–3 hours; 6 hours provides a safe buffer.
controls.cost_per_chunk_floor0.03USDMinimum cost-per-chunk floor used by the nightly sync job when calculating rolling averages. Prevents unrealistically low averages from skewing customer quotes during low-activity periods.
controls.cost_per_chunk_ceiling0.09USDMaximum cost-per-chunk ceiling used by the nightly sync job. Caps runaway averages caused by anomalous jobs (e.g., a single very large document).

Customer Pricing What we charge — separate from what Bedrock charges us

These parameters control what customers pay for TBTS translations. They are applied at quote time by CustomerPriceForPair() in the translation engine. Changing these takes effect immediately — no deploy needed.

KeyDefaultUnitDescription
pricing.infra_per_pair_usd0.003USD / pairInfrastructure cost per language pair. Covers ECS task time, S3 storage, CloudFront invalidation, Lambda invocations, and Step Function state transitions. Added to Bedrock cost before markup is applied.
pricing.markup_pct30percentService fee markup applied to (Bedrock cost + infra cost). Covers support, margin, and operational overhead. Applied at quote time and in all customer-facing emails. This is the primary lever for adjusting customer pricing — change it here and it takes effect everywhere immediately.
pricing.infra_markup_pct9percentInfrastructure markup applied by the nightly sync job to the rolling 7-day cost-per-chunk averages. Covers ECS, S3, CloudFront, SQS, and Step Functions. Baked into the translation:cost_per_chunk:{model} Valkey keys. Separate from pricing.markup_pct — this is the cost side, not the revenue side.
pricing.standard_tier_discount_pct30percentDiscount for Standard tier quotes (batch inference, 12–24h SLA). We use batch inference for Standard tier so our Bedrock cost is also 50% lower — this discount passes some of that savings to the customer while maintaining margin.
pricing.express_tier_premium_pct0percentPremium for Express tier quotes (real-time inference, 1–6h SLA). Currently 0 — Express is priced at the same markup as Standard before the Standard discount. Set to a positive value to charge a premium for faster delivery.
pricing.minimum_quote_usd0.99USDMinimum customer quote price. Prevents sub-dollar quotes that cost more to process (Stripe fees, email, support) than they earn. Applied as a floor after all markup calculations.

Full customer price formula:

subtotal    = bedrock_cost + pricing.infra_per_pair_usd
with_markup = subtotal × (1 + pricing.markup_pct / 100)

Standard tier:  price = with_markup × (1 - pricing.standard_tier_discount_pct / 100)
Express tier:   price = with_markup × (1 + pricing.express_tier_premium_pct / 100)

Final:          price = max(price, pricing.minimum_quote_usd)

Batch Configuration

Operational settings for the Bedrock batch inference pipeline. These control where batch job files are stored, which IAM role Bedrock assumes, and how the Step Function manages batch jobs.

KeyDefaultUnitDescription
batch.s3_buckettrinity-beast-translation-jobsbucket nameS3 bucket for batch job JSONL input and output files. Structure: {job_id}/input.jsonl and {job_id}/output.jsonl. Located in us-east-2 (same region as Bedrock endpoint).
batch.iam_role_arnarn:aws:iam::211998422884:role/tbi-bedrock-batch-roleARNIAM role that Bedrock assumes to read input JSONL from S3 and write output JSONL back. Permissions: s3:GetObject on input, s3:PutObject on output, bedrock:InvokeModel on Sonnet 4.6.
batch.default_modelclaude-sonnet-4.6model nameDefault model for batch translation jobs when no model is specified in the job submission. Sonnet 4.6 is the recommended default — best balance of quality and cost for technical documentation.
batch.max_output_tokens200,000tokensMaximum output tokens per Bedrock request in the batch JSONL. 200,000 = 200K tokens. Sufficient for the largest documents in the library. Must be a multiple of 3 per Trinity Beast convention.
batch.job_state_ttl_seconds518,400secondsValkey TTL for batch job state keys (tx:job:{id}). 518,400 = 6 days (172,800×3). Batch jobs can take up to 6 hours to complete; the 6-day TTL ensures state is available for monitoring and debugging after completion.

Managing Parameters

Via the TBI Administration Dashboard

The easiest way to manage translation parameters is through the TBI Administration dashboard. Navigate to the Translation tab and scroll to the Translation Parameters section (admin only).

Via the Admin API

-- List all parameters
GET /admin/translate/params

-- Filter by category
GET /admin/translate/params?category=customer_pricing

-- Update a parameter (applies immediately to Aurora + Valkey + in-memory)
POST /admin/translate/params
{"key": "pricing.markup_pct", "value": "35", "updated_by": "cory"}

-- Force all containers to reload from Aurora
POST /admin/translate/params/reload

Via Aurora SQL (direct)

-- View all parameters
SELECT key, value, category, updated_at, updated_by
FROM translation_parameters
ORDER BY category, key;

-- Update a parameter
UPDATE translation_parameters
SET value = '35', updated_at = NOW(), updated_by = 'cory'
WHERE key = 'pricing.markup_pct';

-- After a direct SQL update, reload Valkey via the API:
POST /admin/translate/params/reload

Important: Direct SQL updates to Aurora do NOT automatically update Valkey or in-memory values on running containers. Always follow a direct SQL update with a POST /admin/translate/params/reload call to ensure all containers pick up the change within their next 5-minute poll cycle.

Nightly Sync Job

The nightly sync job (trinity-beast-sync-job, runs at 1 AM EDT) performs three translation-related tasks:

1. Sync translation_parameters → tx:params

Reads all rows from the translation_parameters Aurora table and writes them to the tx:params Valkey hash with a 5-minute TTL. This ensures all containers have fresh parameter values even if the Valkey cache expired overnight.

2. Rolling 7-day cost averages

Queries translation_job_events for the last 7 days, grouped by model and execution mode. Calculates the average cost-per-chunk for each combination and writes to Valkey:

Valkey KeyDescription
translation:cost_per_chunk:{model}Real-time path (backward compatible)
translation:cost_per_chunk:{model}:batchBatch inference path
translation:cost_per_chunk:{model}:batch_cachedBatch + prompt caching path
translation:cost_per_chunkBlended average (all models, backward compat)

The 9% infrastructure markup (pricing.infra_markup_pct) is applied to these averages before writing to Valkey. The floor and ceiling from controls.cost_per_chunk_floor and controls.cost_per_chunk_ceiling are also applied.

3. Actual token ratio data

Writes actual token ratios from real jobs to Valkey for operator review:

-- Key pattern: translation:actuals:{model}:{execution_mode}
-- Example: translation:actuals:claude-sonnet-4.6:batch

Fields:
  avg_input_tokens   — average input tokens per pair
  avg_output_tokens  — average output tokens per pair
  avg_cached_tokens  — average cached tokens per pair
  output_input_ratio — derived: avg_output / avg_input
  cache_hit_rate     — derived: avg_cached / avg_input
  pair_count         — number of pairs in the 7-day window
  updated_at         — when this was last calculated

Review these values after each batch run and update estimation.output_input_ratio and estimation.cache_hit_rate in translation_parameters if the actuals drift significantly from the defaults.

Complete Parameter Reference

All 39 parameters in the translation_parameters table, sorted by category and key.

KeyCategoryDefaultDescription
batch.default_modelbatch_configclaude-sonnet-4.6Default model for batch jobs
batch.iam_role_arnbatch_configarn:aws:iam::211998422884:role/tbi-bedrock-batch-roleIAM role for Bedrock batch jobs
batch.job_state_ttl_secondsbatch_config518400Valkey TTL for batch job state (6 days)
batch.max_output_tokensbatch_config200000Max output tokens per Bedrock request
batch.s3_bucketbatch_configtrinity-beast-translation-jobsS3 bucket for batch JSONL files
bedrock.haiku35.batch.input_per_1mbedrock_pricing0.40Haiku 3.5 batch input (USD/1M tokens)
bedrock.haiku35.batch.output_per_1mbedrock_pricing2.00Haiku 3.5 batch output (USD/1M tokens)
bedrock.haiku35.realtime.input_per_1mbedrock_pricing0.80Haiku 3.5 on-demand input (USD/1M tokens)
bedrock.haiku35.realtime.output_per_1mbedrock_pricing4.00Haiku 3.5 on-demand output (USD/1M tokens)
bedrock.opus4.batch.input_per_1mbedrock_pricing7.50Opus 4 batch input (USD/1M tokens)
bedrock.opus4.batch.output_per_1mbedrock_pricing37.50Opus 4 batch output (USD/1M tokens)
bedrock.opus4.realtime.input_per_1mbedrock_pricing15.00Opus 4 on-demand input (USD/1M tokens)
bedrock.opus4.realtime.output_per_1mbedrock_pricing75.00Opus 4 on-demand output (USD/1M tokens)
bedrock.sonnet46.batch.input_per_1mbedrock_pricing1.50Sonnet 4.6 batch input (USD/1M tokens)
bedrock.sonnet46.batch.output_per_1mbedrock_pricing7.50Sonnet 4.6 batch output (USD/1M tokens)
bedrock.sonnet46.cache_read_per_1mbedrock_pricing0.30Sonnet 4.6 cache read (USD/1M tokens, 90% off)
bedrock.sonnet46.cache_write_per_1mbedrock_pricing3.75Sonnet 4.6 cache write (USD/1M tokens, 1.25×)
bedrock.sonnet46.realtime.input_per_1mbedrock_pricing3.00Sonnet 4.6 on-demand input (USD/1M tokens)
bedrock.sonnet46.realtime.output_per_1mbedrock_pricing15.00Sonnet 4.6 on-demand output (USD/1M tokens)
controls.batch_poll_interval_secondscost_controls300Batch job status poll interval (seconds)
controls.batch_timeout_hourscost_controls6Hours before batch job is timed out
controls.cost_per_chunk_ceilingcost_controls0.09Max cost-per-chunk for rolling averages
controls.cost_per_chunk_floorcost_controls0.03Min cost-per-chunk for rolling averages
controls.daily_spend_limit_usdcost_controls600Daily Bedrock spend cap (USD)
controls.daily_token_limitcost_controls51000000Daily combined token limit
controls.max_active_jobscost_controls3Max concurrent active jobs
controls.max_docs_per_requestcost_controls6Max docs per translation request
controls.max_queue_depthcost_controls12Max jobs in translation queue
estimation.cache_hit_ratetoken_estimation0.30Expected prompt cache hit fraction
estimation.cached_system_tokenstoken_estimation1200Tokens in cacheable system prompt block
estimation.output_input_ratiotoken_estimation0.76Output / input token ratio
estimation.quote_padding_pcttoken_estimation6Padding % applied to token estimates
estimation.tokens_per_bytetoken_estimation0.64Input tokens per translatable byte
pricing.express_tier_premium_pctcustomer_pricing0Express tier premium % (real-time, 1–6h)
pricing.infra_markup_pctcustomer_pricing9Infra markup % applied by sync job
pricing.infra_per_pair_usdcustomer_pricing0.003Infrastructure cost per language pair (USD)
pricing.markup_pctcustomer_pricing30Service fee markup % for customer quotes
pricing.minimum_quote_usdcustomer_pricing0.99Minimum customer quote price (USD)
pricing.standard_tier_discount_pctcustomer_pricing30Standard tier discount % (batch, 12–24h)