Apollo Refinery™
Content Quality & Transformation Platform
Transform information into trusted information. Apollo Refinery normalizes, repairs, and transforms textual content at enterprise scale — turning what once required days of costly manual labor into minutes of automated, repeatable precision.
The cost of untrusted information
Manual data cleaning is slow, expensive, and unpredictable.
Every organization that works with large content repositories eventually faces the same problem: data that looked clean in one system reveals itself as broken in the next. Character encoding drift, legacy control characters, XML violations, and system-specific artifacts accumulate silently — until a migration, upgrade, or compliance requirement forces them into the open.
The traditional response is manual remediation — technical staff working record by record, making mistakes, repeating the same effort every time the problem recurs. It is slow, it is expensive, and it produces inconsistent results that create downstream failures weeks later.
Apollo Refinery systematizes what was manual. Operations that once consumed days of skilled technical labor execute in minutes — accurately, repeatably, at any scale.
Capabilities
Inspect. Normalize. Repair. Transform. Improve.
Every operation Apollo Refinery performs is deliberate, auditable, and repeatable. Nothing is changed without first being understood.
Normalize Character Sets & Encodings
Standardize character sets, encodings, and formatting inconsistencies across millions of records in a single pass. What enters the forge as inconsistent raw content leaves as uniform, dependable output.
Repair Violations & Broken Content
Identify and correct invalid characters, broken XML structures, encoding errors, and legacy artifacts that cause downstream failures. Repair what can be fixed automatically — flag what requires review.
Transform Content for System Compatibility
Convert content from the format a legacy system produced to the format a modern system requires. Apollo Refinery bridges the gap between what you have and what your next system expects.
Improve XML & Schema Compliance
Systematically improve content quality against XML standards, schema definitions, and organization-specific rules. Achieve compliance that manual inspection cannot guarantee at scale.
Inspect Without Altering
Run a full non-destructive scan that produces a detailed violation report before any changes are made. Understand exactly what needs to be addressed — and why — before committing to transformation.
Automate Repeatable Operations
Configure once, run repeatedly. Apollo Refinery executes the same precision operations on demand, on schedule, or as part of a larger data pipeline — without manual intervention each time.
Precision control
Two modes. Complete control over what stays and what goes.
Most data cleaning tools remove a fixed list of problem characters. Apollo Refinery gives you two distinct modes of control, each suited to a different class of problem.
Inclusion mode
Specify exactly which characters are unwanted. Everything else is preserved. Use when you know precisely what to remove.
"Remove these specific control characters and legacy code-page artifacts from every record."
Exclusion mode
Specify only the characters that are permitted. Everything outside that set is identified and optionally removed. Use when you need to enforce a strict allowlist.
"Only standard UTF-8 printable characters are allowed. Flag or remove everything else."
Works across all your sources
Apollo Refinery reads content directly from disk, database, or content repository. The same operation runs uniformly across all three — whether you are processing a local file batch, querying a database table, or connecting to an ECM repository. No source requires a separate workflow.
Where it applies
Every organization that moves, upgrades, or publishes data eventually needs this.
Data migration preparation
Legacy systems accumulate decades of character encoding drift, control characters, and system-specific artifacts. Apollo Refinery normalizes and repairs content before it crosses into the target system — preventing the migration failures and rework that make projects run over budget.
XML & schema compliance
Invalid characters in content cause XML parsers to fail, publishing pipelines to break, and integrations to reject records. Apollo Refinery identifies every violation and repairs them systematically, achieving the compliance standard that manual review cannot sustain across millions of documents.
Technology upgrades
Moving to a new ECM platform, search engine, or database often exposes content that the old system silently tolerated. Apollo Refinery cleans and transforms that content before the upgrade — so the new system receives data it can work with from day one.
Translation service quality
Translation pipelines are sensitive to character encoding issues that corrupt output in ways that are difficult to detect. Apollo Refinery validates and normalizes source content before it reaches the translation service, and can inspect the output afterward to verify fidelity.
Publishing & content delivery
Content destined for web, print, or digital delivery must meet strict encoding and formatting requirements. Apollo Refinery prepares content for the publishing stage, eliminating the last-minute manual cleanup that delays production cycles.
Ongoing quality assurance
Content quality degrades continuously as systems ingest new records from disparate sources. Apollo Refinery runs on schedule against your repositories, surfacing violations as they appear — before they accumulate into a remediation project.
Stop paying for manual data cleaning.
Tell us about your content environment — the sources, the volume, and the problem you need to solve. We'll show you what Apollo Refinery does with it.
