API Docs

OCR Forge API Documentation

Convert PDFs, scans, and images into structured data with a simple REST API. AI-powered accuracy at R0.09 per 1,000 pages.

Base URL https://api.ocrforge.co.za

Overview

OCR Forge provides a 3-step API for document processing:

  1. Upload — Send file metadata, get a pre-signed upload URL
  2. Process — Upload your file to S3, OCR processing starts automatically
  3. Retrieve — Poll for status, download results when complete

All API responses are JSON. All timestamps are ISO 8601 format in UTC.

Supported File Types

MIME TypeExtensions
application/pdf.pdf
image/png.png
image/jpeg.jpg, .jpeg
image/tiff.tiff, .tif
image/webp.webp

Maximum file size: 50 MB.

Authentication

All endpoints except /health require an API key. You can pass it in two ways:

Option 1: X-API-Key Header (Recommended)

Header
X-API-Key: ofk_your_api_key_here

Option 2: Bearer Token

Header
Authorization: Bearer ofk_your_api_key_here
Keep your API key secret. Never expose it in client-side code, public repositories, or URLs. If compromised, contact support to rotate your key immediately.

Getting an API Key

API keys are provisioned during onboarding. Keys follow the format ofk_<48-character-hex>. Contact support@ocrforge.co.za or sign up on the landing page to get started.

Quick Start

Get your first OCR result in 3 steps.

Step 1: Create a Job

Send file metadata to get a pre-signed upload URL:

curl
curl -X POST https://api.ocrforge.co.za/documents \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ofk_your_api_key" \
  -d '{
    "filename": "invoice.pdf",
    "mimeType": "application/pdf",
    "outputFormat": "json"
  }'

Response:

Response — 202 Accepted
{
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "awaiting_upload",
  "upload_url": "https://s3.af-south-1.amazonaws.com/...",
  "upload_expires_in": 900,
  "status_url": "/documents/doc_a1b2c3d4e5f6",
  "instructions": "PUT your file to upload_url..."
}

Step 2: Upload Your File

PUT the file directly to the pre-signed S3 URL:

curl
curl -X PUT "${UPLOAD_URL}" \
  -H "Content-Type: application/pdf" \
  --data-binary @invoice.pdf
Note: No API key is needed for the S3 upload — the pre-signed URL handles authentication. The URL expires in 15 minutes.

Step 3: Get Your Result

Poll the status endpoint until the job is complete, then download the result:

curl
# Check status
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6 \
  -H "X-API-Key: ofk_your_api_key"

# When status is "completed", get the result
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result \
  -H "X-API-Key: ofk_your_api_key"
POST /documents

Create a new OCR job. Returns a pre-signed upload URL for direct file upload to S3.

Request

Headers

HeaderValueRequired
Content-Typeapplication/jsonYes
X-API-KeyYour API keyYes

Body Parameters

FieldTypeRequiredDescription
filename string Yes Name of the file (max 255 characters). Special characters are sanitised.
mimeType string No MIME type. One of: application/pdf, image/png, image/jpeg, image/tiff, image/webp.
fileSize integer No File size in bytes. Maximum: 52,428,800 (50 MB).
outputFormat string No Default: json. One of: json, csv, text, pdf.
template string No Template name for structured extraction (e.g. invoice).
webhookUrl string No URL to receive a POST callback when processing completes.

Example

curl
curl -X POST https://api.ocrforge.co.za/documents \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ofk_your_api_key" \
  -d '{
    "filename": "invoice.pdf",
    "mimeType": "application/pdf",
    "fileSize": 245000,
    "outputFormat": "json",
    "webhookUrl": "https://your-app.com/webhook"
  }'

Response

202 Accepted

Response
{
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "awaiting_upload",
  "upload_url": "https://s3.af-south-1.amazonaws.com/...",
  "upload_expires_in": 900,
  "status_url": "/documents/doc_a1b2c3d4e5f6",
  "instructions": "PUT your file to upload_url with the file bytes as body. Then poll status_url for progress."
}

Error Responses

StatusErrorCause
400filename is requiredMissing filename field
400filename must be under 255 charactersFilename too long
400Unsupported file type: ...Invalid mimeType
400File too large. Max: 50MBfileSize exceeds limit
400Invalid output format. Allowed: json, csv, text, pdfBad outputFormat value
401UnauthorizedMissing or invalid API key
403Monthly quota exceededPlan quota reached
GET /documents/{id}

Check the status and metadata of an OCR job. Clients can only access their own jobs.

Path Parameters

ParameterTypeDescription
idstringJob ID returned from POST /documents (e.g. doc_a1b2c3d4e5f6)

Example

curl
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6 \
  -H "X-API-Key: ofk_your_api_key"

Response

200 OK

Response
{
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "completed",
  "filename": "invoice.pdf",
  "file_size": 245000,
  "mime_type": "application/pdf",
  "output_format": "json",
  "model_used": "textract",
  "pages_total": 3,
  "pages_processed": 3,
  "confidence": 0.94,
  "progress": 100,
  "processing_ms": 4200,
  "result_url": "/documents/doc_a1b2c3d4e5f6/result",
  "created_at": "2026-04-02T09:00:00Z",
  "completed_at": "2026-04-02T09:00:05Z"
}

Job Statuses

StatusDescription
awaiting_uploadPre-signed URL generated, waiting for file upload
queuedFile received, waiting for OCR processing
processingOCR engine is working on the document
completedOCR done, results available via result endpoint
failedProcessing failed — error details included in response

Error Responses

StatusErrorCause
400Job ID is requiredMissing ID in path
401UnauthorizedMissing or invalid API key
404Job not foundInvalid ID or belongs to another client
GET /documents/{id}/result

Download the OCR result for a completed job. Returns a pre-signed download URL valid for 1 hour.

The job must be in completed status. If still processing, this endpoint returns 409 Conflict.

Path Parameters

ParameterTypeDescription
idstringJob ID

Example

curl
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result \
  -H "X-API-Key: ofk_your_api_key"

Response

200 OK

Response
{
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "completed",
  "download_url": "https://s3.af-south-1.amazonaws.com/...",
  "download_expires_in": 3600,
  "filename": "invoice.pdf",
  "pages": 3,
  "confidence": 0.94,
  "model_used": "textract"
}

Error Responses

StatusErrorCause
401UnauthorizedMissing or invalid API key
404Job not foundInvalid ID or belongs to another client
409Job is not complete. Current status: processingJob not yet finished
GET /health

Health check endpoint. No authentication required. Use this to verify the API is operational.

Example

curl
curl https://api.ocrforge.co.za/health

Response

200 OK

Response
{
  "status": "healthy",
  "version": "1.0.0",
  "region": "af-south-1"
}

Error Codes

All error responses follow the same format:

Error Response Format
{
  "error": "Human-readable error description"
}

HTTP Status Codes

CodeMeaningDescription
200 OK Request succeeded. Response body contains the requested data.
202 Accepted Job created successfully. Processing is asynchronous — poll the status endpoint.
400 Bad Request Validation error. Missing required field, invalid value, or unsupported file type.
401 Unauthorized Missing API key, invalid key, expired key, or suspended account.
403 Forbidden Valid key but over quota. Upgrade your plan or wait until the next billing cycle.
404 Not Found The requested job does not exist or belongs to another client.
409 Conflict Request is valid but the resource state doesn't allow it (e.g. requesting results for a job still processing).
429 Too Many Requests Rate limited. Check the Retry-After header for seconds to wait.
500 Internal Server Error Unexpected server error. Contact support if this persists.

Rate Limits & Quotas

OCR Forge enforces rate limits at the API Gateway level and monthly quotas per client.

Rate Limits

Requests are throttled per API stage:

SettingValue
Requests per second100
Burst allowance200

When rate limited, you will receive:

429 Too Many Requests
HTTP/1.1 429 Too Many Requests
Retry-After: 1

Wait for the number of seconds specified in Retry-After before retrying.

Monthly Quotas

Each client has a monthly page processing quota based on their plan:

PlanPages / MonthPrice
Free50R0
Starter5,000R499/month
Business25,000R1,999/month
Enterprise100,000+Custom
Pay-as-you-goUnlimitedR0.09 per 1,000 pages

When your monthly quota is exceeded:

403 Forbidden
{
  "error": "Monthly quota exceeded. Upgrade your plan."
}

Webhooks

Instead of polling the status endpoint, you can provide a webhookUrl when creating a job. OCR Forge will POST to your URL when processing completes or fails.

Setup

Include webhookUrl in your upload request:

curl
curl -X POST https://api.ocrforge.co.za/documents \
  -H "Content-Type: application/json" \
  -H "X-API-Key: ofk_your_api_key" \
  -d '{
    "filename": "invoice.pdf",
    "mimeType": "application/pdf",
    "webhookUrl": "https://your-app.com/ocr-callback"
  }'

Callback Payload

On Success

POST your-app.com/ocr-callback
{
  "event": "document.completed",
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "completed",
  "result_url": "https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result"
}

On Failure

POST your-app.com/ocr-callback
{
  "event": "document.failed",
  "job_id": "doc_a1b2c3d4e5f6",
  "status": "failed",
  "error": "OCR processing failed: corrupted image"
}

Webhook Behaviour

SettingValue
HTTP MethodPOST
Content-Typeapplication/json
Timeout10 seconds
Retries3 attempts with exponential backoff
Tip: To verify webhook authenticity, use the job_id from the callback to fetch the job status from the API. This confirms the webhook came from a real OCR Forge event.

SDKs

Official SDKs for quick integration with your favourite language.

Python

Python
pip install ocrforge

from ocrforge import OCRForge

client = OCRForge(api_key="ofk_your_api_key")

# Simple: upload, wait, get result
result = client.process("invoice.pdf")
print(result.text)

# Advanced: upload, then poll manually
job = client.upload("invoice.pdf", output_format="json")
job.wait()
print(job.result.text)

Node.js

Node.js
npm install ocrforge

const { OCRForge } = require('ocrforge');

const client = new OCRForge({ apiKey: 'ofk_your_api_key' });

// Simple: upload, wait, get result
const result = await client.process('./invoice.pdf');
console.log(result.text);

// Advanced: upload, then poll manually
const job = await client.upload('./invoice.pdf', { outputFormat: 'json' });
await job.waitForCompletion();
console.log(job.result.text);

PHP

PHP
composer require ocrforge/ocrforge

use OCRForge\Client;

$client = new Client('ofk_your_api_key');

// Simple: upload, wait, get result
$result = $client->process('invoice.pdf');
echo $result->text;

// Advanced: upload, then poll manually
$job = $client->upload('invoice.pdf', ['outputFormat' => 'json']);
$job->waitForCompletion();
echo $job->result->structuredData;

.NET / C#

.NET / C#
dotnet add package OCRForge

using OCRForge;

var client = new OcrForgeClient("ofk_your_api_key");

// Simple: upload, wait, get result
var result = await client.ProcessAsync("invoice.pdf");
Console.WriteLine(result.Text);

// Advanced: upload, then poll manually
var job = await client.UploadAsync("invoice.pdf", new UploadOptions
{
    OutputFormat = "json"
});
job = await client.WaitForCompletionAsync(job.Id);
var detail = await client.GetResultAsync(job.Id);
Console.WriteLine(detail.DownloadUrl);
All SDKs handle: File upload to pre-signed URLs, automatic polling, retry on rate limits, and typed error handling. See each SDK's README for full documentation.

Postman Collection

Import the official Postman collection to test every API endpoint interactively. Includes pre-built requests, example responses, and test scripts.

Setup

  1. Open Postman and click Import
  2. Drop in ocrforge-api.postman_collection.json
  3. Import the environment file ocrforge-dev.postman_environment.json
  4. Select the OCR Forge Dev environment from the top-right dropdown
  5. Set your API key in the apiKey environment variable
  6. Run requests in order: Health → Create Job → Check Status → Get Result

What's Included

Request Method Path Auth
Health Check GET /health No
Create Document Job POST /api/v1/documents Yes
Check Job Status GET /api/v1/documents/{id} Yes
Get Job Result GET /api/v1/documents/{id}/result Yes
Tip: The collection includes test scripts that auto-save the jobId from the Create Job response. Run requests in order and subsequent requests will use the saved job ID automatically.