OCR Forge API Documentation
Convert PDFs, scans, and images into structured data with a simple REST API. AI-powered accuracy at R0.09 per 1,000 pages.
https://api.ocrforge.co.za
Overview
OCR Forge provides a 3-step API for document processing:
- Upload — Send file metadata, get a pre-signed upload URL
- Process — Upload your file to S3, OCR processing starts automatically
- Retrieve — Poll for status, download results when complete
All API responses are JSON. All timestamps are ISO 8601 format in UTC.
Supported File Types
| MIME Type | Extensions |
|---|---|
application/pdf | |
image/png | .png |
image/jpeg | .jpg, .jpeg |
image/tiff | .tiff, .tif |
image/webp | .webp |
Maximum file size: 50 MB.
Authentication
All endpoints except /health require an API key. You can pass it in two ways:
Option 1: X-API-Key Header (Recommended)
X-API-Key: ofk_your_api_key_here
Option 2: Bearer Token
Authorization: Bearer ofk_your_api_key_here
Getting an API Key
API keys are provisioned during onboarding. Keys follow the format ofk_<48-character-hex>. Contact support@ocrforge.co.za or sign up on the landing page to get started.
Quick Start
Get your first OCR result in 3 steps.
Step 1: Create a Job
Send file metadata to get a pre-signed upload URL:
curl -X POST https://api.ocrforge.co.za/documents \
-H "Content-Type: application/json" \
-H "X-API-Key: ofk_your_api_key" \
-d '{
"filename": "invoice.pdf",
"mimeType": "application/pdf",
"outputFormat": "json"
}'
Response:
{
"job_id": "doc_a1b2c3d4e5f6",
"status": "awaiting_upload",
"upload_url": "https://s3.af-south-1.amazonaws.com/...",
"upload_expires_in": 900,
"status_url": "/documents/doc_a1b2c3d4e5f6",
"instructions": "PUT your file to upload_url..."
}
Step 2: Upload Your File
PUT the file directly to the pre-signed S3 URL:
curl -X PUT "${UPLOAD_URL}" \
-H "Content-Type: application/pdf" \
--data-binary @invoice.pdf
Step 3: Get Your Result
Poll the status endpoint until the job is complete, then download the result:
# Check status
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6 \
-H "X-API-Key: ofk_your_api_key"
# When status is "completed", get the result
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result \
-H "X-API-Key: ofk_your_api_key"
/documents
Create a new OCR job. Returns a pre-signed upload URL for direct file upload to S3.
Request
Headers
| Header | Value | Required |
|---|---|---|
Content-Type | application/json | Yes |
X-API-Key | Your API key | Yes |
Body Parameters
| Field | Type | Required | Description |
|---|---|---|---|
filename |
string | Yes | Name of the file (max 255 characters). Special characters are sanitised. |
mimeType |
string | No | MIME type. One of: application/pdf, image/png, image/jpeg, image/tiff, image/webp. |
fileSize |
integer | No | File size in bytes. Maximum: 52,428,800 (50 MB). |
outputFormat |
string | No | Default: json. One of: json, csv, text, pdf. |
template |
string | No | Template name for structured extraction (e.g. invoice). |
webhookUrl |
string | No | URL to receive a POST callback when processing completes. |
Example
curl -X POST https://api.ocrforge.co.za/documents \
-H "Content-Type: application/json" \
-H "X-API-Key: ofk_your_api_key" \
-d '{
"filename": "invoice.pdf",
"mimeType": "application/pdf",
"fileSize": 245000,
"outputFormat": "json",
"webhookUrl": "https://your-app.com/webhook"
}'
Response
202 Accepted
{
"job_id": "doc_a1b2c3d4e5f6",
"status": "awaiting_upload",
"upload_url": "https://s3.af-south-1.amazonaws.com/...",
"upload_expires_in": 900,
"status_url": "/documents/doc_a1b2c3d4e5f6",
"instructions": "PUT your file to upload_url with the file bytes as body. Then poll status_url for progress."
}
Error Responses
| Status | Error | Cause |
|---|---|---|
400 | filename is required | Missing filename field |
400 | filename must be under 255 characters | Filename too long |
400 | Unsupported file type: ... | Invalid mimeType |
400 | File too large. Max: 50MB | fileSize exceeds limit |
400 | Invalid output format. Allowed: json, csv, text, pdf | Bad outputFormat value |
401 | Unauthorized | Missing or invalid API key |
403 | Monthly quota exceeded | Plan quota reached |
/documents/{id}
Check the status and metadata of an OCR job. Clients can only access their own jobs.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
id | string | Job ID returned from POST /documents (e.g. doc_a1b2c3d4e5f6) |
Example
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6 \
-H "X-API-Key: ofk_your_api_key"
Response
200 OK
{
"job_id": "doc_a1b2c3d4e5f6",
"status": "completed",
"filename": "invoice.pdf",
"file_size": 245000,
"mime_type": "application/pdf",
"output_format": "json",
"model_used": "textract",
"pages_total": 3,
"pages_processed": 3,
"confidence": 0.94,
"progress": 100,
"processing_ms": 4200,
"result_url": "/documents/doc_a1b2c3d4e5f6/result",
"created_at": "2026-04-02T09:00:00Z",
"completed_at": "2026-04-02T09:00:05Z"
}
Job Statuses
| Status | Description |
|---|---|
awaiting_upload | Pre-signed URL generated, waiting for file upload |
queued | File received, waiting for OCR processing |
processing | OCR engine is working on the document |
completed | OCR done, results available via result endpoint |
failed | Processing failed — error details included in response |
Error Responses
| Status | Error | Cause |
|---|---|---|
400 | Job ID is required | Missing ID in path |
401 | Unauthorized | Missing or invalid API key |
404 | Job not found | Invalid ID or belongs to another client |
/documents/{id}/result
Download the OCR result for a completed job. Returns a pre-signed download URL valid for 1 hour.
completed status. If still processing, this endpoint returns 409 Conflict.
Path Parameters
| Parameter | Type | Description |
|---|---|---|
id | string | Job ID |
Example
curl https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result \
-H "X-API-Key: ofk_your_api_key"
Response
200 OK
{
"job_id": "doc_a1b2c3d4e5f6",
"status": "completed",
"download_url": "https://s3.af-south-1.amazonaws.com/...",
"download_expires_in": 3600,
"filename": "invoice.pdf",
"pages": 3,
"confidence": 0.94,
"model_used": "textract"
}
Error Responses
| Status | Error | Cause |
|---|---|---|
401 | Unauthorized | Missing or invalid API key |
404 | Job not found | Invalid ID or belongs to another client |
409 | Job is not complete. Current status: processing | Job not yet finished |
/health
Health check endpoint. No authentication required. Use this to verify the API is operational.
Example
curl https://api.ocrforge.co.za/health
Response
200 OK
{
"status": "healthy",
"version": "1.0.0",
"region": "af-south-1"
}
Error Codes
All error responses follow the same format:
{
"error": "Human-readable error description"
}
HTTP Status Codes
| Code | Meaning | Description |
|---|---|---|
200 |
OK | Request succeeded. Response body contains the requested data. |
202 |
Accepted | Job created successfully. Processing is asynchronous — poll the status endpoint. |
400 |
Bad Request | Validation error. Missing required field, invalid value, or unsupported file type. |
401 |
Unauthorized | Missing API key, invalid key, expired key, or suspended account. |
403 |
Forbidden | Valid key but over quota. Upgrade your plan or wait until the next billing cycle. |
404 |
Not Found | The requested job does not exist or belongs to another client. |
409 |
Conflict | Request is valid but the resource state doesn't allow it (e.g. requesting results for a job still processing). |
429 |
Too Many Requests | Rate limited. Check the Retry-After header for seconds to wait. |
500 |
Internal Server Error | Unexpected server error. Contact support if this persists. |
Rate Limits & Quotas
OCR Forge enforces rate limits at the API Gateway level and monthly quotas per client.
Rate Limits
Requests are throttled per API stage:
| Setting | Value |
|---|---|
| Requests per second | 100 |
| Burst allowance | 200 |
When rate limited, you will receive:
HTTP/1.1 429 Too Many Requests
Retry-After: 1
Wait for the number of seconds specified in Retry-After before retrying.
Monthly Quotas
Each client has a monthly page processing quota based on their plan:
| Plan | Pages / Month | Price |
|---|---|---|
| Free | 50 | R0 |
| Starter | 5,000 | R499/month |
| Business | 25,000 | R1,999/month |
| Enterprise | 100,000+ | Custom |
| Pay-as-you-go | Unlimited | R0.09 per 1,000 pages |
When your monthly quota is exceeded:
{
"error": "Monthly quota exceeded. Upgrade your plan."
}
Webhooks
Instead of polling the status endpoint, you can provide a webhookUrl when creating a job. OCR Forge will POST to your URL when processing completes or fails.
Setup
Include webhookUrl in your upload request:
curl -X POST https://api.ocrforge.co.za/documents \
-H "Content-Type: application/json" \
-H "X-API-Key: ofk_your_api_key" \
-d '{
"filename": "invoice.pdf",
"mimeType": "application/pdf",
"webhookUrl": "https://your-app.com/ocr-callback"
}'
Callback Payload
On Success
{
"event": "document.completed",
"job_id": "doc_a1b2c3d4e5f6",
"status": "completed",
"result_url": "https://api.ocrforge.co.za/documents/doc_a1b2c3d4e5f6/result"
}
On Failure
{
"event": "document.failed",
"job_id": "doc_a1b2c3d4e5f6",
"status": "failed",
"error": "OCR processing failed: corrupted image"
}
Webhook Behaviour
| Setting | Value |
|---|---|
| HTTP Method | POST |
| Content-Type | application/json |
| Timeout | 10 seconds |
| Retries | 3 attempts with exponential backoff |
job_id from the callback to fetch the job status from the API. This confirms the webhook came from a real OCR Forge event.
SDKs
Official SDKs for quick integration with your favourite language.
Python
pip install ocrforge
from ocrforge import OCRForge
client = OCRForge(api_key="ofk_your_api_key")
# Simple: upload, wait, get result
result = client.process("invoice.pdf")
print(result.text)
# Advanced: upload, then poll manually
job = client.upload("invoice.pdf", output_format="json")
job.wait()
print(job.result.text)
Node.js
npm install ocrforge
const { OCRForge } = require('ocrforge');
const client = new OCRForge({ apiKey: 'ofk_your_api_key' });
// Simple: upload, wait, get result
const result = await client.process('./invoice.pdf');
console.log(result.text);
// Advanced: upload, then poll manually
const job = await client.upload('./invoice.pdf', { outputFormat: 'json' });
await job.waitForCompletion();
console.log(job.result.text);
PHP
composer require ocrforge/ocrforge
use OCRForge\Client;
$client = new Client('ofk_your_api_key');
// Simple: upload, wait, get result
$result = $client->process('invoice.pdf');
echo $result->text;
// Advanced: upload, then poll manually
$job = $client->upload('invoice.pdf', ['outputFormat' => 'json']);
$job->waitForCompletion();
echo $job->result->structuredData;
.NET / C#
dotnet add package OCRForge
using OCRForge;
var client = new OcrForgeClient("ofk_your_api_key");
// Simple: upload, wait, get result
var result = await client.ProcessAsync("invoice.pdf");
Console.WriteLine(result.Text);
// Advanced: upload, then poll manually
var job = await client.UploadAsync("invoice.pdf", new UploadOptions
{
OutputFormat = "json"
});
job = await client.WaitForCompletionAsync(job.Id);
var detail = await client.GetResultAsync(job.Id);
Console.WriteLine(detail.DownloadUrl);
Postman Collection
Import the official Postman collection to test every API endpoint interactively. Includes pre-built requests, example responses, and test scripts.
Setup
- Open Postman and click Import
- Drop in
ocrforge-api.postman_collection.json - Import the environment file
ocrforge-dev.postman_environment.json - Select the OCR Forge Dev environment from the top-right dropdown
- Set your API key in the
apiKeyenvironment variable - Run requests in order: Health → Create Job → Check Status → Get Result
What's Included
| Request | Method | Path | Auth |
|---|---|---|---|
| Health Check | GET |
/health |
No |
| Create Document Job | POST |
/api/v1/documents |
Yes |
| Check Job Status | GET |
/api/v1/documents/{id} |
Yes |
| Get Job Result | GET |
/api/v1/documents/{id}/result |
Yes |
jobId from the Create Job response. Run requests in order and subsequent requests will use the saved job ID automatically.