STATUS: PREVIEW (unstable) • build 2025.12.31+sha.3f2c9a1

FyreToolkit API Reference

Production-grade ML infrastructure APIs. Test everything right here with our interactive playground featuring realistic mock responses and state management.

Base URL
https://api.fyreops.com/v1
Auth
HTTP Basic Auth (API key as username)
Playground: Use the right panel to test every endpoint with realistic mock data. Create models, deploy them, run predictions, manage batch jobs - all with persistent state during your session. No backend required!
JSON in / JSON out HTTPS only RESTful design Comprehensive error codes

Quickstart

Get started in 60 seconds. Follow these steps to make your first successful request.

1) Get an API Key

Click Get API Key in the top navigation. For this playground, use the demo key: sk_live_demo_12345

2) Make your first request

List your deployed models to verify authentication:

GET /models

Authentication: Uses HTTP Basic Auth. Provide your API key as the username, leave password blank.

3) Try it in the playground

Click List Models in the sidebar, then hit Send Request. You'll get a realistic response showing available models.

4) Explore capabilities

Use the Quick Test wizard to:

  • Run predictions on sentiment models
  • Create and monitor batch jobs
  • Deploy new models
  • View performance metrics

Authentication

The FyreToolkit API uses API keys to authenticate requests. API keys carry privileges, so keep them secure and never expose them in client-side code.

Authentication is performed via HTTP Basic Auth. Provide your API key as the basic auth username. Password should be left empty.

Header Type Description
Authorization string HTTP Basic Auth value. Format: Basic base64(api_key:)
Example: Basic c2tfbGl2ZV9kZW1vXzEyMzQ1Og==
Security Best Practices:
  • Never commit API keys to version control
  • Use environment variables for key storage
  • Rotate keys immediately if exposed
  • Use separate keys for dev/staging/production

Rate Limits

Rate limits ensure platform stability and fair usage. Limits vary by endpoint type and subscription plan.

Category Default Limit Applies To
Read operations 120 requests/min GET endpoints (models, metrics, job status)
Write operations 60 requests/min POST/DELETE (deployments, batch jobs)
Real-time inference Plan-dependent POST /predict/* endpoints

Rate Limit Headers

Response headers provide limit information:

Header Description
X-RateLimit-Limit Maximum requests per time window
X-RateLimit-Remaining Requests remaining in current window
X-RateLimit-Reset Unix timestamp when limit resets
Handling Rate Limits: When you receive 429 Too Many Requests, implement exponential backoff with jitter. The Retry-After header indicates when to retry.

Errors

FyreToolkit uses conventional HTTP response codes and provides detailed error objects for debugging.

HTTP Status Codes

Code Meaning Description
200 OK Request succeeded
201 Created Resource created successfully
202 Accepted Request accepted for async processing
400 Bad Request Invalid request (missing params, malformed JSON)
401 Unauthorized Invalid or missing API key
404 Not Found Resource doesn't exist
409 Conflict Request conflicts with current state
422 Unprocessable Valid syntax but semantic errors
429 Too Many Requests Rate limit exceeded
500 Internal Server Error Server error (contact support with request_id)

Error Response Schema

Field Type Description
error.code string Machine-readable error code (e.g., "model_not_found")
error.message string Human-readable error description
error.request_id string Unique request identifier for support
error.details object Optional additional context (field errors, etc.)

Real-time Prediction

Generate low-latency predictions from deployed models. Optimized for real-time inference with sub-100ms response times for most models.

Request

POST /predict/{model_id}

Path Parameters

Name Type Description
model_id string Unique identifier of the deployed model

Request Body

Field Type Required Description
inputs array Yes Input data (text, JSON, tensors). Format depends on model.
metadata object No Optional metadata (correlation_id, user_id, etc.)

Response

Field Type Description
id string Unique prediction identifier
object string Always "prediction"
created integer Unix timestamp (seconds)
model string Model ID used for prediction
results array Model outputs (one per input)
usage object Latency and compute usage metadata

Create Batch Job

Process large datasets asynchronously. Ideal for non-time-sensitive workloads requiring high throughput.

Request

POST /batch/{model_id}

Path Parameters

Name Type Description
model_id string Model to use for batch processing

Request Body

Field Type Required Description
input_url string Yes URL to CSV or JSONL file containing inputs
webhook string No Callback URL for completion notification
idempotency_key string No Prevents duplicate job creation on retries

Response

Field Type Description
id string Job identifier
status string "queued" initially
created integer Unix timestamp
Integration Pattern: Create job → Poll GET /jobs/{job_id} OR receive webhook callback when complete.

Get Batch Job

Retrieve status and results for a batch job. Poll this endpoint to monitor progress.

Request

GET /jobs/{job_id}

Response

Field Type Description
id string Job identifier
status string queued | processing | completed | failed | canceled
progress object Processed/total record counts
output_url string Results URL (when status=completed)

Cancel Batch Job

Cancel a queued or processing batch job. Completed jobs cannot be canceled.

Request

POST /jobs/{job_id}/cancel

Note: Cannot cancel jobs in "completed" or "failed" status.

List Models

Retrieve all models available in your organization. Supports pagination.

Request

GET /models

Query Parameters

Parameter Type Description
limit integer Max models to return (default: 20)
starting_after string Cursor for pagination

Response

Field Type Description
object string Always "list"
data array Array of model objects
has_more boolean Whether more models exist

Get Model

Retrieve detailed information about a specific model.

Request

GET /models/{model_id}

Response

Field Type Description
id string Model identifier
status string ready | training | archived
created integer Unix timestamp
description string Model description

Create Deployment

Deploy a model to production infrastructure. Provisions resources and creates endpoints.

Request

POST /deployments

Request Body

Field Type Required Description
model_uri string Yes Model artifact URI (e.g., s3://...)
region string Yes Target region (us-east-1, eu-west-1, etc.)
replicas integer No Initial replica count (default: 1)
idempotency_key string No Prevents duplicate deployments

List Deployments

Retrieve all active deployments in your organization.

Request

GET /deployments

Delete Deployment

Remove a deployment and stop serving traffic. Model artifacts are preserved.

Request

DELETE /deployments/{deployment_id}

Note: Deleting a deployment doesn't delete the model artifact.

Get Metrics

Retrieve performance metrics for a deployment (latency, throughput, errors).

Request

GET /metrics/{deployment_id}

Response

Field Type Description
deployment_id string Deployment identifier
period string Time window (e.g., "1h", "24h")
metrics object Performance metrics (requests, latency, errors)
Example Request
Quick Test Wizard
Test complete workflows in seconds. Select a scenario → click Run → see realistic responses
Request Preview
Select a scenario above...
Expected Response
Response will appear here...
📊 Current State
0
Models
0
Deployments
0
Jobs
📜 Request History
Manual Testing Console
Request Body / Query Parameters (JSON)
Response
Ready
// Waiting for request...