FyreToolkit API Reference

Production-grade ML infrastructure APIs. Test everything right here with our interactive playground featuring realistic mock responses and state management.

Base URL

https://api.fyreops.com/v1

Auth

HTTP Basic Auth (API key as username)

Playground: Use the right panel to test every endpoint with realistic mock data. Create models, deploy them, run predictions, manage batch jobs - all with persistent state during your session. No backend required!

JSON in / JSON out HTTPS only RESTful design Comprehensive error codes

Quickstart

Get started in 60 seconds. Follow these steps to make your first successful request.

1) Get an API Key

Click Get API Key in the top navigation. For this playground, use the demo key: sk_live_demo_12345

2) Make your first request

List your deployed models to verify authentication:

GET /models

Authentication: Uses HTTP Basic Auth. Provide your API key as the username, leave password blank.

3) Try it in the playground

Click List Models in the sidebar, then hit Send Request. You'll get a realistic response showing available models.

4) Explore capabilities

Use the Quick Test wizard to:

Run predictions on sentiment models
Create and monitor batch jobs
Deploy new models
View performance metrics

Authentication

The FyreToolkit API uses API keys to authenticate requests. API keys carry privileges, so keep them secure and never expose them in client-side code.

Authentication is performed via HTTP Basic Auth. Provide your API key as the basic auth username. Password should be left empty.

Header	Type	Description
Authorization	string	HTTP Basic Auth value. Format: Basic base64(api_key:) Example: Basic c2tfbGl2ZV9kZW1vXzEyMzQ1Og==

Security Best Practices:

Never commit API keys to version control
Use environment variables for key storage
Rotate keys immediately if exposed
Use separate keys for dev/staging/production

Rate Limits

Rate limits ensure platform stability and fair usage. Limits vary by endpoint type and subscription plan.

Category	Default Limit	Applies To
Read operations	120 requests/min	GET endpoints (models, metrics, job status)
Write operations	60 requests/min	POST/DELETE (deployments, batch jobs)
Real-time inference	Plan-dependent	POST /predict/* endpoints

Rate Limit Headers

Response headers provide limit information:

Header	Description
X-RateLimit-Limit	Maximum requests per time window
X-RateLimit-Remaining	Requests remaining in current window
X-RateLimit-Reset	Unix timestamp when limit resets

Handling Rate Limits: When you receive 429 Too Many Requests, implement exponential backoff with jitter. The Retry-After header indicates when to retry.

Pagination

List endpoints return paginated results using cursor-based pagination for consistency and performance.

Request Parameters

Parameter	Type	Description
limit	integer	Max records to return (default: 20, max: 100)
starting_after	string	Cursor ID. Returns items after this ID

Response Format

Field	Type	Description
object	string	Always "list"
data	array	Array of resource objects
has_more	boolean	Whether more records exist

Playground Feature: Test pagination by setting {"limit": 1} in any list endpoint.

Errors

FyreToolkit uses conventional HTTP response codes and provides detailed error objects for debugging.

HTTP Status Codes

Code	Meaning	Description
200	OK	Request succeeded
201	Created	Resource created successfully
202	Accepted	Request accepted for async processing
400	Bad Request	Invalid request (missing params, malformed JSON)
401	Unauthorized	Invalid or missing API key
404	Not Found	Resource doesn't exist
409	Conflict	Request conflicts with current state
422	Unprocessable	Valid syntax but semantic errors
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Server error (contact support with request_id)

Error Response Schema

Field	Type	Description
error.code	string	Machine-readable error code (e.g., "model_not_found")
error.message	string	Human-readable error description
error.request_id	string	Unique request identifier for support
error.details	object	Optional additional context (field errors, etc.)

Real-time Prediction

Generate low-latency predictions from deployed models. Optimized for real-time inference with sub-100ms response times for most models.

Request

POST /predict/{model_id}

Path Parameters

Name	Type	Description
model_id	string	Unique identifier of the deployed model

Request Body

Field	Type	Required	Description
inputs	array	Yes	Input data (text, JSON, tensors). Format depends on model.
metadata	object	No	Optional metadata (correlation_id, user_id, etc.)

Response

Field	Type	Description
id	string	Unique prediction identifier
object	string	Always "prediction"
created	integer	Unix timestamp (seconds)
model	string	Model ID used for prediction
results	array	Model outputs (one per input)
usage	object	Latency and compute usage metadata

Create Batch Job

Process large datasets asynchronously. Ideal for non-time-sensitive workloads requiring high throughput.

Request

POST /batch/{model_id}

Path Parameters

Name	Type	Description
model_id	string	Model to use for batch processing

Request Body

Field	Type	Required	Description
input_url	string	Yes	URL to CSV or JSONL file containing inputs
webhook	string	No	Callback URL for completion notification
idempotency_key	string	No	Prevents duplicate job creation on retries

Response

Field	Type	Description
id	string	Job identifier
status	string	"queued" initially
created	integer	Unix timestamp

Integration Pattern: Create job → Poll GET /jobs/{job_id} OR receive webhook callback when complete.

Get Batch Job

Retrieve status and results for a batch job. Poll this endpoint to monitor progress.

Request

GET /jobs/{job_id}

Response

Field	Type	Description
id	string	Job identifier
status	string	queued \| processing \| completed \| failed \| canceled
progress	object	Processed/total record counts
output_url	string	Results URL (when status=completed)

Cancel Batch Job

Cancel a queued or processing batch job. Completed jobs cannot be canceled.

Request

POST /jobs/{job_id}/cancel

Note: Cannot cancel jobs in "completed" or "failed" status.

List Models

Retrieve all models available in your organization. Supports pagination.

Request

GET /models

Query Parameters

Parameter	Type	Description
limit	integer	Max models to return (default: 20)
starting_after	string	Cursor for pagination

Response

Field	Type	Description
object	string	Always "list"
data	array	Array of model objects
has_more	boolean	Whether more models exist

Get Model

Retrieve detailed information about a specific model.

Request

GET /models/{model_id}

Response

Field	Type	Description
id	string	Model identifier
status	string	ready \| training \| archived
created	integer	Unix timestamp
description	string	Model description

Create Deployment

Deploy a model to production infrastructure. Provisions resources and creates endpoints.

Request

POST /deployments

Request Body

Field	Type	Required	Description
model_uri	string	Yes	Model artifact URI (e.g., s3://...)
region	string	Yes	Target region (us-east-1, eu-west-1, etc.)
replicas	integer	No	Initial replica count (default: 1)
idempotency_key	string	No	Prevents duplicate deployments

List Deployments

Retrieve all active deployments in your organization.

Request

GET /deployments

Delete Deployment

Remove a deployment and stop serving traffic. Model artifacts are preserved.

Request

DELETE /deployments/{deployment_id}

Note: Deleting a deployment doesn't delete the model artifact.

Get Metrics

Retrieve performance metrics for a deployment (latency, throughput, errors).

Request

GET /metrics/{deployment_id}

Response

Field	Type	Description
deployment_id	string	Deployment identifier
period	string	Time window (e.g., "1h", "24h")
metrics	object	Performance metrics (requests, latency, errors)