Core Concepts
InfraMind introduces a protocol-level vocabulary that differs significantly from centralized infrastructure platforms. Its terminology reflects how compute is requested, routed, executed, and rewarded across an open network. These concepts are designed to be composable, deterministic, and implementation-agnostic. This section defines the foundational terms used throughout the system, with reference code, schemas, and behavioral expectations where applicable.
A model container in InfraMind is an OCI-compliant bundle that encapsulates an AI model, its runtime dependencies, execution script, and configuration metadata. This format allows any AI workload to be executed in a self-contained, portable manner without external environment assumptions. The container must expose at least one inference endpoint internally and may optionally support health probes, lifecycle events, and preloading.
Each container must include a model.yaml
descriptor alongside the runtime package. Example:
name: text-generator-v1
version: 0.3.4
runtime: python3.10
entrypoint: app.py
resources:
cpu: 4
memory: 4Gi
gpu: true
input_schema:
type: text
output_schema:
type: text
The entrypoint
is called by the InfraMind agent within a sandboxed runtime. It must expose a method or server that listens for inference input. Supported protocols include HTTP (default) and gRPC.
An inference endpoint is a globally routed interface to a deployed model on the mesh. These endpoints are abstract — they do not resolve to any fixed server or instance. When a client calls an endpoint, the InfraMind scheduler dynamically assigns the request to a qualified node and proxies the result back.
Example REST request:
curl -X POST https://api.inframind.host/inference/v1/model/text-generator-v1 \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"input": "The rise of intelligent infrastructure begins"}'
Endpoints maintain the same path, while responses vary based on node latency, routing, and local execution success. Failover is automatic.
The mesh scheduler is a decentralized coordination layer responsible for routing jobs to nodes. It does not live on a single machine or contract but exists as a distributed set of indexers that receive model invocations, discover available compute, and assign tasks based on latency, trust, hardware, and stake profile.
Scheduler nodes do not need to agree by consensus but may use gossip to relay partial state, sync node health, and rebalance load. Scheduling is not blockchain-dependent. Nodes may discover schedulers via static peer lists, DHT, or a coordination registry.
A work unit is a single invocation of a model’s inference function. It includes all the contextual metadata required for execution:
{
"job_id": "ca4a-1918",
"model_ref": "QmXeD123...",
"input": { "text": "InfraMind enables runtime sovereignty." },
"requested_by": "0xA372...",
"expires_at": 1718428912
}
Work units are assigned to nodes via a secure channel. Each job must be completed within the timeout window and must produce an output compatible with the model’s declared schema.
Upon completion of a work unit, a node must sign and return a proof of serving. This cryptographic receipt is used to verify that the node executed the job according to spec and to authorize payment.
Structure:
{
"job_id": "ca4a-1918",
"model_ref": "QmXeD123...",
"output_hash": "e59c7f1...",
"latency_ms": 127,
"node_id": "0xB1Fc...",
"signature": "0x49abf0e3..."
}
This payload is submitted to the reward oracle. InfraMind verifies that the job parameters match the original submission, confirms that the signature belongs to the correct node, and initiates payout if all conditions are met. Optionally, zkML proofs may be attached to attest that a specific model version was executed deterministically.
Node staking is a mechanism to bind execution performance to economic guarantees. Every node maintains a stake weight that affects job assignment, trust scoring, and penalty thresholds. Stake may be self-deposited or delegated. Nodes with higher stake are eligible for higher-value workloads and incur greater penalties for SLA violations.
To stake on a node:
infra stake --node 0x9Ab3... --amount 500
Staking data is used by the scheduler to determine:
job eligibility tier
priority routing preference
slashing thresholds
stake-based weight in quorum jobs
Stake is non-custodial and can be withdrawn subject to an unbonding period. Slashing events are published and verifiable. Slashed stake is partially burned and partially redistributed to other nodes in the same epoch.
Together, these core components form the minimal language of InfraMind’s compute system. Containers define what runs. Endpoints abstract where. Schedulers determine who runs what. Work units define the job. Proofs verify the result. Staking governs the reputation and reliability of the nodes responsible.
Last updated