TIVRA तीव्र

Turbo-Integrated Vectorized Runtime Acceleration

Heterogeneous hardware acceleration for post-quantum cryptography, hashing, and ML inference.

v3.3.0

The Sharp Note that Cuts Through Noise

In classical raga, tīvra is the raised note — sharp, intense, cutting through to reach higher frequencies. TIVRA does the same for computation: it finds every hardware shortcut available — SHA-NI instructions, AVX-512 vectors, GPU shader cores, NPU tensor units — and routes cryptographic operations through the fastest path. Like the tīvra note that elevates the melody, this layer elevates every operation above software-only performance.

Overview

TIVRA is the hardware acceleration layer that lets yakmesh exploit every silicon feature on modern hardware — AVX-512, SHA-NI, NVIDIA CUDA, AMD NPU (XDNA), and ONNX Runtime — for post-quantum cryptography operations and ML-based network intelligence.

Every call to SHA3-256, ML-DSA-65, ML-KEM-768, or model inference routes through TIVRA. If native acceleration is available, it is used automatically. If not, pure-JS fallbacks from @noble/hashes and @noble/post-quantum provide correct results on every platform.

Design Principle

Zero configuration. Call initialize() once at startup. TIVRA probes the hardware, wires native paths, and exposes a unified API. Consumer modules never check hardware flags — they just call sha3_256() or mlDsa65Sign() and get the fastest available path.

Three-Tier Architecture

Tier 1: Hashing & PQ Crypto

SHA3-256, ML-DSA-65 sign/verify, ML-KEM-768 encapsulate/decapsulate. Native OpenSSL and liboqs paths with @noble fallback.

Tier 2: GPU Batch Verification

BatchVerifyQueue batches ML-DSA-65 verifications (8–256) for parallel GPU processing. Falls back to parallel CPU for smaller batches.

Tier 3: NPU Inference Engine

ONNX Runtime with DirectML (NPU) > CUDA (GPU) > CPU. Powers SAKSHI anomaly detection and KARMA trust prediction.

Hardware Detection

probe() runs once at startup and populates the global HW flags object:

Flag	Source	Description
`cpuModel`	os.cpus()	CPU brand string
`avx512 / vaes / shaNI / gfni`	CPU detection	SIMD instruction sets (Zen 4+, Intel 11th gen+)
`nativeSha3`	crypto.createHash('sha3-256')	OpenSSL SHA3-256 support
`nvGpu / nvGpuName / nvGpuVRAM`	nvidia-smi	NVIDIA GPU name, VRAM, compute capability
`amdNpu / amdNpuTops`	PowerShell PnP	AMD XDNA NPU detection and TOPS rating
`onnxRuntime / onnxProviders`	dynamic import	ONNX Runtime availability and execution providers
`nativePQ / nativePQBackend`	dynamic import	liboqs, pqcrypto, or aspect native addon

import { initialize, HW } from 'yakmesh/utils/accel';

await initialize();

console.log(HW.cpuModel);    // 'AMD Ryzen 7 8700F'
console.log(HW.avx512);      // true  (Zen 4)
console.log(HW.nativeSha3);  // true  (OpenSSL 3.x)
console.log(HW.amdNpu);      // true  (XDNA NPU)
console.log(HW.amdNpuTops);  // 16    (TOPS rating)

SHA3-256 Acceleration

Every hash in yakmesh — content addresses, attestation fingerprints, seed expansion — uses SHA3-256. On Zen 4 hardware with native OpenSSL, TIVRA delivers 4.6× faster hashing vs pure JavaScript.

import { sha3_256, sha3_256hex } from 'yakmesh/utils/accel';

// Returns Uint8Array (32 bytes)
const hash = sha3_256(data);

// Returns hex string
const hex = sha3_256hex(data);

// Automatic path selection:
// HW.nativeSha3 = true  → crypto.createHash('sha3-256')  [4.6× faster]
// HW.nativeSha3 = false → @noble/hashes sha3_256          [always correct]

Post-Quantum Cryptography

ML-DSA-65 (Digital Signatures)

FIPS 204 lattice signatures used for DOKO certificates, MANTRA gossip, and gateway attestation. Native liboqs acceleration delivers ~10× faster signing and verification.

Operation	Pure-JS (@noble)	Native (liboqs)	Speedup
sign	~4.9 ms	~0.5 ms	~10×
verify	~1.7 ms	~0.2 ms	~8.5×

ML-KEM-768 (Key Encapsulation)

FIPS 203 lattice KEM used by ANNEX for session establishment. Seeds are sourced from PRAHARI quantum entropy when available.

import { mlKem768Keygen, mlKem768Encapsulate, mlKem768Decapsulate } from 'yakmesh/utils/accel';

// Key generation (with optional PRAHARI seed)
const { publicKey, secretKey } = mlKem768Keygen(hybridSeed);

// Encapsulation (sender)
const { cipherText, sharedSecret } = mlKem768Encapsulate(publicKey);

// Decapsulation (receiver)
const sharedSecret2 = mlKem768Decapsulate(cipherText, secretKey);

GPU Batch Verification

The BatchVerifyQueue collects ML-DSA-65 verification requests and processes them in batches for higher throughput. When an NVIDIA GPU with CUDA is available, the queue targets GPU parallel processing; otherwise it uses parallel CPU verification.

Min batch size

256

Max batch size

5 ms

Flush interval

import { batchVerify } from 'yakmesh/utils/accel';

// Enqueue verification — returns Promise<boolean>
const valid = await batchVerify.enqueue(signature, message, publicKey);

// Queue automatically flushes when:
// - batch reaches minBatchSize (8), or
// - flushInterval (5ms) elapses

NPU Inference Engine

The InferenceEngine manages ONNX model sessions and routes inference to the best available accelerator. Provider priority:

1st

DirectML (AMD NPU)

XDNA Neural Processing Unit — lowest latency, minimal power

2nd

CUDA (NVIDIA GPU)

Discrete GPU — highest throughput for large batches

3rd

CPU (fallback)

Always available — correct results on any platform

import { inference } from 'yakmesh/utils/accel';

// Load ONNX model
await inference.loadModel('sakshi-anomaly', './models/sakshi-anomaly.onnx');

// Run inference — automatically routes to NPU/GPU/CPU
const result = await inference.infer('sakshi-anomaly', {
  features: new Float32Array([0.8, 0.2, 0.5, ...]),
});

// Check provider
console.log(inference.provider);      // 'DmlExecutionProvider'
console.log(inference.isAccelerated); // true

Telemetry

TIVRA tracks every call and native-path hit rate for performance monitoring:

import { getTelemetry, getStatus } from 'yakmesh/utils/accel';

const t = getTelemetry();
// {
//   sha3Calls: 14200,  sha3NativeHits: 14200,  sha3NativeRate: '100.0%',
//   signCalls: 380,    signNativeHits: 380,     signNativeRate: '100.0%',
//   verifyCalls: 1240, verifyNativeHits: 1240,  verifyNativeRate: '100.0%',
//   inferCalls: 45,    inferNpuHits: 45,        inferAccelRate: '100.0%',
// }

const status = getStatus();
// { hardware: { cpu, simd, gpu, npu }, acceleration: { sha3, pqCrypto, batchVerify, inference } }

Integration Points

TIVRA is wired into 12+ hot-path files across the codebase:

Consumer	Uses
ANNEX	ML-KEM-768 session keys + SHA3 KDF
DOKO	ML-DSA-65 certificate signing/verification
NAMCHE	Gateway attestation signatures
SAKSHI	NPU anomaly detection model inference
KARMA	NPU trust prediction model inference
PRAHARI	Entropy Sentinel NPU scoring + SHA3 seed expansion
YPC-27	SHA3-256 + ML-DSA-65 for 27-trit checksums
SEVA	Distributed NPU compute sharing

API Reference

initialize() → Promise<{ hw, telemetry }>

Probe hardware, initialize batch queue and inference engine. Call once at startup.

probe() → Promise<void>

Detect CPU SIMD, GPU, NPU, ONNX Runtime, and native PQ crypto.

sha3_256(data) → Uint8Array

SHA3-256 hash. Native or pure-JS, automatically selected.

mlDsa65Keygen() → { publicKey, secretKey }

Generate ML-DSA-65 keypair.

mlDsa65Sign(secretKey, message) → Uint8Array

Sign a message with ML-DSA-65.

mlDsa65Verify(signature, message, publicKey) → boolean

Verify an ML-DSA-65 signature.

mlKem768Keygen(seed?) → { publicKey, secretKey }

Generate ML-KEM-768 keypair with optional PRAHARI seed.

mlKem768Encapsulate(publicKey) → { cipherText, sharedSecret }

Encapsulate a shared secret using the receiver's public key.

mlKem768Decapsulate(cipherText, secretKey) → Uint8Array

Decapsulate and recover the shared secret.

batchVerify.enqueue(sig, msg, pk) → Promise<boolean>

Enqueue a verification for batch processing.

inference.infer(modelName, inputs) → Promise<Object|null>

Run inference on a loaded ONNX model.

getTelemetry() / getStatus()

Get call counts, native hit rates, and hardware summary.

Version History

Version	Changes
v3.3.0	Full TIVRA module: 3-tier architecture, InferenceEngine, BatchVerifyQueue, telemetry. Wired into 12 hot-path files.
v3.3.0	Initial probe() and SHA3-256 native path.

Continue the Journey

← Previous

SANGHA

Collective attestation

PRAHARI

SHA3 sponge entropy engine