TIVRA तीव्र

Turbo-Integrated Vectorized Runtime Acceleration

Heterogeneous hardware acceleration for post-quantum cryptography, hashing, and ML inference.

v3.0.0

The Sharp Note that Cuts Through Noise

In classical raga, tīvra is the raised note — sharp, intense, cutting through to reach higher frequencies. TIVRA does the same for computation: it finds every hardware shortcut available — SHA-NI instructions, AVX-512 vectors, GPU shader cores, NPU tensor units — and routes cryptographic operations through the fastest path. Like the tīvra note that elevates the melody, this layer elevates every operation above software-only performance.

Overview

TIVRA is the hardware acceleration layer that lets yakmesh exploit every silicon feature on modern hardware — AVX-512, SHA-NI, NVIDIA CUDA, AMD NPU (XDNA), and ONNX Runtime — for post-quantum cryptography operations and ML-based network intelligence.

Every call to SHA3-256, ML-DSA-65, ML-KEM-768, or model inference routes through TIVRA. If native acceleration is available, it is used automatically. If not, pure-JS fallbacks from @noble/hashes and @noble/post-quantum provide correct results on every platform.

Design Principle

Zero configuration. Call initialize() once at startup. TIVRA probes the hardware, wires native paths, and exposes a unified API. Consumer modules never check hardware flags — they just call sha3_256() or mlDsa65Sign() and get the fastest available path.

Three-Tier Architecture

Tier 1: Hashing & PQ Crypto

SHA3-256, ML-DSA-65 sign/verify, ML-KEM-768 encapsulate/decapsulate. Native OpenSSL and liboqs paths with @noble fallback.

Tier 2: GPU Batch Verification

BatchVerifyQueue batches ML-DSA-65 verifications (8–256) for parallel GPU processing. Falls back to parallel CPU for smaller batches.

Tier 3: NPU Inference Engine

ONNX Runtime with DirectML (NPU) > CUDA (GPU) > CPU. Powers SAKSHI anomaly detection and KARMA trust prediction.

Hardware Detection

probe() runs once at startup and populates the global HW flags object:

Flag Source Description
cpuModel os.cpus() CPU brand string
avx512 / vaes / shaNI / gfni CPU detection SIMD instruction sets (Zen 4+, Intel 11th gen+)
nativeSha3 crypto.createHash('sha3-256') OpenSSL SHA3-256 support
nvGpu / nvGpuName / nvGpuVRAM nvidia-smi NVIDIA GPU name, VRAM, compute capability
amdNpu / amdNpuTops PowerShell PnP AMD XDNA NPU detection and TOPS rating
onnxRuntime / onnxProviders dynamic import ONNX Runtime availability and execution providers
nativePQ / nativePQBackend dynamic import liboqs, pqcrypto, or aspect native addon
import { initialize, HW } from 'yakmesh/utils/accel';

await initialize();

console.log(HW.cpuModel);    // 'AMD Ryzen 7 8700F'
console.log(HW.avx512);      // true  (Zen 4)
console.log(HW.nativeSha3);  // true  (OpenSSL 3.x)
console.log(HW.amdNpu);      // true  (XDNA NPU)
console.log(HW.amdNpuTops);  // 16    (TOPS rating)

SHA3-256 Acceleration

Every hash in yakmesh — content addresses, attestation fingerprints, seed expansion — uses SHA3-256. On Zen 4 hardware with native OpenSSL, TIVRA delivers 4.6× faster hashing vs pure JavaScript.

import { sha3_256, sha3_256hex } from 'yakmesh/utils/accel';

// Returns Uint8Array (32 bytes)
const hash = sha3_256(data);

// Returns hex string
const hex = sha3_256hex(data);

// Automatic path selection:
// HW.nativeSha3 = true  → crypto.createHash('sha3-256')  [4.6× faster]
// HW.nativeSha3 = false → @noble/hashes sha3_256          [always correct]

Post-Quantum Cryptography

ML-DSA-65 (Digital Signatures)

FIPS 204 lattice signatures used for DOKO certificates, MANTRA gossip, and gateway attestation. Native liboqs acceleration delivers ~10× faster signing and verification.

Operation Pure-JS (@noble) Native (liboqs) Speedup
sign ~4.9 ms ~0.5 ms ~10×
verify ~1.7 ms ~0.2 ms ~8.5×

ML-KEM-768 (Key Encapsulation)

FIPS 203 lattice KEM used by ANNEX for session establishment. Seeds are sourced from PRAHARI quantum entropy when available.

import { mlKem768Keygen, mlKem768Encapsulate, mlKem768Decapsulate } from 'yakmesh/utils/accel';

// Key generation (with optional PRAHARI seed)
const { publicKey, secretKey } = mlKem768Keygen(hybridSeed);

// Encapsulation (sender)
const { cipherText, sharedSecret } = mlKem768Encapsulate(publicKey);

// Decapsulation (receiver)
const sharedSecret2 = mlKem768Decapsulate(cipherText, secretKey);

GPU Batch Verification

The BatchVerifyQueue collects ML-DSA-65 verification requests and processes them in batches for higher throughput. When an NVIDIA GPU with CUDA is available, the queue targets GPU parallel processing; otherwise it uses parallel CPU verification.

8
Min batch size
256
Max batch size
5 ms
Flush interval
import { batchVerify } from 'yakmesh/utils/accel';

// Enqueue verification — returns Promise<boolean>
const valid = await batchVerify.enqueue(signature, message, publicKey);

// Queue automatically flushes when:
// - batch reaches minBatchSize (8), or
// - flushInterval (5ms) elapses

NPU Inference Engine

The InferenceEngine manages ONNX model sessions and routes inference to the best available accelerator. Provider priority:

1st
DirectML (AMD NPU)
XDNA Neural Processing Unit — lowest latency, minimal power
2nd
CUDA (NVIDIA GPU)
Discrete GPU — highest throughput for large batches
3rd
CPU (fallback)
Always available — correct results on any platform
import { inference } from 'yakmesh/utils/accel';

// Load ONNX model
await inference.loadModel('sakshi-anomaly', './models/sakshi-anomaly.onnx');

// Run inference — automatically routes to NPU/GPU/CPU
const result = await inference.infer('sakshi-anomaly', {
  features: new Float32Array([0.8, 0.2, 0.5, ...]),
});

// Check provider
console.log(inference.provider);      // 'DmlExecutionProvider'
console.log(inference.isAccelerated); // true

Telemetry

TIVRA tracks every call and native-path hit rate for performance monitoring:

import { getTelemetry, getStatus } from 'yakmesh/utils/accel';

const t = getTelemetry();
// {
//   sha3Calls: 14200,  sha3NativeHits: 14200,  sha3NativeRate: '100.0%',
//   signCalls: 380,    signNativeHits: 380,     signNativeRate: '100.0%',
//   verifyCalls: 1240, verifyNativeHits: 1240,  verifyNativeRate: '100.0%',
//   inferCalls: 45,    inferNpuHits: 45,        inferAccelRate: '100.0%',
// }

const status = getStatus();
// { hardware: { cpu, simd, gpu, npu }, acceleration: { sha3, pqCrypto, batchVerify, inference } }

Integration Points

TIVRA is wired into 12+ hot-path files across the codebase:

Consumer Uses
ANNEX ML-KEM-768 session keys + SHA3 KDF
DOKO ML-DSA-65 certificate signing/verification
NAMCHE Gateway attestation signatures
SAKSHI NPU anomaly detection model inference
KARMA NPU trust prediction model inference
PRAHARI Entropy Sentinel NPU scoring + SHA3 seed expansion
YPC-27 SHA3-256 + ML-DSA-65 for 27-trit checksums
SEVA Distributed NPU compute sharing

API Reference

initialize()Promise<{ hw, telemetry }>

Probe hardware, initialize batch queue and inference engine. Call once at startup.

probe()Promise<void>

Detect CPU SIMD, GPU, NPU, ONNX Runtime, and native PQ crypto.

sha3_256(data)Uint8Array

SHA3-256 hash. Native or pure-JS, automatically selected.

mlDsa65Keygen(){ publicKey, secretKey }

Generate ML-DSA-65 keypair.

mlDsa65Sign(secretKey, message)Uint8Array

Sign a message with ML-DSA-65.

mlDsa65Verify(signature, message, publicKey)boolean

Verify an ML-DSA-65 signature.

mlKem768Keygen(seed?){ publicKey, secretKey }

Generate ML-KEM-768 keypair with optional PRAHARI seed.

mlKem768Encapsulate(publicKey){ cipherText, sharedSecret }

Encapsulate a shared secret using the receiver's public key.

mlKem768Decapsulate(cipherText, secretKey)Uint8Array

Decapsulate and recover the shared secret.

batchVerify.enqueue(sig, msg, pk)Promise<boolean>

Enqueue a verification for batch processing.

inference.infer(modelName, inputs)Promise<Object|null>

Run inference on a loaded ONNX model.

getTelemetry() / getStatus()

Get call counts, native hit rates, and hardware summary.

Version History

Version Changes
v3.0.0 Full TIVRA module: 3-tier architecture, InferenceEngine, BatchVerifyQueue, telemetry. Wired into 12 hot-path files.
v2.9.0 Initial probe() and SHA3-256 native path.